Is Musculoskeletal Ultrasonography an Operator-Dependent Method or a Fast and Reliably Teachable Diagnostic Tool? Interreader Agreements of Three Ultrasonographers with Different Training Levels

Objectives. To assess interreader agreements and a learning curve between three (senior, junior, and beginner) different experienced musculoskeletal ultrasonographers. Senior served as the imaging “gold standard”. Methods. Clinically dominant joints (finger, shoulder, knee, tibiotalar, and talonavicular) of 15 rheumatoid arthritis (RA) patients were examined by three different experienced ultrasonographers (senior 10 years, junior 10 months, and beginner one month). Each patient's ultrasonographic findings were reported unaware of the other investigators' results. κ coefficients, percentage agreements, sensitivities, and specificities were calculated. Results. 120 joints of 15 RA patients were evaluated. Comparing junior's and beginner's results each to the senior's findings, the overall κ for all examined joints was 0.83 (93%) for junior and 0.43 (76%) for beginner. Regarding the different joints, junior's findings correlate very well with the senior's findings (finger joints: κ = 0.82; shoulder: κ = 0.9; knee: κ = 0.74; tibiotalar joint: κ = 0.84; talonavicular joint: κ = 0.84) while beginner's findings just showed fair to moderate agreements (finger joints: κ = 0.4; shoulder: κ = 0.42; knee: κ = 0.4; tibiotalar joint: κ = 0.59; talonavicular joint: κ = 0.35). In total, beginner's results clearly improved from κ = 0.34 (agreement of 67%) at baseline to κ = 0.78 (agreement of 89%) at the end of the evaluation period. Conclusions. Ultrasonographic evaluation of a ten-month-experienced investigator in comparison to a senior ultrasonographer was of substantial agreement. Agreements between a beginner and a highly experienced ultrasonographer were only fair at the beginning, but during the study including ultrasonographical sessions of 15 RA patients, the beginner clearly improved in musculoskeletal ultrasonography.


Introduction
Rheumatoid arthritis (RA) is characterised by synovitis and erosions of small finger joints, though large joints are also commonly affected. Due to technical improvements, musculoskeletal ultrasonography (US) has become an established method to detect soft tissue inflammatory process and early superficial bone lesions in patients with RA. In comparison to other imaging methods as diagnostic tools in rheumatology, US has remarkable advantages like easy and quick access, noninvasiveness, inexpensiveness, ability to scan multiple joints, repeatability and high patient acceptability [1]. However, it has been stated but not sufficiently investigated that musculoskeletal US is one of the most operator-dependent imaging techniques [2,3]. The inter-and intraobserver variations have only been tested in 2 International Journal of Rheumatology a minority of studies [2][3][4]. In recent studies of the European League against Rheumatism (EULAR) working group for imaging in RA, interobserver reliabilities, sensitivities, and specificities in comparison with MRI were found to be moderate to good [2,3]. Nevertheless, further standardisation of US scanning techniques and definitions of different pathological lesions are needed to increase the interobserver agreement in musculoskeletal US so that results of US reports can be compared in multicenter studies.
Further studies are also necessary to explore the time needed to assess practical US skills in order to perform a sufficient musculoskeletal US scan. A learning curve has been assessed by D'Agostino et al. who evaluated synovitis in MCP, PIP, and MTP joints. They found that at least 70 examinations were necessary to develop ultrasonographical competence in detecting synovitis in small joints [5]. Bone erosions and other commonly affected joints were not included in this study. Hence, we performed a study with three (senior, junior, and beginner) different experienced ultrasonographers evaluating small finger, tibiotalar, and talonavicular joints as well as large joints like the shoulder and knee of RA patients in order to differentiate between the learning ability with regard to the different joints and joint pathologies. Junior's and beginner's results were compared to the senior's results who served as the imaging "gold standard".

Patients and Joint
Regions. Fifteen patients with RA (ten female, five male, mean (SD) age 58 (±14,9) years, range 29 to 84) according to the American College of Rheumatology criteria [6] were recruited from the Rheumatology Outpatient Clinic of the University Hospital Goettingen, Germany. They were examined once by three different experienced ultrasonographers each on the same examination day during an evaluation time of two months. Altogether, 120 of clinically dominant joints have been assessed in this study. These were finger joints (MCP II and III PIP II and III), shoulder, knee, tibiotalar, and talonavicular joints.

Musculoskeletal
Ultrasonographers. US was performed by three different experienced ultrasonographers: the senior ultrasonographer (AKS) was working as an MD in musculoskeletal US for ten years and as a member of the EULAR and OMERACT US expert group; he was therefore considered as a US specialist. The junior ultrasonographer (JG) also worked as an MD at this study's duration with ten months of experience in the field of musculoskeletal US. She was using US as a routine diagnostic tool as well as in doing clinical research studies, especially in the scanning of finger and toe joints. In average, she scanned two to three patients per day. The beginner ultrasonographer (SO) was still a medical student during this study with one month of musculoskeletal US experience. She underwent one month (12 hours per week) of practical ultrasonographical training sessions ("hands-on" training and didactic instructions of standard scans) before the beginning of the study and had therefore done 50 hours of US training before this study's onset.
The three ultrasonographers reported and documented their US findings independently and unaware of the other investigators' results at the same visit of each patient.
The finger joints MCP II, III and PIP II, III were examined both for synovitis (Figures 1(a) and 1(b)) and for erosions, each from the dorsal and from the palmar view. The MCP joint II was also scanned laterally from radial in terms of erosions. Synovitis was defined as both synovial hypertrophy and effusion [11]. An interruption of the bone surface in two perpendicular planes was described as an erosion as defined by the OMERACT group [12].
For the shoulder joint, emphasis was taken on the following pathologies. Firstly, tenosynovitis of the long biceps tendon was described if there was a hypo-/anechoic thickened tissue with or without fluid within the tendon sheath, which is seen in two perpendicular planes [12]. Secondly, we looked for subdeltoid bursitis in the anterior (Figure 1(c)), lateral, and dorsal view. Further, the joint was evaluated for partial/full rotator cuff rupture. The humeral head surface was also evaluated for erosions according to the OMERACT definition for erosion [12]. A pathologic distension of the joint capsule with an intraarticular effusion and/or synovial proliferation was defined as synovitis in the dorsal and anterior region.
The evaluated pathologies for the knee joint were suprapatellar effusion, synovitis, and erosions of the medial and lateral joint recess and popliteal cysts according to German standard scans for knee joint examination [9].
The tibiotalar joint was examined both for effusion ( Figure 1(d)) and for erosion, while the talonavicular joint was just assessed for effusion after definition of US ankle and foot examination [10]. The pathologies for the ultrasonographic investigation are listed in Table 1. All of the assessed pathologies have been evaluated on a qualitative yes/no (1/0) basis.

Interreader and Learning Curve Sessions.
Three different experienced ultrasonographers evaluated 15 RA patients during an examination time of two months. On the one hand, interreader results were evaluated with regard to the different joints and joint pathologies. On the other hand, an overall κ and agreement for the findings of each of the 15 US examination sessions were calculated. In case of the beginner's results, a learning curve was developed.  The κ coefficients were divided as follows: κ < 0.0: poor, κ = 0-0.20: slight, κ = 0.21-0.40: fair, κ = 0.41-0.60: moderate, κ = 0.61-0.80: substantial, and κ = 0.81-1.0: almost perfect agreement [13]. The interreader agreements refer to the learning time period as well as to the differently examined joint regions.
The interreader agreements between junior and senior according to each US evaluation session (all in all 15) were substantial to almost perfect (mean κ = 0.83; mean agreement = 93%). During the study, the junior ultrasonographer could constantly keep high agreement levels (Figure 2(a)). The interreader agreements between beginner and senior clearly improved from κ = 0.34 (agreement 67%) at first date of the US evaluation session to κ = 0.78 (agreement 89%) at the end of this 2 months evaluation period presented in a learning curve (Figure 2(b)). US improvement of the beginner especially is presented after the 10th evaluation date.

Discussion
During the last decade, musculoskeletal US has become an indispensable diagnostic tool in the management of rheumatic diseases. In patients with RA especially it is important for both diagnosis and disease monitoring. It is widely used as an important outcome measure in therapeutic trials in RA [14][15][16][17]. Main criticism of US is that it is one of the most operator-dependent imaging methods. However, Scheel et al. were recently able to show moderate to good interreader agreements in the first interobserver variability study performed by 14 experts of the EULAR working group [2]. Confirming these findings in a larger study, Naredo et al. also found moderate to good interobserver reliabilities between 23 European musculoskeletal ultrasound experts [3]. Nevertheless, further standardisation of US scanning techniques and definitions of different pathological lesions are needed to increase the interobserver agreement in musculoskeletal US [18][19][20]. D'Agostino et al. performed the first study examining the rate at which rheumatologists with little or no experience in musculoskeletal US develop adequate skills to undertake a US evaluation. They showed that 70 examinations are necessary (including training sessions) to assess synovitis of the small MCP, PIP, and MTP joints accurately [5]. In our study, three different experienced ultrasonographers evaluated small and large joints which are commonly affected in RA. We decided to assess clinically dominant finger joints, shoulder, knee, tibiotalar, and talonavicular joints for synovitis, erosions, and other typical RA joint pathologies. The wrist was excluded because erosions are difficult to detect and to distinguish from physiological irregularities. A ten-month-experienced musculoskeletal ultrasonographer compared to a ten-year-experienced investigator reached substantial to almost perfect agreements (mean κ = 0.83). In the detection of rotator cuff ruptures and popliteal cysts, the junior even reached a κ-value of 1. In contrast to our earlier study [2] though, in which the knee showed a κ-value of 1, the ten-month-experienced ultrasonographer this time just received an agreement of κ = 0.74 for the knee. As Table 2(a) presents, this was because of the fact that the junior ultrasonographer reached only κ = 0.57 in detecting suprapatellar effusion and κ = 0.63 in detecting erosions in the knee joint. This might especially be explained due to difficulties in the detection of small fluid in the suprapatellar recess (maybe loss of dynamic examination by no contraction of quadriceps muscle). In addition, the junior ultrasonographer mainly has had experience in assessing small finger and toe joints by doing research studies in this field. Consequently, the junior got substantial to almost perfect agreements in the finger joint examination of erosions (κ range 0.71-1.0) and synovitis (κ range 0.62-1.0), except MCP III. In the examination of MCP joint III, κ-values of 0.48 (dorsal) and 0.62 (palmar) in detecting synovitis were just reached by the junior (Table 2(a)). Interestingly, Szkudlarek et al. also found the highest US intervariability in the MCP joint III (ICC = 0.57) in comparison to other small joints (MCP II, PIP II, MTP I, II) [4]. In contrast to the very good results reached by the junior investigator in the examination of erosions, the beginner's results were partly extremely poor, especially in the detection of erosions in the radial MCP II joint part (Table 2(b)). This is a region of high interest for the detection of early erosions in RA. Consequently, accurate assessment through special training of this region is strongly needed to improve the detection of erosions. In regard to the included joints, the most difficult joint to assess seemed to be the talonavicular joint with a fair kappa agreement of 0.34. This can certainly be explained by the fact that small amounts of fluid were not detected by the beginner, especially at the beginning of the study. However, during the study period which included ultrasonography of 15 patients with RA and the examination of 120 RA joints, the beginner's ultrasonographical competence clearly improved, and the beginner gained substantial agreement with the senior (from κ = 0.34 to κ = 0.78).
International Journal of Rheumatology 7 In this study, junior and beginner ultrasonographers have both been taught by the same senior, who served as the imaging "gold standard", a constellation that implicates the risk of a possible less objectivity of this study. Furthermore, an intrareader examination was not provided in this study. Larger studies with students of different US training levels from various US backgrounds are needed to confirm our results. Another critical point might include the fact that the reliability of power Doppler US as an emerging important tool in the assessment of synovitis activity was not proven in this study.
Taking our study results together, we were able to show that a US investigator with ten months of experience reached good to almost perfect agreement with a tenyear-experienced senior ultrasonographer and that a little experienced ultrasonographer substantially improved US competence within a period of two months suggesting that a relatively short teaching time can already lead to sufficient diagnostic US findings in grey-scale musculoskeletal ultrasonography. Therefore, the main criticism against musculoskeletal US as an operator-dependent and difficult to learn method might be attenuated.