Early Learning Curve in the Assessment of Deep Pelvic Endometriosis for Ultrasound and Magnetic Resonance Imaging

Purpose We aimed to compare the learning curves of an ultrasound trainee (obstetrics and gynecology resident) and a radiology trainee when assessing pelvic endometriosis. Methods Consecutive patients with suspected endometriosis were prospectively enrolled in a tertiary center. They underwent an ultrasound and magnetic resonance imaging preoperatively, which was reported according to the International Deep Endometriosis Analysis (IDEA) group consensus. Trainees reported on deep endometriosis (DE), endometriomas, frozen pelvis, and adenomyosis. Using the Kappa agreement, their findings were compared against laparoscopy/histology and expert findings. The learning curve was considered positive when performance improved over time and indeterminate in all other cases. Results Reports from thirty-five women were divided chronologically into 3 equal blocks to assess the learning curve. For ultrasound, trainee versus expert showed a positive learning curve in overall pelvic DE assessment. There was an excellent agreement for adenomyosis (Kappa = 1.00, p = 0.09), frozen pelvis (Kappa = 0.90, p = 0.01), bowel (Kappa = 1.00, p = 0.01), and bladder DE assessment (Kappa = 1.00, p = 0.01). Endometrioma and uterosacral ligament assessment showed an indeterminate curve. For radiology, trainee versus expert showed a positive curve when detecting adenomyosis (Kappa = 0.42, p = 0.09) and bladder DE (Kappa = 1.00, p = 0.01). The assessment of endometriomas, frozen pelvis, overall pelvic DE, bowel, and uterosacral ligament DE showed indeterminate curve. Agreement between trainees and laparoscopy/histology showed a positive curve for bladder (both) and frozen pelvis (ultrasound only). Conclusion A positive learning curve can be seen in some areas of pelvic endometriosis mapping after as little as 35 cases, but a bigger caseload is required to demonstrate the curve in full. The ultrasound trainee had positive learning curves in more anatomical locations (bladder, adenomyosis, overall bowel DE, frozen pelvis) than the radiology trainee (bladder, adenomyosis), which could be down to individual factors, differences in training, or the imaging method itself.


Introduction
Accurate preoperative mapping of pelvic endometriosis is crucial for individualized treatment. It is important that professionals reading images report systematically on the presence of adenomyosis, endometriomas, frozen pelvis (as an indirect sign of endometriosis [1]), and deep endometriosis (DE) lesions. Ultrasound and magnetic resonance imaging (MRI) are predominantly used for the evaluation of the pelvic endometriosis, but only for ultrasound, there is an internationally accepted consensus on terms, definitions, and measurements, the International Deep Endometriosis Analysis (IDEA) [2] group consensus. There is no similar document which guides MRI reporting; however, the European Society of Urogenital Radiology (ESUR) [3] published guidelines on the technical protocol for pelvic MRI in endometriosis.
Ultrasound is widely accepted as a method of choice for detecting endometriomas [4], and it was shown to have similar accuracy to MRI in diagnosing adenomyosis [5] and DE [6]. Despite good evidence on the accuracy of ultrasound [6], its wide availability, and no contraindication for use, it is frequently not the diagnostic modality of choice due to various reasons. One such reason is the lack of training and skills in this area. In order to even consider a new imaging method, one has to contemplate the necessary training requirement, characterised by the learning curve.
The learning curve can be described as an improvement in the performance of a given task. In ultrasound, this would consist of not only gaining theoretical knowledge and its application in pattern recognition but also learning probe manipulation, which requires good hand-eye coordination and manual dexterity. For MRI, the learning curve may be shorter since manual dexterity is not necessary. Accuracy is expected to plateau after a certain number of cases.
In this paper, we aimed to compare the learning curve of an obstetrics and gynecology trainee (O&G) using ultrasound and a radiology trainee using MRI when evaluating pelvic endometriosis, where expert reports and histologically confirmed laparoscopic findings served as reference standards.

Methods
This prospective study was conducted at a tertiary referral endometriosis center. It is aimed at comparing the learning curve of an ultrasound and a radiology trainee when assessing pelvic endometriosis (adenomyosis, endometriomas, frozen pelvis, and DE) in the same cohort of patients using one predefined protocol, which was based on the International Deep Endometriosis Analysis (IDEA) group consensus [2] adapted for MRI, as per Indrielle-Kelly et al. [7]. Diagnostic performances of trainees were compared against the accuracy of in-house ultrasound and radiology experts and also against histologically confirmed laparoscopic findings.
There are several ways of assessing a learning curve, and in this study, we used the following model which was previously employed in other research studies [8,9]. Before the analysis, the participants were divided into 3 blocks based on the chronological order. The learning curve was then assessed as an improvement of agreement between trainees and experts over time across these blocks.
2.1. Participants. Consecutive patients with suspected pelvic endometriosis planned for surgical treatment were enrolled in the study in a tertiary endometriosis centre. Endometriosis was suspected based on the symptoms, previous basic imaging, or findings from diagnostic laparoscopy performed in a district hospital. The inclusion criteria consisted of age 18-50 years, planned surgical treatment of pelvic endometriosis, no changes in the hormonal treatment in the last 4 months, and ultrasound and MRI to surgery time < 4 months. The exclusion criteria were age outside the desired range, suspected malignancy, delay between index imaging and surgery (reference) longer than 4 months, missing one of the 3 imaging investigations which were offered as part of the study, and/or participants declining surgery. The participants were divided into three blocks based on the order in which they were recruited. All participants underwent two ultrasound assessments, one by the ultrasound trainee and one by the ultrasound expert. Concurrently, the MRI examination was scheduled and evaluated by a radiology trainee and an expert. All four examiners were blinded to previous clinical and surgical findings and other imaging. The findings by trainees were not considered when planning for surgery.

Subjects.
Both trainees were residents in the final years of their training, and despite having intermediate skills and experience in gynecological imaging, neither had prior experience in the assessment of endometriosis mapping (i.e., description of locations, size and numbers of DE lesions, endometriomas, adenomyosis, and frozen pelvis). The ultrasound trainee (T.I.) was a 4th year resident in O&G with intermediate ultrasound skills (3-year experience, consisting of approximately 500 gynecologic ultrasound cases), doing her postgraduate studies in endometriosis ultrasound. The radiology trainee (P.H.) was a 5th year resident in general radiology with no special interest in gynecology. The ultrasound experts (D.F., F.F.) and a radiology expert (A.B.) were all specialists in their respective fields with more than 10-year postresidency experience in advanced pelvic imaging. We did not recruit more than one sonographer trainee due to the ethical issue of subjecting participants to multiple unnecessary vaginal scans.

Index Tests.
Both imaging modalities were reported using the ultrasound-specific protocol based on the IDEA [2] consensus. For the MRI, the protocol was adapted using some modifications [7], including removing site-specific tenderness as a soft marker and replacing sliding sign by sign of adhesions from distorted anatomy (e.g., "ear sign"). The settings and technical protocols reflected routine clinical practice. Plain transvaginal and transabdominal ultrasound examinations were performed without any bowel preparation or gel sonography using Voluson E10 (GE Medical Systems, Zipf, Austria) at a gynecology setting. The MRI assessment was done using 3 Tesla MRI Siemens scanner with a phased-array coil (Skyra, Siemens AG, Erlangen, Germany) according to the protocol recommended by the European Society of Urogenital Radiology (ESUR) [3], including the intravenous application of a spasmolytic agent, with no vaginal or rectal contrast agents.

Reference Standard.
Trainees were assessed against two reference standards. The first standard was represented by reports from expert imaging where the trainee's diagnostic performance in the three blocks was assessed against the expert's findings. The second reference standard was a laparoscopic evaluation with histological confirmation in most cases. Anatomical sites with a normal appearance on laparoscopy were not biopsied; hence, histological confirmation was missing for those sites. Only sites judged as affected were either resected or biopsied, providing histological confirmation. Adenomyosis was not assessed on laparoscopy because only 1 patient had a hysterectomy.
2.5. Learning Procedure. The ultrasound trainee was assessed by ultrasound experts, and the radiology trainee was assessed by the radiology expert as being at a comparable level of their respective training. Both trainees conducted self-study prior to the study focusing on relevant guidelines and imaging protocols (IDEA [2], ESUR [3]) and a pattern recognition in endometriosis. The ultrasound trainee (T.I.) scanned patients with their consent under the indirect supervision of the ultrasound experts (F.F., D.F.) and was blinded to the clinical findings and other imaging reports. Apart from regular meetings with the supervisors and discussing cases (indirect supervision), the O&G trainee was also involved in the patients' clinical care, including assistance during surgical treatment of endometriosis providing retrospective correlation between the ultrasound and intraoperative findings. The radiology trainee (P.H.) was also blinded to the previous findings and reported MRI independently of the radiology expert (A.B.). He had regular meetings to review the imaging reports and images with the supervisor and went through operative notes retrospectively on the computer.
The learning curve was assessed as "positive" when the agreement was increasing with the increasing number of cases between the blocks and as "indeterminate" when the performance plateaued or the improvement was inconsistent.
2.6. Statistical Analysis. Kappa value (k) was used to evaluate the level of agreement between the trainees and laparoscopy/histology reference and the trainees and experts in all three blocks individually and then overall in the whole cohort. When certain anatomical sites of endometriosis involvement were missing in the block, the learning curve was calculated from 2 blocks only.

Ethical
Approval. The local ethics committee approved the study protocol, and informed consent was obtained from all subjects (study number 1249/16 S-IV, approved version 1486/16 IS).

Learning Curves. The results are in
The radiology trainee versus expert showed a statistically significant positive learning curve in adenomyosis (Kappa = 0:42, p = 0:09) and a bladder DE detection (Kappa = 1:00, p = 0:01). The radiology trainee had an indeterminate learning curve in the assessment of bowel lesions, endometriomas, uterosacral ligaments, and frozen pelvis. The learning curve of the pelvic DE detection did not show any obvious improvement and was also assessed as indeterminate.
The agreement of both trainees with expert imaging was better than the agreement with the laparoscopy in the majority of cases. Both trainees reached an excellent agreement

Discussion
This study is the first to assess the learning curve of endometriosis assessment by ultrasound and MRI in one cohort of patients using the IDEA consensus [2]. It counts among the few studies describing the real-life learning curve for ultrasound without using offline assessments of images and/or video clips. We showed that after as few as 35 cases, the ultrasound trainee had a positive learning curve in more anatomical locations than the radiology trainee, reaching an excellent agreement in the frozen pelvis, adenomyosis, bowel, and bladder DE assessment while the radiology trainee achieved an excellent agreement in the bladder DE detection only. Choosing ultrasound/O&G trainee and a radiology trainee reflects the typical representation of the two specialities actively involved in endometriosis imaging. Endometriosis centres can choose which imaging modality to use, but provided our results, the choice should not be solely based on the need for training in ultrasound. We show that accurate MRI reading is also dependent on the caseload, defined by its learning curve. Another strength of our study is the comparison drawn against two reference standards. In the early learning curve, it is more meaningful to compare the performance against expert imaging because it reflects the gold standard in imaging. Difficulties in detecting certain lesions (small vaginal nodules, multiple bowel lesions, etc.) will affect the accuracy of an expert, providing a performance adjustment for the trainee's accuracy. In the later learning curve, when expert levels in imaging are being reached, the comparison with laparoscopy/histology is more accurate, because ultimately, visual and histological confirmation is the gold standard in the diagnosis of endometriosis. This was demonstrated in our early learning curve, where agreement with expert imaging was achieved easier and quicker than agreement with laparoscopy.
The main limitation of our study is a small sample size where the incidence of lesions in certain anatomical sites was too low to assess a meaningful learning curve (for instance involvement of rectovaginal septum). Also, the number of trainees (2) introduces a possible bias due to personal factors. The individual learning potential of a single trainee may not be representative of a learning potential of all trainees, and any generalisation to sonographers and radiologists in training should be done with caution. In regard to data analysis, it could be argued that the use of cumulative summation tests for the learning curve (LC CUSUM) [11] might have been more appropriate. LC CUSUM offers a learning curve with a predefined threshold at which the trainee is deemed competent. In view of the limited number of cases and a small likelihood of reaching competency in all areas, we aimed to provide more graphic analysis of the development of a positive/indeterminate early learning curve, which Kappa agreement describes better. This should however have no effect on possible future comparison because even though the results are reported in different formats, they all answer the same question, which is how many cases are required to reach an expert level.
One of the interesting aspects of our study was the unexpected discrepancy in the learning curve in the ultrasound and MRI. A possible explanation for this finding lies either in individual trainees, their training, or the imaging modality itself. The first is related to the individual learning ability of the trainees, their speed of internalizing new information, and skillset. From the training perspective, although both trainees received feedback on their reporting skills, the O&G trainee was directly involved in providing medical care to the participants. We assume that the learning of the ultrasound trainee was enhanced by their involvement in other aspects of the endometriosis care, such as direct contact with the patients, multidisciplinary meetings, and assistance in theatres with a possibility to correlate the real-life appearance of pelvic endometriosis with ultrasound images. The third aspect is a possible enhanced learning in ultrasound as an imaging modality, stemming from the combination of soft  5 BioMed Research International markers (such as site-specific tenderness) and the imaging method itself. Tenderness during ultrasound examination guides the sonographer to the points of likely involvement, increasing the chances of detecting small nodules, such as on uterosacral ligaments and bowels. Although it was not in the design of this study, we can presume that adding clinical examination (such as bimanual palpation) to the ultrasound examination would enhance the training as well, making the sonographer/gynecologist's learning curve even steeper.
Another unexpected finding was the inconsistent trainees' accuracy in detecting endometriomas, worse agreement with experts than with laparoscopy. On retrospective review of all the cases, experts reported in more details, including small endometriomas, which were ignored at the surgery, while the trainees tended to focus on bigger lesions which in turn explained the seemingly better agreement traineelaparoscopy in the endometrioma assessment. There were also two cases of ovarian abscess, where intracystic content of ground glass appearance is not distinguishable from endometrioma on the ultrasound but is easy to differentiate on the T1 and T2 MRI sequences. This resulted in a better diagnostic performance of the MRI trainee in the first block.
Previous research assessed the learning curve in endometriosis mapping in several ways. Guerriero    BioMed Research International the learning curve on offline and hands-on training and suggested that between 17 cases (bladder DE) and 44 cases (uterosacral ligaments) are required to reach a predefined threshold in accuracy. Extrapolated to our study, it represents approximately 100-150 cases to achieve a plateau in all areas, and our 35 cases therefore truly correspond to the early stages of the learning curve of DE assessment. Bazot et al. [12] however showed on the learning curve of ultrasound assessment of endometriomas that the inter-trainee variability was very wide and suggested that the assessment of the learning curve might require a more individual approach in training, rather than standardise a set number of cases for everyone in training. Future studies should address the learning curve in pelvic endometriosis assessment in its entirety with a hands-on setting, preferably undertaken in tertiary centers to ensure a steady flow of disease-positive cases. Since the learning curve is not a uniform entity for all trainees, employing several trainees in one study would be beneficial to define a range of cases required to achieve competency. This should then be reflected in the requirements for endometriosis centre accreditation.
In conclusion, this unique study comparing the early learning curve of an O&G trainee using ultrasound and a radiology trainee using MRI when evaluating pelvic endometriosis showed a positive learning curve in several areas in as little as 35 cases. A bigger caseload would be required to demonstrate the learning curve in full. Secondly, we found that the ultrasound trainee had positive learning curves in more anatomical locations (bladder, adenomyosis, overall bowel DE, frozen pelvis) than the radiology trainee (bladder, adenomyosis), which could be down to individual factors, the difference in training, or the imaging method itself.

Data Availability
The data that support the findings of this study are available from the corresponding author, AB, upon reasonable request.

Conflicts of Interest
The authors declare that they have no conflict of interest.