Validation of the National Aeronautics and Space Administration-Task Load Index as a tool to evaluate the learning curve for endoscopy training

1Department of Medicine (Division Gastroenterology), University of Calgary, Calgary, Alberta; 2Department of Gastroenterology, Gloucestershire Hospitals NHS Foundation Trust Hospital, United Kingdom; 3Office of Undergraduate Medical Education, University of Calgary, Calgary, Alberta Correspondence: Dr Sylvain Coderre, Department of Medicine (Division Gastroenterology), University of Calgary, 3500-26th Avenue Northeast, Calgary, Alberta T1Y 6J4. Telephone 403-943-5708, fax 403-943-4471, e-mail coderre@ucalgary.ca Received for publication March 11, 2013. Accepted November 16, 2013 The road to acquiring competence, and at times excellence, in the performance of any skill requires a combination of innate biological capacities, dedicated teachers and many hours of training. The process of skills acquisition has been described as a sequential process involving three major phases (1). The first phase, or novice phase, involves intense concentration to fully understand the activity and avoid making mistakes. The second phase is an evolution to a more fluid, less cognitively arduous step in which trainees begin to perform at an acceptable level, with fewer major mistakes. The final phase involves a process of automation, in which the skill is precisely and smoothly performed with little or no conscious cognitive involvement. Different terminology has been used to describe a similar sequence of events. The terms ‘unconscious’ and ‘conscious’ incompetence have been used to describe the early training stage, evolving to conscious competence (akin to Ericsson’s second phase) and, finally, unconscious competence for the more automated phase of skill acquisition (2). First described in 1971, endoscopy of the entire colon, or colonoscopy, is a common diagnostic procedure performed worldwide (3). During a colonoscopy training session, a novice endoscopist must attend to myriad sensory stimuli: visual stimuli from the endoscopic image of the colon on the monitor; verbal stimuli from the patient, nurse and trainer; as well as tactile/proprioceptive stimulus from the orIgINAL ArTIcLe


La validation du National Aeronautics and Space Administration-Task Load Index comme outil pour évaluer la courbe d'apprentissage dans la formation à l'endoscopie
HISToRIQue : Même si on évalue la charge de travail dans divers domaines, il n'existe pas d'outil d'évaluation de la charge de travail propre à l'endoscopie.obJeCTIF : Valider un tel outil sur la charge de travail et l'utiliser pour cartographier la progression des apprenants novices en gastroentérologie qui effectuent leurs premières endoscopies.MÉTHodoLoGIe : Huit apprenants novices en gastroentérologie et dix gastroentérologues ou chirurgiens en exercice ont utilisé l'outil d'évaluation de la charge de travail National Aeronautics and Space Administration-Task Load Index (NASA-TLX).Les chercheurs ont effectué une analyse factorielle exploratoire pour établir un indice harmonieux de charge de travail propre à l'endoscopie, qui a ensuite été validé.L'indice de charge de travail en endoscopie a permis de surveiller la progression de la fatigue et le rendement personnel des apprenants au cours des 40 premières interventions.RÉSuLTATS : Selon l'analyse factorielle du NASA-TLX, deux grands volets ont émergé : une mesure de fatigue et une mesure d'auto-efficacité, qui sont devenues des éléments de l'indice de charge de travail en endoscopie nouvellement validé.Les chercheurs ont remarqué une diminution régulière de l'autoperception de fatigue tout au long de la période de formation, plus rapide dans le cadre de la gastroscopie que de la coloscopie.Les indices d'auto-efficacité de la gastroscopie ont rapidement augmenté au fil des quelques premières interventions, pour ensuite atteindre un plateau.Pour ce qui est de la coloscopie, l'auto-efficacité déclarée s'est améliorée progressivement pendant les trois premiers quartiles d'intervention, suivie d'une baisse des indices d'auto-efficacité lors du dernier quartile.eXPoSÉ : La présente étude validait un indice de charge de travail d'endoscopie qui peut être vérifié en moins d'une minute.Les conséquences pratiques d'un tel outil pour l'enseignement de l'endoscopie incluent la détermination des périodes de plus grande fatigue perçues par les endoscopistes novices, favorisant un degré d'orientation pertinent de la part des formateurs.instrument itself (4).In addition to these sources of cognitive load (5), colonoscopy also places physical demand on the trainee, such as the process of straightening the colonoscope or loop resolution, a technique critical to the successful, safe and comfortable advancement of the instrument through the colon (6).Examination of the upper gastrointestinal tract (ie, esophagogastroduodenoscopy [EGD]) also provides many of these workload demands, although likely to a lesser extent.
The above-mentioned sources of workload, in addition to other potential sources such as time demand and frustration/anxiety, are captured in a workload assessment tool known as the National Aeronautics and Space Administration Task Load Index (NASA-TLX) (7).This subjective assessment of workload has been used in >500 published articles worldwide (8) in several domains including medicine (9).Subjective ratings are able to tap the essence of workload, and provide a valid, sensitive and practically useful indicator (7).
The objectives of the present study were threefold.The first was to identify the principal components of the NASA-TLX tool when applied to endoscopy training and to then use these data to create an endoscopy-specific rating tool.Our second objective was to assess the construct validity of the endoscopy task load rating tool.Our final study objective was to use the endoscopy task load rating tool to map our trainees' workload and perceived performance during the early phases of their training.Such attempts to measure mental effort during deliberate practice (10) and relating these measures to the level of performance has been shown to adequately represent the efficiency of the ongoing learning processes (11).We hypothesized that as trainees progress through their initial endoscopies, a steady, gradual decrease in workload would occur, correlating with an equally steady improvement in performance.

Participants
Participants were eight first-year gastroenterology residents at the University of Calgary (Calgary, Alberta).This gastroenterology residency training program lasts two years, during which residents perform, on average, 200 colonoscopies and 200 EGDs.Proficiency in both procedures is a requirement for graduating from the program.The study was conducted over two time periods, July 1 to September 30, 2009, and July 1 to September 30, 2010.There were four participants during each time period.This enabled data collection for the residents' procedural learning curve during the first three months of their training.Ten practicing gastroenterologists and colorectal surgeons with >5 years of experience in performing colonoscopy were asked to complete the NASA-TLX rating after two consecutive colonoscopies.Before beginning the study, ethics approval from the Conjoint Health Research Ethics Board at the University of Calgary was obtained, in addition to written informed consent from each of the participants.

Materials
A slightly modified version of the NASA-TLX rating tool was used for the present study.The unmodified NASA-TLX tool was initially piloted on three experienced gastroenterologists performing colonoscopy on two occasions.After their feedback, six items on the rating scales were kept, but some of the descriptors were modified for clarity and/or to make them more relevant to endoscopy.The final rating tool had six items: mental demand, physical demand, time demand, effort, performance, and frustration and anxiety (Appendix 1).Each of these items was rated on a visual analogue scale that was interpreted as the participant's subjective rating of each variable.

Procedure
The present analysis was a prospective observational cohort study.The participants were asked to complete the NASA-TLX rating for each of their colonoscopy and EGD procedures during the study time period.Cognizant of the effect that patient variability may have on the workload of endoscopy, the participants were asked not to rate procedures for patients with known previously difficult colonoscopies (failed or successful), one or more pelvic surgeries, two or more abdominal surgeries, previous colonic resection, as well as patients deemed to experience excessive anxiety over the procedure.These exclusion criteria were not used for the colonoscopies rated by the practicing gastroenterologists and colorectal surgeons.

Statistical analyses
The reliability of the NASA-TLX survey was assessed using Cronbach's α coefficient.Before performing factor analysis, the Kaiser-Meyer-Olkin (KMO) test was used to assess the appropriateness of performing this analysis on this dataset (ie, to ensure KMO statistic >0.5).Using the NASA-TLX survey as the unit of analysis, an exploratory factor analysis on the individual items of this tool was performed.This technique reduces a set of items to a smaller number of underlying principal components and, in doing so, uncovers the latent structure of the set of items (12).Factor analysis can evaluate discrimination by statistically testing whether two or more items differ.Items are considered to be measuring different constructs if they load most heavily on different principal components (13).Items that load most heavily, or converge, on the same principal component are considered to be measuring the same construct.
In the analysis, a Pearson product moment correlation matrix for the NASA-TLX items was initially constructed and then used principal component analysis to extract factors.A cut-off threshold was used for factor extraction of eigenvalue ≥1 (Kaiser rule).Factor loading was then performed on the extracted factors, followed by factor rotation using the Varimax method with Kaiser normalization (12).A cut-off threshold of 0.5 was used for factor loading.
Having identified the principal components of the NASA-TLX tool, simplification this tool was sought using a data-reduction technique.For this, a weighted composite score for each component was created, on which more than one item loaded.Weighting for each item corresponded to its factor loading score.Linear regression was then used to identify the minimum number of items that would allow explanation of ≥80% of the variance (R 2 ) of each weighted composite score.
To compare NASA-TLX scores for colonoscopies performed by practicing gastroenterologists/surgeons and residents, a two-sample t test was used and Cohen's d as a measure of effect size.A repeatedmeasures ANOVA was used to evaluate whether there was a change in NASA-TLX ratings over time.For this analysis, the between-subject variable was participant and the within-subject variable was procedure number.STATA version 11.0 (StataCorp, USA) was used for the analyses.

Principal components of the NASA-TLX for endoscopy
In the factor analysis, all 276 surveys for colonoscopy and 128 surveys for EGD were included.The participants completed a mean of 34.5 (range 12 to 54) surveys for colonoscopy and 32 (range 13 to 45) for EGD.The alpha coefficients and KMO statistics for the colonoscopy surveys were 0.78 and 0.78, respectively, while the corresponding results for EGD were 0.90 and 0.82.
For the colonoscopy surveys, two principal components (eigenvalues of 3.0 and 1.0) were identified that explained 66% of the overall variance.Five of the six individual items loaded on the first factor, which were interpreted as 'exertion', while a single item loaded on the second factor ('self-efficacy'). Factor loading scores are shown in Table 1.For the weighted exertion score, no single variable could explain ≥80% of the variance for this score, but the combination of effort and physical demand provided an R 2 of 0.89.Therefore, the tool was simplified to include three items -a weighted combination of effort and physical demand (hereby referred to as 'exertion') and performance as an indicator of 'self-efficacy'.These two measures, exertion and self-efficacy, are based solely on endoscopists' self-assessment (via NASA-TLX) and not on any specific objective measures of achievement (such as cecal intubation, detection rates, etc).
For analysis of the EGD surveys, the same two principal components explained 80% of the overall variance.Loading of individual items paralleled that of the colonoscopy data (Table 1), as did data reduction to simplify the rating of the exertion score.Once again, the combination of effort and physical demand provided the optimal R 2 (0.96), suggesting that the same simplified survey could be used to evaluate both colonoscopy and EGD procedures.

Comparison of exertion and self-efficacy ratings for practicing gastroenterologists/surgeons and residents
Practicing gastroenterologists/surgeons had significantly lower ratings for task load when performing colonoscopy compared with residents.The mean (± SD) task load rating for practicing gastroenterologists/ surgeons was 27.8±15.3compared with fourth quartile rating for residents of 50.2±20.0(d=1.26;P<0.0001).Practicing gastroenterologists also had significantly higher self-efficacy ratings when performing colonoscopy (89.2±9.1 compared with fourth quartile rating for residents of 44.8±26.6 [d=2.23;P<0.0001]).

Changes in exertion and self-efficacy ratings with training
Because of the wide range of surveys completed by the trainees, procedures were broken down into quartiles (for colonoscopy: first quartile ≤9 procedures, second quartile ≤18 procedures, third quartile ≤29 procedures, fourth quartile ≤54 procedures).For both colonoscopy and EGD, there was a significant reduction in residents' ratings of exertion from the first to the fourth quartile of procedures (P<0.0001 for both).The exertion ratings for the quartiles of procedures are shown in Figure 1.Over the duration of the study, there was a slight increase in perceived self-efficacy (P=0.049) for colonoscopy and a marked increase in self-efficacy for EGD (P<0.0001).These data are also shown in Figure 1.

dISCuSSIoN
Our first objective was to create an endoscopy-specific rating tool by identifying the principal components of the NASA-TLX tool when applied to endoscopy training.Our factor analysis (Table 1) revealed two principal components: a measure of exertion (combination of the effort and physical demand items from NASA-TLX) and a measure of self-efficacy (performance item from NASA-TLX).Therefore, our 'Endoscopy Task Load Index' is now a simplified version of the NASA-TLX, with only three items required: effort, physical demand and performance.A comparison of the exertion and self-efficacy scores between these novice endoscopists and the practicing gastroenterologists/surgeons demonstrated evidence for construct validity of this tool.
Our final study objective was to map the evolution of exertion and self-efficacy during initial training procedures of our novice endoscopists.As shown in Figure 1, for both gastroscopy and colonoscopy there was a steady decline in self-perceived exertion over this training time period.It is notable, however, that the mean exertion score of our novices at the end of their first three months (50.2) was still substantially higher than the mean exertion score of our experts (27.8).The self-efficacy scores for gastroscopy rapidly increase over the first few procedures, reaching a plateau after this period of time.For colonoscopy, there is a progressive increase in reported self-efficacy over the first three quartiles of procedures, which is followed by a drop in selfefficacy scores over the next quartile.The final mean self-efficacy score for the novices is much lower than that for their expert counterparts.
From these findings, it appears that for gastroscopy, there is a rapid acquisition of basic skills resulting in a corresponding decrease in exertion over the first training procedures, with a rapidly increasing perception of self-efficacy that quickly reaches a plateau to a relatively stable level.This finding may reflect the ease of intubation during gastroscopy as well as the relatively consistent anatomy encountered.For the most part, once intubation of the esophagus is mastered, the manoeuvres to navigate through the esophagus into the stomach and duodenum are consistent.The higher exertion and lower self-efficacy reported in the first few procedures may represent the intubation process, which can be challenging and create a high amount of cognitive and physical demand.Once competence and experience is attained with esophageal intubation, the trainee can relax and dedicate more time and is believed to fine tune the skills necessary for the remainder of the procedure.
For colonoscopy, a similar pattern of decreasing exertion over the first training procedures emerges, although a more gradual decline in exertion scores exists compared with gastroscopy.This is perhaps not surprising because colonoscopy is generally believed to be more intrinsically challenging and demanding than gastroscopy.As well, the decline may be more gradual because novice trainees over their first few colonoscopies become more comfortable with several of the 'basic' colonoscopy manoeuvres (eg, movement of dials, torque steering), yet become aware of more advanced and challenging techniques, such as loop resolution and even possibly polypectomy.This awareness of colonoscopy looping, and the need to resolve it, may explain the perceived drop in self-efficacy apparent in the last quartile of study colonoscopies.Learning curve theories demonstrate that learning curves, in general, do not proceed in a smooth, linear fashion, but are characterized by constant fluctuations (14).For colonoscopy training, it is quite possible that trainees spend their first training procedures acquiring the 'basics' of colonoscopy: movement of the dials, tip control and torque steering.During this time, they become increasingly adept and comfortable with these basic techniques, with a concomitant decrease in exertion and increase in self-efficacy.However, at some point (approximately 30 procedures in our study), trainees (with the advice of their trainers) become aware that more advanced techniques are required to navigate consistently and safely to the cecum.

Figure 1) Change in endoscopy task load (top) and self-efficacy with experience in performing colonoscopy and esophagogastroduodenoscopy (EGD) (with CIs)
Specifically, loop formation and the art of loop reduction became important.The majority of difficulties encountered during a colonoscopy result from lack of progression of the instrument on insertion, commonly as a consequence of looping of the colonoscope within the colon.To progress safely, effectively and with minimal patient discomfort, normally requires straightening of the instrument and resolution of the loop (6).The decreased self-efficacy scores achieved in this period may, in fact, represent loop reduction that will be interpreted by the inexperienced trainee as a shorter distance reached as less of the colonoscope is inserted into the patient when, in reality, they are in a better position to proceed.
There are several practical implications for endoscopy education that can be derived from the present study.First, the study validated a brief Endoscopy Task Load Index that can be completed in <1 min.The Endoscopy Task Load Index is easily applied and can be tracked over an entire training period, as well as applied to therapeutic interventions such as polypectomy, managing gastrointestinal bleeds and endoscopic retrograde cholangiopancreatography.It has the potential to identify periods of higher perceived exertion and facilitate appropriate levels of guidance from the trainers.Early in their training, novices will not be able to handle the additional workload imposed by questioning or verbose directions from the trainer.Verbal instruction of scope position and movement at this stage can be simplified by using a set of 12 direct, simple terms such as: tip up, tip down, tip left, tip right, clockwise torque, anticlockwise torque, insufflate, aspirate, advance/ push forward, withdraw/pull back, stop and slowly (15).As the trainee becomes more comfortable and exertion decreases, the amount of direct, hands-on supervision can be gradually decreased, thereby fostering more independence.
Work has been performed on developing formal curricula encompassing both physical and cognitive aspects of procedure teaching for colonoscopy (16,17).This is a concept that is at much further stages of evolution in the surgical realm (18)(19)(20).Much of the existing work in endoscopy has largely focused on the technical skills through simulators (21)(22)(23)(24)(25) and with the aid of other learning tools (4).The Endoscopy Task Load Index could be a valuable guide to the potential benefit (or harm) of new educational interventions.
While it is clear that skills acquisition improves with experience, debate remains as to the necessary number of procedures required for independent competence (26)(27)(28).A tool such as the Endoscopy Task Load Index, followed longitudinally, could provide a better definition of the transition point between the intermediate and the fully automated phases of expertise development.While our study clearly showed a difference between the trainees and the experts, we did not extend it sufficiently to help determine that exact point of transition.Such knowledge may also be helpful in assessing and following expert clinicians in need of more advanced skills training.
There were several important limitations to the present study.First, the sample size was small and the study did not extend beyond the first 50 procedures; hence, other important transition points on the road to colonoscopy competence (200 procedures by some groups [29]), such as polypectomy, cannot be ascertained.In addition, we did not evaluate the trainers' hands-on involvement and degree of supervision, which naturally varies among educators and could have influenced perceived cognitive workload of the trainee.The study used group learning curves, which can yield a misleading picture of what is occurring in individual subjects (30).Furthermore, the study did not include objective parameters, such as cecal intubation, time and detection rates, nor was repeat assessment of practicing gastroenterologists/surgeons performed.Future studies will address the impact of magnetic endoscopic imaging (6) on both novice workload and colonoscopy performance.
Colonoscopy skills training and acquisition is in an exciting state of flux and development.With increasing emphasis of quality assurance measures in endoscopy, formalizing skills training to ensure competence of those providing the service are of paramount importance.The inclusion of objective tools that encompass both the physical and cognitive components of learning will be best suited to identify the optimal methods of teaching.
dISCLoSuReS: This work originated from the University of Calgary, Canada, in affiliation with Gloucestershire Hospitals NHS Foundation (United Kingdom).

Most serious warnings and precautions
Hepatosplenic T-Cell Lymphoma (HSTCL): Very rare post-marketing reports of HSTCL, a rare aggressive lymphoma that is often fatal, have been reported.Most of the patients had prior infliximab therapy as well as concomitant azathioprine or 6-mercaptopurine use for Crohn's disease.The potential risk with the combination of azathioprine or 6-mercaptopurine and HUMIRA should be carefully considered.
Infections: Serious infections have been reported.Hospitalization or fatal outcomes associated with infections have been reported.Many of the serious infections have occurred in patients on concomitant immunosuppressive therapy that, in addition to their underlying disease, could predispose them to infections.Treatment with HUMIRA should not be initiated in patients with active infections.In patients who have been exposed to tuberculosis, and patients who have traveled in areas of high risk of tuberculosis or endemic mycoses, the risks and benefits of treatment with HUMIRA should be considered prior to initiating therapy.As with other TNF blockers, patients should be monitored closely for infections (including tuberculosis) before, during and after treatment with HUMIRA.Administration of HUMIRA should be discontinued if a patient develops a serious infection or sepsis, and appropriate therapy should be initiated.Physicians should exercise caution when considering the use of HUMIRA in patients with a history of recurrent infection or with underlying conditions which may predispose them to infections, or patients who have resided in regions where tuberculosis and histoplasmosis are endemic.Pediatric Malignancy: Lymphoma and other malignancies, some fatal, have been reported in children and adolescent patients treated with TNF blockers, including HUMIRA.
Other relevant warnings and precautions • Concurrent administration with other biologic DMARDs or other TNF antagonists not recommended • Surgery: Close monitoring for infection required • Patients with congestive heart failure: Cases of worsening congestive heart failure (CHF) and new onset CHF • Hematologic events: Pancytopenia, including aplastic anemia, and medically significant cytopenia • Hypersensitivity reactions, including anaphylaxis and latex allergic reactions • Autoimmunity • Immunosuppression • Immunizations: Live vaccines must be avoided.It is recommended that pediatric patients, if possible, be brought up to date with all immunizations in agreement with current immunization guidelines prior to initiating HUMIRA therapy • Infections: Tuberculosis (TB) (including reactivation and new onset of TB), opportunistic infections (including invasive fungal infections), and hepatitis B virus reactivation • Malignancies including malignancies in pediatric patients and young adults, lymphoma and non-lymphoma malignancy • Neurological events: New onset or exacerbation of demyelinating disease • Pregnant women: HUMIRA may cross the placenta; infants born to women treated with HUMIRA during pregnancy may be at increased risk for infection • Nursing women: Breastfeeding is not recommended for at least five months after the last HUMIRA treatment • Geriatrics: Higher incidence of infections and malignancies For more information Please consult the Product Monograph at http://webprod5.hc-sc.gc.ca/ dpd-bdpp/index-eng.jspfor important information relating to adverse reactions, drug interactions, and dosing information which have not been discussed in this piece.The Product Monograph is also available by calling at 1-888-704-8271.

TabLE 1 Principal components of the National aeronautics and Space administration Task Load Index (NaSa-TLX) for colonoscopy and esophagogastroduodenoscopy (EGD) NaSa-TLX item Colonoscopy EGD Factor 1 (Exertion) Factor 2 (Self-efficacy) Factor 1 (Exertion) Factor 2 (Self-efficacy)
• Reducing the signs and symptoms, inducing major clinical response and clinical remission, inhibiting the progression of structural damage and improving physical function in adult patients with moderately to severely active rheumatoid arthritis (RA).Can be used alone or in combination with methotrexate (MTX) or other disease-modifying antirheumatic drugs (DMARDs).When used as first-line treatment in recently diagnosed patients who have not been previously treated with MTX, HUMIRA should be given in combination with MTX.Can be given as monotherapy in case of intolerance to MTX or when treatment with MTX is contraindicated.• In combination with MTX, reducing the signs and symptoms of moderately to severely active polyarticular juvenile idiopathic arthritis (JIA) in patients 4 to 17 years of age who have had an inadequate response to one or more DMARDs.Can be used as monotherapy in case of intolerance to MTX or when continued treatment with MTX is not appropriate.HUMIRA has not been studied in children aged less than 4 years.• Reducing the signs and symptoms in patients with active ankylosing spondylitis (AS) who have had an inadequate response to conventional therapy.• Reducing the signs and symptoms of active arthritis and inhibiting the progression of structural damage and improving the physical function in adult psoriatic arthritis (PsA) patients.Can be used in combination with MTX in patients who do not respond adequately to MTX alone.• Reducing the signs and symptoms and inducing and maintaining clinical remission in adult patients with moderately to severely active Crohn's disease (CD) who have had an inadequate response to conventional therapy, including corticosteroids and/or immunosuppressants.HUMIRA is indicated for reducing the signs and symptoms and inducing clinical remission in these patients if they have also lost response to or are intolerant to infliximab.• Reducing the signs and symptoms and inducing and maintaining clinical remission in pediatric patients 13 to 17 years of age weighing ≥40 kg with severely active Crohn's disease and/or who have had an MP) or who are intolerant to such therapies.The efficacy of HUMIRA in patients who have lost response to, or were intolerant to, TNF blockers has not been established.• Treatment of adult patients with chronic moderate to severe psoriasis (Ps) who are candidates for systemic therapy.For patients with chronic moderate plaque psoriasis, HUMIRA should be used after phototherapy has been shown to be ineffective or inappropriate.Limited data are available for treatment with HUMIRA in children weighing <15 kg.Safety and effectiveness in pediatric patients in indications other than polyarticular JIA have not been established.Clinical trial data for patients aged 4 to 6 years are limited.The safety and efficacy of HUMIRA were authorised in pediatric patients 13 to 17 years of age weighing ≥40 kg with severely active Crohn's disease and/or who have had an inadequate response or were intolerant to conventional therapy.