TEFTOM: A Promising General Trauma Expectation/Outcome Measure—Results of a Validation Study on Pan-American Ankle and Distal Tibia Trauma Patients

Background. In orthopedics, there is no instrument specifically designed to assess patients' expectations of their final surgery outcome in general trauma populations. We developed the Trauma Expectation Factor Trauma Outcome Measure (TEFTOM) to investigate the fulfilment of patients' expectations one year after surgery as a measure of general trauma surgical outcomes. The aim of this paper was to assess the psychometric characteristics of this new general trauma outcome measure. Methods. The questionnaire was tested in 201 ankle and distal tibia fracture patients scheduled for surgery. Patients were followed up for twelve months. The TEFTOM questionnaire was evaluated for its criterion validity, internal consistency, reproducibility, and responsiveness. Results. TOM showed good criterion validity against the American Academy of Orthopaedic Surgeons Foot and Ankle Scale (Pearson's correlation coefficient = 0.69–0.77). Internal consistency was acceptable for TEF (Cronbach's alpha = 0.65–0.76) and excellent for TOM (Cronbach's alpha = 0.76–0.85). Reproducibility was moderate to very good (intraclass coefficient correlation (ICC) ≥0.67) for TEF and very good (ICC ≥0.92) for TOM. TOM also proved to be responsive to changes in patients' condition over time (Wald test; P < 0.001). Conclusions. TEFTOM is a promising tool for measuring general trauma outcomes in terms of patients' expectation fulfilment that proved to be valid, internally consistent, reproducible, and responsive to change.


Background
Ministries of health and healthcare providers from various countries are shifting their focus from clinical processes to outcomes, that is, concentrating on the quality rather than quantity of healthcare [1][2][3]. How much hospitals get paid for a procedure may soon depend in part on such measurements of outcome [4]. Furthermore, patient-reported outcomes in addition to clinical quality indicators [5] are becoming more popular [6][7][8]. It is therefore essential to have valid and reliable outcome measures tailored to each field of application.
Generic measures have been proposed for chronic disease or injury [9,10] to assess provider performance in improving the patient's condition. However, these measures are inappropriate for evaluating trauma outcomes, as no reliable baseline function measurements are available for trauma patients. While a baseline measurement of condition is neither possible nor helpful, a baseline expectation measurement of condition is achievable. Discovering this information may be the key to a new outcome paradigm that considers patients' expectations in light of their final personal outcome. Mondloch et al. recommended working toward a core set of reliable and valid measures of patients' expectations, bearing in mind that "the best prediction of outcome would be an expectancy measure whose domain of behaviour matches that of the outcome" [11].
In orthopaedics, the relationship between patients' expectations and outcomes has been evaluated in various trauma conditions [12][13][14][15]. This work led to the development of surgery expectation surveys for specific trauma conditions. In the authors' opinion, these instruments can provide a way to learn about patients' perspectives, thus providing the surgeon with a template to guide formal discussion about realistic and unrealistic goals and a prospective record that can be used jointly by the surgeon and the patient postoperatively to assess the surgical outcome.
A systematic review of over 300 musculoskeletal outcome measures currently available for clinical and research purposes [16] revealed that there was no instrument specifically designed to assess patients' expectations for their final surgery outcome in general trauma populations scheduled for surgery.
We developed a patient self-rating instrument, the Trauma Expectation Factor Trauma Outcomes Measure (TEFTOM), as a standardised tool to investigate the fulfilment of patients' expectations as a measure of general trauma surgical outcomes.
The TEFTOM questionnaire consists of two portions: the TEF portion is administered preoperatively to assess patients' expectations following consultation with the orthopaedic surgeon, whereas the TOM portion is administered one year after surgery to assess surgery outcome and quantify the extent to which patients' expectations have been fulfilled. The questionnaire was evaluated in a prospective, multicentre study in ankle and distal tibia fracture patients. The aim of this paper is to describe the assessment of the psychometric characteristics (criterion validity, internal consistency, reproducibility, and responsiveness) of this new and promising general trauma outcome measure.

Development of the TEFTOM Questionnaire.
Our research team of trauma surgeons led by the primary (MS) and senior authors (BH) performed a systematic review of over 300 musculoskeletal outcome measures currently available for clinical and research purposes [16]. Based on this meticulous screening of all available tools, it was found essential to develop a generic "core set" tool, which has not yet been considered in the general orthopaedic trauma population to assess the reflection of patient expectation on predicting their final outcome after trauma; this instrument was required to be clinician and patient friendly as well as parsimonious. This was the basis for developing the TEFTOM tool, which was assembled from adaptations of a well-established "core set" questionnaire developed for spine/lower back pain research that has already been proven to be valid, reliable, and responsive [17]. The ten items of the TEFTOM questionnaire were selected to cover the five essential domains proposed by Deyo et al. [18] and Bombardier et al. [19] of pain, physical function, disability, injury satisfaction, and overall satisfaction.
The generated 10-item TEFTOM tool was pretested on 20 patients with a distal tibia fracture. Patients were asked to provide feedback on format, comprehensibility, and content. No ambiguous, inappropriate, or unclear questions were reported. None of the questionnaires returned had blank items. Patients were also asked to report whether they felt that some important issues were missing. None of them reported any concern. Based on this feedback and review by the study team, the psychometric assessment of the questionnaire was initiated.
The TEFTOM instrument comprises two parts of 10 items each, which address the five mentioned domains (see the Appendices A and B). The most important aspect of this measure is that each question can be easily adapted to assess either expectation or outcome. The ten items are individually scored using a 5-point rating scale (from 0 to 4). An overall score ranging from 0 (lowest expectation/outcome) to 40 (highest expectation/outcome) can be easily calculated, whereby the point systems for items from 1 to 7 are first reversed, and a raw score is calculated by summing up all 10 items.
The tool was developed to be self-administered. The reference period for expectations and outcomes was set to one year after surgery. Response options were chosen as ordinal rating scales to offer a clear distinction between choices and were designed, so that each item would have the same weight.
While these questions were developed in English, they were deemed suitable for meaningful translation into Portuguese by clinicians in Brazil for one of our international study centres.

TEFTOM Validation Study.
A prospective multicentre study was conducted to evaluate the performance of the TEFTOM questionnaire. Although the target population of the TEFTOM tool is general trauma patients, we performed our validation study in patients with ankle and distal tibia fracture scheduled for surgery.
The main inclusion criterion was an isolated ankle or distal tibia fracture in patients aged 18 years or older who had provided written informed consent to participate in the study. The operative procedure was required to be performed within four weeks after the injury.
Exclusion criteria included previous internal fixation surgery of the injured ankle and medical conditions that affect bone union, such as metastatic cancer and metabolic bone disease. Polytraumatised patients were also excluded. Other excluded patients were those with severe dementia or other severe mental health problems that would preclude them from completing study questionnaires, those who knew that they would be unable to attend all scheduled study-related followup visits, those participating in other clinical trials of a drug or device, and prisoners.
Five participating clinics in Brazil, Canada, and the United States (USA) were involved. Institutional review board approval was obtained at each centre, and informed consent was obtained from each participant included in the study.

Followup Examinations.
Although the questionnaire is designed to be self-administered before surgery and then one year after surgery, for the questionnaire validation study patients were actively followed up at several time points during one year. Patients underwent a physical assessment and personal interview by the treating surgeon to complete the TEFTOM questionnaire as follows: the TEF portion was administered before and immediately after surgery, and then at two weeks, six weeks, three months, and six months after surgery, while the TOM portion was administered at three, six, and twelve months after surgery. Additional retesting of the TEF and TOM portions was performed at the same time points in those patients who consented to participate in a testretest reliability study. In addition, patients completed the American Academy of Orthopaedic Surgeons Foot and Ankle Scale (AAOS), the Foot and Ankle Outcome Score (FAOS), and the Short Form-36 Health Survey (SF-36) questionnaires at the same time points scheduled for the TOM portion of the questionnaire.

Instruments.
AAOS comprises 20 questions that cover four subscales targeting pain, function, stiffness and swelling, and giving way [20] and was used to assess criterion validity of the TOM portion of the questionnaire. FAOS is a diseasespecific measure for ankle instability consisting of 42 items, which measure five subscales: pain, other symptoms, activities of daily living, sports and recreation, and quality of life [21]. SF-36 is a comprehensive questionnaire on general patient health and well being [22]. All three instruments were self-administered. For use at the Brazilian site, the instruments were translated into Portuguese.

Patient Demographics and Characteristics.
A sample size of 200 patients was planned for this investigation. A total of 204 patients were recruited between October 2006 and January 2009; the entire study period spanned from when the first patient was recruited in October 2006 to the last patient's 1-year followup visit in February 2010. Two patients declined to participate in the study before the surgery was performed, and another patient who had surgery 45 days after the trauma event was excluded from the study. Four additional patients did not satisfy the isolated ankle or distal tibia fracture inclusion criterion but received a waiver from the orthopaedic surgeon who rated their secondary fractures as unlikely to interfere with their perception of the ankle/distal tibia fracture. Therefore, a total of 201 patients were included in the data analysis. Patient baseline sociodemographic characteristics are presented in Table 1.
Included in the study were 181 ankle fractures (90%)-AO Foundation and Orthopaedic Trauma Association (AO/OTA) Type 44-and 20 distal tibia fractures (10%)-AO/OTA Type 43 (Figure 1) [23]. Seven fractures (3.5%) One hundred and thirty-four (67%) fractures were treated by chief surgeons who had previous experience with over 30 procedures using the same surgical technique. In 194 patients (97%), the surgeons were satisfied with the final immediate postoperative outcome. In 7 patients (3%) the surgeons were not satisfied with the final surgical outcome: in 6 patients surgeons reported failure to achieve anatomic reduction, and in 1 patient surgeons reported severe comminution of the medial malleolus.

Statistical Analysis.
Patients recruited in the study were considered for the analysis up to the point of their last study visit. Patient recruitment and followup visit records obtained within the following time windows were analysed: 2 weeks before surgery, immediate postoperative period within 5 days after surgery, 2 weeks ±7 days, 6 weeks ±14 days, 3 months ±30 days, 6 months ±45 days, and the final period of 10 to 20 months based on the assumption that the 2-year outcome would resemble what achieved at 1 year. Statistical analyses were performed using Intercooled Stata Version 11 statistical software (Stata Corp, College Station, TX).
In order to estimate averages for the scores of interest at each time point we used mixed-effects linear regression with random patient effects to account for repeated (longitudinal) measurements on the same patient. Likelihood ratio tests were performed to verify trends over time.
The TEFTOM questionnaire was evaluated for its criterion validity, internal consistency, reproducibility, and responsiveness.

Criterion
Validity. The criterion validity of the TOM element of the questionnaire was assessed by means of Pearson's correlation coefficient against the AAOS, which is a gold standard measure of the condition of interest. A coefficient between 0.8 and 1 indicates "very good or strong" correlation. Correlations of the TOM score with FAOS and SF-36 were also investigated.

Internal Consistency.
Internal consistency was assessed for TEFTOM using Cronbach's alpha and is calculated from the pairwise correlations between the questionnaire items. This measure ranges from zero to one, where the following benchmarks were considered: 0.6-0.7 acceptable consistency, 0.7-0.8 satisfactory consistency, and 0.8 or higher very good consistency [25]. Values of 0.95 or higher are not necessarily desirable and most often indicate the redundancy of the items. Therefore, the goal of designing a reliable instrument in terms of internal consistency is to include similar items that are related (i.e., internally consistent), yet individually provide some unique information.

Test-Retest Reliability (Reproducibility).
The reproducibility of the TEFTOM questionnaire was assessed by means of calculating the intraclass correlation coefficient (ICC) at different time points. ICC measures the agreement between scores obtained by the same subject separated by a short period of time. An ICC of 1 indicates absolute agreement and is obtained after every patient scores exactly the same when the tool is readministered on a second occasion.

Responsiveness (Sensitivity to Change).
A multilevel mixed-effects linear regression was also used to test responsiveness of the TOM score. If the TOM portion was responsive to change, a significant increase in the measured outcome would be observed between the 3-and 6-/12-month followup evaluations based on the Wald test.

Results
The patient recruitment and followup flow chart is provided in Figure 2. Followup rates were consistently at or above 74% (148/201).
Average patient expectations as measured with the TEF portion of the questionnaire ranged from 33.9 to 35.3 points over the 6-month period (Table 2), yet patients were not consistent in reporting their expectations over time (Likelihood Ratio Test; < 0.001).
All outcome measures improved over time between the 3-month and 1-year evaluations ( Table 2). The 1-year mean TOM score was less prone to a ceiling effect (i.e., reaching the uppermost end of the scale) than the AAOS and FAOS "activities of daily living" dimension. In fact, the average 1year TOM score only lay at 20% below the upper limit of the scale, whilst the mean AAOS score was 11% below the upper limit of the AAOS scale. While 10% of the patients had reached the maximum score of 40 on the TOM scale at 1-year followup, 14% of the patients had an AAOS score of 100 at 1 year.

Criterion Validity.
The TOM questionnaire showed good criterion validity with the AAOS (Figure 3). Pearson's correlation coefficients ranged from 0.69 to 0.77 between the 3-month and 1-year testing period (Table 3). Correlation coefficients for validation against the five FAOS dimensions and the SF-36 Physical Component Summary were all equal to or lower than 0.7. In addition, correlations between the TOM portion and the unweighted average of the three SF-36 physical subscales ranged from 0.66 to 0.73 between the 3-month and 1-year testing period.

Internal Consistency.
Internal consistency of the TEF portion proved to be acceptable, with Cronbach's alpha ranging from 0.65 to 0.76 over the 6-month testing period (Table 4). Internal consistency of the TOM tool proved to be  (Table 5).

Test-Retest Reliability (Reproducibility).
A total of 80 patients consented to participate in the reliability assessment of the TEF tool. These patients were contacted by phone on average one day after their clinical examination, that is, one day after the day on which the form was completed (range: 0-14 days; median: 0). At all postoperative time points, TEF reliability was moderate to very good, with ICCs between 0.67 and 0.94 (Table 4). Sixty-two patients participated in the TOM reliability assessment. As for the TOM questionnaire, patients were contacted by phone on average one day after their clinical examination (range: 0-14 days; median: 0). The ICCs ranged between 0.92 at three months and 0.96 at twelve months (Table 5) indicating very good reproducibility of the TOM tool.

Responsiveness (Sensitivity to Change).
According to the responsiveness evaluation, the TOM score significantly increased by an average of 4 points from the 3-to 6-month evaluation (Wald test; < 0.001) and by 6 points from the 3to 12-month evaluation (Wald test; < 0.001) ( Table 2).

Discussion
Recent trends in clinical trial research have placed an increasing focus on understanding health outcomes from the patient's perspective [6]. This has led to the latest implementation of new regulations by the United Kingdom National 6 ISRN Orthopedics  Health Service and definitive guidelines on patient-reported outcomes set by the US Food and Drug Administration [4,26]. The use of standardised outcome measures based on detailed information from the patient now plays an important factor when considering the primary treatment objectives of every new clinical trial. Our development of the TEFTOM questionnaire highlights this increasing requirement for outcome measures focusing on patient-rated assessments. Furthermore, the TEFTOM instrument can specifically be used to evaluate patient-rated expectations; this is of particular importance when considering that the average trauma patient-as opposed to those with chronic disease-cannot adequately provide a baseline score. Reliable indicators of healthcare quality are important to accurately measure performance and promote improvements in services [27]. In the context of trauma surgery, pretraumatic conditions are-in most of the cases-too high a benchmark for success. Previous research in this field has indicated that informed patient expectations for their surgery outcome may represent a valid means of assessing the quality of a surgical procedure following a fracture [11]. It was with this spirit that we developed the TEFTOM questionnaire. We believe that the TEF score-obtained after the orthopaedic surgeon has informed the patient of their individual condition, chances of recovery, and possible consequences of surgery-can reliably quantify expectations on outcome after surgery and thus produce an individual summary expectation factor to be used as a reference to evaluate the recovery process as well as the final outcome. It is with the TOM score that after surgery, a patient-and condition-specific indicators of the ability to fulfil those expectations can be produced.
Being able to provide an individual measure of expectation fulfilment is the striking advantage of the TEFTOM questionnaire. Most outcome assessment tools base judgment solely on the observation of general average trends. The outcome instrument TOM had good criterion validity against the AAOS, a tool that aims at measuring a similar construct. We found a lower correlation with the FAOS tool; this may probably be explained by the fact that TOM summarises five dimensions in a single overall score, whereas FAOS considers five dimensions individually. The moderate correlations (i.e., 0.5 to <0.7) between the TOM instrument and the Physical Component Summary of the SF-36 throughout the 1-year testing period may be due to the  fact that the dimensions covered by the TOM instrument are not equivalent to those covered by the general SF-36 questionnaire. Moreover, the aspect of injury satisfaction is a specific dimension of the TOM tool that is not available in the SF-36 questionnaire. The correlation between AAOS and SF-36 has been reported to be 0.65 [13] and is based on the unweighted mean of the three SF-36 physical subscales. This is similar to the correlation between TOM and the mean of the three SF-36 subscales measured in our study population. An additional advantage of the TOM tool over the AAOS is that missing item imputation is rarely required for the TOM tool. For instance, many patients could already provide complete answers to the TOM tool at the 3-month examination, while many patients who did not recover completely could not answer the complete set of questions of the AAOS tool. Moreover, a ceiling effect was observed with the AAOS, whereas the TOM tool was less inclined to be affected by an upper limit. These positive characteristics of the TOM tool indicate that it might be better in evaluating the healing process than the AAOS.
The limitations of this study include the focus of testing the TEFTOM questionnaire on trauma patients with isolated ankle and distal tibia fractures and using a Pan-American population. However, to obtain first-hand experience with this new measure, we decided to target a specific trauma condition. We believe TEFTOM is a promising general outcome measure that could also be adapted for use with nonoperative treatments. This cohort of patients, however, was not studied and should be the subject of future investigations. Cross-cultural adaptation testing is currently underway for TEFTOM to be used at an international level.

Conclusions
On the basis of this first validation study, TEFTOM proved to be valid, internally consistent, reproducible, and responsive to change in assessing the condition of ankle and distal tibia fracture patients after surgery. We believe that TEFTOM is a promising tool in measuring general trauma outcomes and performances. As an indicator of patient expectation fulfilment, this new measure might have powerful implications on the assessment of healthcare quality within the field of traumatology.