Data equivalency of an interactive voice response system for home assessment of back pain and function

1Liberty Mutual Research Institute for Safety, Hopkinton; 2Department of Family Medicine and Community Health, University of Massachusetts Medical School, Worcester, Massachusetts, USA Correspondence and reprints: Dr William Shaw, Liberty Mutual Research Institute for Safety, 71 Frankland Road, Hopkinton, Massachusetts 01748, USA. Telephone 508-497-0253, fax 508-435-8136, e-mail william.shaw@libertymutual.com WS Shaw, SK Verma. Data equivalency of an interactive voice response system for home assessment of back pain and function. Pain Res Manage 2007;12(1):23-30.

P eriodic monitoring of symptoms by telephone or postal questionnaire has been a common method of outcome assessment in studies of pain treatment and rehabilitation.Technological advances have provided new methods for collecting survey data, including electronic questionnaires (1), hand-held personal digital assistants (2), internet-based questionnaires (3), computer-assisted telephone interviewing (4) and interactive voice response (IVR) systems (5).While previous studies have established the validity of pain assessment using postal questionnaires (6), electronic questionnaires (1) and live telephone interviewing (7), the equivalency of IVR has not been studied.
IVR systems collect survey data by telephone using automated, interactive scripts and push-button or recorded responses.Calls can be respondent-initiated or system-initiated, and respondents are led through an interactive menu that provides instructions, poses a set of standardized, prerecorded questions, and specifies response choices.IVR eliminates the human resources necessary to reach respondents at home, conduct telephone interviews and enter data.For respondentinitiated calls, the IVR system can be accessed at any time of day or night, allowing access to hard-to-reach groups (8).IVR is also well-suited for assessing health information that is potentially stigmatizing, for example, alcohol consumption (5,9) or medication compliance (10), because it may reduce embarrassment or other unintended observer effects.It has been suggested that IVR systems could be designed for clinical use to monitor symptoms without requiring office visits, to access and update patient information, to facilitate shared decision-making and to provide self-management instructions (11,12).
One potential application of IVR is home monitoring of early recovery from acute back pain.Although the majority of back pain episodes resolve within one month and require minimal intervention, a subset of cases with a seemingly benign clinical presentation at intake can progress to chronic or recurrent pain (13,14).Thus, one recommended strategy for improving back pain outcomes has been the early identification of patients at greatest risk for persistent pain and disability (15).A growing number of prospective cohort studies (16,17) have identified disability risk factors such as working conditions, pain intensity, pain beliefs and expectations for recovery during the critical subacute stage of back pain.During this stage (one to six months postinjury), IVR systems may be useful in assessing pain symptoms and emerging risk factors for chronic pain.
In the present study, patients presenting to occupational health clinics with a recent onset of work-related acute back pain were surveyed and then assessed one month later by telephone for outcomes of pain, function, treatment and work status.Although the comparison of assessment methods was not an original focus of the study, the introduction of an IVR system to facilitate automated data collection partway through the study provided an opportunity to compare data collection methods.Criteria for equivalency of IVR to live telephone interviewing were that IVR responders should have identical demographic and injury characteristics, and identical outcome results, to non-IVR responders.

METHOD Participants
Participants recruited into the study were 608 working adults (198 women, 410 men) seeking treatment for work-related back pain at eight occupational health clinics in the northeastern United States between September 2000 and October 2002.All participants received evaluation and treatment from an occupational medicine provider in accordance with standard practice guidelines (18), which, in most cases, recommend acute pain relief, conservative care, reassurance and encouragement to resume normal activity as soon as possible.
Inclusion criteria were: nonspecific sacral, lumbar or thoracic back pain; acute onset or exacerbation; pain duration 14 days or less; participants filed a workers' compensation claim; 18 years of age or older; and fluency in English.Patients with thoracic pain were not excluded from the study because of difficulty in discriminating between thoracic and lumbar pain in a self-report questionnaire before medical evaluation.Prior work with workers' compensation claimants suggests that noncervical cases of back pain can be categorized as 75% lumbar, 12% thoracic and 13% unspecified (19).
Participants were grouped into three principal categories for analysis: IVR responders, who chose to use an IVR option at one-month follow-up; telephone responders, who completed the follow-up by live telephone interview after failing to use the IVR option within five days; and pre-IVR telephone responders, who completed the one-month follow-up by live telephone interview before the IVR option was made available as an option (Figure 1).Demographic characteristics of the three groups are shown in Table 1.Ages of participants ranged from 18 to 80 years (mean ± SD 36.1±11.0years), with 90% younger than 50 years of age (two participants older than 67 years of age were working part-time but still covered under workers' compensation benefits).The majority of the patient sample could be characterized as young to middleaged workers with a high school education and low to moderate income who were working for medium-to large-sized employers.Median job tenure was two years, and occupations were mostly blue-collar trades and skilled service providers (Table 2).As shown in the table, the most frequent occupational categories and injury types compared favourably with national statistics on workrelated back pain (20).

Procedures
Eligible patients were identified by front desk staff or clinicians during an initial evaluation for a recent onset of back pain.Details of the research study were described, and a consent form was provided to review and sign.The consent form described confidentiality of questionnaires, assured participants that no questionnaire responses would be placed in medical records or shared with employers, and gave notice of a US$25 incentive for participation.After any questions or concerns were addressed, patients were provided a self-report questionnaire to report demographic background, circumstances of injury, current level of pain and potential disability risk factors.Participants returned the completed form to the reception desk before leaving the clinic.
A follow-up period of one month was chosen to distinguish acute cases of pain (resolved within one month) from subacute (lasting greater than one month) because return-to-work rates are dramatically reduced beyond one month of work absence (21).Twenty-eight days after their initial visit (28 to 42 days since pain onset), participants were mailed a postcard that specified a tollfree telephone number and personal identifier for accessing the IVR system.The computerized data collection system allowed participants to call at any time and enter data by push-button responses to recorded questions for tracking improvements in pain, function and ability to work (assessment of disability risk factors was not repeated to maintain an interview duration of less than 20 min).Participants not responding within five days after the mailing were called by a trained interviewer, who conducted the follow-up assessment in a live telephone interview instead.Before activating the IVR system in October 2001, all participants were assessed by live telephone interview only.22), an 18-item self-report measure designed to assess risk factors for delayed pain recovery and return to work.Questions refer to physical health risks, workplace factors, pain, mood and expectations for recovery, and responses include a variety of Likert rating scales.In combination with demographic variables, the BDRQ has a screening sensitivity of 74.3% and specificity of 70.1% to predict which patients with acute back pain will experience delayed recovery beyond one month (22).Three factor scores from the BDRQ (pain, emotional distress and physical job concerns) have internal consistencies (alpha) of 0.68, 0.70 and 0.60, respectively.Sample items from the BDRQ include: "How worried are you that future physical activity may increase your back pain or result in re-injury?"and "Do you think that you will be able to do your regular job without any restrictions 4 weeks from now?" Pain: Participants reported their current level of back pain on a 11-point scale from '0' (no pain at all) to '10' (worst pain possible)   Although the recurrent nature of back pain has led to some controversy about optimal methods for assessing return to work over longer follow-up periods, self-reported work status is a reasonably accurate measure during the first several weeks after pain onset (29).
Treatment helpfulness: A nine-item measure was adapted from the Treatment Helpfulness Questionnaire developed by Chapman et al (30) for assessing patient perceptions of treatment modalities offered in multidisciplinary pain centres.Participants rated the effectiveness of up to nine possible medical and self-management treatments for their back pain symptoms (eg, physical therapy, prescription medications, ice pack) on a scale from '1' (extremely harmful) to '5' (extremely helpful).A total score was based on the mean for all applicable treatments.Test-retest reliability of the measure is 0.88 (30).

IVR application
In October 2001, an IVR system was introduced to provide pushbutton telephone responses for the outcome measures described above.The application was developed using the Teleflow computer software package (Engenic Corporation, Canada).After calling a toll-free telephone number and entering a unique identifier, callers were provided basic instructions from a prerecorded script and then routed through each of the assessment domains (a total of 32 to 40 questions).Full lists of possible response categories were repeated after each question; however, callers could enter a response at any time and advance to the next question.
In some cases, skips were inserted to omit questions that may be invalidated by prior responses; for example, those still out of work were not asked whether they were working at a light duty job.Callers had the option to return to a previous question if a response was entered erroneously.Also, assessments could be partially completed, then resumed in a subsequent call.Completion of the full IVR assessment required 12 min to 22 min (depending on whether respondents listened to all available response options before responding), and data were automatically time-and date-stamped and entered into a single spreadsheet containing follow-up results.

Data analysis
To evaluate the possibility of an IVR self-selection bias, the three groups were compared on demographic variables, symptom characteristics and disability risk factors at intake (one-way ANOVA for continuous variables, χ 2 for categorical variables).To test whether IVR assessment was equivalent to that of live telephone interviewing, the three groups were compared on follow-up measures of pain, functional limitation, treatment helpfulness and return to work.Reliability and validity of IVR assessment was determined by comparing the internal consistency and correlations among outcome measures.All variables subjected to analyses of variance had sufficiently normal distributions to conduct parametric tests without data transformations.

Feasibility
Before implementing IVR as an assessment option, 240 patients were recruited into the study and assessed at their initial clinic visit for acute back pain.Of this pre-IVR group, 227 could be reached by a live telephone interviewer for the one-month follow-up assessment (a 94.6% retention rate).
After the IVR option was implemented, an additional 368 patients were recruited into the study.Of these, 131 (35.6%) used the IVR option to complete the one-month follow-up assessment, and the remaining 189 (51.4%) were contacted by a live telephone interviewer after no IVR attempt was made within five days.Forty-eight participants could not be reached by telephone (an 87.0% retention rate).The mean duration between intake and follow-up was 29.2±6.6 days for those who took advantage of the IVR option, and 35.4±7.7 days for those who were interviewed by telephone after failing to access the IVR system.Thus, most IVR responders called on the day they received the postcard invitation.Before introducing the IVR system, the mean follow-up time was 30.5±7.8 days.
Seven participants reported technical problems or difficulties understanding IVR instructions and were unable to complete the IVR survey.A follow-up telephone call from research staff was necessary to complete their surveys.The majority of IVR respondents (60%) called into the system during daytime hours (08:00 to 17:00), followed by 34% during evening hours (17:00 to 22:00), 5% during nighttime hours (22:00 to 06:00), and 1% during early morning hours (06:00 to 08:00).

Self-selection IVR bias
Demographic comparisons of the three participant groupings are shown in Table 1.There were no statistically significant group differences in demographic characteristics of age, sex, education, marital status or job tenure (P>0.05).However, IVR use was greater among lower income participants, χ 2 =30.43 (degrees of freedom=14), P=0.007.Sixty-one per cent of those reporting an annual income of less than $15,000 used the IVR option, versus 39.9% IVR use among others.
Health-related questions from the BDRQ are shown in Table 3.These questions were completed at the initial medical evaluation for back pain.There were no statistically significant differences between IVR and telephone respondents by injury type, previous back pain, pain avoidance beliefs, expectations for return to work, exercise habits, health rating, body mass index or mood (P>0.05).There was, however, a group difference in pain ratings whereby those reporting more pain at intake were more likely to take advantage of the IVR option one month later, F(2,544)=3.11,P=0.04.Due to a small negative correlation between income and pain rating (r=-0.13,P=0.002), the relationship between pain and IVR use was no longer statistically significant after controlling for income.

Equivalency of IVR
Outcomes of return to work, pain, functional limitation and treatment helpfulness are summarized according to assessment method in Table 4.At the one-month follow-up, a majority of participants (47.9%) had resumed their regular job assignment, although 28% of these individuals believed they were accomplishing less at work because of back pain (an item in the RDQ).One hundred ninety participants (35%) were working modified or restricted duties because of back pain, and 95 (37%) were still out of work.There were no statistically significant differences in one-month work status between IVR and live telephone interview respondents (P>0.05).
In terms of health outcomes, IVR respondents reported more functional limitation and less helpful treatments, but no differences in pain ratings.Because significant group differences in the number of days before follow-up data were obtained, the analyses were repeated in an analysis of covariance controlling for the actual number of days between intake and follow-up.The adjusted means are shown in Table 4.After controlling for differences in timing of assessments, the group differences in functional limitation and treatment helpfulness were no longer statistically significant (P=0.08 and P=0.06, respectively).

Reliability and validity
To provide an estimate of the reliability of IVR versus live telephone interviewing, the internal consistency (alpha) of the 16-item RDQ were compared between groups.Among IVR respondents (n=131), the internal consistency (alpha) of the scale was 0.72 (standardized item alpha=0.81).Among all live telephone respondents (n=416), the internal consistency (alpha) of the scale was 0.73 (standardized item alpha=0.80).
To provide an estimate of the validity of IVR versus live telephone interviewing, the within-group correlations between pain and functional limitation were compared.A test of group interactions showed no statistically significant differences in the association between pain and function in the two groups (F[1,543]=0.72,P=0.40).Among IVR respondents (n=131), the Pearson correlation was 0.61.Among all live telephone respondents (n=416), the Pearson correlation was 0.57.

DISCUSSION
Implementing IVR as an automated option for recording pain and disability appears to have had little, if any, effect on the responses of participants in the present prospective cohort study of acute back pain.Although the IVR system reduced staffing requirements for the study, more than one-half of participants failed to use the IVR option, and it was necessary for interviewers to contact these individuals by telephone later.IVR assessment appears equally feasible and valid at different levels of age, education and income; in fact, lower income patients were more likely to take advantage of the IVR option.While more work is needed to demonstrate the potential feasibility of IVR for clinical applications, the present study found no evidence of a systematic bias that might invalidate IVR as an inexpensive tool for data collection.
The low IVR utilization rate in this study (35%) may be due to a number of factors.First, although participants were informed of the follow-up telephone assessments at the time of study recruitment, they were not informed of the IVR option nor provided instructions for IVR use; this option was described in a postcard sent one month later.Some participants may have failed to recognize the postcard or disposed of it inadvertently.Others who intended to use the IVR system may have simply procrastinated beyond the five-day time frame before a live attempt was made.A longer call-in period may have generated a higher use rate.Third, participants were provided no special incentive for using the IVR option (besides added convenience).Other incentives (eg, the ability to access useful medical advice) may increase utilization rates.Because ratings of treatment helpfulness were generally high, it is unlikely that frustration with medical care had any bearing on IVR utilization rates.Although the IVR utilization rate was low, there were considerable cost savings (approximately US$100 per IVR call) in the reduced hours required by paid interviewers.
Only one other study (31), a random digit dialing study of attitudes about mass media, evaluated self-selection bias among IVR responders.The researchers found that IVR users were younger (which they attributed to technology savvy) and less educated (which the researchers attributed to greater frankness in reporting education via IVR).We found no association of IVR use with age, but there was a greater preference for IVR use among lower income patients.The latter finding cannot be explained by more frank IVR reporting of income in the present study because this variable was assessed as part of an earlier pen-and-paper measure.An alternative explanation is that lower income participants may have opted for IVR to overcome telephone access problems associated with shared living arrangements or unusual working hours.Therefore, IVR surveys may have potential benefits for improving response rates among low income respondents, a common problem in mail-in and Web-based surveys (32,33).
Participants who reported higher levels of back pain at their initial medical evaluation were more likely to use the IVR option one month later, but this association was confounded by income.After controlling for income, IVR responders were no different on pain levels at intake, and there were no significant differences in pain report at the one-month follow-up.Therefore, we can conclude that patients experiencing delayed pain and functional recovery were no less likely than others to use the IVR option, and this may have important implications given the tendency of patients reporting more symptoms to withdraw from pain research studies and intervention protocols (34).
Implementation of IVR in the present study had a mixed effect on the timeliness of follow-up assessments.Among those who used the IVR option, the follow-up time was improved by an average of one day (29.2 versus 30.5 days); however, the five-day waiting period before calling nonrespondents resulted in a substantial delay for reaching others (35.4 versus 30.5 days).The net effect was a delay of 2.4 days (32.9 versus 30.5 days).A possible method for improving IVR response rates is to program the system to call participants repeatedly until data collection is complete, but this is far more intrusive and less convenient for participants.
A primary concern was informational bias, wherein IVR may produce different assessment results without the demand characteristics of responding to a live telephone interviewer.Those who used IVR reported more functional limitation on the RDQ and less helpful treatments, but these differences were explained by the time delay in reaching the comparison group of telephone respondents (the comparison group had, on average, five additional days to recover).Therefore, we conclude that there were no differences between IVR and live telephone interviewing.This contrasts with the findings of Millard and Carver (35) that IVR users with back pain reported greater emotional concerns, and poorer mood and overall health.Other studies have found no differences between IVR and written questionnaires for assessing musculoskeletal function (36) and daily allergy symptoms (37).In addition to finding no significant differences in health report in the present study, there were also no differences in scale reliability or in correlations between outcome measures.
For IVR systems to have clinical utility for monitoring of patient pain and function, several obstacles must be overcome.First, low response rates must be improved by providing incentives or by incorporating medical information, instructions, or feedback that would be useful to patients.For example, patients with acute back pain could be provided information about self-care for pain management, goals for resuming various physical activities, and realistic expectations for treatment and recovery.Such IVR counselling methods have been successful in providing tailored advice to patients on cancer screening (38), smoking cessation (39), elderly caregiving (40) and cardiac rehabilitation (41).Second, the utility of IVR should be improved by integrating IVR data with clinical decision-making for recommending treatments, making referrals and assessing the need for follow-up visits.For example, patients showing slower than normal recovery may be automatically referred for physical therapy.For patients with growing concerns about resuming work, providers could intensify their communications with employers.IVR data collection has been noticeably absent from the ongoing debate over the use of electronic health records to improve coordination of care (42).
Although IVR evaluation was not the primary objective of our data collection efforts in the present study, these data have provided a reasonable opportunity for evaluating the equivalency of IVR data collection to live telephone interviewing.Other study designs may provide additional opportunities for more detailed psychometric evaluation (eg, test-retest reliability or randomized designs).A randomized study in which participants are randomly assigned to either IVR or live telephone assessment and provide preferences or perceptions of IVR use would be optimal.Limitations of the present study include its nonrandomized design, a relatively low IVR response rate and the potential confound of secular trends, because IVR was introduced partway through the study.Also, results of the study may not generalize to other pain conditions that are of a more longlasting or disabling nature.
The present study is the first to evaluate the equivalency of IVR to live telephone interviewing for assessing painrelated outcomes.Conclusions of this study are that IVR assessment of pain-related outcomes seems a reasonable alternative to live telephone interviewing, that IVR may be preferred by lower income groups, and that approximately one-third of respondents take advantage of the IVR option without any special incentive.Although a number of obstacles must be overcome to integrate IVR data collection into routine clinical practice, this option has potential for monitoring patient recovery and coordinating care after the onset of back pain.Future studies could investigate ways to improve IVR utilization rates among patient populations, track changes in symptoms over time and integrate IVR systems with clinical care.

Figure 1 )
Figure 1) Response to follow-up before and after implementing an interactive voice response (IVR) data collection system Automated telephone assessment of back pain Pain Res Manage Vol No 1 Spring 2007 25 Automated telephone assessment of back pain Pain Res Manage Vol 12 No 1 Spring 2007 27

TABLE 1
df Degrees of freedom; IVR Interactive voice response

TABLE 2
(28)25)States Department of Labor, Bureau of Labor Statistics, 2000(20); † Bodily reactions are cases, usually nonimpact, in which injury or illness resulted from free bodily motion and excessive physical effort (eg, to avoid a falling object) at the initial visit and at the one-month follow-up assessment.Two-point changes in this scale have been shown to represent clinically meaningful changes that exceed the bounds of measurement error(23).Pain complaints related to other health concerns or body regions were not assessed.Functional limitation: Functional limitation due to back pain was assessed at one-month follow-up using a 16-item abbreviated form of the Roland-Morris Disability Questionnaire (RDQ)(24,25).Respondents report whether each of 16 daily living activities is limited (yes/no) due to pain.A total score is the sum of all positive responses.Sample items include "In the past 2 weeks, because of back pain, have you talked less with those around you?" and "…have you kept rubbing or holding areas of your body that hurt or are uncomfortable?"TheRDQ has good reproducibility, construct validity and responsiveness to intervention(26).One-week test-retest reliability for the RDQ is 0.88(26)and internal consistency is 0.88(27)(current sample alpha=0.73).The RDQ correlates well with other established measures of physical function(28).Return to work: Participants provided details about current work status, any temporary modifications or physician restrictions, and the cumulative duration of work absence and work modification.These data produced three levels of work resumption: not working; working modified or alternate duty; and working full duty. *