Comparability of Internet and telephone data in a survey on the respiratory health of children

1Direction de la Santé Publique de Montréal; 2Université de Montréal, Département de médecine sociale et préventive; 3Clinique interuniversitaire de santé au travail et de santé environnementale, Institut thoracique de Montréal, Montréal, Québec Correspondence: Ms Céline Plante, Direction de la Santé Publique de Montréal, 1301 rue Sherbrooke Est, Montréal, Québec H7L 3L1. Telephone 514-528-2400 ext 3285, fax 514-528-2459, e-mail cplante@santepub-mtl.qc.ca Response rates to surveys, particularly telephone surveys, have been declining in the past 30 years (1,2) and, for general population surveys in North America, have been reported to be approximately 50% to 60% (1,3). This decline is caused by a growing refusal to participate, and difficulties in contacting individuals (2) due to the increased use of answering machines, call screening devices (3) and cellular telephones (4). Because low response rates may affect survey validity (3,5-7), supplementary efforts are often necessary to maintain response rates in the acceptable range, which results in increased survey costs (3,6,8). The Internet, alone or in combination with other survey modes, has been proposed as a means to both control costs and improve response rates (9,10). Internet questionnaires are usually less costly to administer than postal questionnaires (11), and can be completed at anytime, regardless of where the respondents are located. However, the Internet is seldom the only mode of choice because Internet access is still not universally available (68% in 2006, and 76% in 2009 in North America [12]), and there is usually no available list of all users that would enable researchers to obtain a random sample of subjects. However, concern has been expressed that information from two or more sources of data within a single study may not be comparable. Several authors have reported differences in responses between selfadministered questionnaire (mail, computer or Internet) and interview (telephone or face to face) modes (3,8,13,14). Two main factors could explain these discrepancies: socioeconomic heterogeneity among respondents between the different modes; and differences related to the modes themselves (8). Regarding the former, individuals with access to the Internet differ from those who do not have access with respect to socioeconomic status, which is an important variable frequently associated with health outcomes. Regarding the latter, reading questions instead of listening to them read aloud on the telephone, answering at one’s own pace instead of being pressed to, choosing responses among multiple choices when reading versus listening, or answering intimate or sensitive questions posed by an interviewer versus a computer are among situations in which answers may not entirely be comparable. Indeed, several studies have highlighted differences in answers to sensitive or other types of questions elicited by Internet, computer or mail versus telephone questionnaires (9,15,16), orIgInal artICle

BaCkgRound: Mixing survey administration modes has generated concern about the comparability of responses between modes.oBJeCtIve: To explore the differences in respondent profiles, and responses between Internet and telephone questionnaires in a survey on respiratory diseases.MethodS: The data were generated from a mixed Internet and telephone survey of respiratory diseases among children in Montreal (Quebec), in 2006.Comparison of 12 selected questions was performed after standardization for respondent education and income.Stratification of analysis on education and income categories was also performed for the questions with significantly divergent responses.ReSuLtS: Six questions showed significant differences in responses between modes after standardization.The largest differences among the closed-ended questions were observed for highly prevalent symptoms, dry cough during the night (difference of 9% for positive answer [P<0.01]) and symptoms of allergic rhinitis (difference of 7% for positive answer [P<0.01]).A large discrepancy was also found in the multiple choice question and with an open-ended response (ie, free answer).For the three potentially sensitive questions, a desirability bias was probably present in one question on smoking habits (difference of 2.6 % for positive answer [P<0.05]).ConCLuSIon: The differences observed between Internet and telephone responses to selected questions were not completely explained by socioeconomic disparities among the respondents.In a mixed-mode survey (Internet and telephone), caution should be used when formulating sensitive, complex, open-ended and long-ended questions, and those related to highly prevalent and nonspecific symptoms.RéSuLtatS : six questions affichent des réponses significativement différentes entre les deux modes après standardisation.Les différences les plus importantes sont observées parmi les questions fermées concernant des symptômes très prévalents, la toux sèche nocturne (différence de 9 % dans les réponses positives, P<0.01) et les symptômes de la rhinite allergique (différence de 7 % dans les réponses positives, P<0.01).Une divergence importante est aussi constatée pour une question avec choix multiples et un item ouvert.Parmi les trois questions potentiellement sensibles, un biais de désirabilité semble présent pour celle concernant l'usage du tabac (différence de 3 % dans les réponses positives, P<0.05).ConCLuSIon : les différences observées entre les réponses des questionnaires Internet et téléphonique des questions sélectionnées ne sont pas complètement expliquées par les disparités socioéconomiques des répondants.Lorsqu'une enquête combine les modes Internet et téléphone, des précautions doivent être prises dans la formulation des questions sensibles, complexes ou comprenant un item ouvert, ou des questions sur des symptômes très prévalents et non spécifiques.
even after adjustment for selection bias or in the absence of socioeconomic variation (11,13).
The aim of the present survey was to quantify how the prevalence rates of respiratory diseases (asthma, rhinitis and/or infections) among children varied across the Island of Montreal (Quebec) and to identify the factors associated with their distribution.A mixed-mode survey using the Internet and telephone was chosen for the present study given the target group (young families familiar with the Internet) and the desire to reach the various socioeconomic groups.
Given the reported associations between socioeconomic status (SES), asthma rates and access to the Internet, we aimed to assess the comparability of data collected using these two modes.The main objective of the present study was to examine the differences in respondent profiles and the distributions of responses collected from the Internet and the telephone questionnaires.A secondary objective was to study the effect of contact methods (mail only versus both mail and telephone) and SES on mode selection by respondents.

Survey
The present study targeted children six months to 12 years of age living on the Island of Montreal.The survey was conducted by a private firm during the spring and summer of 2006.A probability sample of 17,661 names and addresses of targeted parents was obtained from the administrative database of the Régie de l'Assurance Maladie du Québec (RAMQ, Provincial Health Insurance Board).Through automated and subsequent manual processes, 12,678 addresses (72% of the initial sample) could be matched to a telephone number.Families were contacted using two procedures: a letter was first sent to each family of the sample and, telephone calls, when the number was available, were made subsequent to the letter.The letter invited parents or legal guardians to answer either the questionnaire directly on the Internet or to contact the firm by telephone.Interviewers initially offered the respondent an opportunity to complete the questionnaire immediately or later by telephone, or on the Internet.A personal identification number was included in the letter that permitted secured access to a dedicated Internet site and to complete the online questionnaire.A minimum of 10 calls were made to contact respondents.In an attempt to improve the response rate, a different interviewer made additional callbacks, and a second letter was sent to families for whom no telephone number was identified.Internet and telephone questionnaires were identical and completed during the same period.On average, the telephone survey lasted 23 min, but the duration of the Internet survey is unknown.There were 300 questions that focused on the child's current and past respiratory health, use of medication and health care services, family history of allergies, home environmental exposures, and sociodemographic, household and lifestyle factors.
The survey was completed by 7964 respondents, of whom 4155 (52.2%) answered on the Internet and 3809 (47.8%) by telephone.The response rate for the group for which a telephone number could be paired to an address was 71%, and approximately 30% for the remaining group without an available telephone number.This approximation was made on the assumption that the proportion of valid unreturned mailouts, which is unknown but needed to compute the response rate, is the same as the proportion of valid telephone numbers (known).
The study protocol was approved by the Montreal Public Health Department Human Subjects Research Ethics Committee.Consent was sought from all participants before they completed the questionnaire, and all were assured that the collected information would remain confidential and anonymous.analysis Socioeconomic characteristics of the respondents, including sex, country of birth, education and family income, were compared according to survey mode.The socioeconomic characteristics of Internet respondents were also compared according to contact procedure (mail contact only versus mail and telephone).The c 2 test was used to measure the significance of differences in response distributions.Log binomial regression, suitable for binary response, was used to study the effect of socioeconomic characteristics and contact procedure on the choice of survey mode.
A subset of questions representing diverse task complexities and important outcomes, and some pertaining to potentially sensitive issues (these were related to smoking habits during pregnancy, presence of pest animals, and breast-or formula-feeding of newborns), were selected.One question with multiple choice responses and one with an open-ended response (ie, free answer) were also included.For the question with an open-ended response, which pertained to changes in behaviour or home modifications made to alleviate the child's asthma symptoms, the seven most frequent answers were retained for analysis.All questions offered a nonresponse option.The differences between survey modes were tested using the c 2 test.Nonresponse frequencies were also compared.
Because the difference in responses between modes may be attributable to dissimilarity in the socioeconomic characteristics of respondents, the responses were standardized according to two categories of family income and two categories of respondent education, based on the entire sample proportions of those categories.The cut-off point for family income was $35,000/year, and the education level was classified below or at/above high school.The education level or family income was not available for 861 respondents, resulting in 7103 completed questionnaires.
If initial differences in responses to a question between Internet and telephone survey modes were statistically nonsignificant after standardization, it suggested that the differences were only related to variations in socioeconomic characteristics of respondents according to survey mode.If significant differences remained after standardization, the analysis was further stratified according to four education/ income strata to investigate possible trends in responses.To identify the most probable factors that explained the remaining differences, observed trends according to socioeconomic strata (eg, difference only observed among lowest socioeconomic stratum) and proportions of positive answers between the two modes, taking into account known factors from the literature (eg, a factor known to produce higher positive answers among Internet respondents) and details of the survey methods (eg, possible influence of the interviewer) were investigated.

ReSuLtS
Respondents' socioeconomic profile according to survey mode Table 1 summarizes the distribution of family income, respondents' education, sex and country of birth according to response mode.A clear trend can be observed: Internet respondents had higher incomes and education levels compared with telephone respondents.There was a slight statistical difference in respondent sex, with men responding more frequently by telephone.The distribution of respondents' origin indicated that immigrants, except those born in Europe, were less prone to answer on the Internet.This was especially true for respondents born in Africa, the Caribbean and the Bermudas.

Profile of Internet respondents according to contact procedure
Among the 1516 respondents who were contacted by mail only, 435 called the survey firm back and were interviewed, and 1081 completed the Internet questionnaire.This group of Internet respondents had different socioeconomic characteristics from the group of Internet respondents for whom a telephone number was available (n=3074) (Table 2).In fact, the former group had a lower family income and education level, and was comprised of more immigrants (Table 2).

Factors influencing the choice of survey mode
Table 3 summarizes the results of a regression model reporting the probability of choosing the Internet instead of the telephone, according to selected respondent's characteristics and contact mode.The regression analysis showed that a high education level and being contacted by mail only were the strongest factors among those significantly associated with a preference for the Internet.Higher family income, female sex, and coming from Canada or the United States were less strongly associated.

Response to questions according to survey mode
The frequency of nonresponse item (ie, refusal or don't know) for the selected questions was low, regardless of mode and varied from 0% to 4.8% (Table 4).Nevertheless, for 11 of the 12 questions, the nonresponse rate was higher in the questionnaires completed on the Internet, the greatest difference being 2.9% (question [Q] 7).
Six of the 12 Qs showed significant differences of responses between modes after standardization (Table 4).Among the eight nonsensitive simple 'yes/no' questions (Q1 to Q8 of Table 4), three displayed significant differences of response frequencies.Q2 on dry nocturnal cough and Q6 on allergic rhinitis symptoms showed relatively important discrepancies between modes, with 8.9% and 7.0% more positive answers, respectively, in the Internet questionnaires, while Q1 showed only a slight difference (2.2%).
For the potentially sensitive questions (Q9, Q10 and Q11), Q9 (concerning mother smoking during pregnancy) demonstrated a higher positive response rate among Internet respondents (15.9% for Internet versus 13.3 %) and Q10 concerning the presence of pest animals or insects in the house showed the opposite trend (18.0% for Internet versus 20.4%).These differences were slight but significant.For Q11 (on breastfeeding), the differences were not significant.
The answers to question Q12, concerning home or behaviour changes made because of the child's asthma, showed important differences between modes.The respondent could choose more than one answer among the first four items given, and other answers of his or her own (open-ended response), up to a maximum of five items.The first four items (listed choices) were more frequently chosen by Internet respondents, while the open-ended response was more frequently chosen by the telephone respondents, with the total number of different changes made in the home higher in this group.
The results of stratified analysis according to education and income categories for the questions showing significant differences are presented in Table 5, except for Q12, for which the number of responses per item were too small for stratification.For three questions (Q1, Q9 and Q10), differences between survey modes were not significant for three or all four categories of stratification.For the first question on wheezing, differences within strata seemed more important in respondent groups with low education, but not significantly.The same was  observed for Q9 (smoking during pregnancy), with the difference being significant for the lowest education-income category.There was no clear trend observed for Q10 (presence of pest animals or insects).For Q2 (dry cough during the night) and Q6 (symptoms of allergic rhinitis), differences between survey modes were significant in three or all four categories.The percentage of positive answers was systematically higher among Internet respondents for these two questions, and the difference was more important among low education-income categories.

dISCuSSIon
When tasked with analyzing data obtained by telephone and Internet for the same survey, one should verify whether they are comparable.It is not an easy task when respondents between the two modes have socioeconomic disparities, which adds to potential errors due to the modes themselves, all possibly affecting comparability of data and validity.This was especially true in the present survey because asthma and the other respiratory diseases are known to be associated with SES.The main objective of the present analysis was to study the differences in respondent profiles and the distributions of responses between the Internet and the telephone questionnaires.A secondary objective was to investigate the effect of contact procedure (mail only versus both mail and telephone) and SES on mode selection by respondent.
It came as no surprise that the socioeconomic characteristics of respondents in the present survey differed between those who opted for the Internet and those for the telephone.Other studies have shown an association between administration mode and ethnic origin, age, education and income (13,17).In the present study, Internet respondents were typically more educated, had a higher income and were generally less likely to be immigrants.Among the Internet respondents, however, there were markedly more immigrants in the group contacted by mail only, compared with the group contacted also by telephone, especially those from Central and South America, Europe and Africa.This may indicate that these immigrants tried to avoid the use of a language that they did not speak fluently.It may have also been due to a greater number of addresses not paired with a telephone number among immigrants, who are more frequently tenants and move residences more frequently (data not shown).
Regression analysis showed that a high level of education had the strongest influence on the choice of Internet, even more than family income.However, the contact procedure had a stronger effect: those being contacted by mail only were 53% more likely to respond by Internet than those receiving interviewer's call(s) following the letter(s).In other words, if reached by telephone, respondents would tend to complete the questionnaire immediately over the telephone, and those contacted by mail only would tend to go online instead of calling the survey firm.This is probably due to the pressure exerted by the interviewers to obtain a completed questionnaire as soon as possible.In interpreting this result, one should keep in mind that some respondents had no choice because they did not have access to the Internet (unfortunately, we do not know who did not have such access).The rate of nonresponse (ie, don't know/refusal) was more frequent in the questionnaires completed on the Internet.Fricker and Schonlo (18) reported similar results in a literature review on Internet surveys.Greene et al (9) reported more nonresponse on Internet surveys for complex questions.It is quite possible that the telephone interviewers inadvertently pressed the respondents to answer or provided some explanations that helped the respondent to answer.
Five of the 11 closed questions chosen for comparison showed differences according to survey mode after standardization on income and education.Two of the three potentially sensitive questions showed significant differences between modes (smoking during pregnancy and pest animals or insects) -these probably being the most sensitive.For smoking during pregnancy, the analysis showed that positive answers were more frequent among Internet respondents and the stratification revealed that this trend was more pronounced in the two lowest education-income categories, thus suggesting a desirability bias.Regarding the question on pest animals, the differences observed do not suggest the presence of desirability bias.Other authors have found that selfadministered questionnaires are more efficient for sensitive questions than interviews (8,13,15).
As reported by Fricker et al (13), differences between survey modes increase with complexity such as in open-ended questions.This was the case with the question on behaviour or home modifications made to alleviate the child's asthma symptoms.The open item was more frequently used by telephone respondents while the four choices given more frequently by Internet respondents, especially those related to smoking habits.By systematically asking for an additional open item, the telephone interviewers may have increased the response to such an item.
The largest differences among the closed-ended questions were observed in the ones pertaining to dry cough during the night (8.9%) and that on symptoms of allergic rhinitis (7.0%).Positive answers were systematically more frequent among Internet respondents in all education and income strata.These symptoms are very frequent, the prevalence of dry cough in the present study being approximately 50%, and that of rhinitis symptoms approximately 30%.In the subquestion asking whether the coughing or wheezing woke the child during the night, the difference between the two modes was no longer significant.This may indicate that this more specific question was less influenced by various factors and interpretations and time to respond.Brøgger et al (19) also found greater discrepancy for questions on coughing when comparing respiratory symptoms and risk factors between telephone interviews and postal questionnaires on the same respondents.Regarding questions on symptoms of allergic rhinitis, it is also possible that reading instead of listening to the question modified or facilitated the understanding of this long question.Finally, the question on wheezing or whistling also showed a small but significant difference between the two modes.
There was no difference between the two survey modes for the questions related to the diagnosis of asthma, use of bronchodilator medication and mold in the house.This has important validity implications because these were key variables in our study (20).

Table 4 Comparison of positive responses according to administration mode on standardized data for income and education* Question (Q) †
*Data standardized on two categories of family income and two categories of respondent education, based on the entire sample proportions.The cut-off point for family income was $35,000/year and the education level was classified below or at/above high school; † Q1, Q2, Q3, Q4, Q5, Q7, Q8, Q9 and Q10 had three response categories: 'yes', 'no' and 'don't know/refused to answer'.Two response categories of smokers (everyday, sometimes) were aggregated ‡ P<0.05 (c 2 test); § The first four choices were given in the order presented, separate c 2 tests on the frequency of each item among respondents between the two modes