There is a great concern nowadays regarding alcohol consumption and drug abuse, especially in young people. Analyzing the social environment where these adolescents are immersed, as well as a series of measures determining the alcohol abuse risk or personal situation and perception using a number of questionnaires like AUDIT, FAS, KIDSCREEN, and others, it is possible to gain insight into the current situation of a given individual regarding his/her consumption behavior. But this analysis, in order to be achieved, requires the use of tools that can ease the process of questionnaire creation, data gathering, curation and representation, and later analysis and visualization to the user. This research presents the design and construction of a web-based platform able to facilitate each of the mentioned processes by integrating the different phases into an intuitive system with a graphical user interface that hides the complexity underlying each of the questionnaires and techniques used and presenting the results in a flexible and visual way, avoiding any manual handling of data during the process. Advantages of this approach are shown and compared to the previous situation where some of the tasks were accomplished by time consuming and error prone manipulations of data.
Computing and Information Science play a more and more important role in healthcare studies and applications [
Data gathering is usually carried out with online surveys designed and created with well-known free (e.g., Google Docs [
Data curation, once it has been obtained from the surveys, is usually left to the professional/researcher that designed the questionnaire. They are responsible for obtaining the information in the format that is useful for their work. This process is most of the times a hand-made one and so it is usually very time consuming and error prone.
There are a number of specialized questionnaires that could be reused without the need of creating them from scratch. These questionnaires are used for obtaining a number of psychosocial measures regarding individuals. Some examples are FAS II (Family Affluence Scale) [
When including social network analysis (SNA) in healthcare studies the former considerations acquire greater relevance. Obtaining social relationships from surveys require the use of complex questions where the participants usually are arranged into matrix-like structures. Results from these kinds of questions consist in a great amount of interrelated data that is quite difficult to handle manually, especially when dealing with hundreds or thousands of individuals. Moreover, when the same questionnaire is used by different groups of people the results have a different number of rows and, which is more problematic, columns, in the resulting spreadsheet, which makes it even more difficult to handle. The proper representation of this kind of information is crucial for the later phase of analysis and visualization because algorithms used for social network analysis must be run by computers due to their hardness and execution time.
Another level of complexity regarding the gathering and processing of data appears if the study of a given set of people is carried out at different points in time, that is, if it is desired to know how the social relationships and metrics evolve in time. These kinds of studies are popular in today’s approaches to social-based substance consumption research.
This research aims to study, design, and develop a solution able to automate every step in the process of questionnaire design and deployment, data gathering and curation, and later analysis and visualization, in order to reduce the drawbacks and difficulties of the manual handling that have been previously exposed and help professionals/researchers to focus on their work and not on time consuming, error prone tasks.
In order to achieve these objectives, a number of techniques and tools have been used and created, allowing the capture and representation of data into the computer and showing it in a friendly graphical user interface. Any healthcare professional or researcher, without deep knowledge of computing, is able to design and build a personal questionnaire, publishing it in order to gather data and see the results in the more convenient way for the purpose it has been constructed.
The ideas and tools described in this research have been applied to a study about influence mechanism on alcohol consumption among adolescents that uses a questionnaire for gathering data about the student psychosocial situation, his/her life perception and socioeconomical position, alcohol consumption habits, and friendship and family network. This study uses a complex questionnaire and techniques from social network analysis (SNA) in order to find and demonstrate the mentioned influence mechanisms, as well as for obtaining a picture of the situation of each student and his/her class regarding alcohol consumption levels.
The rest of the paper is organized as follows. Section
Current research on substance consumption and abuse tries to study a wider range of factors that may be involved in consumption habits. In particular, the social environment where the individual is immersed is of special interest. This is even more important when the population subjects of study are adolescent people because it is a stage in life where close friends and colleagues may influence each other’s life style and habits, including alcohol and substance consumption (a survey on this kind of studies for the case of the European Union can be found in [
Traditional tools to gather information about individual consumption habits and other psychosocial and socioeconomic measures for an individual include a number of well-known, validated, and standardized questionnaires. Some of them have been included in the solution proposed in this research: AUDIT (Alcohol Use Disorders Identification Test) [ FAS II (Family Affluence Scale II) [ ESTUDES (Poll about Drug Use in Secondary School in Spain) [ KIDSCREEN-27 (Health Related Quality of Life Questionnaire for Children and Young People and Their Parents) [ Self-efficacy [
These questionnaires have a scoring method in order to obtain a quantitative and/or qualitative result that characterizes the individual into a number of categories that can be later used for decision making or further analysis. Table
AUDIT risk level scoring [
Risk level | Intervention | AUDIT score |
---|---|---|
Zone I | Alcohol education | 0–7 |
Zone II | Simple advice | 8–15 |
Zone III | Simple advice plus brief counseling and continued monitoring | 16–19 |
Zone IV | Referral to specialist for diagnostic evaluation and treatment | 20–40 |
The proposed application has a login system where the professional or researcher (once she has registered into the application) has access to a personalized control panel where she can perform all the tasks needed for her research. The application has been developed with a web architecture, using a web browser for the graphical user interface and the last standards for development of these kinds of applications (HTML5, jQuery, AJAX, CSS3). On the server side, it has been developed with PHP and MySQL as the database management system.
Some libraries and APIs have been used for displaying the data, especially when dealing with graphs that show social connections (e.g., SigmaJS:
Three different user roles can be distinguished in the application: Super Administrator has full permissions to manage questionnaires, users, and respondents (excluding access to personal data as described in data protection laws). Interviewer/pollster can create and edit questionnaires, using the validated test available in the platform or creating questions from scratch. Respondent has access to the application only for filling the questionnaire. Manage questionnaires (both validated and customized) Create and edit individual questions or question groupings for the questionnaires Manage interviewers Analyze and visualize the data from the questionnaires that have already been filled
A researcher or professional is able to accomplish the following tasks:
Next sections are devoted to describing each of these functionalities, showing how the application eases the work of the user.
From this menu item, the user can see a listing of the different validated tests that are common in the field of healthcare studies (as is the case of the aforementioned AUDIT, FAS II, KIDSCREEN, etc.).
The user can browse and deeply inspect the questions and characteristic of the questionnaire, the responses available for each question along with their scoring, and the general score and its meaning for the whole questionnaire.
A user with Super Administration role is able to add new validated questionnaires. If this is the case, the reference where the complete description of the questionnaire is stated must be provided.
Users with Super Administration or interviewer role can design, create, edit, or delete questions inside their questionnaires. For doing so there is an editing window where the he/she can see the questionnaires he/she has created and, within each one of them, questions can be added, deleted, or edited. But an interviewer is not able to see the questionnaires from other ones.
For the questions that have been created by the interviewer, the score can be assigned at design time. Also, questions can be grouped into sets, giving a total score for the set based on a formula including the scores of the different questions within it, in a similar way to the validated tests that exist in the platform.
The procedure for creating a new question is below (see Figure Adding a question with a number of possible answers, giving a value for each one of them Grouping a number of questions and giving a value to the total score based on a formula using the scores of the individual questions
The creator of the question can also indicate if the response given needs to be anonymized so as not to violate personal data protection laws. The system will automatically perform a substitution of data from these fields into anonymized ones.
Adding a question.
The users with Super Administration or interviewer role can design, create, edit, or delete questionnaires. There is a dedicated space for this, along with a listing of the questionnaires that each user is able to manage. An interviewer will only be able to manage questionnaires created by himself/herself.
Adding a questionnaire will prompt the following information (see Figure The questionnaire type A description for the questionnaire Questions and sets of questions or validated questionnaires to be integrated in this one Generating questions based on the group of people to be polled: this feature is especially useful for capturing social network data
When creating and editing a questionnaire, it is possible to add an existing, validated test or a group of existing questions that have already been created by someone else or to add new questions (see Figure
Section to add questionnaires.
Modal window to add questions to questionnaire.
For capturing relationships, it is possible to introduce a list of people who are going to fill in the questionnaire in order to use their data for building questions related to social relationships. As an example, when trying to find how social relationships influence alcohol consumption in adolescents, it is necessary to have a listing of students in a class in order to build questions of the type “How often do you go out for an alcoholic drink with the following colleagues.”
The Super Administration or interviewer roles also allow performing a listing of people with the respondent role, with the permission for adding, editing, or deleting them. When adding new respondents, groups can also be constructed and assigned different questionnaires to be filled.
A similar questionnaire can be assigned to the same individual or group of individuals. This is used for the case where the study needs the same set of data obtained at different points in time as is the case, for example, when studying consumption habits and friendship relationships evolving in time, with the aim of finding influence or selection processes in the social environment of the individual.
The application has a user interface to show the results of the different questionnaires that a user is able to manage. This part of the application shows the resulting piece of data once processed or curated, that is, once the calculations of the different scores have been performed.
As well as plain qualitative and quantitative measures obtained from data curation, there is a part of the application devoted to showing social relationships obtained with the questionnaires. This type of information is shown by means of different graphs where nodes and edges represent individuals (with sizes, shapes, and colors representing different characteristics of the individual) and relationships (friendship, drinking companions, etc.), respectively.
Traditional research using social network analysis (SNA) has been carried out by using tools like UCINET [
Application showing social network of friendship relationships.
This part of application performs the following functionalities, based on the kind of information that can be obtained from the data: Load the set of data coming from the responses to the questionnaire and show the different networks that can be of use for gaining insight into alcohol use related to friendship (at school level) and family networks. Show the data graphs and the results from analysis regarding the social patterns regarding alcohol consumption taking into account different levels of peer relationship (acquaintance, partner, and friend) and also the family environment. Show basic data and a report-like description of any individual in the network about his/her alcohol consumption status, specially concerning alcohol use disorder risk or any kind of relevance within the network that the individual may pose (being a mediator, an influencer for others, etc.). See Figure Show, for each individual, who can act as an influencer for him/her. Show similar report-like description stated previously but for entire social networks and groups that may be found. Show and report the relationship that may hold between alcohol consumption, socioeconomic status, self-perception, and self-efficacy. Show and report factors that may be related to the level of alcohol consumption, like polydrug use and consumption and relationship environments.
The application must show the information in such a format that any health professional or researcher can understand it, without need for knowing anything about social network analysis techniques or terminology. The interaction with the tool and the information presented must utilize commonly used terminology when describing adolescent characteristics, friendship or family relationships, and alcohol consumption habits.
Application showing information about an individual.
Figure
Figure
One of the more difficult parts in application design and development is the part devoted to social network analysis. It was necessary to carry out a study of the different sets of techniques and algorithms used in this field in order to run the algorithms and represent the information in the more convenient way to the user.
In order to obtain relevant information from SNA techniques, the questions and responses to be included in the questionnaires must be carefully chosen [
Further distinctions may be made within each category depending on the characteristics of the relationship. For example, we can ask students to name their best friends, but this relationship may not be reciprocated by the people they name as their friend. Also, we can ask each student to weight the friendship relationship (according to a scale going from “I know him” to “he is my best friend,” e.g.) [
Once relationships stablished among different sets of students are captured, a number of algorithms may be run in order to obtain interesting information about individuals or groups. Different measures can be obtained for each individual showing their relevance in the network of friends, for example, what could involve influence others regarding alcohol consumption. It is also important to detect groups of people having strong connections among them. Clustering algorithms have also been used in order to show this information [
The application that has been presented in this paper was developed during a collaborative effort where healthcare and computer science researchers worked together with the aim of facilitating studies about alcohol consumption in adolescent population combining traditional alcohol consumption habit measures with metrics and tools from social network analysis. The objective was to cover every step of the process, from the design and creation of the questionnaires used to gather data to the presentation of the resulting sets of information, hiding the cumbersome calculations used for scoring the different tests and the complex concepts and algorithms used by SNA techniques.
This collaborative work had, as initial trigger, the difficulties found by the healthcare team when facing a study where a complex questionnaire was created from scratch, consisting of 252 questions for gathering data about alcohol consumption and friendship relationships on different secondary schools. The questionnaire was initially created by an online generic tool. It was later filled by a total of 214 students from 9 classrooms across 3 different secondary schools. A total of 145520 questions were answered and stored in a plain spreadsheet table.
Some of the difficulties that were found during the process were as follows. Existing validated tests like AUDIT, FAS II, and others had to be introduced one question at a time when creating the questionnaire. This led to mistakes or omissions that had to be detected and solved in a time consuming reviewing process; and, even after this process, some of the questions remained with a number of misspelling errors when the questionnaire was published. Questions created for gathering a number of relationships (friendship, consumption companions, family relationships, etc.) had to be constructed by introducing the list of involved students again and again; even with using some copy-paste tricks the process was quite tedious and time consuming. Moreover, if a mistake was made in the name of a student or if one student should be removed from the questionnaire (or a new one added) once it had been already introduced, then a careful editing of all the questions had to be accomplished. Once one questionnaire was completed, there was no way to use a part of it in another, new one; the only solution was to copy and paste the whole questionnaire and edit the copy. Once the questionnaires were filled, researchers had to spend about 30 minutes for each of the respondents in order to obtain the scores from their responses. Moreover, data must be processed by people who knows how each of the questions or validated tests score. Social network analysis tools like Gephi [ The final resulting set of processed data consisted in isolated pieces of information, that is, graphs displaying social relationships and spreadsheets containing scores from the questionnaire. There was no easy way to navigate the results or to query for a given set of individuals with a given score. It was very difficult to obtain a good insight into the situation of the studied individuals and groups regarding their alcohol consumption situation and behavior. It was clear that if the questionnaire was to be presented again to the same set of students in order to study their evolution of consumption or friendship in time, the results from one of the experiments were going to be very difficult to compare with others. Validated tests are now stored in the platform, with the set of questions and their corresponding response values, along with the formulae used to obtain the final score. Other questions that were created from scratch can also be reused for future questionnaires. Questions aimed at obtaining social relationships can now be fed with individual data by combining a listing of the involved individuals with the given question. If an individual is added, removed, or edited, the whole set of questions will be automatically updated. Once the questionnaire is filled, scores are calculated and social network analysis techniques are applied instantaneously and with no error because the formulae and methods used are properly stored in the application. Having integration in mind from the beginning allowed building an application where all the information is interrelated. Graphs showing social networks can be used to navigate to individual information and his/her corresponding scores and personal data just by clicking on the node icon representing the individual. Results from several applications of the same questionnaire can be easily shown in the application by means of tabs representing the different responses obtained at different times.
The application developed during this research helps to solve the aforementioned problems. The development was time consuming but the advantages are clear, especially when the system can be used by many studies from now on:
The application has been used for research purposes, studying a number of secondary school communities and trying to find the relationship between alcohol consumption habits and social metrics from friendship and familiar networks. The tool helped to find the main social network parameters for the different groups and individuals involved in the study, resulting in a strong connection among alcohol consumption habits and group formation (by selection processes) or leadership.
From the point of view of survey design and creation, the tool was evaluated as very useful and user friendly by the team in charge of building questionnaires, comparing their experience to the previous, manual work that they had to perform before.
Computing and Information Science are very important tools for today’s healthcare studies and applications. The complexity of the techniques and tools used in these studies and the increase in the amount of data that can be obtained from individuals and groups make it necessary to use automated processes for the gathering, manipulation, analysis, and visualization of the information.
Much of the data that can be obtained from an individual comes from online surveys that are designed each time a study is to be accomplished. This leads to time consuming and error prone processes that can easily be automated. In this sense, collaboration among researchers of healthcare, knowledge representation, and computing science are crucial to building the tools needed to avoid this situation.
Complex tools like the one presented in this paper can help to achieve studies that can be run by healthcare professionals or researchers without the need of being experts in the techniques underlying the automated processes that the application runs internally (e.g., knowing how to score the different tests or how social network analysis is carried out).
The use of tools like the one described in this paper helps to focus on the goals of the studies and not on the data gathering or manipulation that can be easily automated. Information processing and visualization is also greatly improved if the application is properly designed to display the data in an integrated, visual, and flexible user interface.
As future line of work, the inclusion of new functionalities that could, automatically, provide insight into the situation and changes in the relationships of the same set of individuals at different points in time would be a good enhancement for the tool, as it would allow improving the usefulness of the application for research purposes. A study on how this tool may help in real scenarios is also a planned future work; the tool will be presented to a number of healthcare and education professionals in order to explore and test the possible applications and benefits of the system, obtaining valuable feedback that can be used to enrich it.
The authors declare that there are no conflicts of interest regarding the publication of this paper.