COVID-19 Information on YouTube: Analysis of Quality and Reliability of Videos in Eleven Widely Spoken Languages across Africa

Introduction Whilst the coronavirus disease 2019 (COVID-19) vaccination rollout is well underway, there is a concern in Africa where less than 2% of global vaccinations have occurred. In the absence of herd immunity, health promotion remains essential. YouTube has been widely utilised as a source of medical information in previous outbreaks and pandemics. There are limited data on COVID-19 information on YouTube videos, especially in languages widely spoken in Africa. This study investigated the quality and reliability of such videos. Methods Medical information related to COVID-19 was analysed in 11 languages (English, isiZulu, isiXhosa, Afrikaans, Nigerian Pidgin, Hausa, Twi, Arabic, Amharic, French, and Swahili). Cohen's Kappa was used to measure inter-rater reliability. A total of 562 videos were analysed. Viewer interaction metrics and video characteristics, source, and content type were collected. Quality was evaluated using the Medical Information Content Index (MICI) scale and reliability was evaluated by the modified DISCERN tool. Results Kappa coefficient of agreement for all languages was p < 0.01. Informative videos (471/562, 83.8%) accounted for the majority, whilst misleading videos (12/562, 2.13%) were minimal. Independent users (246/562, 43.8%) were the predominant source type. Transmission of information (477/562 videos, 84.9%) was most prevalent, whilst content covering screening or testing was reported in less than a third of all videos. The mean total MICI score was 5.75/5 (SD 4.25) and the mean total DISCERN score was 3.01/5 (SD 1.11). Conclusion YouTube is an invaluable, easily accessible resource for information dissemination during health emergencies. Misleading videos are often a concern; however, our study found a negligible proportion. Whilst most videos were fairly reliable, the quality of videos was poor, especially noting a dearth of information covering screening or testing. Governments, academic institutions, and healthcare workers must harness the capability of digital platforms, such as YouTube to contain the spread of misinformation.

At the onset of the pandemic, the African response of instituting swift lockdown had been applauded [3]. However, WHO has cautioned that the actual number of COVID-19 cases is far higher in Africa due to limited testing. Tis was despite the Partnership to Accelerate COVID-19 Testing (PACT) initiative by the African Union Commission and the Africa Centers for Disease Control (Africa CDC), which aimed at improving testing capacity [4]. Ofcially, 8.5 million cases were reported for the African continent, though the WHO found that only 14.2% of cases were detected [5]. Furthermore, according to a report by the WHO in October 2021, 70 million tests have been conducted, a fraction of the continent's population of 1.3 billion [5].
Te epidemiological situation is further compounded by a poor vaccination plan. According to a recent analysis by the WHO, Africa scored 33% readiness for the rollout of the COVID-19 vaccine, far below the required 80% benchmark [6]. Furthermore, vaccine inequity and nationalism have impeded vaccinations in Africa, with only 153.95 million doses administered compared with the global total of 10.34 billion as of 11 February 2022 [7]. Tis translates to less than 2% of global vaccinations occurring in Africa. Furthermore, adjusted for population, Africa has only received 48.31 doses per 100 people, whilst the global fgure is 162.58 per 100. Tis is a concern noting that unvaccinated people have a higher risk of severe disease and admission. Furthermore, a recent study found that mortality amongst critically ill patients is higher in the African continent than in other continents; this occurrence is attributed to the lack of healthcare resources and comorbidities, including HIV/AIDS, and other chronic diseases [8].
Te Internet is not only a key source of information for the public but also a catalyst of misinformation due to its capability of spreading information widely and rapidly [9]. According to Internet trafc estimates, as of December 2021, YouTube (a freely available, easy-to-use, and Internet videosharing platform with more than 2 billion users) was the second most popular website globally, with 1 billion hours of video watched daily [10,11]. YouTube has been widely utilised as a source of medical information in previous outbreaks and pandemics [12][13][14]. Although the information found on this platform is questionable due to personal opinions, anecdotes, blind authorship, and a lack of credible sources, it is widely popular and trusted by viewers [9,15].
A systematic review by Osman et al. of 202 articles assessing 22,300 health-related YouTube videos found that about 41% of health-related content objectively assessed using standard scoring systems across a wide range of medical specialties was reported to be inaccurate or "not useful" while only about 19% of such content was categorized as useful, with more than half refecting commercial interests [16]. Te same study also revealed that 44% of articles highlighted the role of verifed health sources in promoting reliable information on YouTube with 11% discouraging the use of YouTube as a platform for promoting actionable health-related content.
Google Trends, on 21 January 2022, indicated that the most searched topic on YouTube for the past 12 months, worldwide, was the coronavirus pandemic [17]. Tere have been a few studies evaluating the quality and reliability of COVID-19 information on YouTube. Such studies report information as being suboptimal or poor, with over 25% of videos being misleading [18][19][20]. Tere remains a paucity of studies focusing on languages in Africa. In this study, the authors aim to evaluate the usefulness of YouTube as a source of COVID-19 information in selected languages widely spoken across Africa, by characterizing the quality, reliability, viewer interaction metrics, source, type, and content of videos.

Methods
Te methods used to carry out this work are explained in the following sections.
2.1. Languages. Videos in eleven languages (English, isiZulu, isiXhosa, Afrikaans, Nigerian Pidgin, Hausa, Twi, Arabic, Amharic, French, and Swahili) which are widely spoken or representative of all regions of Africa were evaluated.

Video Collection.
YouTube was accessed on 5 September 2020, using a new account with default settings, after clearing the cache of the Internet browser to minimize bias caused by cookies, personal preferences, and browser history. Combinations of search terms used included "coronavirus," and "COVID-19", and attaching the name of the language thereafter as a specifer ( Figure 1).
Te frst 50 relevant results were screened, and relevant videos that contained COVID-19 medical or epidemiological information were included.
Videos concerning nonmedical information (that is socioeconomic impacts of lockdown or politics), of another language, without audio, or that were duplicated from previous search results were excluded.
After the initial 50 videos were screened, this process was repeated with the second and third search terms for a total of 50 videos for each language (with the exception of English for which 75 videos were included due to the higher volume of videos available). Figure 1 indicates the data collection fow. Cumulatively, 562 videos were marked for evaluation.
Te fnal videos were saved on a playlist for evaluation, as search results on YouTube can change on a daily basis, and the Universal Resource Locators (URLs) of each video were captured [21]. Tis methodology is in accordance with previous studies [22][23][24]. Video characteristics such as the duration of the video (using the "mm: ss" format for minutes and seconds), upload date, interaction metrics (number of views, likes, dislikes, comments, and subscribers), video quality, and hashtags were all recorded at the time of video collection.  Global Health, Epidemiology and Genomics

Evaluation.
Videos of each language were scored by two independent viewers who had a medical background and were profcient in the respective languages. Kappa coefcient (K) was used to determine the degree of agreement between the two researchers for each language.
Te reliability and quality of each video were assessed. Te reliability was evaluated using a modifed DISCERN tool [22,25]. Tis tool comprises fve questions assessed: clarity, reliable source, lack of bias, reference supplementation, and mention of uncertainty; each has "Yes" or "No" responses (Supplementary Table 1). "Yes" indicates good reliability, scoring 1 point, whilst "No" indicates poor reliability and scores 0 points. Consequently, each video can obtain a cumulative score between 0 (lowest reliability) and 5 (highest reliability).
Quality was assessed using the Medical Information Content Index (MICI) scale. Tis tool was previously devised for a study on the Ebola epidemic [21]. Te MICI is a 5-point Likert scale assessing fve components of medical information: prevalence, transmission, clinical symptoms, screening/testing, and treatment/outcomes of disease. A maximum score of 25 was possible. Each component was graded utilising criteria adapted from a similar study evaluating the quality of COVID-19 information on You-Tube [18]. Guidelines from WHO and CDC were used for developing the criteria for the 5 components of the MICI scale (Supplementary Table 2).
Te sources of videos were classifed into one of the following groups: independent users, government/national agencies, news agencies, academic institutions/hospitals, and medical advertisements/for-proft companies. Te type of content was classifed into four distinct groups: informative (factual information regarding prevention, screening, signs, and symptoms, testing, treatment, and epidemiology), misleading (scientifcally inaccurate information or content that is not evidence-based), personal (based on an individual's experience or that of a friend or family), and news updates (content concerning statistical updates on cases, mortality, and recovery in the absence of providing information on symptoms, prevention, or management of COVID-19). Furthermore, any incorrect or unscientifc claims were documented.

Data Analysis.
Te like-dislike ratio was calculated by dividing the number of likes by dislikes. Views per day were calculated by dividing the total number of views by the number of days that the video was uploaded for. Descriptive statistics were performed with means, standard deviation, median, and interquartile range. Categorical data were analysed using chi-square tests and continuous data utilising ANOVA. A p value of 0.05 was considered signifcant. Data were analysed using R Studio (version 3.6.3, Vienna, Austria).

Results
Te Kappa coefcient of agreement between the researchers for all eleven languages was statistically signifcant (p < 0.001), Table 1.
Te majority of videos were classifed as informative (471/562, 83.8%) with over 229 million views, while misleading videos constituted 12/562 (2.13%) of total videos, with a total number of views at over 75,000. News updates had the highest median number of views at 88,837 (7 669-493 260), while informative videos had the least at 222 . Furthermore, misleading videos had the highest like: dislike ratio at 49.15 (32.5-66.9), while news updates had the least at 8.95 (5.5-13.6).

MICI: Quality.
Overall, information on the transmission of COVID-19 was most prevalent (477/562 videos, 84.9%), whilst content covering screening or testing was least reported with less than a third of videos containing such information (Table 4 and Supplementary Table 2). Te mean total MICI score was 5.75/25 (SD 4.35). Te highest total MICI score was attained by Amharic and Hausa (13.1/25 and 10.68/25, respectively) whilst Nigerian Pidgin, Twi, and Amharic had total scores below 4/ 25, and consequently, the lowest scores of all languages were analysed. All p values for MICI scores by language were statistically signifcant (p < 0.001).
Informative videos had the highest total MICI score (6.1/ 25) and, consequently, the highest score for all MICI indicators, aside from screening/testing, where personal experience and misleading videos reported the highest score. Te personal experience videos had the lowest MICI score (2.65/25).
Government videos had the highest score for prevalence and transmission (tied with medical adverts), and the highest proportion of information pertaining to transmission and screening/testing. Academic institutions/hospitals reported the highest scores for clinical symptoms, screening/testing, and treatment/outcomes, but the lowest scores were for prevalence. News agencies reported the lowest score for transmission and the lowest proportion of videos covering transmission and clinical symptoms. Stratifed by source, the highest total score was attained by government (6.3 ± 4.59) and academic institutions/hospitals (6.09 ± 3.84), whilst news agencies had the lowest (4.94 ± 3.96).

DISCERN:
Reliability. Te mean total DISCERN score was 3.01 ± 1.11 out of a possible fve (Table 5 and Supplementary Table 2). Te highest DISCERN item scores overall were reported on Item 1 (regarding whether video aims are clear and achieved), 0.97 ± 0.18, and Item 3 (regarding if the information was unbiased and unbalanced), 0.91 ± 0.28, and the lowest was Item 5 (whether any areas if uncertainty were mentioned), 0.25 ± 0.44. All DISCERN items were statistically signifcant across language, while DISCERN items 1-3 were statistically signifcant across content, and all DISCERN items aside from Item 5 were statistically signifcant across sources.
Across contents, news updates had the highest DIS-CERN scores for all items aside from Item 1, where informative videos had the highest score. Misleading videos had the lowest score for most DISCERN Items aside from Item 2 (a reliable source of information used) where personal experience had the lowest. Misleading and personal experiences had the lowest score for Item 3. Informative and misleading videos had the lowest score for item 5. Te highest total score across content was achieved by news updates, whilst misleading videos had the lowest score.
Across sources, the highest total DISCERN score was obtained by videos from academic institutions/hospitals (3.46 ± 0.88), whilst the lowest was from independent users (2.73 ± 1.18). Items 1 and 3 had full or close to full scores across the sources.

Discussion
Videos in English recorded the highest number of views, followed by Arabic and French. Tis is not surprising, as these languages are among the most widely spoken languages in the world [26]. A previous study on YouTube, as a source of information for COVID-19, showed that the English language has the second highest number of views, after the Hindi language [27].
Te majority (83.8%) of the videos that were studied were informative, with misleading videos constituting only 2%. Tis was encouraging, as YouTube has become a major means of gaining information during pandemic periods [12,13]. Te sharing of misleading information on YouTube about the COVID-19 pandemic has been a major challenge, as reported in a number of studies. Li et al. reported over a quarter of the viewed content containing misleading information [20]. Khatri et al. reported 10% and 8% of misleading videos, respectively, in their studies [18,27]. Te lower percentage seen in this study could be potentially accounted by the timing as it was conducted later than the previous studies, providing the opportunity for more robust infodemic curtailing measures. Gallotti et al. noted that social media information improved with an increase in infections [28].
Despite their modest number, misleading videos in our study had the highest likes: dislike ratio, whilst videos from government sources had the lowest. Similarly, misleading videos were proportionally liked fve times more than their government counterparts. Tis phenomenon may be attributed due to the sensational nature of misleading videos, which may encourage individuals to like and share such content. Furthermore, the decreased digital support from users for government videos suggests a lack of confdence and trust in society. Tis is supported by a report that indicated that most African citizens did not trust their governments to provide accurate information about case counts and mortalities during this pandemic [29].
Our study revealed that independent users were the predominant source of information on the pandemic. Tese fndings are supported by a previous study that evaluated YouTube as a source of information for the West Nile virus [14]. Tis is expected noting that YouTube is content-driven and content creators on YouTube are paid for views [30]. However, YouTube has been identifed as a major source of health information with a broad viewership. In view of this, professional bodies and healthcare organizations have been implored to embrace this technology for the sharing of medical information [18,20,31,32]. IsiXhosa was the only language, which had the most (39%) of its content coming from academic institutions or hospitals. Tis may be an incidental, with no studies available to corroborate this fnding.
Overall, across all languages, content on the transmission was the most reported (84.9%). Tis fnding was not diferent from the study by Khatri et al. [18]. Content from screening or testing was the least reported, and even when this category of information was present, it was of poor quality indicated by low MICI scores. Tis was corroborated   [27], and Ataç et al. [33]. With the advent of variants as identifed in South Africa, low-quality information on the technical aspect of the pandemic is worrying [34]. Overall, videos scored poorly for the quality of treatment information present. Brandi Ramos et al. noted that although videos on possible treatments for COVID-19 had problems with quality, they had high viewer rates [35]. Tis further underscores the grave challenge experienced with sensationalism. Tis is particularly disturbing due to the promotion of unproven treatments, noting earlier reports of chloroquine poisoning in attempts to resort to this as treatment in Nigeria [35]. Furthermore, the falsehood of ivermectin as a cure continues to perpetuate as well as vaccine misinformation driving hesitancy, despite the evidence.
Government/national agencies and academic institutions/hospitals had the highest and second-highest MICI scores, respectively. Nagpal et al. opined that reliable medical information with higher MICI scores could be provided by academic centres, government, and news agencies, and Singh et al. recommended that these institutions utilise YouTube as a platform for the dissemination of medical information [31,36]. In our study, videos from these sources scored the highest among the various sources. However, the absolute MICI scores were still poor. Tis further emphasizes the potential for improvement with respect to the quality of the content of the videos produced by these sources. Videos from news agencies had the overall lowest MICI scores. Andika et al. reported that news agencies had greater odds of uploading useful videos compared to other sources [36]. Tis fnding suggests that news agencies need to improve the quality of the information shared; particularly, information pertaining to treatment/outcomes is lacking.
Te overall DISCERN score of 3.01 ± 1.11 suggests that most videos across all languages were fairly reliable. Similar fndings were reported by Khatri et al. [18]. Te presence of videos from academic institutions and hospitals were encouraging and in contrast to Nagpal et al. where these institutions did not contribute to uploading the videos regarding the Ebola epidemic at the time [32]. In this current pandemic, it was demonstrated that videos by healthcare professionals are more reliable than those of independent users [37,38]. Szmuda et al. also noted that videos uploaded by physicians were the most reliable [39].

4.1.
Recommendations. Several approaches have been implemented internationally and regionally in Africa to curb the infodemic in the setting of the COVID-19 pandemic. Governments of various countries held press briefngs to give accurate COVID-19 information and combat fake news [40]. In Ghana, the Ministry of Information regularly issued fyers and press briefngs to constantly educate and inform the public about the COVID-19 situation in the country [41]. South Africa has utilised traditional and social media to frequently dispel myths and encourage vaccination uptake. Government and nonproft organizations in countries such as Benin, Nigeria, and Sierra Leone have used platforms, such as WhatsApp, to disseminate information about the pandemic [42]. Furthermore, response and adherence to the public health interventions in Africa are usually marred by mistrust in the government alongside their perceived poor performing health systems [43,44]. Tis makes content of the mainstream social media, YouTube, in particular, which is largely contributed to by independent users, a generally more acceptable source of information to the public.
Several measures should be taken to improve the quality and reliability of the information on YouTube. Firstly, public health agencies should collaborate with various YouTube content producers, particularly, infuential independent users [20]. Tis will not only minimize the spread of misinformation but also attract viewership from a wider audience. Secondly, a qualifed organization should be established and given the mandate to vet health-related videos before they are uploaded on YouTube [45]. Such an organization would help fag down both the poor quality videos and those with inaccurate information. Lastly, similar to the peer review process with journal articles, healthrelated YouTube videos should be subjected to rapid review by the health experts [46]. Te health experts will ascertain the validity and reliability of the information contained in those videos; thus, ensuring high-quality evidence based-information and preventing misinformation from being viewed and shared by consumers.

Strengths and Limitations.
Tis study has several strengths and limitations. It is the frst study to date to investigate the quality and reliability of YouTube videos in a multitude of African languages widely spoken across the African continent. Te sample size, a cumulative total of 562 videos, is larger than other similar studies.
Te African continent is home to thousands of indigenous languages. Despite our attempt at selecting some of the widely spoken languages representatives of various regions in Africa, we could not accommodate several other languages with signifcant geographic and demographic representation across Africa. Qualitative studies with thematic analysis and behavioral studies are untapped territories that may provide a complimentary view with regards to YouTube's role in health promotion, particularly, in Africa.

Conclusion
YouTube is an invaluable, easily accessible resource for information dissemination during health emergencies, as it has a broader and faster mainstream reach than on-site community sensitization by appropriate health authorities. However, a concerning number of misleading videos abound on this platform, and consequently, before people access quality and reliable information during health emergencies, a vast majority of persons may be misinformed, compromising the adherence to proven public health measures. Fortunately, in our study, misleading videos accounted for a minor proportion. Although we showed some favorable levels of cumulative reliability in the YouTube videos assessed, we, however, found poor quality especially with regard to the information pertaining to screening, testing, and treatment. Academic medical centers, governments, and news agencies should improve the quality of their video content in these areas.
Various strategies need to be implemented to ensure a high caliber of medical information is available online to the public. Ultimately, this solution is the shared responsibility of the public, government, and YouTube itself. We implore all stakeholders to, therefore, join the global concerted eforts of health promotion and harnessing digital media, during this pandemic and beyond, in order to ultimately contain the spread of misinformation and achieve pandemic control [47].

Data Availability
Te dataset used to support the fndings of this study can be obtained from the corresponding author upon reasonable request.

Disclosure
Earlier version of the manuscript was presented as a preprint in "COVID-19 information on YouTube: analysis of quality and reliability of videos in eleven widely spoken languages across Africa" [46].