Genomic Epidemiology of African Swine Fever Virus Identi ﬁ ed in Domestic Pig Farms in South Korea during 2019 – 2021

,


Introduction
African swine fever (ASF) is a highly contagious viral disease that affects domestic pigs and wild boars and causes severe economic losses and trade disruptions in the swine industry worldwide.The disease is caused by the ASF virus (ASFV), a large double-stranded DNA virus belonging to the Asfarviridae family.ASFV exhibits a wide range of clinical symptoms and high mortality rates in domestic pigs, with acute forms characterized by high fever, depression, and hemorrhage [1].ASFV genotyping is based on the analysis of specific genetic markers such as the major protein p72 encoded by the gene B646L and the central variable region (CVR) within B602L [2][3][4][5][6].ASFV strains have been classified into 24 genotypes, with genotypes I and II found outside Africa [4,7,8].
Genotype I ASFV strains were first reported in Portugal in 1957 and have since been detected in several countries, including Spain, France, Italy, Brazil, the Dominican Republic, and Haiti in the 1960s and the 1970s [9,10].ASFs due to genotype I ASFV strains have since been eradicated in all countries, except Sardinia and Italy, where they were endemic [11], until China reported genotype I ASFV isolates in 2021 for the first time in Asia [12].Genotype II strains were introduced into Georgia in 2007, which marked the first outbreak outside Africa, and spread throughout the Trans-Caucasian region and Europe in 2014 [13,14].In Asia, China reported its first ASF outbreak in a pig farm in 2018, followed by many countries, including Mongolia, Vietnam, Cambodia, North Korea, Laos, the Philippines, Myanmar, Timor-Leste, Indonesia, and India.Since 2007, most ASFVs reported in European and Asian countries are genotype II [15][16][17][18][19][20][21][22][23][24][25][26][27].
In South Korea, ASF was first detected in 2019 on a pig farm in Paju, Gyeonggi Province.By the end of 2021, 21 outbreaks were reported in Korea [28].In a previous study that analyzed 12 gene markers using partial sequencing [28], all 21 ASFV strains isolated from affected pig farms were found to belong to p72 genotype II, serogroup 8 with intergenic region (IGR) 173R-I329L II and CVR 1 [28].Notably, no tandem repeat sequence (TRS) insertions were detected in the IGR A179L-A137R and IGR MGF 505 9R/10R, and no variations were observed in the O174L, K145R, MGF 505-5R, CP204L, or Bt/Sj regions among the 21 Korean isolates.In addition, the analyzed genes of these isolates were identical to those of Georgia 2007/1, the Chinese strains Pig/HLJ/ 2018 and China/2018/AnhuiXCGQ, and the Vietnamese strain ASFV_NgeAn_2019.However, further analysis revealed that X69R, located in the J268L region of the 18 th isolate (Korea/Pig/Goseong/2021), had a single tyrosine (Y) insertion at position 209.A previous study [28] stated that this finding implies that there are slight variations in ASFVs circulating in South Korea from 2019 to 2021 and that the source of the virus responsible for the 18 th ASFV (Korea/Pig/Goseong/ 2021)-infected farm was different from those of the other 20 pig farms.However, the detailed epidemiology of ASFV outbreaks in South Korea has not yet been fully determined owing to the low resolution of genomic data.
Recently, next-generation sequencing (NGS) methods have been applied for whole-genome sequencing of viruses, including ASFV [29][30][31][32].To further understand the epidemiology, transmission, and evolution of ASFV in South Korea, in this study, we conducted whole-genome sequencing of 21 strains isolated between 2019 and 2021 using NGS methods.We analyzed genetic polymorphisms among Korean ASFVs and conducted a spatiotemporal transmission analysis using a time-scaled phylogenetic tree.

Materials and Methods
2.1.Samples.A total of 21 outbreaks were identified in South Korea between 2019 and 2021 (14 in 2019, 2 in 2020, and 5 in 2021).Three outbreaks, each in 2019 and 2021, were detected through nationwide monitoring of pig farms, which was initiated following the initial ASF outbreak in September 2019.In addition, two positive pigs were detected during testing at the slaughterhouse in 2020, leading to the tracing and confirmation of its origin farm and a neighboring farm.The remaining 13 outbreaks were confirmed based on notifications from pig farmers reporting sick or deceased pigs.Samples (blood or spleen) from all 21 outbreaks were collected for further analysis.Real-time PCR confirmed the presence of ASFV (Table 1).

DNA Extraction and Real-Time PCR for ASFV Detection.
Viral DNA was extracted from the samples using the Maxwell ® RSC Total Nucleic Acid Kit and Maxwell RSC Whole Blood DNA Extraction Kit (Promega, Madison, WI, USA) according to the manufacturer's instructions.The extracted DNA was stored at −20°C until analysis.Real-time PCR targeting the B646L gene encoding p72 for the detection of ASFV genomic DNA was performed using Bio-Rad CFX-96 (Bio-Rad, Hercules, USA) as described in the World Health

Transboundary and Emerging Diseases
Organization for Animal Health (WOAH) Manual (World Organization for Animal Health) [33,34].
2.3.Whole-Genome Sequencing.DNA sequencing libraries for Illumina MiniSeq (Illumina, San Diego, CA, USA) were prepared using the Illumina Nextra XT DNA Library Preparation Kit and the Nextra XT Index Kit v2 Set A (Illumina) according to the manufacturer's instructions.Target enrichment was performed using an Enzymatic Preparation Kit (Celemics, Seoul, Republic of Korea), and a library was prepared.The prepared genomic DNA library and capture probes were hybridized with the prepared genomic library and capture probes using the Celemics Target Enrichment Kit (Celemics).Capture probes were chemically synthesized to hybridize with the target region, and the captured regions were amplified by post-PCR to enrich the genomic DNA.Before sequencing, library quality was assessed using a BioAnalyzer 2100 (Agilent, Santa Clara, USA) with an Agilent Bioanalyzer DNA High Sensitivity kit (Agilent) and quantified using a dsDNA High Sensitivity Assay kit (Thermo Fisher Scientific) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific).MiniSeq sequencing was conducted in 150 bp paired-end mode using the MiniSeq High Output Reagent kit (300-cycle) kit (Illumina) according to the manufacturer's instructions.

Data Analysis.
Adapter sequences and low-quality sequencing reads with a quality score below 70 were trimmed using the BBDuk v38.84 program.Taxonomy classification using the KRAKEN2 program (https://ccb.jhu.edu/software/kraken2/) was used to determine the percentage of ASFV genome in the remaining reads.The trimmed sequencing reads were then assembled by performing reference mapping against ASFV Georgia 2007/1 (GenBank Accession number NC_044959) using Geneious Prime software (https://www.geneious.com/).To minimize erroneous mappings, a maximum of 10% mismatch was allowed during the reference mapping process.Consensus sequences were generated considering only sites with coverage depths >20.NGS and assembly data are summarized in Table 2.
Owing to the limitations of our short-read sequencing system, we were unable to obtain sequences for the terminal inverted repeat regions at both ends of the genome.The whole-genome sequences are uploaded in GenBank (Table 1).[29,32], because these European viruses are phylogenetically distinct from the Korean isolates.A total of 64 reference genomes, including the Georgia 2007/1 strain, were selected for phylogenetic analysis (Table S2).The 21 ASFV genomes analyzed in this study, along with the reference genomes, were aligned using the Multiple Alignment using Fast Fourier Transform method and manually trimmed to equal lengths with Georgia 2007/1, with approximately 187,420 sites including gaps.G/C homopolymers and inverted terminal repeats, prone to sequencing errors, were excluded.A maximum-likelihood phylogenetic tree was constructed using RaxML v8.2.7, employing the general time reversible (GTR) nucleotide substitution model.Bootstrap analysis with 500 replicates was used to assess the statistical support of the phylogenetic tree.Georgia 2007/1 was used as the root of the phylogenetic tree.

Variant Confirmation
In addition, a time-scaled phylogenetic tree was constructed using the BEAST v1.10.4 program [35].An uncorrelated relaxed clock model with gamma-distributed rate (GTR + γ) nucleotide substitution was used.Four Markov chain Monte Carlo runs, each comprising 150 million steps, were run in parallel.The parameters and trees were sampled every 10,000 steps, resulting in 40,000 parameter states and posterior trees.TRACER v1.5 was used to analyze the parameters, with 10% of each result discarded as burn-in [36].All parameters had an effective sample size of greater than 200.A time-scaled maximum clade credibility tree was generated using TreeAnnotator v1.10.4 (https://beast.community/treeannotator) in BEAST and visualized using FigTree v1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/).The ASFV/Korea/ Pig/Inje2/2021 strain was excluded from the time-scaled phylogenetic analysis because of suspected putative recombination.Reference sequences that deviated from the normal mutation rate and considered as errors were excluded.

Results
3.1.Whole-Genome Comparative Analysis.Comparative analysis of the 21 ASFV whole-genome sequences with the Georgia 2007/1 reference strain sequence revealed 33 mutations, including single nucleotide polymorphisms (SNPs) and insertion/deletion polymorphisms (Indels), present in the 21 Korean isolates.The Inje2/2021 strain had multiple additional mutations in the MGF 500-9R gene.Of the SNPs identified, 17 were nonsynonymous, five were synonymous, and the remaining 11 were detected in IGRs that do not code for any protein.All Korean isolates shared 11 mutations, of which six were nonsynonymous: T26425C (N329S) in MGF 360-10L, A44576G (K323E) in MGF 505-9R, T134514C (N414S) in NP419L, T170862A (I195F) in I267L, a truncation mutation (stop codon) in MGF 110-1L at position C7059T (W197 * ), and a frameshift deletion causing protein truncation at position 12578 in the ASFV G ACD 00190 gene.In addition, a GAATATATAG insertion was found in the IGR between the I73R and I329L genes in all Korean isolates, indicating that all Korean isolates belonged to the IGR II genotype, based on the TRS classification of the IGR between I73R and Transboundary and Emerging Diseases I329L.The GAATATATAG insertion was confirmed by Sanger sequencing in previous study [28].

Phylogenetic Analysis. A phylogenetic tree was constructed to analyze the genomic epidemiology of ASFVs in South
Korea.The maximum-likelihood phylogenetic tree revealed that genotype II ASFVs formed two distinct subgroups: Asian and European.All the Korean isolates clustered within the Asian subgroup (Figure 3).Although at least two distinct clusters specific to Korea were identified in the maximumlikelihood phylogenetic tree, most nodes in the tree did not receive strong bootstrap support (<70).Therefore, further investigation is required to explore the detailed genetic epidemiology of this maximum-likelihood phylogenetic tree.
In addition, a time-scaled Bayesian phylogenetic tree was generated to gain further insights into the detailed genetic epidemiology of ASFVs in South Korea.The time-scaled  Transboundary and Emerging Diseases phylogenetic tree revealed that the Korean ASFVs were divided into at least three subgroups, with each subgroup sharing a common node supported by a high posterior probability (>0.9) (Figure 4).Notably, each ASFV subgroup exhibited a geographical pattern (Figures 4 and 5).Viruses isolated from north Gyeonggi-do (Yeoncheon, Paju) in 2019 and west Gangwon-do (Hwacheon, Hongcheon) during 2020-2021 formed a cluster in the phylogenetic tree, designated as Korean subgroup I.This cluster also included two ASFV whole-genome sequences (YC1/2019 and HC224/ 2020) detected in wild boars in 2019 (Yeoncheon) and 2020 (Hwacheon).The virus isolated from west Gyeonggi-do (Gimpo, Ganghwa) was clustered in Korean subgroup II.Furthermore, viruses isolated from Gangwon-do (Yeongwol, Goseong, and Inje) formed a distinct cluster, designated as Korean subgroup III.These findings suggest that at least three distinct viruses were introduced into South Korea through west and north Gyeonggi-do and east Gangwon-do.Five other isolates detected in Gyeonggi-do (Ganghwa, Paju) did not cluster with the other Korean isolates with a high posterior probability.These phylogenetic outliers indicate the possibility of multiple introductions of ASFVs in South Korea.

Discussion
In this study, we conducted a comprehensive analysis of the whole-genome sequences of 21 ASFVs isolated in South Korea between 2019 and 2021.Through our analysis, we identified 33 mutations in the Korean isolates compared with the reference strain Georgia 2007/1.Of these, 17 nonsynonymous mutations, four substitutions (T26425C, A44576G, T134514C, and T170862A), one truncation (C7059T), and one frameshift mutation (A deletion at 12578) were consistent in all 21 Korean ASFVs.These mutations were also detected in ASFVs isolated from various Asian and European countries between 2007 and 2021 [29].The A44576G (K323E) substitution in MGF-505-9R was documented in an ASFV isolate from Armenia as early as 2007.Similarly, truncation by C7059T and three substitutions (A44576G, T134514C, and T170862A) were present in ASFVs detected in several countries between 2017 and 2020.These results indicated that the mutations shared by the Korean isolates occurred before the virus was introduced into South Korea.
SNPs and Indels found in Korean isolates have also been detected in isolates from other countries.Frameshift mutations at 12578 and substitutions at T26425C have been detected in multiple countries and years.The frameshift mutation at position 12578 was more prevalent and was found in ASFVs from Lithuania, Poland, China, Vietnam, Russia, and Germany between 2014 and 2021.Substitution at T26425C was observed in ASFVs from China, Timor-Leste, Vietnam, Poland, and Armenia in 2018 and 2019.These findings suggest the possibility of multiple viral introductions into Korea, indicating that viruses were brought into the country at multiple instances, rather than mutations arising during local transmission.Furthermore, we identified a unique C insertion at position 33042/3  Transboundary and Emerging Diseases shown any mutations in these proteins in this study.Further studies are required to determine whether these mutations, particularly protein truncations, contribute to changes in the biological characteristics of the virus.
Our study also provides further support for the existence of distinct subgroups of ASFVs in Koreans, indicating multiple introductions of the virus over time.Through time-scale phylogenetic analysis, we found that groups 1 and 2 were initially isolated in Yeoncheon and Gimpo, respectively, and outlier groups that did not cluster with other Korean isolates were identified in Ganghwa and Paju.These findings suggest the continuous introduction of distinct viruses in areas near the DMZ.In group 3, the first outbreak was observed in Yeongwol, Gangwon-do, in 2021, a region geographically distant from the previously affected areas.Our phylogenetic analysis suggests the possibility of group 3 viruses being introduced through the DMZ near the east coast.However, it is important to acknowledge the limitations of our epidemiological investigation, particularly the absence of related reference sequences, including those from North Korea and wild boars.This leaves open the possibility that these clusters result from ASFV genetic mutations in wild boars, despite the virus's low mutation rate.Hence, global collaboration for the whole-genome sequencing of ASFVs from domestic pigs and wild boars is necessary to enhance our understanding of the epidemiology and transmission dynamics of the virus.
Recombination is a potential source of viral evolution, including changes in host range, virulence or pathogenesis, tissue tropism, resistance to antivirals, and the emergence of new viral diseases [38].Previous studies have reported recombination events in ASFV, including homologous recombination, leading to genomic Indels that contribute to the genetic diversity of the virus [39], and recombination among different genotypes facilitated by the presence of recombination hotspots, resulting in the generation of diverse genetic strains [40].In our study, we identified the possibility of selfrecombination in the MGF 505 gene of the ASFV/Korea/ Pig/Inje2/2021 strain.We detected a concentrated mutation in 13 SNPs within 52 bp of the MGF 505-9R genes, which shares an identical sequence with MGF 505-10R.Recombination could occur due to template switching by the polymerase among DNA or RNA strands that have high sequence identity [38].Although MGF 505-9R and MGF 505-10R are paralogous proteins, their original sequences showed a total identity of approximately 61.4%.However, the MGF 505-9R showed 80.9% identity with that of MGF 505-10R in the region of putative recombination occurred.However, the precise mechanism of recombination between different gene locations in viruses not fully determined yet.Although Inje2/2021 exhibited potential self-recombination in MGF 505-9R, there were no discernible differences in virulence in pigs compared with the first Korean ASFV Paju1/2019 [28].
It is worth noting that although recombination events between different ASFV isolates have been reported, selfrecombination of ASFV has not been previously documented [40,41].MGF 505/530 genes are believed to play important roles in virus tropism, virulence, and suppression of the interferon response, along with MGF 360 gene although the precise function of the encoded proteins is not fully understood [42].Further studies are needed to elucidate the mechanisms underlying this recombination event and its implications for changes in biological characteristics, particularly the roles of proteins within individual viruses.
Despite the limitations of our study, primarily the lack of related reference sequences, genomic epidemiology using whole-genome sequences of ASFV provides valuable information on viral epidemiology.Therefore, continuous molecular epidemiological studies based on whole-genome sequences of ASFV are crucial for monitoring the origin of outbreaks and strengthening surveillance efforts.In this study, we present evidence for multiple possible introductions of ASFV through the DMZ in South Korea.These findings underscore the persistent challenge of repeated introduction of ASFVs into South Korea despite previous strain elimination of the virus through quarantine strategies.Therefore, intensive disease control measures are needed in regions such as Ganghwa-gun and Paju-si, where wild animals are likely to cross the DMZ, to prevent the introduction of new viruses.
In conclusion, our study provided valuable insights into the genetic diversity, mutations, and subgroups of ASFVs in South Korea.The identified mutations and subgroups suggest multiple introductions of ASFV strains into the country over time.These findings emphasize the need for intensified disease control measures, particularly in regions near the DMZ, to prevent the introduction of new viruses.

Data Availability
The whole-genome sequences used in this study were deposited in GenBank, and the accession numbers are provided in Table 1.Sequence alignments and results from phylogenetic analyses are available from the corresponding authors on reasonable request.
Seong-Keun Hong collected samples and outbreak data.Da-Won Kim, Ji-Yun Kim, and Dong-Wook Lee performed genetic data curation.Da-Won Kim performed genetic analysis.Oh-Kyu Kwon and Da-Won Kim wrote the first draft of the manuscript.Jin-Ju Nah, Yeun-Hee Kim, and Hae-Eun Kang revised draft manuscript.Jung-Hoon Kwon and Yeun-Kyung Shin revised final manuscript.Oh-Kyu Kwon and Da-Won Kim contributed equally to this work.Jin-Jun Nah, Hae-Eun Kang and Yeun-Kyung Shin obtained funding for this work.Oh-Kyu Kwon and Da-Won Kim two authors contributed equally.

FIGURE 1 :
FIGURE 1: Mutations in whole-genome sequences of African swine fever viruses isolated from South Korea compared with the reference strain Georgia 2007/1.The nucleotide numbers are marked based on reference sequence (Georgia 2007/1).The location of the gaps in the reference sequence generated by nucleotide insertions of South Korea isolates is marked with a slash (/) between two nucleotide numbers.The sequences in open reading frames (ORFs) are highlighted in red in the reference sequence, whereas those located in intergenic regions (IGRs) are represented in black.Nonsynonymous mutations are marked with filled stars.Three nucleotide insertions between 20405 and 20406 are shown in the ASFV/Korea/Pig/Goseong/2021 strain.Mutations clustered in MGF 505-9R assumed to be the result of selfrecombination in ASFV/Korea/Pig/Inje2/2021 and the GAATATATAG insertion at I73R-I329L IGR in all Korean isolates are not included in this figure.

40 FIGURE 3 :
FIGURE 3: Maximum-likelihood phylogenetic tree of genotype II African swine fever viruses.Taxa with red font are the ASFV isolates from South Korean domestic pigs and taxa with green font are the ASFV isolates from South Korean wild boars.Phylogenetic trees were constructed using the maximum-likelihood method in RAxML with 500 bootstrap replicates.Bootstrap values are shown in each node.Each node sharing identical nucleotide mutations, insertion, or deletion among Korean isolates is annotated.

FIGURE 4 :
FIGURE 4: Time-scaled maximum clade credibility tree of the genotype II African swine fever viruses.Taxa with red font are the ASFV isolates from South Korean domestic pigs and taxa with green font are the ASFV isolates from South Korean wild boars.The size and the color of the node (circle) indicate the posterior, which means the probability associated with forming one clade.Korean ASF was divided into three groups according to the posterior probability.

TABLE 1 :
List of African swine fever viruses in samples analyzed in this study.

TABLE 2 :
Next-generation sequencing results of ASFV isolates analyzed in this study.
OR145834The number of total reads and percent of reads mapped to ASFV genome.A Taxonomic classification of sequencing reads wereas assigned using KRAKEN2 program to calculate the percentage of ASFV genomes to the total reads.B The mean number of reads per each nucleotide site.C The percentage of the sequenced genome compared to with the whole-genome sequence.DThe four viruses detected in 2021 (Goseong/2021, Inje1/2021, Hongcheon/2021, and Inje2/2021) are targeted for enrichment in NGS.
Diagram illustrating self-recombination in the ASFV/Korea/Pig/Inje2/2021 strain.Thirteen mutations were identified within the 52 bp segment of the MGF 505-9R gene and identical sequences were found in the MGF 505-10R gene.The original sequences (above) were identified in all other Korean isolates.The results indicated possible self-recombination between MGF 505-9R and MGF 505-10R in the ASFV/Korea/Pig/Inje2/2021 strain (below).