Integrative Functional Genomic Analysis in Multiplex Autism Families from Kazakhstan

The study of extended pedigrees containing autism spectrum disorder- (ASD-) related broader autism phenotypes (BAP) offers a promising approach to the search for ASD candidate variants. Here, a total of 650,000 genetic markers were tested in four Kazakhstani multiplex families with ASD and BAP to obtain data on de novo mutations (DNMs), common, and rare inherited variants that may contribute to the genetic risk for developing autistic traits. The variants were analyzed in the context of gene networks and pathways. Several previously well-described enriched pathways were identified, including ion channel activity, regulation of synaptic function, and membrane depolarization. Perhaps these pathways are crucial not only for the development of ASD but also for ВАР. The results also point to several additional biological pathways (circadian entrainment, NCAM and BTN family interactions, and interaction between L1 and Ankyrins) and hub genes (CFTR, NOD2, PPP2R2B, and TTR). The obtained results suggest that further exploration of PPI networks combining ASD and BAP risk genes can be used to identify novel or overlooked ASD molecular mechanisms.


Introduction
ASD is a spectrum of psychological characteristics that describe a wide range of abnormal behavior and difficulties in social cooperation and communication, as well as severely restricted interests and frequently repetitive behaviors. Relevance of the ASD problem arises from the high incidence of this pathology all over the world, including Kazakhstan. According to official data, in 2021, there were 4,887 children with ASD in Kazakhstan, but experts believe that this indicator is ten times higher. According to the statistics of WHO and CDC, there are at least 30,000 children with ASD in Kazakhstan (https://inbusiness.kz/ru/last/v-kazahstane-30tysyach-detej-stradayut-autizmom).
The etiology of this pathology is extremely difficult and is probably determined by a combination of genetic susceptibility and environmental factors. Determining the specific contribution of these factors to ASD is difficult due to the lack of population-based, longitudinal evidence necessary to establish conclusive links between exposure, genotypic responses, and phenotypic consequences [1]. Some studies steered the debate toward the greater importance of environmental factors rather than a genetic predisposition to ASD [2,3]. Other studies showed little support for general environmental influences [4,5]. Most recent studies suggest that environmental exposures may be a catalyst for deleterious DNMs leading to ASD [1], whereas genetic factors are considered the predominant causes of ASD [6,7]. A strong contribution of heritable factors in the etiology of ASD is supported by twin studies and studies of first-degree relatives. Indeed, the risk of a child being diagnosed with ASD is increased at least 25-fold in a family where a brother or sister has already been diagnosed with autism [8]. Independent twin studies show concordance rates of 60-92% in monozygotic twins versus 0-10% in dizygotic twins [9,10]. If only one child in a family has ASD, the other twin may have delayed speech, reading, and spelling difficulties [11]. This study by Folstein et al. of siblings and parents of affected children with mild cognitive and behavioral impairments led to the concept of the BAP [11]. Apparently, ASD families with multiple occurrences and relatives with BAP have a higher genetic loading for ASD [12], making them a good model for studies when environmental factors are excluded or have minimal influence. Such families are not uncommon in ASD, and several studies of such pedigrees have been published [13][14][15][16]. The prevalence of BAP in ASD families is also not low. A large-scale study by Sasson et al. estimates that the prevalence rate of BAP among parents of children with ASD ranges from 14 to 23% [17]. A meta-analysis of twin studies found that there is no disruption between ASD and BAP in genetic modeling, suggesting that ASD as a disorder can be conceptualized as the extreme of BAP symptoms/behaviors [18]. If this is the case, the inclusion of individuals with BAP in a study of multiplex families should increase the power of the study to determine the genetic structure of ASD [16,19].
A complex understanding of the genetic structure of ASD requires unbiased knowledge of the number of risk loci, their penetrance, and allele frequencies [19]. The collected data to date provided conclusive evidence for three categories of genetic structure, including common SNPs (MAF > 1%), inherited rare variants (MAF < 1%), and DNMs that have been identified in the proband and are not found in the genome of the biological parents [20]. Genetic models suggest that at least 50% of the variance in ASD may be due to common inherited variants [21], which act in aggregate while having little effect individually. Despite evidence for a significant role of common variants in ASD risk, rare genetic variations may be associated with higher individual risk [22]. Maintenance of genetic susceptibility to ASD despite reduced transmission of risk variants may be due to DNMs [23,24]. The relative contribution of spontaneous DNMs to the ASD etiology is estimated between 5 and 15% [23]. In several cases of syndromic ASD, a single DNM appears to be sufficient to cause the onset of ASD symptoms [25], suggesting that this DNM disrupts key loss-of-function intolerant genes. Despite a considerable genetic heterogeneity underlying ASD, there is compelling evidence that a large number of risk genes can be integrated into a much smaller number of proteinprotein interaction (PPI) networks [26]. Previous studies have shown that ASD genes functionally converge in synapse development, axon alignment, neuron motility, synaptic transmission, chromatin remodeling, transcription and translation regulation, ion transport, and cell adhesion [27][28][29][30][31][32]. As far as we know, these studies were mainly focused on investigating genes affected in children with ASD, but not in relatives with subclinical phenotypes of BAP. The foregoing suggests that inclusion of ASD-related genes from first-degree relatives with BAP in the PPI network may help to better understand the development of autistic traits in the family. Will the main pathways of development of autistic traits change from those shown so far in this case? If ASD is simply the extreme end of the distribution of autistic traits that make up BAP, there will not be a large shift in the main trajectory. However, will other less studied convergent signaling mechanisms or protein interactions contributing to ASD pathology be identified? The previously discovered BAP genes [33][34][35][36][37] lead to the assumption that BAP gene loci generally correlate with ASD loci. However, several loci were found to be significant only for BAP [13], suggesting that the absence of the BAP putative risk gene in the PPI networks may be a missing link to understanding the initial biological mechanisms of ASD. Therefore, here, we focused on a set of four extended pedigrees with ASD and BAP. The aim of the study was to identify putative candidate genes and to investigate functional relationships between these genes using PPI network analysis. This is the first genetic study of Kazakhstani families with ASD.

Materials and Methods
2.1. Sampling. Families for this study were selected using a database of 400 Kazakhstani families with ASD children. The database was created within the framework of the previously implemented project 0118РК00503 in 2018-2021. We applied the following inclusion criteria for families: two or more children with ASD AND BAP among first-degree relatives AND Kazakhstani ancestry. Exclusion criteria were a simplex family OR/AND fragile X syndrome. A total of 13 families (3%, 95% CI: 1.7-5.5%) met the inclusion and exclusion criteria. Three families were out of the country at the time of the study, two families were single parents, and four families declined to participate in the study for one reason or another. Thus, four families took part in the study. Samples of saliva were collected from all children with ASD as well as from their parents and neurotypical siblings using a collection kit (Zeesan) provided by TellmeGen.
Collection was conducted after obtaining informed consent from at least one of the parents. The study was approved by the Ethics Committee of the Institute of Human and Animal Physiology, Almaty, Kazakhstan. The children recruited in this study were diagnosed with ASD by psychiatrist. The Child Autism Rating Scale (CARS) was used to assess the severity of ASD [38]. The Broad Autism Phenotype Questionnaire (BAPQ) was used to assess BAP traits [39,40].

Data
Generation. DNA isolation from the collected biomaterial and data generation were performed using the Infinium Global Screening Array (GSA) v3.0 run on the Illumina iScan Platform at TellmeGen CA (Valencia, Spain). A total of 650,000 genetic markers were analyzed using 10,000 probes (99.99% reliability). A triplicate analysis was performed.
2 Disease Markers 2.3. Data Analysis. Family trees were generated using the GenoPro2020 software (https://genopro.com/2020/). TellmeGen CA applied standardized quality control measures to filter out low-quality data (a call rate lower than 0.99) from the SNP list and compiled all obtained results into csv files, which were sent to our laboratory for further analysis.
The GSA includes ∼640,000 single nucleotide polymorphisms (SNPs) and ∼10,000 indels (insertion/deletion). SNPs that are missing from a fraction of individuals in the cohort were filtered out. SNPs with a MAF > 1% associated with ASD according to the GWAS catalog (p < 0:00001) were included in the list of common variants. SNPs with a MAF ≤ 0:01% associated with ASD according to the ClinVar database and inherited by a child with ASD from a parent with BAP were included in the list of rare inherited variants.
DNMs were identified according to the scenario: both parents carry a homozygous reference allele and the child is heterozygous, i.e., carries one copy each of the alleles REF and ALT. The variants were classified as pathogenic, probably pathogenic, of unclear significance (VUS), benign, or probably benign according to the ACMG (American College of Medical Genetics and Genomics) guidelines [12]. Pathogenic mutations included stop codon variants (frameshift and nonsense mutations), variants with uncorrected splicing, and variants with previously established pathogenic effects according to ClinVar database. In silico tools such as SIFT (Sorting Intolerant From Tolerant, http://sift-dna.org) and Polymorphism Phenotyping-2 (PolyPhen-2, http:// genetics.bwh.harvard.edu/pph2/) were used to predict deleterious effects of missense variants on protein structure and function. We filtered out variants that were most likely nonpathogenic (benign and likely benign) or with MAF < 1 % in order to identify clinically relevant rare DNMs.

Data Visualization and Functional Interpretation.
To characterize the relationships between the ASD/BAP candidate genes in each family, we projected them into the PPI network. The InnateDB (Knowledge Resource for Innate Immunity Interactions and Pathways, https://www .innatedb.com/) was used to retrieve predicted interactions for the identified candidate genes [42,43]. The OmicsNet 2.0 software (https://www.omicsnet.ca) was used to construct the PPI network. This is a novel web-based tool for creation and visualization of complex biological networks. The software supports ten molecular interaction databases for protein-protein, miRNA-target, TF-target, and enzymemetabolite interactions and provides multiple methods for network customization using a powerful WebGL technology to enable native 3D display of complex biological networks in modern web browsers [44]. The WalkTrap algorithm in OmicsNet 2.0 was applied to further partition of the PPI into modules. The algorithm assumes that a random walker tends to be trapped in dense parts of a network corresponding to modules.
Functional annotation and enrichment analysis of genes were performed according to the GO (Gene Ontology, http://geneontology.org/) [45], KEGG (Kyoto Encyclopedia of Genes and Genomes, https://www.genome.jp/kegg), and REACTOME (http://www.reactome.org) databases using the g:Profiler (https://biit.cs.ut.ee/gprofiler/). This is an open web server for characterizing and manipulating gene lists. It is updated every three months following the quarterly releases of the Ensembl databases [46]. The g:Profiler Bonferroni correction was used, and only pathways with an adjusted p value ðp adj Þ < 0:05 were considered significantly enriched.

Characteristics of Subjects.
The study included four multiplex families ( Figure 1). The mean age (± standard deviation) of the ASD children was 9:1 ± 4:2 years. The ratio of male to female children with ASD was 7 : 1. The mean ages of parents and neurotypical siblings were 39:3 ± 3:5 and 14:5 ± 7:8 years, respectively. Family 1 has two boys with moderate ASD and one neurotypical girl. Family 2 has two sons with severe and moderate ASD and one neurotypical daughter. Family 3 has two sons with moderate autism. In Family 4, the mother has two children from different marriages. The son has severe ASD, and the daughter has moderate autism. The BAPQ data indicated that the fathers from Families 1, 2, and 3 and the mothers from Families 2 and 4 have autistic traits with high scores across the domains of ASD. The fathers from Families 1 and 3 and the mother from Family 2 have high aloofness subscale scores, while the father from Family 2 has pragmatic language deficits. The mother from Family 4 has either BAP or ASD and shows rigid personality and pragmatic language deficits. All family members are Kazakh except the father and his daughter from Family 4. They are Russian.
DNMs were found only in children with ASD but not in neurotypical siblings. In total, 12 heterozygous DNMs were identified in three families, including nine missense variants, two nonsense mutations, and one splice variant (Table 3). No DNMs were detected in Family 3. We found no identical mutations in ASD siblings.
3.3. PPI Network and Functional Enrichment Analysis. We prioritized candidate genes 57, 60, 58, and 73 in Families 1, 2, 3, and 4, respectively. The PPI networks for these genes were constructed for each pedigree. As a result, four networks with the following properties were obtained: 614 nodes, 672 edges, and 35 seeds for Family 1, 746 nodes, 870 edges, and 36 seeds for Family 2, 669 nodes, 743 edges, and 39 seeds for Family 3, and 923 nodes, 1092 edges, and 50 seeds for Family 4. After partitioning into modules, these networks were divided into 14 significant modules for Families 1 and 2, 11 modules for Family 3, and 20 modules for Family 4 ( Figure 2). The number of connections of a node or the degree of centrality (DC) showed that ten genes, namely HDAC4, CFTR, MECP2, NOD2, PPP2R2B, TCF4, TRIM33, TSC2, TTN, and TTR, play a nodal role in the generated networks and form the largest modules ( Table 4). The highest-ranking node in all networks was HDAC4 (DC = 202), except in Family 4, where CFTR played a greater role (DC = 222). In Families 2 and 3, another high-ranking node was TTN (DC = 104). TCF4, PPP2R2B, and HDAC4 were common hub genes for all four networks.
We then assumed that the set of identified genes for each pedigree work together and can be integrated into a single module. We defined them as disease modules and performed the enrichment analysis. A total of 92 enriched terms for  Table 5.

Discussion
Recent studies suggest that in models of the genetic architecture of ASD, common and rare variants interact additively to form susceptibility [47][48][49]. Common variants likely play a major role in population-level susceptibility, whereas rare mutations contribute substantially to individual susceptibility [21]. Following this hybrid model, we used polygenic risk scores to analyze four extended pedigrees of Kazakhstani ancestry and prioritized ASD risk genes with common and rare inherited and DNM variants. The combination of ASD and BAP was used to improve the performance of risk gene identification. We then performed integrative analysis by constructing PPI networks. We were particularly interested in the nodal elements of the obtained PPI networks. We hypothesized that any perturbation at these important nodes could trigger abnormal conditions such as diseases [50,51]. According to the obtained results, ten genes clearly formed potentially important nodes in the PPI networks. Six of these genes, namely HDAC4, MECP2, TCF4, TRIM33, TTN, and TSC2, belong to the SFARI category 1-2 (highconfidence and strong candidate genes) and are widely associated with the neuropathological mechanisms of ASD [31,[52][53][54][55][56][57][58][59][60][61][62][63][64][65][66][67][68][69][70]. The CFTR, NOD2, PPP2R2B, and TTR genes were not found in the SFARI databases, and data on the role of      [71,72]. However, although the exact mechanism is not clear, there is some evidence of a link between these genes and ASD. The CFTR gene controls secretion and absorption of ions and water in epithelial tissues [73]. Immunohistochemical staining with a mouse monoclonal antibody directed against the Cterminal amino acid sequence of human CFTR revealed diffuse neuronal expression of CFTR in ten human control fetuses at 13 to 40 weeks of gestation [74]. This study showed that CFTR has an early and widespread distribution during development. In addition, a case of autism associated with a genetic variant of CFTR and early exposure to herpes simplex virus (HSV) has been described [71]. The NOD2 gene belongs to the intracellular NOD-like receptor family and plays an important role in the immune response to intracellular bacterial lipopolysaccharides (LPS) [75]. The central role in maintaining the balance between the gut microbiota and the host immune response to control inflammation [76] makes NOD2 one of the most important sus-ceptibility genes for inflammatory bowel diseases [77][78][79][80][81][82]. At the same time, a number of studies confirm that autistic children are at higher risk for this disorder [83][84][85][86][87][88]. Moreover, there is evidence of an association between maternal inflammatory bowel disease and ASD in children [89,90]. The PPP2R2B gene encodes a neuron-specific B regulatory subunit of protein phosphatase 2 (PP2A), which regulates synaptic plasticity [91]. Some studies suggested that DNMs in the PPP2R2B gene may partially contribute to the genetic landscape of intellectual disability [92], but we found only one study linking this gene to ASD [72]. However, this gene may be a strong ASD candidate given a recent study, which highlights a role of another subunit of PP2A (PPP2R5D) in dendrites and synapses using neuron-specific protein network of ASD risk genes [31]. Another strong candidate may be the TTR gene, which is involved in the transport of thyroid [93] and retinol [94]. The involvement of TTR in novel functions, such as neuroprotection, is part of the very recent and constantly evolving knowledge [95]. In addition, TTR has been shown to interact with the GABA A receptor subunit and regulate its expression and function [96]. GABA receptors play an important role in brain development and synchronization of neural network activity. Since these receptors are located on synaptic and extrasynaptic membranes, a deficiency of GABA receptors leads to a lack of neurotransmission and is associated with ASD [97,98].
Considering that disease genes tend to cluster and cooccur at central sites in the network [48], the above-mentioned genes may represent a priority list for further validation studies.
Another rationale for constructing a PPI network with ASD and BAP risk genes was to identify convergent signaling pathways. Despite the multiplicity of ASD risk genes in each pedigree, our results suggest overlapping functions involving a limited number of biological pathways. Thus, most of the ASD networks is localized in specific cellular compartments such as axons, ion channel complex, and synapses, whereas most biological processes involve ion channel activity, regulation of synaptic function, and membrane depolarization. These findings confirm the results of previous studies that described synaptic functions and ion channel activity in the development of ASD [31,[99][100][101][102] and allow us to hypothesize that the main course of development of autistic traits from BAP to ASD does not change. However, we also identified several novel or poorly characterized signaling pathways, such as circadian entrainment, neural cell adhesion molecule 1 (NCAM1) interaction, butyrophilin family (BTN2 and BTN3) interaction, and the interaction between L1 and ankyrins. The first of these pathways may       [103][104][105]. The genes that form NCAM1 interactions gene set are involved in neuronal development and synaptic plasticity [106], and perhaps this pathway is not so unexpected for ASD. Apparently, NCAM1 can be considered a general vulnerability factor for neurological and psychiatric disorders [107]. The role of BTN2 and BTN3 and related proteins in the neurodevelopmental disorders is much less studied [108,109]. BTNs are regulators of immune responses and exert both stimulatory and inhibitory effects on immune cells [110][111][112]. The BTN enriched gene set correlates a previous finding of a dysregulated immune system in ASD [113][114][115][116][117][118][119][120]. Ankyrin B (AnkB) is an adaptor and scaffold for motor proteins and various ion channels that is expressed ubiquitously in the organism, including the brain [121]. L1 interaction with AnkB mediates branching and synaptogenesis of cortical inhibitory neurons. AnkB mutations and polymorphisms are associated with ASD [23,69,122,123], but the detailed mechanisms underlying the neurological symptoms associated with AnkB are unknown. Interestingly, both the NCAM1 interaction pathway and the interaction between L1 and  Glucokinase, phosphorylates glucose to produce glucose-6-phosphate (the first step in most glucose metabolic pathways)

Pancreas and liver
The variant was changed to likely pathogenic upon submission. Other variants in the GCK gene that alter enzyme activity have been associated with various types of diabetes and hyperinsulinemic hypoglycemia [139]. There is a report showing that neonatal hypoglycemia increases the risk of ASD threefold in children born at term [140]. c.769T>G 4-Hydroxy-2-oxoglutarate aldolase 1, catalyzes the final step in the metabolic pathway of hydroxyproline, releasing glyoxylate and pyruvate Kidney, liver, heart, fat, and brain The variant results in a nonconservative amino acid change in the encoded protein sequence. Three of five in silico tools predicted a deleterious effect of the variant on protein function. The variant was found at a frequency of 6:4E − 05 in 249558 control chromosomes, most notably at a frequency of 0.00082 within the East Asian subpopulation in the gnomAD database. This frequency is not higher than the maximum expected for a pathogenic variant in HOGA1 causing primary hyperoxaluria, type III, (0.0015) and does not allow conclusions to be drawn about the significance of the variant. c.769T > G has been reported in the literature in homozygous and compound heterozygous states in several individuals with primary hyperoxaluria, type III [141][142][143][144]. These data indicate that the variant is very likely to be associated with disease. At least one publication reports experimental evidence evaluating an impact on protein function and demonstrated that the variant resulted in no measurable activity [145]. Two clinical diagnostic laboratories have submitted clinical-significance assessments for this variant to ClinVar after 2014 and classified the variant as pathogenic/likely pathogenic. There are reports that hyperoxaluria may be involved in the pathogenesis of ASD in children [146,147].
c.637G>A c.2022G>T Collagen type III alpha 1 chain, encoding the pro-alpha1 chains of the collagen type III, is found in extensible connective tissues such as skin, lung, uterus, intestine, and the vascular system, often in association with type I collagen Gall bladder, placenta, and 12 other tissues The variants are associated with Ehlers-Danlos syndrome, type 4 [148,149]. Ehlers-Danlos syndrome type 4 shares several similar neurophenotypes with ASD, such as mood disorders, proprioceptive impairments, sensory hyper/ hyposensitivities, eating disorders, and suicidality [150].  [151,152].
These data do not allow any conclusion about variant significance. At least one publication reports experimental evidence indicating that the variant reduced carnitine transport activity to less than 20% of wild-type in vitro [152]. Three clinical diagnostic 18 Disease Markers without evidence for independent evaluation. These laboratories cited the variant with conflicting assessments: one laboratory classified the variant likely pathogenic, one laboratory classified the variant as likely benign, and a third laboratory classified the variant as uncertain significance. Based on the evidence outlined above, until additional information becomes available, the variant was classified as VUS-possibly pathogenic. An association between primary carnitine deficiency and ASD has been reported [153,154]. It is hypothesized that carnitine deficiency in the brain causes nonsyndromal autism with extreme male tendency [155]. The variant is predicted to result in loss of normal protein function due to protein truncation as the last 124 amino acids of the protein are lost. c.457G>A Cystathionine beta-synthase, catalyzes the conversion of homocysteine to cystathionine, the first step of the transsulfuration pathway Liver, brain and 6 other tissues The variant involves the modification of a conserved nucleotide located within the pyridoxal phosphatedependent enzyme domain (InterPro). 5/5 in silico tools predict a deleterious outcome for this variant. This variant was found in 1/156632 control chromosomes at a frequency of 0.0000064, which does not exceed the estimated maximum expected allele frequency of a pathogenic CBS variant (0.0030414). In addition, functional studies in yeast suggest that the variant may affect protein function [156]. The variant has been reported in a Saudi Arabian family, in which two affected patients with homocystinuria were homozygous for the variant inherited from unaffected heterozygous parents [157]. There are reports that children with classical homocystinuria may have isolated ASD due to cystathionine-β-synthase deficiency [158,159].
c.5146 +1G>A Nuclear receptor binding SET domain protein 1, enhances androgen receptor transactivation Testis, thyroid and 25 other tissues The variant affects a donor splice site in intron 14 of the NSD1 gene. It is expected to disrupt RNA splicing and likely results in an absent or disrupted protein product. Donor and acceptor splice site variants generally result in loss of protein function [160], and loss-of-function variants in NSD1 are known to be pathogenic and the major cause of Sotos syndrome [161][162][163]. One report found several rare variations of the NSD1 gene in individuals with ASD, although the variants were not considered pathogenic [164]. c.5767C>T Pericentrin, interacts with the microtubule nucleation component gamma-tubulin and is probably important for the normal functioning of centrosomes, the cytoskeleton, and cell cycle progression Testis, bone marrow and 24 other tissues The variant has been classified as pathogenic according to ACMG in the context of microcephalic osteodysplastic primordial dwarfism type II. The variant produces a premature translational stop signal (p.Arg1923 * ) in the PCNT gene. It is expected to result in absent or impaired protein product. Lossof-function variants in PCNT are known to be pathogenic [165,166]. The variant results in a nonconservative amino acid substitution of a nonpolar alanine residue with a negatively charged aspartic acid residue at a position that is conserved across species. In silico analysis 19 Disease Markers ankyrins were prioritized in a study of the role of rare variants in biological processes and molecular pathways leading to the pathogenesis of Alzheimer's disease [124], indicating the prospects for their further investigation in the context of neurological disorders.
The final important finding of our study is the identification of DNMs in affected children. Detailed information on these DNMs can be found in Table 6. Some of these DNMs have been previously described in ASD and/or other neurodevelopmental disorders [125][126][127][128][129][130], and others are indirectly associated with ASD. In this context, the p.Ala797Asp mutation in the potassium channel gene KCNH2 was of particular interest. This DNM results in a nonconservative amino acid exchange of a nonpolar alanine residue for a negatively charged aspartic acid residue at a conservative position (https://www.ncbi.nlm.nih.gov/ clinvar/variation/200440/). An in silico analysis revealed that this mutation affects the protein structure or functions (https://www.ncbi.nlm.nih.gov/clinvar/variation/200440/). The data on the clinical significance of this variant are lacking. This study appears to be the first report on this DNM in an affected individual.
Taken together, the DNMs that we found only in children with ASD cannot explain the heritable nature of ASD in the studied families. However, because their greatest number was found in children with severe autism (child AU209 with the most severe ASD has four DNMs), we can assume that dbSNP and rare inherited variants represent a common genomic burden. Their combinations converge in common biological processes and likely contribute to the increased threshold of susceptibility to ASD, while the severity of ASD is determined by DNMs. Similarly, it has been previously reported that patients carrying DNMs in two or more candidate genes exhibit more severe phenotypes of ASD [131]. At the same time, the results showed that the genetic heterogeneity of ASD is so great that different DNMs could be identified even in siblings.

Limitations.
We understand that this study has many limitations given the latest genomic technologies, bioinformatics methods, and the large-scale studies [132][133][134]. However, paradoxically, the large amount of data generated by these studies has raised new challenges and questions, and many more studies and approaches are needed to unravel predicts Ala797Asp is probably damaging to the protein structure/function. Mutations in nearby residues (Glu788Asp, Glu788Lys, Arg791Trp, Gly800Glu, Gly800Trp) have been reported in association with Long QT syndrome (LQTS), further supporting the functional importance of this region of the protein. Furthermore, the Ala797Asp variant was not observed in approximately 6,500 individuals of European and African American ancestry in the NHLBI exome sequencing project, indicating it is not a common benign variant in these populations. In summary, while Ala797Asp is a good candidate for a disease-causing mutation, with the clinical and molecular information available at this time we cannot unequivocally determine the clinical significance of this variant.

c.740C>T
Hepatic and glial cell adhesion molecule, acts as a homodimer and is involved in cell motility and cellmatrix interactions Brain, fat and liver The variant was classified as a variant of unknown significance for megalencephalic leukoencephalopathy with subcortical cysts. Other rare mutations in the HEPACAM gene have been found to cause either macrocephaly and mental retardation with or without autism or benign familial macrocephaly [129]. c.1087A>C Wnt family member 10A, a member of the WNT gene family, involved in oncogenesis and several developmental processes, including cell fate regulation and cell patterning during embryogenesis Skin, placenta and 16 other tissues The variant was not observed in significant frequency in approximately 5300 individuals of European and African American ancestry in the NHLBI exome sequencing project, suggesting that it is not a common benign variant in these populations. The variant is a semiconservative amino acid substitution that may affect secondary protein structure because these residues differ in some properties. This substitution occurs at a position that is conserved across species, and in silico analysis predicts that this variant is likely to affect protein structure/function. 20 Disease Markers the complex mechanisms of ASD. In our brief study, we attempted to use a novel approach by constructing PPI networks based on putative causative genes for ASD and BAP. For our study, we chose extended pedigrees, which provided a good opportunity to examine inherited genetic risk factors. We integrated three major genetic components of ASD, and we believe that the genes identified in this study are considered penetrant enough to cause ASD-related traits and should be prioritized for further validation. However, the number of variants that microarrays can contain is limited. GSA tends to focus on relatively common variants, so the study has a bias in its design. It is possible that other undetected or uncharacterized variants not included in this study play a critical role. Risk alleles may be at the level of rare inherited copy number variants (CNVs) [135][136][137][138]; therefore, examination of CNVs within these families will be a subject of further study. In addition, we performed our analysis with samples that came mainly from families of Kazakh descent. For this reason, our results cannot be generalized to other populations without further investigation. Future approaches should ideally use whole-genome sequencing in extended pedigrees of not only Kazakh ancestry in conjunction with comprehensive clinical validation of detected deleterious variants.

Conclusion
This study is an attempt to describe the genetic trajectory of autistic trait development in four extended pedigrees of Kazakhstani ancestry. Construction of networks based on putative causative genes for ASD and BAP revealed no differences in major functional pathways compared with those shown in previous studies for ASD only. Nevertheless, our study uncovered several nodal genes and signaling pathways that have not previously been associated with ASD but for whose relevance there are strong biological arguments. The obtained results highlight the importance of including subclinical phenotypes in the search for inherited causes of ASD and provide insights into previously unknown convergent disease pathways. The study is also interesting regarding new DNMs that may contribute to the pathogenesis of ASD.

Data Availability
Genomic data have been deposited in a Cloud file storage and are available at https://drive.google.com/drive/folders/ 1XyIhBp7i8IJJZq7l-aQ3OI0FoNPW1qi6?usp=sharing. The processed data used to support the findings of this study are included in the provided tables.

Conflicts of Interest
The authors declare no conflict of interest.