Identification of a β-Arrestin 2 Mutation Related to Autism by Whole-Exome Sequencing

Autism spectrum disorder (ASD) is a complex neurological disease characterized by impaired social communication and interaction skills, rigid behavior, decreased interest, and repetitive activities. The disease has a high degree of genetic heterogeneity, and the genetic cause of ASD in many autistic individuals is currently unclear. In this study, we report a patient with ASD whose clinical features included social interaction disorder, communication disorder, and repetitive behavior. We examined the patient's genetic variation using whole-exome sequencing technology and found new de novo mutations. After analysis and evaluation, ARRB2 was identified as a candidate gene. To study the potential contribution of the ARRB2 gene to the human brain development and function, we first evaluated the expression profile of this gene in different brain regions and developmental stages. Then, we used weighted gene coexpression network analysis to analyze the associations between ARRB2 and ASD risk genes. Additionally, the spatial conformation and stability of the ARRB2 wild type and mutant proteins were examined by simulations. Then, we further established a mouse model of ASD. The results showed abnormal ARRB2 expression in the mouse ASD model. Our study showed that ARRB2 may be a risk gene for ASD, but the contribution of de novo ARRB2 mutations to ASD is unclear. This information will provide references for the etiology of ASD and aid in the mechanism-based drug development and treatment.


Introduction
Autism spectrum disorder (ASD) is a complex, lifelong, neurodevelopmental disease with a high degree of genetic heterogeneity, characterized by impaired social communication and interaction skills, stereotyped behavior, declining interest, and repetitive activities [1,2]. The prevalence of ASD has been steadily increasing since the first epidemiological study, and seven studies have shown that 4.1 of every 10,000 people in the UK have ASD [3]. Early research indicates that the incidence of ASD in men is 4-5 times that in women, although this difference is reduced among individuals with intellectual disabilities [4]. Genetic factors play a key role in the etiology of ASD and are combined with early environmental factors. Studies have shown that the incidence of childhood trauma, self-mutilation, and suicidal behaviors and thoughts of ASD patients is significantly increased [5], resulting in a heavy financial burden on families and society [6]. In the UK, the lifetime cost of ASD patients is £920,000 (US$1.36 million) [7].
Recent studies have shown that, although the heritability of ASD is high, the disease has a high degree of genetic heterogeneity, with rare and common genetic mutations [8][9][10]. Furthermore, the genetic etiology of approximately 90% of ASD patients is not clear [11,12]. Common genetic variants have a weak effect on the risk of ASD. The combined effect of common low-impact genetic variants is also related to ASD. Rare variants that affect the risk of ASD include hundreds of genes in total [13,14]. Genetic and epidemiological studies show that de novo mutations, that is, spontaneous rare mutations that do not occur in unaffected parents but occur in affected children [15,16], contribute significantly to ASD, and approximately 3%-10% of de novo mutations in ASD-related risk exons are explained [9,17,18].
In this study, we examined a family including a child with ASD and found a new de novo mutation of ARRB2. Gene expression experiments confirmed the possible connection between ARRB2 and ASD. It is important to analyze mutations of a single disease-causing gene in ASD to understand its mechanism and physiological effects.

Materials and Methods
2.1. Patients and Samples. We obtained the medical history of a child with ASD. The patient's language comprehension was severely impaired, the tone and speed of speech were abnormal, the patient failed to establish relationships with other children and experienced extreme loneliness, and the patient's posture was rigid with repetitive movements. To further investigate the cause of ASD, we collected peripheral blood from the patient and the patient's parents and obtained the gDNA of three family members. Additionally, the family history of the proband was collected. All participants signed an informed consent form.

Whole Exome
Sequencing. Whole-exome sequencing (WES) was completed at the Shanghai Yuanshen Biomedical Technology Co., Ltd. The company used the Agilent's liquid-phase chip capture system to efficiently enrich human exon region DNA and then performed highthroughput and deep sequencing on the NovaSeq 6000 platform. An Agilent SureSelect v6 kit was used to build the library and analyze data. The library was qualified, and the NovaSeq 6000 platform was used for sequencing according to the effective concentration of the genes in the library and the data output requirements. The final sequencing results were exported in Excel format. After obtaining the original sequencing reads, the reference genome (hg38) was used for bioinformatic analysis.

Evaluation of Mutation Sites and Screening of
Candidate Genes. Online software programs including LRT, MutationTaster, MutationAssessor, and VEST3 (https:// sites.google.com/site/revelgenomics/) were used to predict the pathogenicity of the mutations, and the GERP, phyloP, and SiPhy programs were used to analyze the conservation of mutation sites [19]. Additionally, using the Gene Ontology (GO, http://www.geneontology.org/) database, genes were classified according to their roles in biological processes, cell components, and molecular functions, and the Discovery Studio18 was used to simulate the wild type and mutant site protein structure.

Gene Coexpression and Genetic Interaction Network.
We extracted a dataset of ASD genes and performed weighted gene coexpression network analysis (WGCNA) [19]. This data set comprised a population of boys with relatively spontaneous ASD and excluded patients with known genetic diseases or recognizable phenotypes or syndromes, as well as patients with mental retardation or primary seizures. Obvious expression characteristics were found between individuals with ASD and sibling controls. This data was the chip data, removed one-to-many probes and selected high-expressed many-to-one probes.
A variety of relevant action modes provided by the STRING database were selected, mapping was performed, which indicated different action modes, and weighted values were provided by the database.

Animals.
All animal experiments performed in this study were approved by the Animal Protection and Utilization Committee (Department of Experimental Animal Science, Shanghai University) and were strictly performed in accordance with the Animal Care and Used Program of the Experimental Animal Institute of Shanghai University.
Both C57 males and females aged 3-4 weeks were obtained from the Shanghai Experimental Animal Center, Chinese Academy of Sciences. All mice were reared under standard conditions (23 ± 2°C; 55% ± 5% humidity) on a 12-hour light-dark cycle.
Adult male and female mice were mated overnight. The date of detection of a vaginal plug was recorded as day 0.5 of pregnancy. Adult female mice were given a single intraperitoneal injection of valproic acid (VPA) (500 mg/kg in 0.85% saline) on day 12.5 of pregnancy. The control group was injected with the same dose of normal saline at the same time. Female mice were allowed to breed freely, and mice were weaned on day 23 after birth. Subsequently, male mice were selected as experimental subjects. Behavioral tests were performed on mice at 7-8 weeks.

Behavioral Tests.
Behavioral tests were performed on mice according to the method described by St Omer et al. [20]. At least 10 experimental animals were included in each group. See the Supplementary Materials 2.5.1-2.5.4 for test methods.
2.6. Western Blotting. In western blot experiments, a Protei-nExt mammalian membrane protein extraction kit (Transgen Biotech) was used to extract proteins from tissue samples of the hippocampus in C57 mice, and a BCA kit (Beyotime Shanghai, China) was used to determine the protein concentration in the tissue. Thirty microliters of protein from each sample was separated on a precast gel (Tanon, Biofuraw), and the protein was transferred to a polyvinylidene fluoride membrane (Immobulon-P 0.45 mm, Millipore Germany) using transfer solution. The transferred membrane was blocked with QuickBlockTM Western blocking buffer (Beyotime, P0252) for 30 minutes at room temperature and incubated with the primary antibody at 4°C overnight (rabbit anti-β-Arrestin2 (CST, 1 : 1000) or mouse anti-GAPDH (Affinity: 1 : 3000)). After washing with phosphate buffered saline with Tween (PBST), the membrane was incubated with a horseradish peroxidase-linked anti-rabbit or antimouse secondary antibody (1 : 10,000 Santa Cruz Biotechnology USA) in PBST with 5% BSA at room temperature for 1 hour. Immunoblots were visualized using ECL (Immobulon, Millipore, Germany) and Image Develop (Tanon5200S China).

2.7.
Immunofluorescence. First, slides containing the cells were immersed three times in a petri dish containing PBS. Next, the slides were fixed with 4% paraformaldehyde for 15 minutes and incubated with 0.5% Triton X-100 (prepared 2 BioMed Research International in PBS) for 20 minutes at room temperature. The slides were blocked at room temperature for 30 minutes. Then, the blocking solution was drained, a sufficient amount of the diluted primary antibody (rabbit anti-β-Arrestin2 (PTG, 1 : 200) or mouse anti-Neun (Abcam: 1 : 500)) was added, and the slides were placed in a wet box and incubated at 4°C overnight. Next, the slides were dipped in PBST, the excess liquid on the slides was drained, and a diluted fluorescent secondary antibody (Alexa Fluor 594 anti-rabbit (1 : 500; Abcam) or Alexa Fluor 488 anti-mouse (1 : 500; Abcam)) was added. The slides were protected from light and incubated for 1 hour at 20-37°C in a wet box. Afterwards, the slides were incubated for 5 minutes in 4 ′ ,6-diamidino-2-phenylindole (DAPI) and protected from light. The specimens were stained, excess DAPI was washed off with PBST, liquid was blotted from the slides, mounting solution containing an antifluorescent quencher was applied, and the slides were observed with a confocal laser scanning microscope (LSM 710 Carl Zeiss).

Statistical
Analysis. Data are expressed as the mean ± SEM. A one-way variance method was used, and statistical analysis was performed using Origin 8, followed by a post hoc minimum significance test. A value of P < 0:05 was considered statistically significant.

Analysis of Sequencing Results and Candidate Gene
Selection. Blood samples were collected from child with ASD and their parents. The quality control of WES analysis is described in Table 1 in the Supplementary Materials. Many mutations were detected in the analysis, but the number of mutations associated with the disease was limited. To screen out mutations that were related to the disease from the large number of mutations detected, we used existing databases, software, and other tools, including the 1000 Genomes, EXAC, and esp6500siv2_all databases, to screen these mutations based on genes, mutation types, mutation details, population mutation frequency, and pedigree spectrum. The 1000 Genomes, EXAC, and esp6500siv2_all databases were searched for common mutations in the population while excluding rare mutations (< 0.05). The mutations that were present in the patient but not in the parents were determined based on a family map. After preliminary screening, 20 insertion-deletion (INDEL) (Supplementary Materials Additionally, the conservation of mutation sites was analyzed using the GERP, phyloP, and SiPhy tools. The sites that were more conserved had a greater impact on the protein. Based on the above prediction results, we selected five candidate genes (Table 1). Next, the GO database was used to classify the mutant genes according to their roles in biological processes, cell components, and molecular functions ( Figure S1). We screened three GO annotations related to synapses (Table 2, [21]. Combined with the previous analysis of the harmfulness of conserved mutation sites, the target gene ARRB2 was identified.

Distribution of the Candidate
Gene. Subsequently, the expression of ARRB2 in different brain regions and developmental stages was evaluated to investigate the potential contribution of this gene to the human brain development and function. We obtained gene expression levels measured using microarrays from the human brain transcriptome database (HBT, http://hbatlas.org/) [19]. Throughout development and adulthood, ARRB2 is stably and highly expressed in the brain regions of the neocortex, hippocampus, amygdala, striatum, mid-thalamic nucleus, and cerebellar cortex. (Figure 1(a)). Next, we extracted primary neurons from the hippocampus of newborn C57 mice for immunostaining. Blue fluorescence indicated DAPI-stained nuclei, red fluorescence indicated cells expressing ARRB2, and green fluorescence indicated Neun-positive cells. The results showed that ARRB2 was expressed in the hippocampus and was colocalized with neurons ( Figure 1(b)).

Coexpression and Interaction of ARRB2 and ASD.
We used the GEO dataset GSE65106, which includes 21 samples from ASD patients. A total of 21,408 pointers were analyzed, with 21,408 corresponding genes. WGCNA was used to analyze the coexpression network. The genes that were associated with ARRB2 were selected from the coexpression network and clustered into a module by WGCNA. A total of 26 related genes were selected. A network diagram was constructed based on its topological overlap measure (TOM) correlation with ARRB2 (Figure 2(a)). Additionally, a variety of related action modes provided in the STRING database were selected, and a graph was created to show the different action modes and the weight value provided by the database (Figure 2(b)). A larger weight value indicated a greater correlation.

Mutation of ARRB2.
By combining the LRT, Mutation-Taster, MutationAssessor, VEST3, GERP, phyloP, and SiPhy results, we found that the ARRB2-R8W site is highly conserved and may participate in the function disruption of the protein sequence. Therefore, we simulated the structure of the ARRB2-R8W protein. Figures 3(a) and 3(b) show images before and after the action of the wild type protein and substrate, respectively. Figure 3(c) shows the different stages of ligand binding to the wild type protein of ARRB2. The pocket exerts three types of forces between ARRB2 and the ligand, including hydrogen bonds and hydrophobic and electrostatic forces. ASP331 binds to the ligand with the electrostatic force via a negative charge. MET207, VAL123, TRP134, PHE326, TRP334, and TYR342 bind to the ligand and form hydrophobic bonds via Van der Waals forces. ARG212, ASP331, and GLU53 bind to the ligand through conventional hydrogen bonds. We also noted a steric interaction between GLU326 and the ligand (Figure 3(d)). However, during the structural simulation, we found that the mutation at the ARRB2-R8W site had little effect on protein structure changes.  To verify our predictions, we established a VPAinduced mouse ASD model. Through behavioral testing, the mouse model was shown to exhibit autism-like behaviors such as social disorders, repetitive behaviors, and anxiety ( Figure S2). Western blots of protein in the hippocampus of mice in the normal and ASD groups showed that the expression of the ARRB2 protein was increased in the ASD group (P < 0:01) (Figures 4(a) and 4(b)).

Conclusions
In recent years, exome sequencing has been widely used for screening of various diseases because of its simplicity and cost-effectiveness [22]. We used a series of analysis tools and found a de novo mutation of ARRB2. Mutation of ARRB2 satisfies all the screening conditions and was present in a child with ASD but not in the parents. Furthermore, mutation of this gene was predicted to be harmful by  multiple software programs. The functional annotation results showed that this gene was related to neurons. β-arrestin 2 is a member of the arrestin family. As a multifunctional adaptor, it plays an important role in regulating G protein-coupled receptor transport and signaling [23]. ARRB2 is highly expressed in the brain tissue and plays a key role in regulating systemic immune responses by modu-lating various signaling pathways [24]. Additionally, ARRB2 is associated with a variety of neurological disorders, such as Parkinson's disease [25], depressive behavior [26], and Alzheimer's disease [27]. However, few studies have revealed a direct relationship between ARRB2 and ASD. We evaluated the expression profile of ARRB2 in different human brain regions and developmental stages. Considering the high  BioMed Research International expression level of ARRB2 in the human brain, it may be important for early brain development and normal brain function, and therefore, it is a potential candidate gene for diseases related to the brain function. The hippocampus is an important brain area in the ASD research [28]. ARRB2 is expressed in the hippocampus of mice, which is consistent with previous research results by Gurevich [29]. The coexpression and genetic interaction network analysis indicated that ARRB2 may exert effects similar to several candidate ASD genes. Considering that SNPs may cause structural or functional abnormalities of the encoded protein, tools such as Mutatio-nAssessor can be used to predict functionally harmful effects of mutation sites in protein sequences. However, in the protein structure simulation, mutation of the ARRB2-R8W site had little effect on the protein structure. The mutation site   Figure 4: The expression of ARRB2 in the hippocampus. (a, b) Western blot analysis revealed that ARRB2 protein levels were increased in the VPA group vs. the control group. * P < 0:05, * * P < 0:01, * * * P < 0:001. n = 7. BioMed Research International is a nonsynonymous SNP, and nonsynonymous SNPs can affect the protein function by reducing the solubility of the protein and/or the instability of the protein structure [30,31]. R8W is highly conserved at the beginning of the amino acid sequence, which prevents it from having a major impact on the protein structure. Arginine is more hydrophobic than tryptophan, and their charges are different [32]. This change will alter the charge of the wild type residue, which may result in the loss of interaction with other molecules. Furthermore, the solubility of the protein is reduced, thereby affecting the function. Studies have hypothesized that amino acid substitutions will not cause any major structural disturbances but will only change the thermodynamic stability of the underlying state [33,34]. In a study of the amyloid protein, the mutant fibrillation kinetics were severely slowed, and the thermodynamic stability was decreased. Despite this instability, the resulting amyloid structure was still relatively undisturbed [35].
We identified the ARRB2 gene mutation and predicted that it may be a pathological site through bioinformatic analysis. Analysis of functional mutations in individual diseasecausing genes of ASD is very important to determine the mechanism and obtain pharmacological insights into ASD. Exposure to VPA can have permanent and adverse effects on the development of nerves and behavior. Children or rodents who are exposed to VPA prenatally may have a greatly increased risk of ASD [36]. To verify our predictions, we established a mouse model of ASD induced by VPA, which showed defects, including decreased social ability, increased stereotyped behavior, and anxiety ( Figure S2). In this model, the ARRB2 expression was abnormal, which is consistent with our prediction.
In this study, we identified ARRB2 through WES and verified its presence through a series of bioinformatics and molecular experiments. ARRB2 may be linked to ASD. However, the mechanism by which ARRB2 is involved in ASD still requires further exploration.

Data Availability
The data can be seen in GSE65106 and Supplementary Materials.

Ethical Approval
I have read and have abided by the statement of ethical standards for manuscripts submitted to Biomed Research International.