Genome-Wide Characterization and Expression Analysis of the HD-ZIP Gene Family in Response to Salt Stress in Pepper

HD-ZIP is a unique type of transcription factor in plants, which are closely linked to the regulation of plant growth and development, the response to abiotic stress, and disease resistance. However, there is little known about the HD-ZIP gene family of pepper. In this study, 40 HD-ZIP family members were analyzed in the pepper genome. The analysis indicated that the introns number of Ca-HD-ZIP varied from 1 to 17; the number of amino acids was between 119 and 841; the theoretical isoelectric point was between 4.54 and 9.85; the molecular weight was between 14.04 and 92.56; most of them were unstable proteins. The phylogenetic tree divided CaHD-ZIP into 4 subfamilies; 40 CaHD-ZIP genes were located on different chromosomes, and all of them contained the motif 1; two pairs of CaHD-ZIP parallel genes of six paralogism genes were fragment duplications which occurred in 58.28~88.24 million years ago. There were multiple pressure-related action elements upstream of the start codon of the HD-Z-IP family. Protein interaction network proved to be coexpression phenomenon between ATML1 (CaH-DZ22, CaHDZ32) and At4g048909 (CaHDZ12, CaHDZ31), and three regions of them were highly homology. The expression level of CaHD-ZIP gene was different with tissues and developmental stages, which suggested that CaHD-ZIP may be involved in biological functions during pepper progress. In addition, Pepper HD-ZIP I and II genes played a major role in salt stress. CaHDZ03, CaHDZ 10, CaHDZ17, CaHDZ25, CaHDZ34, and CaHDZ35 were significantly induced in response to salt stress. Notably, the expression of CaHDZ07, CaHDZ17, CaHDZ26, and CaHDZ30, homologs of Arabidopsis AtHB12 and AtHB7 genes, was significantly upregulated by salt stresses. CaHDZ03 possesses two closely linked ABA action elements, and its expression level increased significantly at 4 h under salt stress. qRT-P-CR and transcription analysis showed that the expression of CaHDZ03 and CaHDZ10 was upregulated under short-term salt stress, but CaHDZ10 was downregulated with long-term salt stress, which provided a theoretical basis for research the function of Ca-HDZIP in response to abiotic stress.


Introduction
Plant transcription factors can be divided into 58 families according to the conserved domain and function [1]. Homeobox (HB) protein is a kind of transcription factor closely related to biological growth and development [2]. In 1983, Garber et al. [3] discovers the HB in Drosophila, and then, it is found in invertebrates, vertebrates, fungi, and other high plants. In 1991, Vollbrecht et al. [4] clones the HB gene with the name of Knotted-1 (Kn-1) in maize, and then, the HB gene is cloned in various plants. According to the location, differences, and homology of HD (homeodomain) sequences, plant HB proteins can be subdivided into six major categories: PHD-Finger, Bell, HD-ZIP, WOX, ZF-HD, and KNOX [5]. HD-ZIP transcription factor is a unique transcription factor in plants and plays roles in the growth, development, disease resistance, and abiotic stress [6]. These functions of HB have been reported in mass [7], ferns [8], monocotyledons [9], and dicotyledons [10,11]. The HD-ZIP transcription factor is mainly composed of two conserved domains, HD (homeodomain) domain and a closely linked LZ (leucine zipper) domain. HD is linked to the specific binding of DNA, and Zip is related to heterodimerization [12]. According to structure and function, HD-ZIP proteins can be divided into four subfamilies: HD-ZIP I, HD-ZIP II, HD-ZIP III, and HD-ZIP IV [13].
HD-ZIP I protein is considered to be involved in the control of plant growth and development and the response to abiotic stress [14]. It is reported that there was a response of HD-ZIP I to acetic acid (ABA) at the transcript level under stress in Arabidopsis [15]. Under NaCl stress, the expressions of Gmhdz51 and Gmhdz 83 of cotton are substantially upregulated at 12 h after treatment [16]. In cotton, GhHB1 (belongs to HD-ZIP I subfamily) may be involved in salt stress and ABA treatment, and the expression of Gh-HB1 increases significantly in early stage of roots, and then decreases sharply, which suggests that HD-ZIP I played an important role in the early development of root [17]. In Arabidopsis, AtHB1 works at a downstream location, and AtPIF1 promotes hypocotyl elongation, particularly in response to short-day photoperiods [18], and mediates the apoptosis of leaf cell [19].
HD-ZIP II subfamily can induce shade avoidance reactions in plants through light signal transduction [20]. For example, when the seeds are stimulated with far-red light in the late germination period, they induce shade avoidance responses [21]. La Rota et al. [22] find that HD-ZIP III is involved in the Arabidopsis sepal development. HD-ZIP is also responsible for regulating plant cell differentiation and participating in the development of apical meristems, embryos, and vascular systems [15]. There are five family members of Arabidopsis HD-ZIP [13], due to differences in domains and expression patterns, each member plays different role [23][24][25]. The PCN gene in poplar belongs to the HD-ZIP subfamily and has a role in regulating xylem cell differentiation [26]. Transgenic poplars overexpressing PCN exhibited a slow deformation of wood and phloem and upregulation of endogenous hormone expression [28]. HD-ZIP IV subfamily is mainly involved in epithelial cell differentiation and root development [29]. In Arabidopsis thaliana, PDF2, ATML1, and ATHB10 are HD-ZIPIV proteins, which have a regulatory effect on the specific expression of outer cortical cells [30], and the double mutants of PDF2 and ATML1 exhibit epidermal deletion [31]. The OCL1 gene is a member of the maize HD-ZIP subfamily, and the N-terminal amino acid of its START domain plays a decisive role in the activity of this gene; overexpression can lead to delaying in flowering [32].
Pepper is a major vegetable, which is widely cultivated in the world. It is both fresh food, and processing raw material for seasoning, medicine, and cosmetics. It has very significant economic value [33]. The pepper genome has been the sequence in 2014 and can be used for gene prediction and annotation and public use [34,35]. Several transcription factor families such as DOF, WRKY, AP2/ERF, and NAC in pepper have been studied [36][37][38][39]. HD-ZIP family genes have been reported in Arabidopsis, rice, poplar, corn, and other plants [40,41], but so far, there is no systematic study of peppers HD-ZIP family. Our research analyzed the bioinformatic characteristics of the HD-ZIP gene family of pepper and supported a theoretical basis for studying the function of this gene in response to salt stress.

Genome-Wide Identification of HD-ZIP Family Genes in
Pepper. We downloaded the pepper HD-ZIP protein sequence (PF00046 and PF02183) from the Pfam database (http://pfam .sanger.ac.uk/) and applied BLAST alignment to the pepper genome database PGP (http://peppergenome.snu.ac.kr/, the protein sequences annotated in CM334 and Zunla-1), the default parameters (Limit Expect Value 1e-5) output data, and the pepper CM334 was identified using the HMMER3.1 software (http://hmmer.org/). For all CaHD-ZIP proteins in the genome, the default output E value is <1 × 10 − 5, and the CaHD-ZIP sequence in the plant transcription factor database is combined to accurately obtain the target sequence [42]. With the help of NCBI-CDD (https://www.ncbi.nlm.nih.gov/ Structure/cdd/wrpsb.cgi) and SMART (http://smart.emblheidelberg .de/smart/save_user_preferences.pl) tools, protein domain identification was performed, and genes without the HD-ZIP main domain were deleted [43]. With the ExPASy-ProtParam tool (https://web.expasy.org/protparam/), the isoelectric point (pI) and molecular weight (MW) of the target sequence are predicted [44]. The online software Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/) was used to predict the exon/intron structure of CaHD-ZIP [45].

Phylogenetic Analysis.
To investigate the phylogenetic relationships of the HD-ZIPs among C. annuum L, O. sativa, and A. thaliana, multiple HD-ZIP protein sequences were aligned, and an unrooted phylogenetic tree was constructed in MEGA 5.05 [46]. The phylogenetic tree was constructed using the neighbor-joining (NJ) method. In the phylogenetic tree, group pattern was evaluated with bootstraps (1000 replicates).

Conserved Motif
Analysis. The MEME software (http:// meme-suite.org/tools/meme) was used to identify conserved motifs in CaHD-ZIP, the maximum number of motifs was 25, and other variables were the default values.

Chromosomal Location and Gene Duplication.
The protein-coding sequence of CaHD-ZIP was mapped to the pepper genome database using BLASTn, and the gene was displayed on the chromosome by TBtools [47].
Plant gene duplication database (Plant Genome Duplication Database PGDD http://chibba.agtec.uga.edu/ duplication/index/locus) was used to identify the gene duplication of pepper HD-ZIP gene [48]. Ka, Ks, and Ka/Ks were estimated using DnaSPV5 [49]. The Ks value of each pair is used to estimate the replication time by the following formula: replication time = Ks/2λ, where λ = 6:1 × 10 −9 . International Journal of Genomics protein and the Arabidopsis HD-ZIP protein, 32 Arabidopsis HD-ZIP proteins which represent the 40 pepper HD-ZIP proteins are uploaded to the string website [51] (https:// string-db.org/) to predict protein interactions. The online program run with default parameters.

Transcriptome Analysis of CaHD-ZIP in Different
Tissues. Based on the CM334 RNA-seq [34], the expression patterns of CaHD-ZIP in different stages and tissues were analyzed; the tissues include root, stem, leaf, pericarp (PC), and placenta (PL) at 6, 16, and 25 days postanthesis (DPA), PC and PL at mature green (MG) and at breaker (B) stages, and PC and PL at 5 and 10 days postbreaker (B5 and B10, respectively).

Plant
Cultivation and NaCl Stress. Healthy pepper seeds (Capsicum annuum L. var. conoides (Mill) Irish) were soaked at room temperature for 6-8 h; then, the seeds were placed in a germination box and covered with wet gauze at 28 ± 2°C. The germinated seeds were sown into a pot with substrate (Vpeat : Vvermiculite = 2 : 1). When the sixth leaves appeared, the seedlings were transferred to barral with 8 L distilled water. Each barrel had four peppers. After a week, the seedlings were treated with 100 mM NaCl solution, and those cultivated with distilled water were control. The pH was adjusted to 7.0 using H2SO4 or NaOH.

Identification and Structure Analysis of the CaHD-ZIP
Gene Family in Pepper. 40 CaHD-ZIP target sequences were obtained, which were named with CaHD-ZIP01 to CaHD-ZI-P40 (Table 1). It could be seen from Table 1 that the num-ber of amino acids in each CaHD-ZIP sequence was between 119 and 841; the theoretical isoelectric point was between 4.54 and 9.85; the molecular weight was between 14.04 and 92.56 kD; the instability coefficient results showed that except CaHDZ29 and CaHDZ32, the other 38 CaHD-ZIP family members were unstable proteins ( Table 1). The core domain analysis showed that the CaHD-ZIPI subfamily had two domains, HD and LZ. Report to the CaHD-ZIPI, CaHD-ZIPII had an additional N-term. For CaHD-ZIP III and CaHD-ZIP IV subfamilies, in addition to the HD and LZ domains, they had a START domain and a MEKHLA domain (S1A). Analysis of the predicted introns and exons of the pepper HD-ZIP gene revealed that 5 genes (CaHDZ 03, CaHDZ17, CaHDZ23, CaHDZP35, and CaHDZ38) had no introns in the pepper genome. Most CaHD-ZIP genes contained 3-18 exon in the coding DNA sequence. 216 introns had 0 phases, and 1 had 2 phases (S1 B).

Phylogenetic Analysis of CaHD-ZIP Genes.
In order to examine the phylogenetic relationship between the 32 HD-ZIP transcription factors in pepper and the known members of other plants, we created a rootless development tree between Arabidopsis, rice, and pepper and implemented it in the MEGA 6 software. The phylogenetic tree implied that there are four groups of HD-ZIP, which is similar to previous studies on sesame, poplar, and corn [40,41,53]. A number of HD-ZIP I to HD-ZIP IV members of pepper are 14, 10, 5, and 11. The tree also showed that most CaHD-ZIP proteins move closer to members of Arabidopsis thaliana than members from rice ( Figure 1). For example, in the third group, CaHD-ZIP 11 was clustered with AtPHV and AtPHB, while Oshox 32 and Oshox 33 in the third group are clustered in a single clade. These results offered an important basis for the prediction of the function of pepper HD-ZIP protein.

Analysis of Conserved Motifs of the CaHD-ZIP Gene
Family. The conserved motif of CaHD-ZIP protein was further analyzed using the meme software. The software detected a total of 25 motifs in the 40 CaHD-ZIP proteins, which were designated 1 to 25 ( Figure 1 and S2). As expected, all identified CaHD-ZIP proteins (except CaHDZ06, 18) contained the LZ domain (motif 3) and the HD domain (motifs 1 and 2). A START field (topic 8) was found in the members of the third and fourth groups but was not found in the first and second groups. Among the third category, Motif15 and Motif18 were found to correspond to the Mekhla domains e. In addition to these, new functional motifs, some domains with unknown functions were also discovered, for example, Motif 6, 9, 10, 11, 12, 23, and 25 (only detected members of the fourth group) and Motif 14, 17, and 19 (only found in members of the third group). The results also indicated that members of the same group of CaHD-ZIP usually had similar motifs and therefore may have similar function.

Chromosomal Locations and Syntenic Analysis. 40
CaHD-ZIP genes were located on twelve chromosomes, 7 CaHD-ZIP genes located on chromosome 02 (CaHDZ09, 10, 11, 12, 13, 14 and CaHDZ15), 6 CaHD-ZIP genes were located on the long arms of chromosome 01 and 03, 3 International Journal of Genomics chromosomes 4, 6, 11, and 12 included 3 genes, respectively, and chromosomes 09 and 10 had two genes, respectively, each of chromosome of 5, 7, and 8 had one CaHD-ZIP gene, and chromosome 0 had two genes of CaHDZ01 and CaHDZ02 (Figure 2). Analysis of HD-ZIP genes duplication of pepper showed that two pairs of paralogism genes (CaH-DZ18 and CaHDZ28 and CaHDZ12 and CaHDZ22) were fragment rep-lication. The nonsynonymous substitution rate (Ka), the synonymous substitution rate (Ks), and the Ka/Ks ratio were shown in S3. It was reported that CaHD-ZIP's two-segment replication happened between 58.28 million and 88.24 million years ago. The Ka/Ks values of the two replication pairs were lower than 0.3, which indicated that there were no significant functional differences between these CaHD-ZIP genes after the replication event.  5 International Journal of Genomics hypoxia-specific induction was only found in two members (CaHDZ06 and CaHDZ36). CaHDZ03 has two closely connected defensive stress response and ABA action elements.

Prediction of Protein-Protein Interaction Network.
In order to further understand the interaction of pepper HD-ZIP proteins, an interactive network based on Arabidopsis orthodoxy was established using STRING. The results showed that there was just one pair of ATML1 (CaHDZ22, 32) and At4g048909 (CaHDZ12, 31). There was a coexpres-sion phenomenon, and the protein homology between the two was high. It has been confirmed by the protein twohybrid test on Arabidopsis [55]. Three high homology regions were verified, one of them was located on HB-2 (CaHDZ27), and it had high homology to HAT22 (CaHDZ9, 13, 24, 29), HAT14 (CaHDZ06, 33), and HAT3 (CaHDZ08). The second region was centered on HB5 (CaHDZ19, 23), and it had high homology with HB6 (CaHDZ15, 25), HB16 (CaHDZ35, 38), and ATHB-7 (CaHDZ03, 17, 26, 30). The last area was a high correlation with HDG11 (CaHDZ28) and    International Journal of Genomics HDG4 (CaHDZ05). The relationship of other HD-ZIPs had yet to be explored (Figure 4).

Expression Analysis of CaHD-ZIP Genes in Different
Tissues and Development Stages. To study the role of CaHD-ZIP in pepper growth and development, we cited publicly available RNA-seq data from 5 tissues (root, stem, leaf, peel, and placenta) to generate CaHD-ZIPs heat map of transcription patterns, which included seven developmental stages of the peel and placenta ( Figure 5 Figure 5(b), S4). It was noted that CaHDZ 01, CaHDZ04, CaHDZ 09, CaHDZ32, CaH-DZ13, CaHDZ19, and CaHDZ35 had higher expression levels after 4 h of salt stress, and CaHDZ 03, CaH-DZ11, and CaHDZ17 had higher expression after 58 h of salt stress. Expression levels of CaHDZ 06 and Ca-HDZ27 were lower than those of the control after 4 h and 58 h of salt stress. It was useful to note that the expression level of CaHDZ03 increased significantly after 4 h of salt stress.

Expression Analysis of CaHD-ZIP Genes in Response to
Salt Stresses. Based on the differential expression data of the transcription, 9 genes were selected in different subfamilies, and qRT-PCR was used to find out the role of the gene under salt stress (Figure 6). The expression levels of CaHDZ03, CaHDZ04, CaHDZ25, CaHDZ32, CaHDZ35, and CaHDZ39 were higher than that of the control after 4 h salt stress; the expression levels of CaHDZ03, CaHDZ04, CaHDZ10, C-aHDZ21, and CaHDZ33 decreased after 4 h of salt stress. The expression of the CaHDZ21, CaHDZ33, and CaHDZ32 genes decreased after 58 h of salt stress. CaHDZ03 and CaHDZ10 can be induced to increase the expression level under short-term salt stress and decreased expression level under long-term salt stress. These results suggested that these genes may play a significant role in peppers response to salt stress.

Discussion
HD-ZIP, as a plant-specific transcription factor, is closely linked to abiotic stress. HD-ZIP transcription factors are widely disseminated in different plants. HD-ZIP genes are highly conserved, but their proteins are different. This study got 40 HD-ZIP transcription factors; those factors contain a highly conserved domain, but most members are unstable proteins. We analyzed 217 introns, 216 of them have phase 0, only one of them has phase 2, and all the introns were    9 International Journal of Genomics available on both sides of the exon. This phase was called symmetrical exons. Based on the early intron hypothesis [56], excessive phase 0 introns and symmetrical exons can effectively promote exon shuffling by avoiding the interruption of the open reading frame, which could accelerate the recombination and exchange of protein domains [57]. There were a large number of phase 0 introns and symmetric exons of CaHD-ZIP genes in our study. It was corresponding with the statement and indicates that exon shuffling may be playing an important role in the evolution.
The phylogenetic tree was constructed using HDZ proteins of pepper, rice, and Arabidopsis. HDZ proteins of pepper can be divided into four categories (HD-ZIP I-IV), and the number of HD-ZIP I, II, III, and IV are 14, 10, and 5, respectively, 11, and 17, 10, 5, and 16 in Arabidopsis [58]; 12, 9, 4, and 7 in physic nut [59]; 11, 7, 5, and 8 in grape [60]; 16, 10, 9, and 10 in sesame [53]. These results suggested that the HD-ZIPI protein was the most abundant type of pepper HD-ZIP transcription factor, and the amount of HD-ZIPIII protein in pepper was the same as that of most plants. In addition, the phylogenetic tree also showed that most CaHDZ proteins were closer to members of Arabidopsis thaliana than members from rice [59] (Figure 7. Motif analysis also demonstrated that CaHDZ protein motifs distrib-uted differently with subfamilies, and genes of the same family had similar structure and function (Figure 1, S2), which provided a powerful guarantee for the evolution of pepper. Gene duplication is the highest evolutionary mechanism that helps plants adapt to various environmental stresses [15]. Compared with 41 pairs of soybean [16] and 10 pairs of cassava [61] paralogism HD-ZIP genes, there were only two pairs of pepper, CaHZ18 and CaHDZ28 and CaHDZ12 and CaHDZ22; those genes distributed on different chromosomes. Paralogous genes originated from the partial duplication of CaHD-ZIP between chromosomes. Replication of two fragments occurred between 5828 and 88.24 million years ago. The genomes of most polyploid plants contain a large number of repetitive chromosome segments, so the probability of partial repetitions is significantly higher than that of tandem repeats and translocations [62]. It speculated that the evolution of the HD-ZIP family of pepper was slow.
More and more evidence shows that the HD-ZIP gene is associated with plant growth and development [16,40,53]. For example, AtHB2 regulates the shade response of red light/far red light, which is linked to the formation of lateral roots. In sesame, HD-ZIP I, II, and III are widely expressed in all tissues [53], and similar phenomena had been found  10 International Journal of Genomics in pepper, but the number of CaHDZ genes expressed was less than that of sesame. In addition, we also found that the CaHD-ZIPIV gene in pepper showed a clear tissue-specific expression pattern ( Figure 5). For instance, CaHDZ28 was hardly expressed in roots but was expressed in other tissues; the expression of CaHDZ12 was opposite to that of CaHDZ28. HD-ZIP IV gene is highly expressed in young leaves and flowers of tomato [63]. These results indicated that different subgroups of HD-ZIP genes have distinct functions. Members of the CaHD-ZIP family have multiple pressure-related action elements upstream of the start codon. These case elements are divided into four main subgroups: stress-response, hormone responsiveness, photosensitivity, and MYB binding site [54]. ABA response element, lowtemperature response element (LTRE), and dehydration response element (Dre) are the main transcription factors regulating ABA signal transduction and participate in salt and drought stress [64]. CaHD-ZIP gene contained ABA signal transduction, and the greater the number of related action elements, the higher the gene expression level under salt stress. A lot of evidence shows that HD-ZIP I protein is involved in developmental reprogramming, c and coping with environmental pressure [65]. For example, AtHB7 and AtHB12 can act as negative feedback on the effect of ABA signal in plants under water shortage [65,66]. Some of the HD-ZIP I genes in pepper, such as CaHDZ03, 10, 17, 25, 34, and 35, were significantly induced in response to salt stress. Notably, the expression of CaHDZ07, 17, 26, and 30, homologs of Arabidopsis AtHB12 and AtHB7 genes, was markedly upregulated by salinity stresses, indicating that these genes may regulate drought and salt tolerance through an ABAdependent pathway. In particular, CaHDZ03 possessed two closely linked defense stress responses and ABA action elements, and its expression level increased significantly at 4 h under salt stress. The correlation analysis of qRT-PCR ( Figure 6) and transcription data (S4) showed that the expression of CaHDZ03 gene can be upregulated under short-term salt stress. In addition, we found that the defense stress-response may also be related to those gene expressions which contain defense stress-response elements. CaHDZ03, CaHDZ19, CaHDZ25, and CaHDZ26 had higher expressions than other genes under salt stress; CaHDZ10 was increased under short-term salt stress but decreased under long-term salt stress.

Conclusion
In the present study, we identified 40 HD-ZIP genes in pepper, including four types, which were unevenly distributed on 12 chromosomes. Syntenic analysis showed that CaHDZ18, CaHDZ28, CaHDZ12, and CaHDZ22 are fragment duplication. Two fragment duplications occurred at 58.28~88.24 million years ago. There were multiple upstream of the start codon of HD-ZIP family members. There was coexpression between ATML1 (CaHDZ22, CaHDZ32) and At4g048909 (CaHDZ12, CaHDZ31), and there were three regions with  11 International Journal of Genomics high homology of them. Expressions of the CaHD-ZIP gene ranged with plant tissue and developmental stage. The HD-ZIPI responded more significantly to salt stress than other subfamilies.

Data Availability
The attached table contains all the data used to fund the results of this study. Transcription data under salt stress has not been uploaded to the NCBI because this article has not been published, but S4 contains the data needed for this article. The qRT-PCR data used to support the findings of this study are included within the supplementary information file (Table S7).