Human Surfactant Protein – A Gene Locus for Genetic Studies in the Finnish Population

Lung surfactant lowers the surface tension but surfactant proteins also have other functions. Surfactant protein A (SP-A) has a well-defined role in innate immunity. The gene locus for human SP-A genes is in chromosome 10q21 through q24 and consists of two highly homologous functional SP-A genes (SP-A1 and SP-A2) and a pseudogene. Several alleles that differ by a single amino acid have been identified for both SP-A genes. The SP-A gene locus has been shown to be sufficiently polymorphic for genetic studies in the American population. In this study, we analysed the SP-A allele frequencies in a Finnish population (n = 790) and found them to differ from the frequencies observed in US. Furthermore, we describe several new alleles for both SP-A genes. The heterozygosity indices and polymorphism information content values ranged between 0.50–0.62 indicating that SP-A gene locus is polymorphic enough for studies associating the locus with pulmonary diseases.


Introduction
Surfactant is a complex mixture of lipids and proteins. Its best-characterised function is its ability to lower surface tension at the air-liquid interface, thus preventing lung collapse [1]. Surfactant also has other functions: It participates in innate host defence and * Correspondence to: Mikko Hallman, MD, PhD, Department of Paediatrics, University of Oulu, Kajaanintie 50, Fin-90220 Oulu, Finland. Fax: +358 8 315 5559; E-mail: mhallman@cc.oulu.fi. takes part in the inflammatory processes of the lung [2]. The products of the surfactant protein (SP-A, SP-B, SP-C and SP-D) genes are crucial for many aspects of surfactant biology and physiology [2].
SP-A is the most abundant surfactant protein. It is a member of a family of collagenous carbohydratebinding proteins called collectins [2]. The gene locus for human SP-A genes is located on chromosome 10q21 through q24 [5] and consists of two functional SP-A genes (SP-A1 and SP-A2 genes) [13,20] and a pseudogene [14]. Each SP-A gene comprises four coding exons and spans a region of less than 5 kb. The DNA sequences of the two functional SP-A genes are highly homologous; within the coding regions the homology is > 99% [13,20]. Both gene products are required for fully functional and stable mature SP-A protein [9]. Each functional SP-A gene encodes a precursor that undergoes a number of post-translational modifications to give rise to a multimeric (octadecameric) active protein that consists of six trimeric subunits [2].
The SP-A gene locus has been reported to be sufficiently polymorphic to be an informative marker in genetic studies on the American population [7]. SP-A allele frequencies have been shown to vary between races [12]. The aim of the present study was to determine the allele and haplotype frequencies in a Finnish population. The haplotypes of SP-A were found to be different from US frequencies. The SP-A gene locus was polymorphic enough to be used as a marker in genetic studies also in the Finnish population.

Sample collection and study population
The sample collection for this study was carried out in the Department of Paediatrics, University of Oulu, Finland. The parents of the neonates gave written informed consent for their infants' blood samples to be used in this study. The ethical committee of the Uni-Disease Markers 16 (2000) 119-124 ISSN 0278-0240 / $8.00  2000, IOS Press. All rights reserved versity Central Hospital approved the study protocol. Samples were collected from infants born in 1997-99.
Two hundred twenty five cord blood specimens were obtained from all infants born at term (i.e. gestation 37.0 weeks) during one month in the Oulu University Hospital. In addition, 354 blood samples from premature infants and 211 from infants born at term were analysed for SP-A genotypes. Altogether 790 blood samples were genotyped for the SP-A genes.

DNA samples
Whole blood samples (0.5-3 ml) were collected from the cord into plastic EDTA tubes and stored at −70 • C until analysis. Genomic DNA was isolated from the whole blood specimens using the Puragene DNA isolation Kit (Gentra Systems). An aliquot of the DNA solution was diluted to 50 ng/l to be used for PCR amplification. When whole blood samples were not available (66 subjects), genotypes were determined using a blood spot dried on a filter paper. A 3 mm disk (corresponding to about 12,000 white blood cells) was punched from the blood spot using a hand-held paper punch (Wallac, Finland). The punch was decontaminated between samples by multiple punching of clean filter paper. DNA was bound to the disk and cellular contaminants were released by three successive 15minute incubations with 50 µl of DNA Purification Solution (Gentra Systems) followed by three washes with 100% ethanol. After drying at 55 • C or room temperature, the purified paper disk was directly used as a template for PCR amplification. A blank paper disk treated in a similar manner was included in each series of PCR reactions as a control for DNA cross-contamination.

Sequencing human SP-A1 and SP-A2 genes
Gene-specific amplifications of SP-A1 and SP-A2 genes were performed using two sets of gene-specific primers, with the second set nested with the first. A gene-specific forward primer 5 -ACTCCATGACTGAC CACCTT-3 at nucleotide position 469-488 of the SP-A1 gene (Katyal et al. [13]) and 5 -ATCACTGACTGTG AGAGGGT-3 at position 472-491 of the SP-A2 gene were used together with a common reverse primer 5 -TGCCACAGAGACCTCAGAGT-3 at position 3845-3864 for the first PCR amplification. The 10 µl reaction mixture contained 50 ng of template DNA, 1 µl of 10x PCR buffer, 0.2 µl of AmpliTaq Gold DNA Polymerase, 0.1 mM of dNTPs and 0.2 M of each primer. The cycling conditions were as follows: initial denat-uration at 95C for 5 min, followed by 26 cycles at 95 • C for 30 s, 58 • C 30 s and 72 • C for 2 min 30 s, followed by final extension at 72 • C for 5 min. For the nested PCR reaction, a common forward primer 5 -GATGGGCTCACGGCCATCCC-3 at position 1023-1042 was used together with a gene-specific reverse primer 5 -GAGGCCGAAGGCCAGAGAGC-3 at position 3364-3387 (SP-A1) or 5 -GAAACTGAAGGCC AGACAGGA-3 at position 3374-3397 (SP-A2). The 50 µl reaction mixture contained 5 µl of 10x PCR buffer, 0.5 µl of AmpliTaq Gold Polymerase, 0.1 mM of dNTPs and 0.2 M of each primer. The cycling conditions were as follows: initial denaturation at 95 • C for 5 min, followed by 26 cycles at 95 • C for 30 s, 58 • C for 30 s and 72 • C for 2 min 30 s, followed by final extension at 72 • C for 10 min. The resulting 2.4 kb fragments were purified with a DNA purification kit (Qiagen) and used as a template for sequencing. Exon 2 was sequenced for both SP-A genes using a forward primer 5 -GATGGGCTCACGGCCATCCC-3 , exon 3 for both SP-A genes using a forward primer 5 -ACCAGTTGTGGGTGACAGAT-3 and a reverse primer 5 -GGGTTTGTCTGATCCCCATC-3 , exon 4 for both SP-A genes using a forward primer 5 -GGGCAGAGTTCCAGGATTG-3 and exon 5 for both SP-A genes using a forward primer 5 -GCTTAGAGAC AAAGTGGTCA-3 , for SP-A1 using a reverse primer 5 -GAGGCCGAAGGCCAGAGAGC-3 and for SP-A2 using a reverse primer 5 -GAAACTGAAGGCCAG ACAGGA-3 .

Genotyping of SP-A genes
Genotyping was carried out as described earlier [3]. In brief, SP-A genes were amplified using gene-specific primers and conditions as described previously [7]. Genomic clones containing the SP-A1 gene and the SP-A2 gene were included as controls of the gene specificity of amplification in all PCR reactions. A PCR-cRFLPbased method was used to detect single nucleotide polymorphisms at codons 19, 50, 62, 133 and 219 for the SP-A1 gene and at codons 9, 91, 140 and 233 for the SP-A2 gene. Codon 85 was analysed for both genes to further ensure the gene specificity of the PCR amplification. Different combinations of polymorphisms at these sites distinguish between different alleles. At present, 19 different alleles have been described for the SP-A1 gene (denoted as 6A n ) and 15 for SP-A2 (denoted as 1A n ) [3]. The two SP-A genes are shown to be in marked linkage disequilibrium [7] and are thus suitable for haplotype analysis. The haplotypes for the SP-A1 and SP-A2 genes are denoted as 6A n /1A n .

Statistical analyses
Allele frequency comparisons were performed using chi-squared analysis. The allele distributions in the American and Finnish populations were compared using 2 × k tables. The observed genotype frequencies were compared to the expected Hardy-Weinberg distribution using chi-square analysis. To ascertain the validity of the chi-square analysis, rare genotypes with expected cell counts of less than five were pooled together. The observed heterozygosity was a fraction of the heterozygotes in the study population. The expected heterozygosity was determined on the basis of the expected distribution of different alleles in a given population, assuming random mating. Polymorphism information content (PIC) was calculated according to the formula where p i is the frequency of the ith allele, n is the total number of alleles and i = j. Haplotype frequencies were determined on the basis of homozygous genotypes. Utilising this information, haplotypes from heterozygous genotypes were determined according to the highest likelihood. The statistical significance of the linkage disequilibrium was calculated by comparing the expected haplotype frequencies (based on allele frequencies) to the observed ones using chi-square analysis.

New SP-A alleles and differences compared to previously published SP-A sequences
To ensure that the alleles found in the American population were also present in the Finnish population, the coding exons of six individuals were sequenced. Variation was observed at seven codons within the coding exons of the SP-A1 gene (codons 19 GC/TG, 39 CAC/T, 50 C/GTC, 62 CCA/G, 133 ACA/G, 184 TAC/T and 219 C/TGG) and at five codons within the coding exons of the SP-A2 gene (codons 9 AA/CC, 50 C/GTC, 91 C/GCT, 140 TCC/T and 223 A/CAG). The allelic variation seen at codons 39 and 184 for SP-A1 and codon 50 for SP-A2 has not been reported previously. Variations at codons 39 and 184 did not change the encoded amino acid, whereas variation at codon 50 caused a change of Leucine to Valine. Furthermore, in the SP-A2 gene codon 246 was constantly GAG, whereas it has been reported to be GAT [13]. The observed variation within the non-coding exons and introns is shown in Table 1.

Evaluation of the genetic informativeness of the SP-A gene locus
The usefulness of a genetic marker varies according to its degree of informativeness. This is shown by the polymorphism information content (PIC) value of the marker and by its heterozygosity index. The expected heterozygosity index was 0.56 for SP-A1 and 0.62 for SP-A2, while the observed indices were 0.54 and 0.61, respectively. The PIC values were 0.50 for SP-A1 and 0.59 for SP-A2. These results indicate that both of these SP-A gene loci are highly polymorphic and can therefore be used in studies evaluating the importance of the SP-A gene locus in pulmonary and infectious diseases.

Discussion
Surfactant proteins are crucial for many aspects of surfactant biology and physiology [2]. Apart from being essential for the normal surfactant structure [15], the human SP-A protein has a clearly defined role in innate host defence [16,17]. Besides being expressed in type II alveolar cells, the SP-A gene is also expressed not only in peripheral airways but also in central airways, namely in the Eustachian tube [18].
Individual SP-A alleles have been shown to have different properties. One SP-A1 allele (6A 2 ) has been associated with low mRNA levels [6], whereas 6A 3 allele-derived mRNA was more prone to the destabilisation effect of glucocorticoid treatment than 6A 2 or 1A 0 derived mRNA in a cell culture model [11]. Furthermore, specific SP-A1 and SP-A2 alleles (6A 2 and 1A 0 ) have been suggested to be associated with respiratory distress syndrome in premature infants (RDS) [12,19]. Deficiency in SP-A expression in mice is associated with enhanced susceptibility to specific pulmonary infections [2]. Accordingly, one can speculate that allelic variability in the SP-A gene locus plays a potential role in the genetic predisposition to a variety of other pulmonary and infectious diseases. In addi- tion to RDS, these include recurrent infections of the lung and the airways, acute respiratory failure in adults (ARDS), chronic obstructive pulmonary disease [4,8], reactive airway disease and recurrent respiratory infections. The three new allelic variants of the SP-A genes, described in the present study, are so-called silent variants. The population frequency and possible association of these alleles with disease remain to be studied.

Conclusion
Based on our results, the SP-A gene locus is sufficiently polymorphic to be used as a marker in studies of the genetic component of pulmonary and infectious diseases in the Finnish population. The new single nucleotide polymorphism seen in the present study potentially increases the information value of the SP-A gene locus as a genetic marker.