Genetic Variants of Surfactant Proteins A, B, C, and D in Bronchopulmonary Dysplasia

BPD_28D (O2 dependency at 28 days of life) and BPD_36W (O2 dependency at 36 wks post-menstrual age) are diseases of prematurely born infants exposed to mechanical ventilation and/or oxygen supplementation. In order to determine whether genetic variants of surfactant proteins (SPs-A, B, C, and D) and SP-B-linked microsatellite markers are risk factors in BPD, we performed a family based association study using a Greek study group of 71 neonates (<30 wks gestational age) from 60 families with, 52 BPD_28D and 19 BPD_36W, affected infants. Genotyping was performed using newly designed pyrosequencing assays and previously published methods. Associations between genetic variants of SPs and BPD subgroups were determined using Transmission Disequilibrium Test (TDT) and Family Based Association Test (FBAT). Significant associations (p ≤ 0.01) were observed for alleles of SP-B and SP-B-linked microsatellite markers, and haplotypes of SP-A, SP-D, and SP-B. Specifically, allele B-18_C associated with susceptibility in BPD_36W. Microsatellite marker AAGG_6 associated with susceptibility in BPD_28D/36W group. Haplotype analysis revealed ten susceptibility and one protective haplotypes for SP-B and SP-B-linked microsatellite markers and two SP-A-SP-D protective haplotypes. The data indicate that SP loci are linked to BPD. Studies in different study groups and/or of larger sample size are warranted to confirm these observations and delineate genetic background of BPD subgroups.


Introduction
Bronchopulmonary dysplasia (BPD) is the most common cause of morbidity in prematurely born infants who require prolonged mechanical ventilation [30]. BPD is almost exclusively a disease of severely premature (24-28 weeks of gestation), extremely low birth weight (ELBW, less than 1000 gr) infants who have been exposed to high airway pressures and/or high inspired oxygen concentrations [26,38]. Despite the dramatic improvements in neonatal care the incidence of BPD is on the rise due to the increased survival of the smallest infants [2]. BPD infants have structurally and biochemically immature lungs, the development of which is disrupted due to a number of harmful stimuli [29,30]. These processes lead to an impaired alveolar and capillary growth and overall an abnormal lung structural development that can have long term consequences [3,14].
It has been noted that genetic factors may contribute to BPD [27]. Gene expression profile studies of an animal model of BPD implicated several genes in the pathogenesis of this disease [57]. Candidate gene and linkage analysis are approaches used to identify genes associated with multifactorial diseases such as BPD. Although there have been very few studies on the genetic background of BPD in humans, preliminary reports have shown an association between BPD and surfactant proteins [39,62]. A recent case control study has demonstrated that the frequency of SP-B intron 4 deletion variant alleles is increased in BPD versus control infants even when essential external confounding factors, such as birth order, are included in the analysis [50].
Surfactant proteins (SPs) are good candidate genes of neonatal disease due to their role in lung development and maturation [4,15,19,63]. These proteins are components of pulmonary surfactant, a lipoprotein complex necessary for lung function [15]. Surfactant protein (SP-) B and SP-C play important roles in surfactant structure and surface tension lowering properties [61], while SP-A and SP-D play a role in local host defense and regulation of inflammatory processes in the lung [6,46,65]. Deficiency of surfactant can result in Respiratory Distress Syndrome (RDS) in premature infants. The lungs of these infants have been identified to have low SP levels [9,10]. Association of genetic variants of SP-A and SP-B with RDS has been demonstrated by both case control and family-based linkage studies [16][17][18][19]21,22,41,42]. Since BPD is commonly preceded by RDS, overlapping underlying mechanisms regulated by the same genetic factors may play a role in the etiology of both diseases.
In this report, we sought to perform a family based association study in order to identify whether alleles and/or haplotypes of surfactant protein genes (SP-A, SP-B, SP-C, and SP-D) and SP-B-linked microsatellite markers are susceptibility or protective factors in BPD. Associations between single nucleotide polymorphisms (SNPs) of surfactant proteins and BPD were tested using Transmission Disequilibrium Test (TDT) [55] and Family Based Association Test (FBAT) [24,34,47]. The extended TDT (ETDT) [7,52] and multi-allelic FBAT [25] were used to test for markers with multiple alleles.
To accelerate genotype analysis necessary for this study, we adapted our previously described PCR-cRFLP method [11] to the recently introduced pyrosequencing method. Pyrosequencing is a primer extension based method that has been used for a variety Table 1 Characteristics of population used in the study Number of children BPD at 28 days BPD at 36 weeks per family 1 3 5 = Y + 6* 11 + 4* 2 4 2 3 1 0 Total number of families 46 17 *These families have a second child that was diagnosed with either RDS or the other form of BPD = Y Two families have only mothers and no fathers, all others have both mother and father. One family has only mother and no father, all others have both mother and father. of biological applications, including SNP genotyping, SNP discovery, haplotyping, insertion/deletion studies, methylation studies and allele frequency studies [12,49]. This flexible, high-throughput method of genotyping improves on our previously published PCR-cRFLP method [11] by reducing the time and effort necessary for data acquisition.

Study population
The study population consisted of 60 families from Greece with 71 affected infants, as outlined in Table 1. All infants required supplemental oxygen at 28 days of life. All infants were intubated and required FiO 2 >0.30 on day 1 of life. Sixty four of them had RDS and were treated with surfactant. Out of 71 affected infants, 52 fulfilled the criteria for BPD at 28 days as described by Bancalari et al. [1]. All infants with BPD at the 28 days category met the Bancalari et al. criteria of supplemental oxygen requirement 28 days after birth, persistent abnormalities in the chest radiograph, and tachypnea in the presence of rales or retractions. This subgroup of infants is referred to here as BPD 28D. Nineteen infants went on to be oxygen dependent at 36 weeks of post-menstrual age [53]. This subgroup of infants is referred to here as BPD 36W. When the two subgroups are combined for analysis, they are referred to as BPD 28D/36W. The study protocol was approved by institutional committees of the participating hospitals and written parental consent obtained from each family. In this case two nucleotides (C and T) precede the SNP (A/G). Therefore, in the sequencing reaction, a G nucleotide (negative control, arrow) is dispensed. This is followed by dispensation of a C and a T that sequence the CT preceding the A/G SNP. Due to the fact that another G nucleotide is immediately adjacent to the A/G SNP, the specific SNP signal for G overlaps with the flanking G, as represented by the double peak height in panel A.I. A negative control C nucleotide (arrow) is dispensed following the A/GG sequence, and sequencing of surrounding bases continues as in panel B. Panel C: Both SNPs (AA140 and AA223) are sequenced at once. Due to the dispensation order set-up AA223 SNP is sequenced first (red bars), as in panel B, followed by a negative control T. The dispensation of the following C and T nucleotides, sequences through both AA223 and AA140 sequence at once (block arrow, red and blue bars overlap). Dispensation of the A and G nucleotides, sequences through the AA140 SNP site (blue bars). This SNP site is followed by a negative control (T nucleotide, arrow). The remaining stretch of nucleotides is the surrounding sequence for both AA140 (AG, blue bar) and AA223 (CTTTTCCCCG, red bar).

Isolation of
according to manufacturer's instructions. DNA was eluted in the final step using 200 µl of DNase free dH 2 O (Ambion). Upon extraction DNA was stored at −20 • C until further handling.

2.3.
Pyrosequencing: General considerations, primer design, template preparation and plate set-up a) Background: Pyrosequencing is a primer based DNA sequencing method, where DNA synthesis is monitored by four different enzymes and detected through luminescence in the form of a light pulse. It is performed on single stranded DNA templates pro-duced by PCR and separation of the biotinylated single strands (see below). Then a sequencing primer hybridization to the complementary bases on the single stranded PCR template initiates nucleotide incorporation by DNA polymerase. As each new nucleotide is incorporated, a pyrophosphate (PPi) group is released in an equimolar proportion to the amount of the incorporated nucleotide. This pyrophosphate is converted to ATP by the ATP sulfurylase, which in turn drives the conversion of luciferin to oxyluciferin by luciferase. The light produced as a result of this reaction is detected by a charge-coupled device (CCD) camera. Each light signal is proportional to the number of nucleotides in- Table 2 SNPs of surfactant protein genes studied SP-A1, SP-A2, SP-C, and SP-D SNP positions are based on the amino acid number of the amino acid sequence deduced from the cDNA (for SP-D the amino acid numbering starts after cleavage of signal peptide, while for SP-A and SP-C starts prior to the signal peptide cleavage). The first letter denotes the gene (A for SP-A, C for SP-C, D for SP-D), and the second letter (A) stands for amino acid. For SP-B SNPs, the numbers refer to the nucleotide position of the polymorphism [36]. The polymorphic variant of each SNP is given in parenthesis. For SP-A, haplotypes consisting of SNPs within the coding sequence that may or may not change the encoded amino acid are denoted as 6A or 1A for SP-A1 and SP-A2, respectively. For example, the SP-A2 1A 2 haplotype referred to in Fig. 3 consists of SNPs encoding amino acids 9, 91, 140, and 223 (AA9 C/AA91 G/AA140 C/AA223 C) [11]. The SP-D locus is in physical proximity with the SP-A locus [23] and in Fig. 3 the co-transmission of the SP-D SNPs with the 1A 2 haplotype are shown. corporated onto the DNA strand. As each nucleotide is dispensed, the complementary strand is synthesized and its signals generate a Pyrogram TM . Each light signal corresponds to a single peak in the pyrogram. Pyrograms are scored by pattern-recognition software that compares the predicted SNP pattern (histogram) to the observed pattern (pyrogram) (Pyrosequencing AB, Uppsala, Sweden). The ability to quantitate each peak of the pyrogram provides for numerous advantages. For example, efficiency of the PCR amplification can be determined from the peak height. Moreover, specificity of the template or any non-specific allelic amplification can be monitored by ensuring appropriate heights of specific peaks, and lack of incorporation of negative control nucleotides. b) Primer design for pyrosequencing: PCR primers were designed using commercially available software DNAStar (www.dnastar.com) or Pro-oligo (www.changbioscience.com) ( Table 3). Multiplex PCR and/or multiplex pyrosequencing primers were designed to ensure that no homo-and/or heterodimers occurred, and that there were no false priming sites for either of the primers on either template. Biotin label was added to the 5' end of one of the primers in each primer pair. The biotin labeling of the primers is currently carried out by Biomers (www.biomers.com), although other companies have been used in the past. Sequencing primers for the pyrosequencing reaction were designed using the SNP Primer Design Software (technical support web site of Pyrosequencing AB, Uppsala, Sweden). These primers anneal directly in front of the SNP position (or several nucleotides before it) on the reverse strand of the PCR template to be analyzed. Depending on the position of the sequence primer docking, the dispensation order of nucleotides is determined by the Pyrosequencing SNP analysis software ( Fig. 1). Nucleotides that are not present in the sequence, can serve as negative controls, and are dispensed, if possible, at the beginning of the sequencing reaction as well as in positions flanking the SNP position (Fig. 1). The negative control ensures the specificity of the priming and sequencing reaction. The DNA strand to be analyzed is elongated through the SNP site. c) Template preparation and plate set-up for pyrosequencing: Twenty µl of biotinylated PCR product was immobilized to streptavidin-Sepharose HP beads (Amersham Biosciences) and incubated with binding buffer, 10 mmol/L Tris-HCl, 2 mol/L NaCl, 1 mmol/L EDTA, 1 mL/L Tween 207 in a 96-weIl microtiter plate for 10 min at 25 • C with mixing at 1400 rpm. The 96-well plate was then processed through vacuum filtration using the Vacuum Prep Tool (Pyrosequencing AB), where the bead-bound PCR products were transferred to the filter at the end of each 96-Filter probe, and the remaining liquid was removed by vacuum filtration. The Vacuum Prep Tool was passed through a series of solutions including the denaturation solution (0.2 mol/L NaOH) that removed non-biotinylated strands. Sepharose bead-bound biotinylated strands were retained on the filter and rinsed in washing buffer (10 mmol/L Tris-acetate, pH 7.6). The vacuum was removed to release the beads from the filter. Bead-bound single-stranded DNA templates for each SNP were resuspended in annealing buffer (20 mMTris-acetate, pH 7.6; 2 mM magnesium acetate) containing 20 pmol of each respective sequencing primer, in wells of a PSQ 96 plate (Pyrosequencing AB). This mixture was then heated at 80 • C for 2 min on a compact heat block and cooled to room temperature for 10 min to facilitate annealing of the Pyrosequencing primers to templates. Table 3 Primers used in the study The plate was placed into the PSQ HS 96MA System (Pyrosequencing AB) for analysis. Simplex pyrosequencing was performed for genotyping SNPs of all four genes. Where multiplexing could be achieved, duplexing of two SNPs of the same gene was designed, with the sequencing reaction taking place from either one or two PCR templates. Table 2 shows the SNPs studied for all four surfactant protein genes and SP-B linked microsatellite markers, and Table 3  Cycling conditions for SP-A (gene specific reaction) were as follows: 95 • C for 2 min followed by 35 cycles of 95 • C for 30', 58 • C for 1 min, and 72 • C for 1 min. This was followed by one cycle of 72 • C for 5 min. PCR profiles for SP-A nested reaction were as follows: 95 • C for 2 min followed by 35 cycles of 95 • C for 30', 60 • C for 30', and 72 • C for 30'. This was followed by one cycle of 72 • C for 5 min. An aliquot of the PCR reaction was run on 8% PAGE gel to verify size and quantity of PCR product, and most importantly to ensure the presence of a single PCR band in simplex PCR or two bands in multiplex PCR. The outline of the SP-A SNP genotyping approach is presented in Fig. 2.

Genotyping of SP SNPs and SP-B microsatellite markers
ii) SP-A1: primer pair 1327A/293 was used to generate a 549 bp long PCR product, which was used as template for nested reactions for AA19, AA50, and AA62. Due to the spatial proximity of AA19 and AA50, these SNPs were amplified on the single DNA strand of a 187 bases product generated with 1327A/1328 primer pair. The 549 bp PCR product was also used as a template in a nested reaction with primers 1344/1345 to generate a fragment for analysis of AA62. The second gene spe- Due to the cost of biotinylated primers, the objective was to have as many as possible overlapping primers in the nested reactions for the two genes. Therefore, the same primer pairs were used for AA9, AA19, and AA50; for AA133 and AA140; and for AA223 and AA219.
b) SP-A pyrosequencing: All pyrosequencing reactions were first performed in simplex, where a single SNP was analyzed at a time, using 20 µl of the PCR reaction. Where possible, cost effective multiplex pyrosequencing assays were designed. For SP-A1, AA19 and AA50 were sequenced in duplex using 20 µl of a singular template and both SNPs were analyzed in forward assays. A reverse AA133 assay was performed in multiplex with a forward AA219 assay, using 15 µl of AA133 PCR template and 25 µl of AA219 template, to achieve optimal intensity in the ratio of the signal from each template. For SP-A2, 20 µl of AA140 PCR product was pyrosequenced in multiplex with 20 µl of AA223 PCR product. Both SNPs were pyrosequenced either in simplex or multiplex assays as schematically presented in the Fig. 1. Overall, complete genotyping of both SP-A genes requires 12 PCR reactions and 6 pyrosequencing reactions. c) SP-A molecular haplotypes: The advantage of being able to pyrosequence AA19 and AA50 on a single DNA strand, allows for direct molecular haplotyping of these two SNPs. In SP-A1, the AA19(C)AA50(G) haplotype was not observed in 152 individual DNA samples genotyped by pyrosequencing and in 100 samples genotyped by PCR-cRFLP [11]. FBAT was used to estimate frequencies of SP-A1, AA19(C/T)AA50(C/G), haplotypes in 1534 individuals. The frequency of AA19(T)AA50(G) haplotype was 0.532, the frequency of AA19(T)AA50(C) was 0.383, the frequency of AA19(C)AA50(C) was 0.083, while the frequency of AA19(C)AA50(G) was 0.001. These observations together indicate that the SP-A1 haplotype AA19(C)AA50(G) is extremely rare (approximately 1 in 1000).

II. SP-B analysis:
a) SP-B SNP genotyping: Genotyping for the SP-B SNPs B-18(A/C), B1013(A/C), and B9306(A/G) was performed for all samples using the PCR based converted Restriction Fragment Length Polymorphisms (cRFLP) method, as described before [36,51]. Genotyping for SP-B marker B1580(C/T) was done by both PCR-cRFLP and pyrosequencing. PCR for marker B1580(T/C) was performed using oligos 1313/1390 and cycling conditions were as follows: 95 • C for 2 min followed by 35 cycles of 95 • C for 30', 58 • C for 1 min, and 72 • C for 1 min. This was followed by one cycle of 72 • C for 5 min. An aliquot of the PCR reaction was run on 8% PAGE gel to verify size and quantity of PCR product, and most importantly to ensure the presence of a single PCR band. Pyrosequencing of the PCR product was done using sequencing with primer 1341. b) Microsatellite genotyping was performed for SP-B-linked microsatellites, AAGG, D2S22 32, and D2S388, as described previously [31].

III. SP-C analysis:
Genotyping for SP-C markers CA138(A/C) and CA186 (A/G) was performed as previously described [51] using the PCR-cRFLP method.

IV. SP-D analysis:
Simplex or multiplex PCR for DA11 and DA160 of SP-D was performed using primer pair 1301/1305 for DA11 and primer pair 1306/1307 for DA160. PCR profiles for SP-D both simplex and multiplex reaction were as follows: 95 • C for 2 min followed by 35 cycles of 95 • C for 30', 59 • C for 30', and 72 • C for 30'. This was followed by one cycle of 72 • C for 5 min. An aliquot of the PCR reaction was run on 8% PAGE gel to verify size and quantity of PCR product, and most importantly to ensure the presence of a single PCR band in simplex PCR or two bands in multiplex PCR. These SNPs were pyrosequenced either in simplex or multiplex forward assays using oligos 1301 and 1308, respectively.

Statistical analysis
PedCheck program was used to check the compatibility of genotypes at each marker locus within families, prior to analysis [44]. Marker loci with incompatible parental and offspring genotypes, were treated as missing in those families. Association between alleles of surfactant protein genes and BPD was tested using the Transmission Disequilibrium Test (TDT) [56] and Family Based Association Test FBAT [24,34,47].
TDT analysis was performed using GENEHUN-TER [33] (Whitehead Institute for Biomedical Research, MIT) to determine, transmission of individual surfactant protein (SP-A, SP-B, SP-C, and SP-D) marker alleles and SP-B-linked microsatellite marker alleles from heterozygous parents to affected offspring, and also to test for transmission of SP-B haplotypes of two, three, and four marker loci. In this analysis it is sufficient for only one parent to be heterozygous for the marker in order for a family to be considered informative and thus used in the study. Extended TDT (ETDT) analysis was performed to assess linkage of a multi-allele locus to the disease locus [7,52].
FBAT analysis was performed using the online program, http://www.biostat.harvard.edu/fbat/fbat.htm. Bi-allelic FBAT analysis was performed for SP SNP markers and selected microsatellites [24,34,47]. Multiallelic analysis, where association of the entire locus with disease is examined, was also performed for marker loci with multiple alleles [24]. All of the FBAT analyses were performed assuming an additive model. The minimum size [minsize] of FBAT analyses was set to 4 indicating that the test statistic was not computed when the number of informative families available was fewer than 4. Informative families refer to families with non-zero contribution to the FBAT statistic. For each FBAT marker, the Z-statistic and the corresponding p value are listed. A significant p value and a positive Z-statistic are indicative of a susceptibility marker allele for disease, while a significant p value and a negative Z-statistic is indicative of a protective marker allele for disease. TDT and FBAT analyses were performed for the following groups: BPD 28D, BPD 36W, and BPD 28D/36W.
Stratified FBAT analyses were also performed within each group based on baby's steroid treatment, mother's steroid treatment, and baby's surfactant therapy. Haplotype analyses were performed using the FBAT program, which implements an EM based algorithm [25]. EM algorithm is a general method of finding the maximum-likelihood estimate of the parameters of an underlying distribution from a given data set when the data set is incomplete or has missing values. For single marker analysis, FBAT and TDT gave the same results, although some differences were observed in the haplotype analysis, perhaps due to subtle differences in the assumptions each method makes (see discussion). For both TDT and FBAT analysis, significant results were noted when p was smaller or approaching 0.01. Results were not corrected for multiple comparisons.
Inferred haplotype reconstruction was performed using FBAT, where the most likely haplotypes were assigned for each individual infant. FBAT compares the distribution of test statistics using the conditional offspring genotype distribution under the null hypothesis [25]. The offspring genotype probabilities were computed given the parental mating type.

Results
We examined transmission of individual SNPs for all five surfactant protein genes and SP-B-linked microsatellites markers ( Table 2). Tables 4A and 4B, summarize the results of TDT and bi-allelic FBAT analysis, of SP-B and SP-B-linked markers, respectively. In both TDT and FBAT analysis, allele 6 of the AAGG marker was found to be transmitted more frequently to the affected offspring in BPD 28D/36W (p = 0.011, Table 4). The SP-B marker locus B-18(A/C) was found to be associated with BPD 36W group by both TDT and FBAT (p = 0.018), where the A allele was associated with protection and the C allele with susceptibility to disease. No significant associations were found for individual allelic variants of SP-A, SP-C, and SP-D by either TDT or FBAT. b) SP marker alleles stratified for pre-natal/neonatal treatments: Stratified bi-allelic FBAT analysis was performed to investigate the possible effect of pre-natal/neo-natal treatments on the outcome of BPD babies. Analyses were stratified according to the three co-variates: 1) lack or presence of pre-natal steroids, 2) lack or presence of post-natal steroids, and 3) lack or presence of surfactant therapy. Of significance is that in the combined BPD group and BPD 28D, allele 4 of the D2S388 marker locus was associated with protection from disease in babies that did not receive prenatal steroid treatment (p = 0.01). When the effect of post-natal steroids was investigated, allele 7 of the AAGG marker was associated with a protective effect (in BPD 28D, p = 0.014) while allele AAGG 6 (in the combined BPD group, p = 0.019) and B-18 C (in the BPD 36W, p = 0.018) was associated with susceptibility to disease in babies treated with post-natal steroids. Allele AAGG 6 was also found to be related to susceptibility in the combined BPD group that received surfactant therapy (p = 0.008).

I. Family-based allele association analysis: a) SP SNPs marker alleles and SP-B-linked microsatellites:
These results are in concordance with those of nonstratified analysis, as would be expected, since the majority of infants have received post-natal steroid treatment. On the other hand, in the smaller subgroup of BPD 28D (babies that did not receive post-natal surfactant therapy) allele 4 of the D2S388 marker was associated with a protective effect from disease (p = 0.008). The functional significance of these associations remains to be determined, particularly since the sample number of families under study is small for all, except for BPD at 28 days post-natal steroid treatment group.

II. Family-based haplotype association analysis:
a) SP-A and SP-D SNP haplotypes: Figure 3 depicts the findings of association between haplotypes of SP-A2 and SP-D genes and disease groups by TDT. In the BPD 28D group the DA160(G)SP-A2(1A 2 ) (p = 0.005) and the DA11(C)DA160(G)SP-A2(1A 2 ) (p = 0.014) haplotypes of SP-D/SP-A2 were not transmitted to the affected child. Therefore these haplotypes may act as protective factors in the BPD 28D subgroup. No significant associations were found for SP-A and SP-D haplotypes in the BPD 36W subgroup. Neither ETDT, nor bi and multi-allelic FBAT analysis showed any significant association between individual SP-A alleles and BPD disease subgroups. b) Haplotype analysis of SP-B SNP marker alleles and SP-B-linked microsatellite markers: Haplotype analyses were performed by both TDT (generated by Genehunter) and FBAT for two, three, and four markers at the time, and by FBAT for five,six, and seven markers at the time. The results of TDT and FBAT analyses are listed in Tables 5A and 5B, respectively. Of note is that  in most cases individual alleles found to be associated with either protection or susceptibility from disease in Table 4, were also found to form haplotypes that were associated with the same overall effect. SP-B-linked microsatellite markers were found in several disease associated haplotypes. The frequency of transmission of haplotype AAGG 6/B-18 C to babies in the BPD 28D/36W group was increased by TDT (p = 0.008), as well as the AAGG 6/B-18 C/B1013 A haplotype (p = 0.014). Moreover, the frequency of transmission of the AAGG 7-containing haplotypes to babies in all three subgroups was decreased by TDT (p = 0.02), and the D2S2232 1/AAGG 7 haplotype was found less frequently in BPD 28D/36W group (p = 0.014). Figure 4 provides a schematic presentation of haplotypes among different disease subgroups by the two different tests, TDT and FBAT. Presence of AAGG 6 allele in haplotypes was consistently associated with susceptibility to disease, while the presence of AAGG 7 allele was associated with a protective effect from disease. Although, AAGG 3 allele was found in three susceptibility haplotype combinations, this allele may not be informative as the AAGG 6 and AAGG 7 alleles may be, because it was also found in protection associated haplotypes (not shown: p 0.05). The AAGG 3 containing haplotypes are, D2S2232 2/AAGG 3 found to associate with BPD 28D group by TDT, D2S388 5/D2S2232 2/AAG G 3 found to associate with both the BPD 28D and BPD 28D/36W by FBAT (in both p = 0.01), and D2S388 5/D2S2232 2/AAGG 3/B-18 A found to associate with the BPD 28D/36W group by FBAT (p = 0.01).
SP-B SNP marker haplotype, B-18 C/B1013 A was associated with susceptibility by both TDT and FBAT analysis in BPD 36W (p = 0.002 and p = 0.014, respectively) whereas B-18 C/B1013 A/B1580 T and B-18 C/B1013 A/B1580 T/B9306 A haplotypes were associated with susceptibility in the same group only by TDT.
Overall, both individual allele and haplotype analysis of SP-B markers indicate an association of the microsatellite marker AAGG 6 allele with susceptibility to disease in infants affected with BPD 28D/36W. SNP marker B-18(A/C) alleles were associated with the more severely affected babies, where the C allele associated with susceptibility and the A allele with protection from BPD 36W.

Discussion
BPD is a multifactorial disease of preterm infants where multiple risk factors contribute to permanent pathophysiologic changes. Numerous genes required for neonatal lung adaptation are likely to be involved in the etiology of this disease. Since surfactant proteins are essential for normal lung function and play a role in both gas exchange and innate immune mechanisms of the lung, they are likely to play a role in the pathogenesis of BPD. Susceptibility to RDS, the common precursor of BPD has been shown to be linked to surfactant proteins, and SP-A in particular [16-18, 20,21,48]. Recently, in a case-control study, an SP-B intron 4 deletion variant was identified as a risk factor in BPD [50]. In order to avoid spurious associations secondary to population stratification, characteristic of case-control studies, we performed family based association analyses. Such analyses are not confounded by population characteristics and provide a more powerful analysis even in small sample sizes [54]. The data from TDT and FBAT (using two powerful family based association tests) revealed, a) association of SP-B with susceptibility to BPD 36W; and b) association of the microsatellite marker AAGG 6 with susceptibility in the BPD 28D/36W group; c) Ten susceptibility and one protective haplotypes for SP-B and SP-linked microsatellite markers were detected for all three subgroups (BPD 28D, BPD 36W, and BPD 28D/36W); d) TDT analysis revealed two SP-A/SP-D susceptibility haplotypes for BPD 28D.
Of the 71 infants in the study, 52 had the less severe BPD 28D, while 19 infants were affected with the more severe BPD 36W. Although the sample size in BPD 36W may be small, the use of the family based association approach may mitigate the margin of error due to small sample size [54]. However, it should be noted that the majority of observations in the BPD 28D/36W group overlap with those for the BPD 28D sub-group, perhaps due to the predominance of the sample size of this group.
The SP-A2/SP-D haplotypes found to associate with protection from BPD 28D have not been previously studied. The lack of association with SP-A in the Finnish BPD study is most likely due to the fact that their study group consisted of more severely affected babies (86 infants had BPD at 36 weeks, and 21 had BPD at 28 days). The inflammatory components that appear to be at work in the early stages of the disease could be modulated by SP-A and SP-D [28], which may be consistent with their immune related functions [46]. Although it is unknown how the SP-A2(1A 2 )/SP-D haplotypes may protect from BPD, we speculate that this protection is likely due to adequate levels of SP-A and/or SP-D at a critical developmental stage of the lung. In-vitro studies have shown genotype-dependent levels of SP-A mRNA [32] as well as differences among SP-A2 variants in 3'UTR mediated expression [59]. It is unlikely that the association reflects functional differences, because the SP-A2, 1A 2 variant is identical in the amino acid sequence of the mature protein with several other SP-A2 variants [11]. With regards to SP-D, the DA11(C/T) SNP was recently identified to affect oligomerization, function, and serum concentrations of the protein [35]. Individuals homozygous for the DA11(C/C) variant were found to have significantly lower SP-D serum levels than the DA11(T/T) homozygous individuals. Polymorphisms of SP-A were not investigated in this study [35].
SP-B SNP variant alleles and/or haplotypes on the other hand were only associated with the BPD 36W. Specifically, the C variant of the B-18(A/C) polymorphism within the 5'UTR of SP-B was found, either alone or in various haplotypes, to associate with susceptibility to BPD 36W by both TDT and bi-and multi-allelic FBAT. The finding of the SP-B intron 4 polymorphism association with BPD in a Finish study group [50], along with the present findings support a role of SP-B in BPD pathogenesis. Recently, a dual role for SP-B, as a protein essential for surfactant function and as an anti-inflammatory mediator, has been proposed that could explain involvement of this pro-   (Table 5A) and below the schematic presentation susceptibility haplotypes are noted (Table 5B). The nucleotide for each SNP is also noted by a number: A = 1, C = 2, G = 3, and T = 4. For example in BPD 36W susceptibility haplotype B 18 C/B1013 A/B1580 T/B9306 A is also noted as haplotype 2/1/4/1. The alleles for each microsatellite marker locus have been defined previously [31] and are noted by numbers 1, 2, 3, etc. The haplotypes noted here are described in detail in Tables 4 and 5. Only significant findings where p value approaches 0.01 are presented. The plus (+) sign in the left hand column denotes a haplotype that was found by either TDT or FBAT, while the minus (−) sign denotes that a particular haplotype was not found to have a significant p value by either TDT or FBAT. tein in BPD [13] at multiple levels. Regulatory roles in SP-B gene expression of SP-B that B-18(A/C) SNP and intron 4 variants [50] may have can only be speculated. Inadequate levels of SP-B may contribute to the progression of BPD especially in the later stages of the disease progression. Infants that require continued ventilatory support, as in BPD, experience transient episodes of surfactant dysfunction that are associated with deficiency in SP-B [43]. Mulltiallelic SP-B-linked microsatellite markers were also used to increase resolution in the definition of the genetic background of BPD subgroups. Of the three markers studied, AAGG and specifically allele AAGG 6 was found to associate with BPD 28D/36W. This microsatellite marker is located 26 kb upstream from SP-B, and allele 6 consists of 24 AAGG repeats [31]. No difference in transcription factor bind-ing among the alleles of AAGG was found by current bioinformatics techniques (JASPAR transcription factor data base). It is currently unknown whether this marker imparts a direct functional effect on SP-B in relation to BPD, or whether it is linked to another gene and/or to a functional genetic variant(s) that is located in, or close to the SP-B gene.
To identify genes between the microsatellite marker AAGG locus and the SP-B SNPs an ensemble gene search (www.ensembl.org) was conducted. The only known gene located within this region is granulysin, a T cell activation gene. Although granulysin has not been implicated in the pathogenesis of BPD, this gene plays a role in immunity against intracellular pathogens, and may therefore be involved in the modulation of inflammatory responses in BPD [45]. However, the disease locus does not have to be limited to this region, and   may in fact be much further upstream or downstream. Although association data may identify and implicate genes in the clinical course of a disease, functional analysis of variants of interest should be conducted to better understand their role in BPD pathogenesis. Multilocus haplotype association analyses are advantageous because they provide a stronger power of association than the single nucleotide methods [66]. The results from these analyses confirmed and extended on the trends seen by individual SNP analysis. Combinations of B-18 C, B1013 A and B1580 T were found in susceptibility haplotypes by both TDT and FBAT. Moreover, significant three marker linkage disequilibrium for B-18(A/C), B1013(A/C), and B1580(C/T) has been reported in several populations [37]. In the Greek population the frequency of the B-18 C/B1013 A/B1580 T and B-18 A/B1013 C/B1580 C haplotypes of these three marker loci is the same (0.274) [37]. Therefore, association of the CAT haplotype of these marker loci with susceptibility to BPD in this study is not an artifact of the overall population characteristic, but it may be rather disease related.
Although FBAT and TDT analyses identified similar alleles and/or haplotypes associated with BPD, different findings were also observed between these two analyses. For example, FBAT revealed haplotypes with additional upstream loci (D2S388 5/D2S2232 2/AAGG 3 and D2S388 5/D2S2232 2/AAGG 3/B-18 A) for BPD 28D and BPD28D/36W. The basis for the observed differences is currently unknown, but it may be due to differences in approach and/or the underlying assumptions of the two tests (TDT and FBAT). Haplotype TDT uses the maximum likelihood approach based on the Lander-Green hidden-Markovmodel (HMM) [33], while the FBAT uses a weighted conditional approach [25]. Therefore, the differences in results are more likely to occur for low frequency haplotypes. Moreover, haplotype FBAT assumes no recombination among the marker loci [25] and haplotype TDT assumes no linkage disequilibrium [33]. Recent studies have used these two tests in complement for either single nucleotide polymorphism analysis [5,40,64] or haplotype analysis [8]. However, the partial discrepancies observed in the results by these two different family based association tests in these studies as well as the results presented in this study, point to the need for future studies to compare different analytical methods in parallel. Such efforts could identify differences in approach and assumption, understand the implications of such differences and clearly define advantages and disadvantages of each approach.
Due to the presumed overlap between BPD and RDS, it was hypothesized that similar genetic association findings would be seen in these two diseases [17,22]. It was, therefore, surprising that the B1580 T variant was found in the susceptibility haplotypes in the present study. This SNP codes for the non-glycosylated Ile (A TT) variant [58]. This variant has been associated with decreased risk of RDS in blacks in the presence of the SP-A1 allele (6A 3 ) [17], whereas the glycosylated Thr (ACT) variant has been shown to associate with susceptibility to RDS especially in the first born and male infant in a study of prematurely born twins [17,41]. However, this is consistent with previous observations where factors that may protect from RDS, increase risk for BPD. For example, prenatal inflammation (chorioamnionitis) and increased post-natal lung inflammation associate with a decreased incidence of RDS among preterm babies, but these are associated with an increased risk for chronic lung injury following preterm delivery [60].
In summary, factors involved in predisposing infants to develop BPD following RDS are likely complex and presumably involve several loci with small effects, effectively making the traditional genetic studies difficult. In the present study, using two different family based association tests, we show that the SP-A/SP-D locus associates with BPD 28D, and the SP-B and SP-B-linked microsatellite marker loci with BPD 36W and BPD 28D/36W. However, larger sample size studies with various BPD subgroups are needed to extend these results and enable localization and characterization of genetic factors.