Reprints Available Directly from the Publisher Photocopying Permitted by License Only Sequences of and the Vh1 Family in Lg7, a Clonable Strain of Xenopus, Homozygous for the Immunoglobulin Loci

Twenty-eight heavy-chain variable (VH1) region genes and the immunoglobulin (IgM) heavy-chain constant region of an isogenic Xenopus hybrid, X. laevis/X, gilli, LG7, have been sequenced. The LG7 clone represents the first Xenopus hybrid that is homozygous for the IgH locus. The VH1 family was specifically investigated because VH1 genes are used by the antibodies produced during the Xenopus antidinitrophenol (DNP) response, These VH1 germ-line sequences establish a so-called "dictionary" that is available for studying somatic hypermutational mechanisms in immunized frogs.


diversity (Wa
l  and Du Pasquier, 1976; Du Pasquier and Wabl,  1978).As useful as they are, these clonable animals are still heterozygous and therefore gen- etically complex.It would considerably simplify immunogenetic studies if some LG hybrid clones could be made homozygous at the major histo- compatibility complex (MHC) or immunoglobu- lin (Ig) loci.We report here such a clone, LG7, that is homozygous at the Ig heavy-chain (H) locus.In addition, we report most of its VH1 germ-line sequences as well as the sequence of its IgM (/) heavy-chain gene.This VH family is involved in the antidinitrophenol (DNP)   response, one of the best studied Xenopus anti- body responses (Brandt et,al., 1980; Schwager et   *Corresponding author.al., 1988).Thus, these germ-line sequences estab- lish a "dictionary" necessary for future studies on somatic generation of the antibody repertoire in this species.


RESULTS

LG7 Is Homozygous for the Ig Loci LG7 frogs first attracted our interest because they yielded fewer restriction fragments than other LG hybrids (Schwager et al., 1988).Whether this was due to homozygosity or to a Ig H chromo- some loss was not known.To determine the cause of LG7's unique restriction fragment length polymorphism (RFLP) pattern, we decided to first use a variety of techniques.Th

previous
RFLP analyses were confirmed by testing differ- ent digests of LG genomic DNAs with probes from all 11 VH families, light-chain probes for Vp and Cp (kindly provided by our colleague J. Schwager; Schwager et al., 1991) and C/ probes.The individual patterns could be grouped into four RFLP classes, identical for each probe (LG3,15;LG17,5,46; LG14,6; and LG7).LG7 con- sistently gave half as many restriction fragments as the other LG7 hybrids and Fig. 1 shows examples of Southern blot hybridized with various VH probes.Moreover, every band in LG7 that hybridized to a VH probe could also be found in LG15; thus, LG15 seems to contain an LG7-1ike haplotype.

To determine whether there are differences in expressed Ig haplotypes, we analyzed the serum from LG7, LG15, and outbred frogs in enzyme- linked immunosorbent assays (ELISA) with dif- ferent anti-Xenopus Ig monoclonal antibodies (Mab).LG7 was always negative with Mab 14G1, a monoclonal that detects specifically laevis !Ig (Du Pasquier et al., 1985) indicating that only gilli/-type Ig is produced.All three frogs rou- tinely tested positive with Mabs for Xenopus IgY (Mab 11D5), Xenopus light chain (Mab 409B8), and Lg / (Mab 10A9; Table 1).Additional evi- dence for LG7 possessing only the gilli Ig haplo- type was also found by screening LG7 genomic libraries with a C/ probe.Only gilli It constant FIGURE 1.Genomic Southern blot analyses of erythrocyte DNA from LG frogs.DNA was digested to completion with either Hind III or EcoRI and electrophoresed on 0.7% agarose gels and Southern blotted.The filters were hybridized with probes for (A) VHI, VttlII, (B) V,V, V,VII, and washed under stringent conditions.Numbers correspond to LG nomenclature (Kobel and Du Pasquier, 1977). (C) Southern blot analysis of LG7 small-egg progeny.Genomic DNA from seven individual LG7 small-egg tadpoles (numbered 1-7) and an adult LG7 (labeled A) were digested with EcoR1 and hybridized with a V, probe.Markers (in kb) are indicated in the margins.

regions could be sequenced from two different LG7 genomic libraries.Even though laevis and gilli C# cross-hybridize and share long stretches of sequence identity, the introns are of different sizes, with characteristic repeats and thus are easy to identify (Du Pasquier, unpublished).

Although the foregoing data did not exclude loss of a chromosome r deletion of one of the Ig loci, additional experiments do so.Chromosome spreads of phytohemagglutinin-stimulated LG7 spleen cells contained 36 chromosomes, the num- ber expected of LG hybrids.Moreover, when Southern blots were titrated with C/t and Cp probes, the signal intensity of LG7 genomic DNA was equal to t at of LG15 and LG14, that is, LG7

contains the same number of copies of C# and Cp as LG15 and LG14 (data not shown).

LG hybrids produce two types of eggs, small eggs that can be either normal haploids or aneu- ploids, and large genetically identical diploid eggs that are used to propagate the species.If

LG7 was heterozygous, segregation of IgH (or any other gene family) polymorphism should be detected in progen .However, the segregation cannot be followed in the large (2n) eggs because they ar always identical.Instead, the small eggs (ln) that contain a segregated set of chromo- somes must be analyzed.We compared in Southern blots digests of genomic haploid LG7 embryo DNA with adult LG7 DNA.The RFLP pattern of every haploid embryo tested was identical with the adult parent LG7 pattern when tested with different VH probes and with a C# probe (Fig. 1C).Thus, LG7 is homozygous at the IgH locus.This and the fact that the LG7 pattern of Ig H-chain genes is identical to half of the pat- tern of LG15 Ig genes suggests that LG7 is homo- zygous for at least the gilli Ig H-chain gene set present in LG15.

LG7 Is Derived from the Gynogenetic Development of an LG15 Small Egg Small eggs rarely develop beyond hatching unless the second polar body is not extruded so that diploidy is restored and a viable embryo is created.This is a rare occurrence.When small egg embryos do survive, their genetic loci will still be heterozygous if there is a crossover between a locus and the centromere Thus, heterozygosity is a function of gene position; het- erozygosity increases with increasing distance from the centromere (Nace et al., 1970; Kobel and  Du Pasquier, 1977).

We believe that the LG7 hybrid developed from an accidentally selected viable diploid LG15 small egg (see Fig. 2).Because all LG hybrids are tested by skin graft as a routine control, this crossing-over crossing-over

LG 15-like animal Ig heterozygous

LG v. ocus Upper pathway generates an LG15-1ike animal; the lower pathway generates the homozygous LG7

animal.The asterisks (* represent the point where the prevention of the second polar body extrusion occurs and a viable diploid egg is produced.

animal was easily isolated.Later, it was found to produce both big and small eggs, thus making it c onable.

Genomic V.l Gene Sequences Three genomic libraries representing altogether 1.7x106 recombinants phage (three to four gen- ome equivalents) were made from LG7 red cell genomic DNA and screened unamplified with a Xenopus family VH1 cDNA probe at high strin- gency.A total of twenty eight different VH1 elements were isolated and sequenced, and the alignments of their encoded amino acids are shown in Fig. 3.All exhibit a typical VH gene structure (Kabat et al., 1991) and all contain specific features characteristic of Xenopus VH1 genes.The complementarity-determining regions (CDRs) show limited variability and many differ- ent genes contain identical CDRls and very simi- lar if not identical CDR2s.The CDR2s exhibit the most sequence divergence which is restricted to their amino-terminal (NH2-) portions (Schwager  et al., 1988, 1989).Six sets of the genomic VHlS contain identical CDRls and three sets contain identical CDR2s (Table 2).The CDRls are fifteen nucleotides in length and can differ by as much as 60%.Only the nucleotides at positions 100-103, encoding MET 34, never vary.Genomic g44A was the only gene that contained a CDR1 identical to one published previously from a Xen- opus laevis genome (p LL1.4; Schwager et al.,  1989).The largest sequence difference found in the VH1 CDRls is a change of nine nucleotides (g46C vs. g345A).

LG7 VH1 CDR2s vary in length from 48-54 nucleotides (16-1 amino acids) and can differ by as much as 43% (g2A vs. g345A).

Overall nucleotide sequence identities range from 82-96%, and as expected, the framework (FR) regions are the most highly conserved.The VH1 nucleotide sequences beginning with the initiation codon (ATG) of the split leader and LG7G345A

LG7G B

LG7G7C LG7GI5 LG7G4A LG710A LGTG349 LG7G3 1

LGTG13 LGTG2A GTG35 GTG346B LGTG10 LG7G343 LGTG4B G7G44A G7G5

L 7G349C L 7G17B LTG342B LG7G342A

LG7G22 L
7G46C

LG G2B

LG G21

LGT 352

LG G349B G7G331
C RI CDR2 + It---+ + + + + + + + + 0--- of encoded animo acids of V.1 genomic genes (in one letter code) beginning with residue 1.The genes in bold type were found expressed in a cDNA library made from immunized LG7 frogs.Sequence identities are indicated by (') and gaps (-) are introduced to maximize homology.The asterisks (*) indicate stop codons.The CDR boundaries are marked and are according to Kabat et al. (1991).Genes g345A and g342A are pseudogenes and gene g5 is truncated due to only being partially sequenced.Master sequence was chosen for length.GenBank accession numbers are at the right margin.continuing to the conserved n namer of the recombination signal sequence (RSS) are aligned in Fig. 4. The conserved octamer ATGCAAAT and sequences analogous to a TATA box that are found 5' of all VH genes (Parslow et al., 1984) and act as the Ig H-chain promoter (Wirth et al., 1987)   were identified in twenty five of the VH1 genes.Three of the genes, g349A, g349C, and g343, were cloned using a restriction site just 3' of their leader sequences and so their promoters were not identified.Genes g349A and g343 are both viable VH genes because they are found expressed in an

LG7 cDNA library (Wilson et a ., 1992).Very little sequencing was performed 5' of the leader sequences and consequently no second possible promoter regions, as have been previously ident- ified in VH1 genes (Schwager et al., 1989), were found.The LG7 VH1 leader regions range from sixteen to nineteen amino acids in length and several genes (e.g., g7B, g7C, g2A, g35) contain almost identical leader sequences.Gene g44A has the longest leader intervening intron of 108 nucleotides (Fig. 4A).The VH1 genes varied in overall length from 98-101 amino acids.The hep- tamers of the RSS are identical in all but one of the VH1 genes.Two nucleotide substitutions in g22 change its heptamer to CACACTA.The non- amer sequences are much more variable and only six genes contain the concensus G/ACAAAAACA previously identified as the VH1 nonamer (Schwager et al., 19&9).The RSS 23-bp spacers vary by a wide range.Six genes contain identical 23-bp spacers, whereas others differ by more than 50% (g345A vs. g352).

Many of he genomic VHlS were isolated more than once from both the same library and from the different libraries.Eight of the recombinant phage contained more than one VH1 gene (e.g., g349A, B, C) and at least four recombinant phage (from seven chosen at random) tested positive for members of additional VH families by hybridiz- ation with family-specific oligonucleotide probes (Haire et al., 1990).Three of the recombinant phage contained VH II hybridizing fragments and one different recombinant phage contained a VH VIII hybridizing fragment (data not shown).These results in part agree with recent reports that show evidence for interpersion of Xenopus VH families (Schwager et al., 1989; Haire e al.,   1991).

Two of the twenty eight genomic genes are clearly pseudogenes.Gene g345A contains two stop codons in its FR3 region, beginning at nucle- otide positions 238 and 247, and gene g242A con- tains a stop codon in its leader sequence.Genes g342A and 342B are located on the same recombi- nant phage and differ by four bases.Both genes contain an extra different base in their leader intron sequences and, more important, g342A contains two extra bases in its leader, one of which is responsible for generating the stop codon (TAA) directly after the initiation codon ATG.The close sequence similarities and their close association in the DNA suggest that these two genes are the result of a recent gene-dupli- cation event.Genes g7B and g7C probably also represent another example of recent gene dupli- cation; they are found on the same recombinant phage and differ by thirteen bases in their coding regions.Genes g15 and g7C only differ by one base in their CDR2 regions, and are identical everywhere else from 100 bp 5' of their octamer to 80bp 3' of their RSS sequences (data not shown).Recombinant phage 15 and 7 are differ- ent by RFLP analysis, which suggests that these two sequences represent two genes.However, this evidence, even considered with the low (practically nonexistent) frequency of sequencing error (see Materials and Methods) of our reac- tions is inconclusive.Both genes g7B and g7C were repeatedly isolated from all libraries, whereas g15 was found only once.Also the EcoR1 fragment containing g15 is approximately equal in size to the EcoR1 fragment containing g7C, and so the difference in the two may well be the result of a s quencing artifact.


IgM Heavy-Chain Region

The four exons (C/I through C/4) encoding the gilli Ct chain of LG7 are shown in Fig. 5.The coding region encompasses 4163 bp from the first amino acid of c/1 to the polyadenylation signal.The four exons and their splice sites were ident- ified by comparison with the published laevis heavy-chain cDNA sequence (Schwager et al.,  1988) and with LG7/ cDNA sequences (Wilson  et al., 1992).The splice sites obey the traditional rule encountered so far for all Ig genes, the splice junction occurs between the first and second base of the joining amino acid (Brggerman, 1987).In two places, the exon boundaries of the gilli gene differ from the boundaries predicted from the laevis l,t cDNA for the C/2 exon.The LG7 C/2 exon is two amino acids shorter, ending with a cystein making the C/3 exon two amino acids longer at its amino-terminal end.

The deduced amino-acid sequence of the gilli gene differs at thirty six positions from the published laevis l,t cDNA sequence (Schwager et al.,  1988).The percent homology of the 452 residues is 79.6% (see Fig. 5); at the DNA level, the hom- ology is 96% (1308/1359 bp identical).This polymorphism, relatively higher at the amino-acid level, is not likely to represent the true allelic polymorphism of Xenopus Ig allotypes because X. laevis and X. gilli are different species even though they can produce fertile offspring.In addition, these substantial differences could explain why species-speci ic antihe vy-chai LG7G34 A LG7G7 LG7G7C LG7GI5 G7G4A LG7 0A LG7G3 9A LG7G 41 LGTG 3 LGTG2 ''''''T' LG7GIOB LG7G34 LG7G4B LG7G4 A LG7G5 G7G349C LG7GI7B G7G342B GTG342A LGTG22 LGTG46C LG7G2B LGTG21 LGTG352 GTG349B LGTG331 TCC---' 'G' LG7G7C A ATA C TATA GTA C CATCTTTTCC-- G FIGURE 4. Alignment of genomic VH1 genes. (A) Leader sequences beginning at the initiation codon ATG and continuing to the beginning of the VH region (residue 1).Sequence identities are indicated by (') a d gaps (-) are introduced only to conserve length.

The stop codon TAA in g342A is underlined. (B) VH1 regions begin at residue and continue through the RSS nonamer.The number of nucleotides from the beginning of the first base of residue is shown above the scale.Gaps are introduced to maximize homology.FR and CDR boundaries are according to Kabat et al. (1991).The stops in g345A are und rlined. LG7G4B ''''CC LG G44A '''' 'C LG7G LG7G349 LG7GI7B LG7G3 2B LG7G3 2A LG7G22 LG7G46C LG7G2B LG7G21 LG7G352 LG7G349B LG7G331 '''C'ATGACA'AA'C'C'CATTGCTCTT G'A''GAATGAG TTC'C'GA'''T''CT
''A''A'A'''''''''''''''''''T' ''A''A'A'''''''''''''''''''T' '''''A'A'''''''''''''''''T'G' ''TAGC'T''''''''GTGCAGC''''AC '''A'''''''''TCA'AT''''''T''' ''TAG''T''''''''''''''''''''T ''AAGCAATG'CAA'AACTCCA'TTCG'T ''TAG''A''''''''''''''''''''T  Thr Thr Ser Tyr Leu Ser Ile Thr Arg Lys G1u Trp Asp Leu Asp Thr Leu Tyr Ser Cys Val Val Glu His Ala Gly Ser ACA ACA TCC TAT CTC TCC ATA ACT AGA AAG GAA TGG GAT TTG GAT ACT TTG TAC TCA TGT GTA GTT GAA CAT GCA GGA TCG Ala 200 205 210 Gly Set Leu Gln Glu Lys Asn Met Ser Lys Set Leu Met Cys A GGT TCC TTA CAA GAG AAG AAT ATG AGC AAA TCA CTA ATG TGT G GTAAGTTCTTGTGATGTGATGCTTATTGTTCTACAGATAAGGATCAGTT Mabs (like 14G1, see the foregoing) can be produced.

Only 650 bp beyond C/4 were sequenced and no transmembrane exons were found in that stretch.The segments encoding the Xenopus transmembrane region have been published previously (Du Pasquier and Schwager, 1989) and a complete map of the Xenopus IgH locus will be published elsewhere (Du Pasquier, in preparation).The gilli 1 locus is also charac- terized by a long intron between C/1 and C/2 (1618 bp).This intron contains sequences similar to some enhancer motifs described for

nd Lenardo,
1991).An octa- mer and sequences analogous to/e4 and/e5 are identified in Fig. 5. gilli/ are shown.These residues are shown above the LG7 gilli-encoded amino acids Sequence motifs in the CH1 to CH2 intron that are similar to Ig-enhancer seuqencers are underlined.These include a SITE E, E-boxes (/e5,/e3,/e4), and a/B motif.A sequence identical to the KBF-A motif in the kappa-chain enhancer is underlined and overlined.The octamer sequence is double underlined.Ig-enhancer sequences are from Staudt and Lenardo (1991).Polyadenylation sites are in bold.GenBank accession number of gilli la is M97008.


DISCUSSION

In this study, we examined a clone of isogenic Xenopus, the X. laevis/X, gilli hybrid LG7, which probably developed accidentally from a LG15 small egg (Kobel and Du Pasquier, 1986) with a recombined genome.Because this clone is homo- zygous at the heavy- hain locus, its genetic sim- plicity makes it the best available strain for com- paring germ line with cDNA Ig sequences.These comparisons are necessary to detect the presence of somatic mutations, an issue that has remained unanswered in the lower vertebrates for the past 18 years, ever since the major differences in anti- body repertoires were first reported between amphibians and mammals.

In addition to establishing a dictionary of VH1 sequences from LG7, we have sequenced the p- chain gene.This will be useful for future studies with conventional heterozygous LG hybrids in order to identify origins of p cDNAs.

All of the twenty eight VH1 genes sequenced exhibited the typical vertebrate VH-gene structure (Kabat et al., 1991) and sequence identities ranged from 82-96%, thus easily fulfilling the cri- teria of VH-gene family membership (Brodeur  and Riblet, 1984).In addition, all of the VHlS con- tain specific features and conserved nucleotides previously described for Xenopus VH1 genes (Schwager et al., 1989).Many shared CDRls and CDR2s are similar 4ven though by overall com- parison they show the most equence divergence.Most of this variability is in the first eight to ten amino acids of the CDR2s.The LG7 VH1 sequences further reinforce, by allowing for greater comparisons, the hypothesis that the low heterogeneity of the Xenopus VH pool (at least for this family) may be the result of recent expansion events (Schwager et al., 1989; Du Pasquier and  Schwager, 1991 ).

We are aware that the screening process may have missed genes and that calculating VH family size by Southern blotting can be inaccurate because restriction fragments may contain more than one gene or may comigrate in the gel.The size of the LG7 VH1 family was originally esti- mated to contain at a maximum thirty genes and Southern blot analyses with frequent six or even four base cutters are consistent with this estimate (data not shown).Because studies in outbred laevis and other LG hybrids place the VH1 family size at approximately sixty in heterozygous indi- viduals (Schwager et al., 1989; Haire et al., 199 ), we believe that very few LG7 genes are missing.Indeed, when the twenty eight germ-line genes sequenced in this study were used to analyze LG7 somatic mutants (Wilson et al., 1992), only four out of fifty five cDNAs could not be unam- biguously assigned to one of these germ-line VHS.

The LG7 gilli Cp segment conforms to the organizational pattern seen in all other vertebrate Cp loci.There are four exons that encode the four protein domains of the secreted p heavy chain and a detailed analysis of the inferred amino acid identities between Xenopus chain and those of other vertebrates can be found in Schwager et al.  (1988).One unusual fea

the presence of enhancer motifs in t
e Cpl to Cp2 intron.Whether these sequences have any enhancer function remains to be investigated.Enhancer like sequences are also found 5 of the putative switch region of the Xenopus laevis IgM gene (Du Pasquier, unpublished)., a location anal- ogus to the site of the mammalian IgH-chain enhancer.The Cpl, Cp2 intron length, however, is not unusual.Introns of similar lengths are found between Cpl and Cp2 of th channel cat- fish, Ictalurus punctatus, and between Cp2 and Cp 3 of the horned shark, Heterodontus francisci (Wilson et al., 1990; Kokubu et al., 1988).

The LG7 hybrids are the first Xenopus clones to be homozygous at their IgH locus and prelimi- nary evidence indicates that they may well be homozygous for the p light-chain locus.Except for the MHC locus, which is heterozygous (Bernard et al., 1979), homozygosity at other LG7 genetic loci has yet to be investigated.In the future, it may well prove to be worthwhile to create other clonable homozygous LG hybrids by using pressure or cold-temperature shock to increase

43, gl0Bg3
2A, g342Bg22, g349B



Ala Leu Ala Leu Asn Glu Ser Leu Phe Ile Val Cys Leu Ala Thr Asn Phe Asn Pro Lys Asn Ile Val Ile TCC AAG GAT GCT CTT GCC TTG AAT GAA AGC CTC TTT ATC GTA TGT CTT GCA ACG AAT TTT AAT CCC AAA AAT ATA GTA ATT Lys Asn Gly Asn Gln Thr Lys Glu Gly Val Arg Val Glu Glu Pro Val Glu Asp Lys Lys Gly Gly Tyr Glu AAA TGG CTA AAG AAT GGG AAC CAG ACA AAA GAA GG GTG AGA GTT GAA GAA CCT GTT GAA GAC AAA AAG GGA GGA TAT GAG
TACATTTCATCTAACACCTTGTAATCATTTACATCTCTAATCTAACAAAAATACTATTCCTTCCCTTTCTTTTGCTTGTATTCTCAG CT ACT TCALys Ser Ash Pro Pro Ser Leu Phe Pro Leu