Applications of recombinant DNA technology in gastrointestinal medicine and hepatology: Basic paradigms of molecular cell biology. Part C: Protein synthesis and post-translational processing in eukaryotic cells

Can J Gastroenterol Vol 14 No 7 July/August 2000 603 Department of Medicine, Division of Gastroenterology, McGill University Health Centre, and McGill University Inflammatory Bowel Disease Research Program, Montreal, Quebec; Department of Medicine, Division of Gastroenterology, University of Alberta, Edmonton, Alberta Correspondence: Dr Gary E Wild, Montreal General Hospital, 1650 Cedar Avenue, Montreal, Quebec H3G 1A4. Telephone 514-934-8308, fax 514-934-8411, e-mail gwild@is.mgh.mcgill.ca Received for publication March 23, 1999. Accepted July 15, 1999 REVIEW

T he processes of DNA replication and transcription occur inside the nucleus. By contrast, protein synthesis takes place in the cytoplasm (1)(2)(3)(4)(5)(6)(7)(8). Protein synthesis is termed 'translation' and is directed by mRNA templates. The translation of mRNA is only the first step in the formation of a functional protein. Importantly, the polypeptide chain must subsequently fold into the appropriate three-dimensional configuration and undergo various processing steps before being converted into its active form. In eukaryotic cells, these processing steps are intimately related to the sorting and transport of different proteins to their appropriate destinations within the cell.
While the regulation of gene expression occurs primarily at the level of transcription, the expression of many genes can also be controlled at the level of translation. Most proteins can be regulated in response to extracellular signals, and intracellular protein levels can be controlled by differen-tial rates of protein degradation. Thus, the regulation of both the amounts and activities of intracellular proteins ultimately determines all aspects of cellular behaviour.
Proteins are synthesized on mRNA templates by a process that is remarkably similar in both prokaryotes and eukaryotes. The mRNAs are translated in the 5¢ to 3¢ direction, and polypeptide chains are synthesized from the amino to the carboxy terminus. The amino acids incorporated into the polypeptide chains are specified by three bases (adenine [A], uracil [U], and cytosine [C] or guanine [G]) (ie, a codon) in the mRNA, which are determined by the genetic code. Translation occurs on ribosomes, with tRNA serving as the adapter between the amino acids being incorporated into the nascent protein strand and the mRNA template. Thus, protein synthesis involves interactions between three species of RNA molecules -mRNA templates, rRNAs and tRNAs. 604 Can J Gastroenterol Vol 14 No 7 July/August 2000

Wild et al
Applications du génie génétique en médecine gastro-intestinale et en hépatologie : paradigmes de base de la biologie moléculaire de la cellule. Partie C : synthèse des protéines et transformation post-traductionnelle dans les cellules eucaryotiques RÉSUMÉ : La traduction de l'ARNm constitue la première étape de la synthèse d'une protéine fonctionnelle. La chaîne polypeptidique est ensuite pliée dans la configuration tridimensionnelle appropriée et suit plusieurs étapes de transformation avant d'être convertie dans sa forme active. Ces étapes de transformation sont étroitement reliées aux événements cellulaires qui surviennent dans le réticulum endoplasmique et dans les compartiments de Golgi, et déterminent le triage et le transport des différentes protéines vers leurs destinations appropriées à l'intérieur de la cellule. Alors que la régulation de l'expression des gènes survient principalement au niveau de la transcription, l'expression de nombreux gènes peut aussi être régulée au niveau de la traduction. La plupart des protéines peuvent être régulées en réponse à des signaux extracellulaires. De plus, les niveaux de protéines intracellulaires peuvent être contrôlés par des vitesses différentes de dégradation des protéines. Par conséquent, la régulation à la fois des quantités et des activités des protéines intracellulaires détermine finalement tous les aspects du comportement des cellules.  The 5¢ end of the mRNA is the Cap sequence, followed by the 5¢ untranslated region (UTR), and then by the AUG codon, which signals the initiation of translation. Toward the 3¢ end of the mRNA, there is a signal for the termination of translation (UAA, UAG or UGA) followed by the 3¢ -UTR. At the extreme 3¢ end of the mRNA is the poly A tail. Protein synthesis starts at the AUG codon and proceeds in the 5¢ to 3¢ direction until a termination codon is reached, which heralds the end of protein synthesis.
The genetic code is comprised of 64 codons, each containing three bases (A, U, C or G). The permutations of the four bases in groups of three are shown in Table 1. The 64 codons code for 61 amino acids and termination signals. The genetic code, with some minor exceptions, is ubiquitous, wherein the same codons always code for the same amino acid. Minor variations occur in the mitochondria. More than one codon can code for the same amino acid (Table 1). This is known as 'redundancy of the genetic code'.
TRANSLATION OF mRNA tRNAs serve as carriers and adapters for the alignment of each of the 20 amino acids with their corresponding codons on the mRNA template (9)(10)(11)(12)(13)(14). tRNAs consist of 70 to 80 nucleotides, with a characteristic 'clover leaf' configuration that results from complementary base pairing between regions of the molecule. The tRNAs possess unique identifying sequences that allow the correct amino acid to be attached and aligned with the appropriate codon in the mRNA. All tRNAs have the sequence CCA at the 3¢ end, where free amino acids covalently attach to the ribose of the terminal adenosine residue. Recognition of the mRNA template occurs through interaction with an 'anticodon loop', located at the other end of the tRNA, which binds to the appropriate codon through complementary base pairing. The attachment of amino acids to specific tRNAs is mediated by 'aminoacyl tRNA synthetases'. The three-base sequence on the anticodon loop is complementary to a specific codon found in the mRNA. For example, if the codon in the mRNA is GGC, it is recognized by the anticodon of the tRNA as CCG.
While there are 61 codons specifying amino acids, there are fewer than 61 tRNA molecules. Thus, some of the tRNA molecules are able to recognize more than one codon; this phenomenon is called 'wobble'. Wobble effects are found with the third base of the codon.

THE STEPS IN PROTEIN SYNTHESIS
Particles consisting of RNA and protein, known as 'ribosomes', are located in the cytoplasm and serve as the site of protein synthesis. The principal components of the protein synthesis machinery include mRNA, tRNAs, amino acids and ribosomes.
Each ribosome is comprised of two subunits -the 40S (or small subunit) and 60S (or large subunit). The size of the entire particle is 80S. The 40S subunit is made up of the 18S rRNA and 30 different proteins. The 60S subunit is made up of the 5S, the 5.8S and the 28S rRNA as well as 50 different protein species. Ribosomal proteins are imported to the nucleolus from the cytoplasm and begin to assemble on pre-rRNA before its cleavage. As the pre-rRNA is processed, additional ribosomal proteins and the 5S rRNA assemble to form preribosomal particles. The preribosomal particles are exported from the nucleus to the cytoplasm, yielding the 40S and 60S ribosomal subunits.
The ribosome physically moves down the mRNA in the 5¢ to 3¢ direction, with the sequential addition of amino acids from tRNAs to form the nascent polypeptide. Amino acids are attached to tRNA by a process called 'charging', which is mediated by 'aminoacyl tRNA synthetases'. For each of the 20 amino acids, there are 20 different aminoacyl tRNA synthetases. When the protein is completed, it is released along with the ribosome and tRNA molecules, which are free to begin the cycle again.  Protein synthesis comprises three specific steps -initiation, elongation and termination. Each of these steps involves specific proteins, and the energy for this process is derived from either ATP or GTP. These steps are illustrated in Figure 1.

Initiation of translation:
In eukaryotes, the initiation of protein synthesis involves approximately 10 different proteins ( Figure 2). The initiation factors eIF-III and eIF-IA bind to the 40S ribosomal subunit. The initiation factor eIF-II binds to GTP to form a complex that binds a tRNA charged with the initiator methionine. The 5¢ cap of the mRNA is recognized by eIF-4, which brings the mRNA to the ribosome. The eIF-II-methionine-tRNA-GTP complex subsequently interacts with the 40S subunit at the 5¢ end of the mRNA. After binding to the 5¢ end of the message, the 40S subunit with the eIF-II-methionine-tRNA-GTP complex moves down the mRNA. This process is known as 'scanning'. Scanning continues until the complex reaches the first AUG (ie, the initiator codon) on the mRNA. Then, the 60S ribosomal subunit binds to the complex to form the final ribosomal structure. This process requires GTP as an energy source. The formation of this final structure signals the completion of the initiation step. eIF-II and GDP are released from the complex and are able to reinitiate the cycle. When the initiator codon (ie, AUG) is located, eIF-V triggers the hydrolysis of GTP bound to eIF-II, followed by the release of eIF-II (complexed to GDP) and other initiation factors. The 60S ribosomal subunit then joins the 40S complex to form the 80S initiation complex. Peptide elongation: The various steps involved in the elongation phase of protein synthesis are illustrated in Figure 3. The ribosome has three sites for tRNA binding designated the 'peptidyl', 'aminoacyl' and 'exit' sites. The initiator methionine-tRNA is bound at the peptidyl site. The first step in elongation is the binding of the next aminoacyl tRNA to the aminoacyl site by pairing with the second codon on the mRNA. The aminoacyl tRNA is escorted to the ribosome by an 'elongation factor' (eEF-Ia ), which is complexed to GTP. The GTP is hydrolyzed to GDP after the correct aminoacyl tRNA is inserted into the aminoacyl site of the ribosome, and the elongation factor bound to GDP is released.
Once the eEF-Ia has left the ribosome, the peptide bond is formed between the initiator methionine-tRNA at the peptidyl site and the second aminoacyl tRNA at the aminoacyl site. This reaction is catalyzed by the large ribo-  somal subunit. The result is the transfer of methionine to the aminoacyl tRNA at the aminoacyl site of the ribosome, forming a peptidyl tRNA at this position and leaving the uncharged initiator tRNA at the peptidyl site. The next step in elongation is translocation, which requires the elongation factor eEF-II, and is again coupled to the hydrolysis of GTP. During translocation, the ribosome moves three nucleotides along the mRNA, positioning the next codon in an empty aminoacyl site. This step translocates the peptidyl tRNA from the aminoacyl site to the peptidyl site, and the uncharged tRNA from the peptidyl site to the exit site. The ribosome is then left with a peptidyl tRNA at the peptidyl site, and an empty aminoacyl site. The binding of a new aminoacyl tRNA to the aminoacyl site then causes the release of the uncharged tRNA from the exit site. This leaves the ribosome ready for the next amino acid in the growing polypeptide chain. Termination of translation: Elongation of the polypeptide chain continues until a terminator (ie, stop codon) is translocated into the A site of the ribosome. The 'release factor' (eRF) recognizes all three termination codons. The eRF binds to a terminator codon at the aminoacyl site and stimulates the hydrolysis of the bond between the tRNA and the polypeptide chain at the peptidyl site. This results in the release of the completed polypeptide from the ribosome. The mRNAs are usually translated by a series of ribosomes, spaced at intervals of about 100 to 200 nucleotides. The group of ribosomes bound to a mRNA molecule is called a polyribosome (ie, polysome), and each ribosome within the group functions independently to synthesize a separate polypeptide chain.

REGULATION OF TRANSLATION
Although transcription is the primary level at which gene expression is controlled, the translation of mRNA is an additional regulatory control point in eukaryotic cells (9)(10)(11)(12)(13)(14). One of the best examples of translational regulation in eukaryotic cells is the cellular mechanism associated with the regulation of ferritin synthesis. The translation of ferritin mRNA is regulated by the supply of iron (15). More ferritin is synthesized when iron is abundant, and this regulation is mediated by a protein that binds to the iron response element (IRE) in the 5¢ untranslated region of ferritin mRNA. In the presence of iron, the repressor no longer binds to the IRE, and ferritin translation can proceed.
The regulation of ferritin translation by iron is similar to the regulation of transferrin receptor mRNA stability, which is regulated by protein binding to an IRE in its 3¢ untranslated region. The same protein binds to the IREs of both the ferritin and the transferrin receptor mRNAs. However, the consequences of the binding of this protein to the two IREs are quite different (15). The protein bound to the transferrin receptor IRE protects the mRNA from degradation rather than inhibiting its translation. These distinct effects probably result from the different locations of the IRE in the two mRNAs. Thus, binding of the same regulatory protein to different sites on mRNA molecules can have distinct effects on gene expression -in one case inhibiting translation, and in the other case stabilizing the mRNA to increase protein synthesis. In the case of the ferritin mRNA, the IRE blocks translation by interfering with 5¢ cap recognition and binding of the 40S ribosomal subunit. This protein binding to the same sequence in the 3¢ UTR of transferrin receptor mRNA protects the mRNA from nuclease degradation and prolongs its half-life.
Earlier studies suggested that protein folding is a selfassembly process determined primarily by its amino acid sequence. However, more recent studies have shown that the proper folding of proteins is mediated by the activities of a group of proteins called 'molecular chaperones'. Chaperones catalyze protein folding by assisting the self-assembly process; the folded conformation of a protein is determined solely by its amino acid sequence. Chaperones bind to and stabilize partially folded polypeptides. In the absence of chaperones, unfolded or incompletely folded polypeptides are unstable within the cell and aggregate into insoluble complexes. Some chaperones bind to nascent polypeptides that are still being translated on ribosomes. This prevents incorrect folding of the amino terminal region of the polypeptide before the synthesis of the chain is terminated. This interaction is important for proteins in which the carboxy terminal region is required for correct folding of the amino terminus. Other classes of chaperones stabilize unfolded polypeptide chains during their intracellular transport to organelles such as the mitochondria. Finally, chaperones are also involved in the assembly of proteins that consist of multiple polypeptide chains.
Many of the molecular chaperones were originally identified as heat shock proteins, a group of proteins that are expressed in cells that have been subjected to increased temperature or other forms of environmental stress. The heat shock proteins appear to stabilize and to facilitate the refolding of proteins that have been partially denatured as a result of exposure to increased temperature. However, many heat shock proteins are expressed under normal growth conditions. They function as molecular chaperones required for polypeptide folding and transport under normal conditions, as well as under conditions of environmental stress. Members of the Hsp70 family stabilize unfolded polypeptide chains during translation as well as during intracellular transport to subcellular compartments, such as the endoplasmic reticulum (ER) and mitochondria. These proteins bind to short segments of seven or eight amino acid residues of unfolded polypeptides and maintain the polypeptide chain in an unfolded conformation, thereby preventing aggregation. Proteins in the Hsp60 family facilitate the folding of proteins into their native conformations. In several instances, members of the Hsp70 and Hsp60 families act together in a sequential fashion and may, therefore, represent a general pathway of protein folding.
In addition to molecular chaperones, cells contain enzymes that catalyze protein folding by breaking and reforming covalent bonds. The formation of disulphide bonds between cysteine residues is an important step in the stabilization of the folded structures of many protein species. In this regard, protein disulphide isomerase (PDI) catalyzes the breakage and reunion of these bonds. Disulphide bonds are usually restricted to secreted proteins and some membrane proteins; in eukaryotic cells, disulphide bonds form in the ER where the activity of PDI is correlated with the level of protein secretion. Another example of an enzyme that plays a pivotal role in protein folding is peptidyl prolyl isomerase, which catalyzes the isomerization of peptide bonds that involve proline residues.
Proteolysis is a critical step in the maturation of many proteins. A simple example of proteolysis is the removal of the initiator methionine residue from the amino terminus of many polypeptides after the growing polypeptide chain leaves the ribosome. As well, proteolytic modification of the amino terminus plays a central role in the translocation of many proteins across the membranes. This includes the translocation of secreted proteins, as well as proteins destined for targeting to the plasma membrane, lysosomes and mitochondria of eukaryotic cells.
Active enzymes and hormones are formed via proteolytic processing of larger precursors. For example, insulin is synthesized as a large precursor polypeptide (pre-proinsulin) that contains an amino terminal sequence, which targets the polypeptide chain to the ER. Proinsulin is formed through the removal of the signal sequence during transfer to the ER. Proinsulin is subsequently converted to insulin, which consists of two chains held together by disulphide bonds and by proteolytic removal of an internal peptide.
The levels of proteins within cells reflect a balance between synthesis and degradation. The differential rates of protein degradation are an important aspect of cell regulation. Rapidly degraded proteins function primarily as regulatory molecules, such as transcription factors. The rapid turnover of these proteins is necessary to allow their levels to respond quickly to external stimuli. Two major pathways mediate protein degradation -the ubiquitin-proteasome pathway and lysosomal proteolysis. The major pathway for selective protein degradation employs ubiquitin as a marker that targets cytoplasmic and nuclear proteins for rapid degradation. Ubiquitin is a 76-amino acid polypeptide that attaches to the amino group of lysine residues. The ubiquinated proteins are recognized and degraded by a multisubunit protease complex called proteasome. Ubiquitin is subsequently released and recycled. The other major pathway for protein degradation involves the transport of proteins to lysosomes, where they are taken up and degraded by proteases.

CELLULAR COMPARTMENTALIZATION OF PROTEIN SORTING AND INTRACELLULAR TRANSPORT
Eukaryotic cells are distinguished from prokaryotic cells by the presence of membrane-delimited compartments, wherein specific cellular activities occur. The sorting and targeting of proteins to their appropriate destinations such as the plasma membrane, the ER or the Golgi complex are key features in the maintenance of these specific cellular activities (22,25,(30)(31)(32).
Proteins destined for the ER, Golgi apparatus, lysosomes, plasma membrane or cellular secretion are synthesized on ribosomes that are bound to the ER membrane. Nascent polypeptide chains are transported from the cytoplasm into the ER, where protein folding and further processing occur before transport to the Golgi apparatus via ER-derived vesicles. In the Golgi apparatus, proteins are further processed and sorted for transport to the plasma membrane and lysosomes, or export from the cell as secretory proteins. The various cellular compartments associated with protein sorting and transport are depicted in Figure 4.
Proteins synthesized on free ribosomes either remain in These proteins are targeted to the nucleus by specific 'nuclear localization signals' that direct their transport through the 'nuclear pore complex'. The first nuclear localization signal characterized was that of the The SV40 viral T antigen. The amino acid sequence proline-lysine-lysine-lysinearginine-lysine-valine is necessary for the nuclear transport of the T antigen and other types of cytoplasmic proteins. Proteins are transported through the nuclear pore complex; this process is mediated by the action of a nuclear receptor called 'importin'. Protein targeting to the ER: Ribosomes that participate in the synthesis of proteins that are ultimately destined for secretion are targeted to the ER (20,25,(29)(30)(31)(32)(33)(34)(35)(36)(37). This targeting is directed by the amino acid sequence of the newly synthesized polypeptide chain, rather than by the intrinsic properties of the ribosome. A signal sequence spans about 20 amino acids and includes a stretch of hydrophobic residues, and is located at the amino terminus of the polypeptide chain. As they emerge from the ribosome, signal sequences are recognized and bound by a signal recognition particle (SRP), which consists of six polypeptides and a small cytoplasmic RNA. The binding of the SRP inhibits translation and targets the complex (polypeptide chain, SRP, ribosome) to the rough ER. This is mediated by binding to the SRP receptor on the ER membrane. Binding to the receptor releases the SRP from the ribosome and the signal sequence of the polypeptide chain. The ribosome subsequently binds to the protein translocation complex of the ER membrane, and the signal sequence is inserted into an ER membrane channel. Translation resumes, and the growing polypeptide chain is translocated across the membrane into the ER lumen. The signal sequence is cleaved by the action of signal peptidase, and the polypeptide is liberated into the ER lumen. The sec-61 complex comprises three membrane-spanning proteins and is the principal component of the ER protein conducting channel in mammalian cells. The targeting of secretory proteins to the ER is illustrated in Figure 5. Proteins destined for incorporation into the plasma membrane, ER membranes, Golgi or lysosomes are inserted initially into the ER membrane, instead of being liberated into the ER lumen. These proteins then proceed to their final destination along the following secretory pathway: ER to Golgi to plasma membrane or lysosomes. These proteins are transported along this pathway as membrane constituents, which differentiates the process from that of secretory proteins. These integral membrane proteins are embedded in the plasma membrane by hydrophobic regions that span the phospholipid bilayer of the membrane. The orientation of proteins inserted into the ER, Golgi, lysosomal and plasma membranes is established as the polypeptide chain is inserted into the ER. The ER lumen is topologically equivalent to the exterior of the cell membrane, such that the domains of plasma membrane proteins that are exposed at the level of the cell surface correspond to the regions of polypeptide chains that are translocated into the ER.
A variety of orientations of membrane proteins are found in eukaryotic cells. Transmembrane proteins are observed with either the carboxy or amino termini exposed to the cy-

Figure 6) Possible orientations of membrane proteins. Integral membrane proteins span the membrane via alpha-helical regions of 20 to 25 hydrophobic amino acids, which can be inserted in a variety of orientations. The proteins at left and centre each span the membrane only once, but they differ in whether the amino (N) or carboxy (C) terminus is on the cytoplasmic side. On the right is an example of a protein that has multiple membrane-spanning regions
tosol ( Figure 6). Other proteins have multiple membranespanning regions called 'alpha-helical regions', which consist of 20 to 25 hydrophobic amino acids. Some integral membrane proteins span the plasma membrane only once, while others have multiple membrane-spanning regions. As well, some proteins are oriented in the membrane with their amino terminus on the cytoplasmic side, and others have their carboxy terminus exposed to the cytoplasm. Two additional features of membrane proteins have been discovered that play a key role in determining the orientation of membrane proteins -the stop-transfer sequence and the internal signal sequence. The consequences of these sequences in determining membrane protein orientation are illustrated in Figures 7, 8 and 9. Protein processing in the ER: A variety of modifications to polypeptides at the level of the ER include folding and assembly, as well as covalent modifications (16)(17)(18)(19)(20)24,29,38).
The proteolytic cleavage of the internal signal sequence takes place as the polypeptide chain is translocated across the ER membrane. The translocation occurs while transla-tion is still in progress, and molecular chaperones facilitate the folding of the polypeptide chains. The binding protein BiP is a member of the Hsp70 family of chaperones that mediate protein folding and the assembly of multisubunit proteins within the lumen of the ER (Figure 10). The correctly assembled proteins are released from BiP and are available for export to the Golgi apparatus. By contrast, abnormally folded or improperly assembled proteins remain bound to BiP and are retained within the ER where they are subsequently degraded. Disulphide bond formation is an important aspect of protein folding and assembly within the ER. This process is facilitated by the enzyme disulphide isomerase, which is located within the lumen of the ER.
Some proteins are anchored within the plasma membrane by glycosylphosphatidylinositol (GPI) anchors, which are Figure 7) The insertion of a membrane protein with a cleavable signal sequence and a single stop-transfer sequence. The signal sequence is cleaved as the polypeptide chain crosses the membrane, so the amino (N) terminus of the polypeptide chain is exposed within the endoplasmic reticulum lumen. However, translocation of the polypeptide chain across the membrane is halted by a stop-transfer sequence that anchors the protein to the membrane. The ribosome is released from the membrane, and continued translation results in a membrane-spanning protein with its carboxy (C) terminus on the cytoplasmic side

Figure 8) The insertion of membrane proteins with an internal cleavable signal sequence. Internal noncleavable signal sequences result in the insertion of polypeptide chains in either orientation in the endoplasmic reticulum (ER) membrane. Top
The signal sequence directs insertion of a polypeptide such that its amino (N) terminus is exposed on the cytoplasmic side. The remainder of the polypeptide is translocated into the ER as translation proceeds. The signal sequence is not cleaved, so it acts as a membrane-spanning sequence that anchors the protein to the membrane with its carboxy (C) terminus within the ER lumen. Bottom Other internal signal sequences are oriented to direct the transfer of the N terminal portion of the polypeptide across the membrane. Continued translation results in a protein that spans the ER membrane with its N terminus in the lumen and its C terminus in the cytoplasm. This orientation is the same as that resulting from insertion of a protein that contains a cleavable signal sequence followed by a stop-transfer sequence assembled in the ER membrane. The GPI anchors are added immediately after completion of protein synthesis to the carboxy terminus of some proteins, which are subsequently transported to the cell surface via the secretory pathway. Their orientation within the ER dictates that GPI anchor proteins reside outside of the cell. Transport of proteins from the ER: Proteins travel along the secretory pathway in transport vesicles derived from the ER (22,23,25,31,32,39,40). These proteins subsequently fuse with the membrane of the Golgi apparatus. The subsequent steps in the secretory pathway involve vesicular transport between the different Golgi compartments, and from the Golgi to the plasma membrane or lysosomes. The Golgi apparatus consists of series of membrane-delimited cisternae and associated vesicles. Proteins derived from the ER enter at the cis face and exit the Golgi from its trans face. Proteins marked for residence within the ER are recognized by the Golgi and are returned to the ER. Other proteins are carried by transport vesicles to the trans Golgi network, where the final stages of protein modification are completed, before being targeted to lysosomes and to the plasma membrane.
Most proteins travel from the ER to the Golgi. However, some proteins particular to the functioning of the ER must be retained within that organelle (eg, BiP, signal peptidase, protein disulphide isomerase). Targeting sequences specifically designate proteins destined for retention in the ER or transport to the Golgi (Figure 11). The proteins that are retained in the ER lumen contain the targeting sequence KDEL (single letter amino acid code; lysine-asparaginaseglucine-leucine) at their carboxy terminus. The retention of certain transmembrane proteins within the ER is dictated by the carboxy terminal sequence KKXX. Soluble ER proteins are packaged into vesicles and transported into the Golgi, where they are subsequently retrieved and returned to the   Figure 12). Proteins destined for transport from the ER are selectively packaged into transport vesicles targeted to the Golgi apparatus. Thus, protein export from the ER is controlled not only by retention and retrieval signals, but also by targeting signals that mediate the selective transport to the Golgi. Protein glycosylation: Protein glycosylation takes place on specific asparagine residues (N-linked glycosylation) while a translation is taking place (23,31,33,40). The oligosaccharide is synthesized on a dolichol carrier, which is anchored to the ER membrane. The membrane-bound enzyme oligosaccharyl transferase transfers the oligosaccharide unit to acceptor asparagine residues in the consensus sequence (asparagine)-X-serine/threonine, where X represents any other amino acid. Thereafter, three glucose residues and one mannose residue are trimmed while the protein is still within the ER. The sequence of steps associated with protein glycosylation in the ER is illustrated in Figure 13. The N-linked oligosaccharides are processed within the Golgi complex in an ordered sequence of reactions. The first modification is the removal of three additional mannose residues. This occurs on proteins destined for secretion or for targeting to the plasma membrane. This is followed by the sequential addition of an N-acetylglucosamine residue, the removal of two more mannoses and the addition of fucose as well as two more N-acetlyglucosamines. Finally, three sialic acid residues and three galactose moities are added, and these reactions occur at the level of the trans Golgi network. The processing of the N-linked oligosaccharide of lysosomal proteins differs from that of secretory and plasma membrane proteins. The proteins destined for incorporation into lysosomes are modified by mannose phosphorylation, followed by the removal of the N-acetylglucosamine group, leaving mannose 6-phosphate residues on the N-linked oligosaccharide. These phosphorylated mannose residues are specifically recognized by the mannose 6-phosphate receptor in the trans Golgi that directs the trafficking of these proteins to lysosomes. Proteins can also be modified by the addition of carbohydrates to the side chains of serine and threonine residues within specific sequences of amino acids (O-linked glycosylation). The serine or threonine is usually linked directly to N-acetylgalactosamine to which other sugars can be subsequently added. Protein sorting and transport from the Golgi apparatus: Proteins are transported from the Golgi apparatus to their ultimate destinations via the secretory pathways. This involves sorting of the proteins into different kinds of transport vesicles that bud from the trans Golgi network and deliver their contents to the appropriate cellular addresses (25,31,32,41). In the absence of specific targeting signals, proteins are delivered to plasma membranes by bulk flow; proteins are transported in a nonselective fashion from the ER to the Golgi and ultimately to the cell surface. This bulk flow pathway accounts for the incorporation of new proteins and lipids into the plasma membrane as well as for the continuous secretion of certain proteins from the cell.
The bulk flow pathway leads to continuous, unregulated protein secretion. In contrast, in some cell types a distinct regulated secretory pathway exists in which specific proteins are secreted in response to particular stimuli. Examples of regulated secretion include the release of hormones and neurotransmitters, and the release of digestive enzymes from the pancreatic acinar cells. These proteins are packaged into specialized secretory vesicles, which store their contents until specific signals direct their fusion with the plasma membrane. The sorting of proteins into the regulated secretory 612 Can J Gastroenterol Vol 14 No 7 July/August 2000 Wild et al Figure 12) Proteins that enter the endoplasmic reticulum (ER) are transported to the Golgi and subsequently to the plasma membrane. Specific signals cause proteins to be returned from the Golgi to the ER, to be retained within the Golgi, to be retained in the plasma membrane or to be transported to endosomes and lysosomes. Proteins may be transported between the plasma membrane and endosomes Figure 13) The sequential process of protein glycosylation in the endoplasmic reticulum (ER). Asn Asparagine; N Amino terminus; P Phosphate pathway involves the recognition of signal patches shared by multiple proteins that enter this pathway. Proteins that function within the Golgi complex must be retained within that organelle. Retention of Golgi membrane proteins is based on the trans membrane domains of those particular proteins. Golgi membrane proteins have short transmembrane alpha-helices of about 15 amino acids, which contribute to the retention of these proteins within the Golgi complex. As well, signals in the cytoplasmic tails of some Golgi proteins mediate the retrieval of these proteins from subsequent compartments along the secretory pathway.
The plasma membrane of polarized epithelial cells such as the enterocyte is divided into apical and basal lateral domains. Each domain contains compartment-specific proteins related to the unique functions of each domain. In some types of epithelia, membrane proteins are sorted at the level of the trans Golgi network for selective transport to the domains of the plasma membrane. The GPI anchor is one signal that directs proteins to the apical membrane domain.
A specific receptor in the trans Golgi network recognizes mannose 6-phosphate residues. The resulting complexes are comprised of receptor plus lysosomal enzyme, and are packaged into transport vesicles destined for lysosomes. Vesicular transport: The first step in vesicular transport is the formation of a vesicle by a process of 'budding' from the membrane. The cytoplasmic surfaces of these transport vesicles are coated with proteins. Three types of coated vesicles that participate in vesicular transport have been characterized (42)(43)(44)(45)(46)(47)(48)(49)(50). Clathrin-coated vesicles are responsible for the uptake of molecules from the plasma membrane by endocytosis, as well as the transport of molecules from the trans Golgi network to lysosomes ( Figure 14). The two remaining types of coated vesicles that arise from the ER and Golgi complex are called non-clatherin-coated or COP-coated vesicles. COP-I-coated vesicles arise from the Golgi apparatus, whereas COP-II-coated vesicles bud from the ER. The COP-II-coated vesicles transport material from the ER to the Golgi, whereas COP-I-coated vesicles mediate transport between Golgi stacks, recycling from the Golgi to the ER and possibly other transport processes.
The binding of clatherin to membranes is mediated by adaptins. These adaptins are responsible for the assembly of clatherin-coated vesicles at the plasma membrane and at the trans Golgi network, as well as being responsible for selecting specific molecules to be incorporated into the vesicles.
Distinct protein complexes comprise the coats of COP-Iand COP-II-coated vesicles. The components of the COP-I Figure 14) The incorporation of lysosomal proteins into clatherincoated vesicles. Proteins targeted for delivery to lysosomes are marked by mannose 6-phosphates, which bind to mannose 6-phosphate receptors in the trans Golgi network. The mannose 6-phosphate receptors span the Golgi membrane and function as binding sites for cytoplasmic adaptins, which in turn bind clatherin. Clatherins comprise three protein chains that associate with each other to form a lattice structure that distorts the membrane and promotes vesicle budding The hydrolysis of the bound GDP then converts ARF to the GDP-bound state. This leads to the disassembly of the vesicle coat before fusion with the target membrane. The GDP-bound ARF is subsequently reconverted to the GTP-bound state. This is mediated by the action of a Golgi membrane protein that promotes a GDP-GTP exchange process. This leads to another cycle of coatomer assembly. P Phosphate coat interact with the KKXX motif that is responsible for the retrieval of ER proteins from the Golgi apparatus, and is consistent with the role of COP-I-coated vesicles in recycling from the Golgi to the ER. The budding of clatherin-coated and COP-I-coated vesicles from the trans Golgi network requires the activity of a GTP-binding protein called ADPribosylation factor (ARF) (Figure 15). ARF is related to Ras proteins, which function as oncogenes in human cancers. ARF bound to GTP associates the Golgi membranes and is required for the binding of either COP-I-coat components or clatherin adaptins.
Several other Ras-related GTP binding proteins have also been characterized in the secretory process. These include more than 30 Ras-related proteins (termed Rab proteins) that are implicated in vesicular transport in eukaryotic cells.
Two types of events characterize vesicle fusion with its target. First, the transport vesicles recognize the correct target membrane. Second, the vesicle and target membranes fuse, thus delivering the contents of the vesicle to the target organelle. Recognition between the vesicle and its target is mediated by interactions between unique pairs of transmembrane proteins. In contrast, fusion between the vesicle and target membranes arises from the action of general fusion proteins.
Biochemical analyses of reconstituted vesicular transport systems from mammalian cells have defined two classes of proteins involved in vesicle fusion. N-ethylmaleimitesensitive fusion (NSF) is a soluble cytoplasmic protein that binds to membranes with other proteins called soluble NSF attachment proteins (SNAPs). NSF and SNAPs bind to families of specific membrane receptors called SNAP receptors (SNAREs). According to the SNARE hypothesis, interactions between specific vesicle SNARE and target SNARE membranes dictate the specificity of vesicle fusion. Following specific vesicle-target interaction, the SNARE complex recruits NSF and SNAPs, resulting in the fusion of the vesicle and target membranes. For example, transport from the ER to the Golgi requires SNAREs that are located on both the vesicle and target membranes. These interactions are additionally regulated by the Rab GTP-binding proteins that are essential for vesicle transport. The SNARE hypothesis provides a central framework for understanding the molecular mechanisms of vesicle docking and fusion.
The major functions of lysosomes are related to the digestion of material taken up from outside the cell by endocytosis. Lysosomes are formed by the fusion of transport vesicles arising from the trans Golgi network with endosomes, which contain the molecules taken up by endocytosis at the level of the plasma membrane. Acid hydrolyases are targeted to lysosomes by mannose 6-phosphate residues, which are recognized by mannose 6-phosphate receptors in the trans Golgi network and packaged into clatherin-coated vesicles. After removal of the clatherin coat, these transport vesicles fuse with endosomes, and the acidic internal pH results in dissociation of the hydrolyases from the mannose 6-phosphate receptor. The hydrolyases are thus released into the lumen of the endosome. The endosomes then mature into lysosomes as they acquire a full complement of acid hydrolyases that digest the molecules taken up by endocytosis.

CONCLUSIONS -CYSTIC FIBROSIS AS A PARADIGM OF MUTATIONS LEADING TO ALTERATIONS IN TRANSCRIPTIONAL AND POST-TRANSCRIPTIONAL PROCESSING OF AN INTEGRAL MEMBRANE TRANSPORT PROTEIN
The largest family of membrane transport proteins consists of the ABC transporters, so designated because they contain a basic structural unit characterized by six transmembrane domains followed by highly conserved ATP binding cassettes. One of the most important members of the ABC family of transporters is the gene responsible for cystic fibrosis (CF). This gene encodes a protein, the CF transmembrane regulator (CFTR), which functions as a chloride ion channel in epithelial cells (51)(52)(53).
CF is the most common (one in 2500 newborns) lethal recessive genetic disease of Caucasians. The fundamental physiological abnormality in CF is characterized by failure of cyclic adenosine monophosphate (cAMP) regulation of chloride transport across epithelial cell membranes. The CFTR maps to chromosome 7 and comprises 27 exons (ie, 230 kilobytes of DNA) that encode a glycosylated protein containing 1480 amino acids with a molecular mass of 170 kilodaltons. The CFTR gene product has two transmembrane domains, each containing six membrane-spanning segments, two nucleotide binding domains (NBD) and a regulatory (R) domain ( Figure 16). The hydrolysis of ATP occurs at the NBD sites, while the R domains play an inhibitory role in keeping the chloride channel closed. The closed state of the chloride ion channel arises through the dephosphorylation of the R domain.
The CFTR is restricted to the apical membrane domain of epithelial cells, where it functions as a cAMP-dependent channel that allows the selective transport of chloride ions across the epithelial cell membrane. The binding of ATP leads to the gating of the chloride ion channel. As well, the CFTR is regulated by phosphorylation, which is accomplished by the action of a cAMP-dependent protein kinase A. The phosphorylation of the R domain results in a conformational change that leads to the opening of the chloride channel. The phosphorylated R domain plays a stimulatory role by enhancing the interaction of NBDs with ATP. The binding of ATP by the NBDs and its subsequent hydrolysis serve to control the opening and closing of the chloride channel. The activated CFTR conducts chloride ions out of the epithelial cell and functions as a regulatory switch that allows cAMP to inhibit sodium ion absorption through sodium ion channels, and stimulate chloride ion secretion through channels distinct from the CFTR.
Chloride conductance at the apical membrane domain is dramatically reduced in CF. This is explained on the basis of quantitative or qualitative alterations in the CFTR, such that the clinical phenotype of CF patients is characterized by the inability of epithelial cells to transport or secrete chloride. The specific deletion of three base pairs in exon 10 results in the loss of a phenylalanine residue at position 508 within one of the ATP binding domains of the CFTR protein (D F508). This particular mutation is associated with 70% of the mutant alleles in CF. More than 800 additional mutations within the CF gene comprise the remaining 30% of the mutant alleles in CF.
The D F508 mutation, for example, results in defective post-translational processing and intracellular trafficking of the CFTR such that it does not reach the apical membrane domain. Other mutations in the CFTR reduce its function in CF patients by a variety of mechanisms that act at one or several points in the flow of DNA to RNA to protein. Five classes of CFTR mutations ( Table 2) have been described, and the molecular consequences of these different classes of mutations are illustrated in Figure 16. However, the various classes of CFTR mutations are not mutually exclusive. For example, in the D F508 CF mutation, the deletion of phenylalanine leads to misprocessing of the CFTR but also failure of the CFTR protein to respond normally to activation signals.
In summary, mutations in the CFTR gene lead to alterations in transcription, post-transcriptional processing, trans-lation and post-translational processing of the CFTR membrane protein along the secretory pathway. Importantly, the various types of CFTR mutations underscore the importance of each of these critical steps in the regulation of CFTR gene expression.

ACKNOWLEDGEMENTS:
This work was supported by operating grants from the Medical Research Council of Canada and the Crohn's and Colitis Foundation of Canada. Dr Gary E Wild is a chercheur boursier clinicien of Les Fonds de la Recherche en Sante du Québec. Dr Wild wishes to extend his appreciation to Drs David Fromson, John Southin, Howard Bussey and Bruce Brandhorst of the McGill Biology Department. Their tireless efforts in the area of undergraduate science education fostered a sense of inquiry and collegiality that guided a cohort of students through the early Recombinant DNA era.

TABLE 2 Cystic fibrosis transmembrane regulator (CFTR) mutations
Class I Mutations result in abnormal protein synthesis, with premature termination of CFTR mRNA translation. This is the result of a base substitution that creates stop codons (eg, G542X) or of frameshift mutations such as the 390insT resulting from the insertion of a single nucleotide. Mutations in this class result in a dramatic decrease in the numbers of functioning CFTR channels.
Class II Mutations result in the defective processing or intracellular trafficking of CFTR protein such that it does not reach its intended address at the brush border membrane, eg, D F508 and N1303K.
Class III Mutations lead to defective regulation of the CFTR, even though it reaches the brush border membrane, eg, G551D.
Class IV Mutations in CFTR are such that the CFTR reaches the brush border membrane, but conductance properties are defective due to altered channel properties such as gating, eg, R117H and R347P.
Class V Mutations are associated with reduced synthesis of the CFTR. This class may include promoter mutations that reduce the transcription; nucleotide alterations that promote alternative splicing; and amino acid substitution that cause insufficient levels of functional CFTR molecules, eg, 3849 10 + ¾®¾ kilobase pairs C T