Papillomavirus Life Cycle Organization and Biomarker Selection

Human papillomaviruses (HPVs) are a diverse group of viruses that cause epithelial lesions of varying severity. Of the 100 or so types that have been identified, around 40 can infect the cervix, with a subset of these causing lesions that can progress to high-grade neoplasia and cervical cancer. These high-risk types are prevalent in the general population, and can predispose to the development of cancer in women who cannot resolve their infection. Virus infection usually leads to the establishment of productive flat warts, or to maintenance of the viral genome in an asymptomatic or latent state. Virus synthesis depends on the ordered expression of viral gene products as the infected basal cell migrates towards the epithelial surface. E7 is expressed in the lower epithelial layers, and is followed eventually by the expression of E4 and L1 closer to the epithelial surface. This ordered pattern changes in characteristic ways during neoplastic progression and latency, and can be irreversibly fixed following integration of the viral genome into the host cell chromosome. Our understanding of expression patterns and their significance, is beginning to explain the nature of disease progression, and offers a rational basis for the selection of biomarkers that may be used to predict disease status and prognostic outcome.


The diversity of papillomavirus types that infect humans
Human papillomaviruses (HPV) complete their productive cycle in stratified epithelial tissue such as cutaneous skin or mucosal epithelium. They can infect many epithelial sites and cause a wide variety of epithelial lesions, including common warts, verrucas, laryngeal papillomas, and genital condyloma. The different types of epithelial lesion, are in general, caused by different groups or types of HPV, with some types showing a very restricted tissue tropism [6,34]. This is the case with viruses such as HPV1, which causes cutaneous lesions at palmar and plantar surfaces (Fig. 1A).
DNA sequence analysis over the last 25 years has shown that papillomaviruses are a very diverse group with over 100 human members [34]. Most HPV types belong to the Alpha or Beta genus, with the two groups having quite distinct biology and life cycle patterns. The Alpha papillomaviruses are found only in humans and primates, and it is this group that contains the HPV types that are frequently associated with cervical cancer. HPV16 (α9) causes over 50% of all cervical can-cers, while HPV18 (α7) is responsible for around 20% of cases (Fig. 1B). These HPV types are classified as high-risk, and are amongst 20 or so such viruses that infect the cervix (Fig. 1B). The Alpha papillomaviruses also contain low-risk members that infect the cervix, but these are not generally associated with cervical cancer. HPV types such as HPV6 and HPV11 (α10) are the best studied of these, and are also responsible for the production of external genital warts that can be a problem in young adults. Although not generally life threatening, such lesions can be difficult to treat effectively in some patients [94]. The Alpha papillomavirus genus also contains cutaneous HPV types such as HPV2 (α4), which is a prominent cause of common warts in children. The different biology of Alpha papillomavirus members (high-risk, low-risk and cutaneous) is clearly reflected at the level of virus evolution, when whole genomic sequences are compared (Fig. 1B).
Beta papillomaviruses are evolutionarily distinct from the Alpha genus (Fig. 1A), and appear to cause widespread in-apparent or asymptomatic infections in the general population, with children becoming infected at an early age. In immunosuppressed patients, Fig. 1. The association of papillomaviruses with disease in humans. A. Human Papillomaviruses are contained within five evolutionary groups. HPV types that infect cervical mucosa are contained within the Alpha group. Beta, Gamma, Mu and Nu papillomaviruses typically infect cutaneous sites. The type of lesion associated with each group is shown in the boxes. B. The Alpha papillomaviruses can be subdivided into three categories (high-risk, low-risk and cutaneous), on the basis of DNA sequence analysis. The evolutionary relationships shown on the left are also manifest in their biology. High-risk types come from the Alpha 5, 6, 7, 9 and 11 groups. The frequency with which the different HPV types are found in cervical cancers (squamous cell carcinoma and adenocarcinoma/adenosquamous carcinoma) is shown in the central columns (based on information contained in [79]). and in individuals suffering from the inherited disease Epidermodysplasia Verruciformis (EV), these viruses can spread unchecked, and have been implicated in the development of non-melanoma skin cancer cancer [55,96]. EV patients carry mutations in their EV-ER1/TMC6 or EVER2/TMC8 genes, which renders them particularly susceptible to these viruses [97,112]. The remaining HPV types come from the Gamma, Mu and Nu genus, and cause visible cutaneous papillomas that do not generally progress to cancer (Fig. 1A). Only two Mu HPV types are known (HPV1 and 63), and the Nu Genus comprises only one member.

Similarities in the organization of all papillomavirus genomes
All papillomaviruses consist of a double stranded circular genome of around 8kb containing one coding strand, encapsidated in an icosohedral protein shell made up of a major (L1) and minor (L2) coat protein ( Fig. 2A). 360 molecules of L1 and approximately 12 molecules of L2 are needed for the formation of an infectious virion [76]. The L1 and L2 proteins are conserved across widely divergent papillomaviruses, and along with E1 and E2, are key viral gene products that are thought to have been present in the common ancestor of all animal papillomaviruses [113]. Other viral genes, such as E6, E7, E4 and E5, have been acquired or significantly modified during evolution, and are not necessarily present, or may not have the same function in all papillomavirus types [34]. All human papillomaviruses have E6 and E7 open reading frames (ORF), but the Beta papillomaviruses lack an E5 ORF. Avian papillomaviruses lack a clearly recognizable E4 ORF, and do not contain canonical E6 and E7 open reading frames similar to those found in human papillomaviruses [83,113]. The roles of the viral gene products have been most thoroughly worked out for the Alpha HPV types, and more specifically for the high-risk types associated with cervical cancer. While similarities exist amongst HPV types, it is also clear that there are significant differences in the way different papillomaviruses regulate their genes, and in the function of specific viral gene products, and that this reflects the very different biologies of the different viruses.

Life cycle organisation and the basis for biomarker selection
Irrespective of their evolutionary origin, all papillomaviruses must complete their life cycle in the ep-ithelial tissue that they infect, and produce infectious particles that are eventually secreted from the epithelial surface. To do this requires the timely and coordinated expression of the different viral gene products as the infected cell moves towards the epithelial surface (Fig. 2B). This highly regulated pattern of gene expression allows the different stages of the virus life cycle to be completed appropriately, and provides a basis for the selection of biomarkers that may be useful in diagnosis. Life cycle events can be divided into a number of phases that are outlined below.

Primary infection of the epidermis
Papillomavirus infection begins when virus particles gain access to the epithelial basal layer, which in uninfected epidermis contains cells that are mitotically active. Such cells are normally stimulated to divide by growth factors that originate from the underlying stromal tissue, and in uninfected epithelium, it is the continual division of these cells that allows renewal of the epidermis as surface cells exfoliate. It is thought that papillomavirus particles associate with proteoglycans on the cell surface, as well as with secondary receptors that facilitate viral uptake [64,72,91,104]. At most epithelial sites, access to the basal cell layer requires a break in the integrity of the epithelial sheet. Although this may require the presence of microwounds or more obvious damage to the epithelial layer, some papillomaviruses are thought to infect sites where access to the basal layer is naturally facilitated, such as at the base of the hair follicle, or sites where columnar and stratified epithelial cells meet each other (such as at the cervical or anal transformation zone) (Fig. 2B). Successful infection leads to the establishment of the papillomavirus genome as a stable episome that is maintained at low copy number with out integration into the host cell chromosome. Replication of viral DNA in these cells depends on the presence of the cellular replication machinery, with viral DNA replication accompanying cellular DNA replication as the cells progress through S-phase. The only viral proteins that are thought to be necessary in these cells are E1 and E2, which can be expressed from the differentiation-dependent promoter (PL (also referred to as p670 in HPV16)) ( Fig. 2B), and which play an important role in viral DNA replication and genome partitioning during cell division. The E2 protein contains a C-terminal DNA binding domain that recognizes a repeated sequence motif in the viral non-coding region [35]. One of these E2-binding motifs is adjacent to the viral origin of replication, and it is The six early open reading frames (E1, E2, E4, E5 (green), and E6 and E7 (red)) are expressed from either the P97 early (PE) or P670 late (PL) promoters during epithelial cell differentiation. The late open reading frames (L1 and L2 (yellow)) are also expressed from P670, following a change in splicing patterns, and a shift in polyadenylation site usage (from PAE (early polyadenylation site) to PAL (late polyadenylation site). All the viral genes are encoded on one DNA strand. The Long Control Region (LCR from 7156 to 84) is enlarged to reveal the E2 binding sites and the P97 promoter. E1 and SP1 binding sites are also shown. B. The part of the transformation zone where the columnar cells of the endocervix meet the stratified squamous cells of the ectocervix is illustrated diagramatically. Virus infection occurs through a microwound, or where access to basal/reserve cells is facilitated (arrows). Cells expressing cell cycle markers are shown with red nuclei. The appearance of such cells above the basal layer is a consequence of E6 and E7 expression. The expression of viral proteins necessary for genome replication occurs in cells expressing E6 and E7 following activation of P670 in the upper epithelial layers (cells shown in green with red nuclei). The L1 and L2 genes (yellow) are expressed in a subset of the cells that contain amplified viral DNA in the upper epithelial layers. Infectious particles are shed from the epithelial surface as the cells are lost through desquamation. The timing and extent of virus gene expression is summarized at the right of the figure using arrows. The consequence of expressing viral gene products in this ordered way is shown at the far right.
thought that E2 recruits E1 to this region by direct binding through its N-terminus. This region of E2 contains binding sites for several viral and cellular proteins such as L2 [57], and in the case of Bovine Papillomavirus type 1, the cellular protein Brd4 [74,122]. The latter interaction is important for the proper segregation of replicated episomes during cell division. Although ini-tially involved in basal cell expansion following infection, the long-term requirement for the viral E6 and E7 proteins is uncertain in cells of the basal layer, as these cells are already able to cycle as a result of their close proximity to the basal lamina. During productive infection, HPV proteins cannot easily be detected in the basal cell layer, although the E7 protein can be seen in cervical neoplasia where expression levels are not properly controlled [44].

Cell proliferation during productive infection
Papillomavirus entry is though to be followed by a period of cell proliferation, in which the infected cell is driven to divide in order to produce a sheet of basal cells that harbour viral episomes. The HPV E6 and E7 proteins are integral in driving cell proliferation following infection, as well as in stimulating cells to re-enter the cell cycle following their migration towards the epithelial surface (Fig. 2B). The molecular basis of HPV E6/E7 function is now well established, as the deregulation of these gene products is important in the development of cervical neoplasia and eventually cancer.
The key function of E7 lies in its ability to associate with the cell cycle regulator pRb (and other members of the pocket protein family), and to disrupt the association between pRb and the E2F family of transcription factors in the absence of growth factor stimulation (Fig. 3). The released E2F can subsequently transactivate cellular genes involved in viral DNA replication such as cyclin A and E, allowing cells above the basal layer to support DNA replication. E7 also associates with histone deacetylases [12], components of the AP1 transcription complex [2], and the cyclin dependent kinase inhibitors p21 and p27 [49], which also regulate cell proliferation. During productive HPV infection, it appears that the level of p21 and p27 is critical in determining whether an infected cell can effectively enter S-phase following migration from the basal layer ( Fig. 3). It is thought that high levels of p21 and p27 lead to the formation of inactive complexes with E7 and cyclinE, which stalls the cell in a pre S-phase state and prevents E7-mediated cell proliferation (Fig. 3). The ability of E7 to drive cells into S-phase in differentiating epithelium appears limited to those cells in the lesions which express p21 and p27 at low level, or which express sufficient E7 to overcome the block to cell cycle progression [89]. Immunostaining studies clearly show that cyclinE is detectable in a subset of differentiating epithelial cells during productive infection, and indeed, cyclin E has been proposed as a marker of HPV infection because it is only transiently expressed (and is therefore not readily detectable) during the normal cell cycle. E2F-activated cellular proteins such as cy-clinE are regarded as surrogate markers of E7 expression when present above the basal layer in productive lesions. Other E2F activated gene products and/or Sphase proteins such as PCNA, Ki67, p16, MCM and possibly also survivin [11], may serve similar purposes as diagnostic markers, and may be detected above the basal layer during productive papillomavirus infection. The presence of Cyclin E and p16 appears linked to E7 abundance, and may offer additional information (Fig. 3). The function of the HPV E6 protein complements that of E7, and in the high-risk HPV types, the two proteins are expressed together from a single polycistronic mRNA species [107]. A primary role of E6 is its association with p53, which in the case of the high-risk HPV types, mediates p53 ubiquitination and degradation (Fig. 3). This contributes to survivin accumulation [1], and prevents growth arrest or apoptosis in response to E7-mediated cell cycle entry in the upper epithelial layers, which would otherwise occur through activation of the ARF pathway (Fig. 3). E6s general role as an anti-apoptotic protein is further emphasized by the finding that it also associates with Bak [114] and Bax [69]. This role of E6 is important in the development of cervical cancers as it compromises the effectiveness of the cellular DNA damage response, and allows the accumulation of secondary mutations to go unchecked. The E6 protein of the high-risk HPV types also plays a role in mediating cell proliferation independently of E7 through its C-terminal PDZ-ligand domain [115]. E6-PDZ binding can mediate suprabasal cell proliferation [84,85] and may contribute to the development of metastatic tumours by disrupting normal cell-cell contacts. In productive lesions, cells are driven into cycle only in the lower epithelial layers, and extend towards the epithelial surface to varying extents depending on lesion grade and the nature of the infecting HPV type [92]. Genome maintenance and the replication of genomes in the upper epithelial layers appears critically dependent on E6 and E7 expression, as well as the viral proteins involved in origin binding. With the exception of E7 and its surrogate markers however, viral proteins are not readily detectable prior to the onset of genome amplification during normal productive infection [36,44]. The low level expression of viral proteins in the basal layer is thought to reflect, at least in part, the need of the virus to avoid detection by the host's immune system.

Amplification of viral genomes
The maintenance of viral genomes in the proliferating basal cells, and in cells of the lower epithelium, is a common feature of all papillomaviruses. During productive infection, genome amplification is triggered as the infected cell differentiates and is pushed towards ii) The consequence of the above is the release of E2F, which can then act as a transcription factor. iii) E2F stimulates the expression of proteins necessary for S-phase entry such as PCNA and MCM. In uninfected cells, p16 regulates the continued expression of these proteins by inhibiting the activation of CyclinD/Cdk. In HPV infected cells, p16 is unable to exert a regulatory effect, as S-phase entry is stimulated by E7 rather than CyclinD/Cdk. iv) The uncontrolled expression of p16 that can sometimes be seen following E7 expression is accompanied by a rise in the levels of p14. This leads to MDM inactivation and to an increase in p53 that is countered by the E6 protein. In uninfected cells, p14 levels are regulated by p16 feedback and p53 levels remain low. v) The E6 protein forms a tripartite complex with E6AP and p53, which leads to the degradation of p53-mediated growth arrest and/or apoptosis. E6 can also stimulate the degradation of PDZ-domain proteins, which affects cell adhesion and signalling. B. The fate of cells expressing E7 can vary during natural infection. i) In the presence of high levels of p21 and lower levels of E7, E7-mediated cell proliferation can be stalled, and CyclinE/Cdk accumulates, probably as an inactive complex. ii) When E7 levels are high, p21-mediated growth arrest is thought to be inhibited as a result of association. In such cells, CyclinE/Cdk exists as a low-level active complex. the epithelial surface by the division of the cells beneath. The initiation of late events can vary greatly in their timing, and in canine papillomas (caused by COPV), genome amplification begins in a subset of cells in the basal layer [86]. Although the onset of genome amplification in basal cells is not seen with any of the known HPV types, HPV 1 and 63 reproducibly trigger genome amplification in the cell layers immediately above [13,42]. The differences between different HPV types (and indeed evolutionary groups) can be clearly seen when lesions from the same epithelial site are compared. HPV 1, HPV2 and HPV4, which can all cause palmar and plantar warts, come from three different evolutionary groups. HPV2 shows a pattern that is characteristic of the Alpha group, and resembles the HPV types that cause genital warts (e.g. HPV6 and 11) and cervical lesions (HPV16 and 31) in its timing of genome amplification. Lesions from the same site caused by HPV1 and 4 show expression patterns that are clearly distinct [75].
The specific events that trigger the onset of genome amplification are not well understood, but depend in part on changes in the cellular environment as the infected cell migrates towards the epithelial surface. Of key importance is the up-regulation of the differentiation-dependent promoter, which for many HPV types resides within the E7 reading frame (Fig. 2B). Promoter activation, which is not dependent on viral genome amplification [7,106], leads to an increase in the levels of viral proteins necessary for replication, including E1, E2, E4 and E5 (Fig. 2B). Of these, the E4 protein is by far the most abundant, with the protein becoming easily detectable by immunostaining in the upper epithelial layers. In accordance with our understanding of papillomavirus protein function, cells supporting genome amplification stain positive for both E7 and E4, and can be shown to be replication competent using in vitro assays (Fig. 4).
Although E1, E2 and E5 expression is thought to accompany the expression of E4 (Fig. 2B), these proteins have proved difficult to detect in HPV-induced lesions. E1 is thought to be a very minor gene product, as codon usage within the E1 ORF is sub-optimal and E1 transcripts are difficult to detect. E2 by contrast, is thought to be expressed more abundantly. The E2 protein has been convincingly localized to cells supporting genome amplification in bovine warts caused by BPV, but its identification in human lesions is currently less conclusive [70,109]. The mRNA species that encode E1, E2 and E4 terminate at the early polyadenylation site and for many of the alpha papillomaviruses, include E5 as their second or third open reading frame. E5 is a transmembrane protein that is localized predominantly in the ER. The E5 protein can associate with the vacuolar proton ATPase and can delay the process of endosomal acidification [37,61,111], which affects the recycling of growth factor receptors on the epithelial surface. The consequence of this is an increase in EGF-mediated receptor signalling, which is thought to contribute (along with E6 and E7) to the maintenance of a replication competent environment during genome amplification [27]. Support for such a role during the virus life cycle comes from the analysis of E5 mutant genomes of HPV16 and 31, which show a lower level of genome amplification than do wild type genomes following propagation in raft culture [43,51]. The detection of E5 by immunostaining is compromised by the fact that the protein is predominantly membrane bound, and that epitope availability is therefore limited. The E5 protein has been detected in lesions induced by BPV where it localises to the basal cell layer and to differentiated epithelial cells supporting genome amplification [17]. In cervical biopsies caused by HPV16, one report has shown the protein to be present in the lower epithelial layers, but not specifically in basal cells [21]. As with E2, there are only a limited number of studies that have investigated E5 distribution in vivo, and it is difficult to be certain as to its precise expression pattern.
Although the molecular basis for the role of E1 and E2 in replication is well understood, the role of E4 is not yet fully worked out. Abundant 16E4 can inhibit cell cycle progression by associating with CyclinB/Cdk and CyclinA/Cdk during G2, and this leads to cell cycle arrest at the G2/M boundary by sequestering these proteins in the cytoplasm [31,32]. It has been suggested that the continued expression of E7 in a cell containing abundant E4 (as seen in vivo) may lead to the maintenance of an S-phase environment, which allows accumulation of the viral genomes. The correlation between E4 expression and genome amplification is an invariant characteristic of all papillomavirus types that have been analysed [75,92], and in several model systems, the inability to express E4 leads to a reduced ability to support the amplification of viral genomes [82,93,119]. Although the precise contribution of E4 remains to be established, its association with cyclin dependent kinases [30,31], and with E2 [105], are likely to play important roles. It is intriguing to note, that the increase in E2 that is thought to accompany E4, must eventually down-regulate the viral early promoter that regulates E7 production and entry into S-phase. The steady rise in E2 may act both to initiate genome amplification, and in a timely manner, to turn it off once amplification is complete. Such a scenario is supported by immunostaining data, which shows only a narrow overlap between cells expressing E7 and those expressing E4 (Fig. 4).

Virus assembly and release
The final stage in the virus life cycle involves the packaging of viral genomes into infectious particles that are shed from the epithelial surface. The two virus coat proteins (L1 and L2) are expressed only in cells that have already undergone genome amplification and which contain E4 in their cytoplasm. The E4 protein persists to the epithelial surface and is thought to have an additional role in virus release by associating with cytokeratins [39,98,118] and with the cornified envelope [14,15]. Both L2 and L1 are detectable by immunostaining, with the expression of L2 preceding the expression of L1 [38,47] (Fig. 2B). It appears that the precise timing of capsid synthesis is regulated at the level of RNA processing and protein synthesis [103,127,130], with a change in splice site usage leading to the production of transcripts that terminate at the late rather than the early polyadenylation site. Important for this is a splicing silencer in the HPV16 L1 gene [129], and the presence of negative regulatory elements that destabilize HPV late transcripts [28,73], and allow the preferential synthesis of early transcripts in proliferating cells. Furthermore, the pattern of codon usage within the late genes is distinct from that of the host cell, and it is thought that this may further contribute to the suppression of L1 expression in the lower epithelial layers [128,130].
In addition to the virion structural proteins, the E2 protein is also thought to be required for the assembly of infectious particles. Both L2 and E2 localize to PML (promyelocytic leukaemia) bodies within the nucleus, with E2 recruiting replicated genomes to this site [16,33]. Virus assembly occurs when L1 capsomeres enter the nucleus and are recruited to PML structures by L2 [47]. Although papillomavirus particles can assemble in the absence of L2, packaging occurs more efficiently in its presence [108], and viral infectivity is greatly enhanced [100]. In model systems of the virus life cycle, the inability to express L2 leads to a 10-fold reduction in packaging efficiency and a 100 fold reduction in virus infectivity when compared to wild type viral genomes [59]. At a molecular level, the association of L2 with L1 requires hydrophobic sequences near to the C-terminus of the protein that are thought to insert into the central hole of the pentavalent L1 capsomeres [46]. The assembly of caspsids is confined to the nucleus, and is dependent on sequences within the C-terminal region of the L1 protein that are essential for capsomere interactions. These protein-protein interactions are stabilized in the upper epithelial layers by disulphide bonding, which contributes to the stability of the released particles [16,99]. It is thought that the accumulation of virion structural proteins to high level is retarded until the cell reaches the upper epithelial layers in order to limit detection by the host immune response.

Abortive infection and neoplastic progression
DNA tumour viruses, such as Polyomaviruses or Adenoviruses, generally cause cancers when their nor- This can lead to viral latency or to viral clearance. The deregulation of viral gene expression can also lead to cervical neoplasia. In the early stages of disease, this is reversible with some CIN2 reverting to CIN1 and below. The long-term deregulation of the viral oncogenes during persistent infection can lead to secondary changes in the cellular genome that contribute to the neoplastic phenotype. Integration of the viral genome into the host cell chromosome, which is thought to be a rare event, can eventually fix E6/E7 expression, predisposing to the accumulation of further genetic abnormalities and the development of cancer. mal pattern of gene expression is compromised, and in many instances, this is also how papillomavirusassociated cancers can arise. High-risk genotypes such as HPV-16 infect cervical epithelium, with cervical cancer arising within the transformation zone of women who have long-term/persistent infections. The transformation zone is the site where the columnar cells of the endocervix meet the stratified squamous cells of the ectocervix, and this appears to be a particularly unstable site for high-risk HPV types. HPV16 infections can be found at many sites in the anogenital tract, including the vagina, vulva and penis, but despite this widespread distribution, the incidence of cancer at these sites is generally quite low (0.001% of the population). At the cervical transformation zone, the frequency of HPVassociated cancers is around 20 to 40 times higher than this, and in the absence of screening, affects around 30-40 people per 100,000 (depending on the population being studied). Only at the anus, in men who have sex with men, does the incidence of HPV-associated cancer rise to levels that are similar to those found at the cervix, and it is interesting to note that at both these susceptible sites there is a transformation zone [8].
HPV infection of the cervix can have a number of different outcomes, but in most cases it is thought that a period of productive infection is eventually followed by lesion regression and viral clearance (Fig. 5), or by the maintenance of viral genomes as latent episomes in cells of the basal layer. Resolution of infection as a result of a cell-mediated immune response is thought to be the primary mechanism by which infections are cleared, with most women who become infected showing no evidence of the infecting virus type after 18 months [121]. Primary infections are relatively common in young women, and affect between 20-40% depending on the population and geographical location, with the cumulative incidence of exposure over 5 years being as high as 60% [24,29,120]. This high lifetime risk of infection (estimated at approximately 80%) contrasts markedly with the relatively low risk of devel-oping cervical cancer, and it is now well established that cancer progression typically occurs over years or decades in patients that do not effectively resolve their initial infection.
The precursors of cervical cancer have been recognized for many years, and are classified by pathologists as cervical intraepithelial neoplasia (CIN) of varying grade. Flat condylomata and CIN1 are low-grade lesions in which the productive cycle of the virus is supported to some degree, with cells that are being driven through the cell cycle being confined to the lower epithelial layers (Fig. 6). In CIN1, such cycling cells typically occupy the lower third of the epithelium, with mitotic figures being occasionally apparent in these cell layers. High-grade lesions such as CIN2 and CIN3 are characterised by the persistence of cycling cells into higher cell layers, and in CIN3, such cells are detectable at the epithelial surface. Such lesions represent abortive infections in which viral gene expression is not properly controlled, and late events in the virus life cycle are not properly supported (Fig. 6). The accurate identification of lesion grade has prognostic significance, as it has been estimated that around 20% of CIN1 will progress to CIN2, and that around 30% of CIN2 will progress to CIN3 if left untreated. CIN3 are generally considered to be the direct precursors of cervical cancer, and it has been suggested that around 40% of CIN3 lesions will progress to cervical cancer in the absence of intervention [95]. This figure may need to be re-evaluated in light of recent data from vaccination trials, which show that lesions graded as CIN2 or higher can be detected in around 1% of infected women within a few years of infection [54], and it is likely that many of these either regress or revert to lower grade disease rather than progressing to cancer. The aim of cervical screening is to predict the severity of cervical disease from the analysis of cells taken from the epithelial surface, and this approach has been hugely successful in managing cervical cancer in developed countries [95]. The molecular events that cause a productive lesion to progress to neoplasia and eventually to cancer are not precisely understood, but appear to reflect changes in viral gene expression, and in particular, an elevation in the level of E6 and E7 expression [116]. Using molecular markers that can distinguish between key life cycle phases (i.e. cell proliferation, genome amplification and virus synthesis) it appears that these events change in a very ordered way during progression to high-grade neoplasia, with the thickness of the E7-expressing layer increasing markedly with increasing disease severity [75]. This deregulation of early gene expression is accompanied by a concomitant failure to trigger late events until the cells are close to the epithelial surface, and in CIN3, genome amplification is often confined to small pockets at the epithelial surface. Changes in the timing of viral gene expression are accompanied by changes in the levels of the viral proteins, and this is likely to have a key influence in determining lesion grade. The E7 protein can be detected more easily in high-grade neoplasia than in productive lesions [36,44,45], and this can also be evidenced when staining is carried out with surrogate markers of E7 expression such as p16 INK4A . Detectable p16 INK4A is considered a surrogate marker of elevated E7 expression, and can be detected in some CIN1 lesions as well as in CIN2 and CIN3. The molecular basis for changes in the pattern of E6 and E7 expression in neplasia are not well understood, but may reflect the effects of hormones or cytokines acting on cells in the transformation zone.

Development of cervical cancer
High-grade neoplasia such as CIN2 and CIN3 are regarded as precursor lesions that have a high frequency of progressing to cancer. Although it is clear that cervical cancer can sometimes develop rapidly from such lesions, more typically it requires long-term/deregulated expression of the viral oncogenes, which is facilitated by the integration of HPV DNA into the host cell chromosome. Integration is typically a random event, and can occur at any site in the host cell DNA, with different levels of disruption to viral gene expression occurring as a result. Several reports have suggested that integration is favoured at common fragile sites, and in some instances, the expression pattern of cellular genes near to the integration site may contribute to the development of cancer [123]. The most significant consequence of integration however, is the deregulated expression of the viral E6 and E7 genes, which leads to persistent stimulation of S-phase entry, and the loss of the cellular p53 protein as a result of E6 activity. This combination of events predisposes to the accumulation of secondary genetic changes in the host cell genome, and compromises the ability of the cell to effectively repair damaged DNA. Although high-risk papillomaviruses express genes that stimulate cell proliferation and inhibit the p53-mediated DNA damage response, their presence alone is not sufficient to induce cervical cancer. The accumulation of mutations in the host-cell genome is an important and necessary event in the development of cervical cancer, and can occur over Fig. 6. Changes in expression patterns that accompany cervical cancer progression. Low-grade cervical lesions such as CIN1 (LSIL) are similar to productive infections caused by other HPV types. In such lesions, the three phases in the virus life cycle are distinguishable using antibodies to E7, E4 and L1 (shown diagrammatically on the left). During neoplastic progression (CIN2, CIN3), the levels of E6 and E7 are de-regulated, which facilitates the accumulation of secondary genetic changes that predispose to the development of cervical cancer. Although the order of events is unaltered (E7 then E4 then L1), cells expressing E7 extend closer and closer towards the epithelial surface as the grade of neoplasia increases. In high-grade neoplasia (CIN3), genome amplification is restricted to small pockets of differentiated cells, with the expression of L1 occurring only rarely. The development of cervical cancer typically involves the integration of viral DNA into the host cell genome, which stabilizes E6/E7 expression and facilitates the accumulation of secondary changes in the host cell. time in women who carry persistent active infections. Such mutations can occur by chance, or can arise as a result of environmental events. Tobacco metabolites, which are present in the cervical secretions of women who smoke, are thought to act as co-factors in the development of cervical cancer [56,58]. Multiparity and the long-term use of oral contraceptives are also associated with increased risk [19,77], as is the presence of additional genital infections such as chlamydia. Given the requirement for secondary genetic changes, it is not difficult to see why cervical cancer develops at the site of infection over the course of years or decades. The average age of women presenting with visible highgrade neoplasia (CIN3) is in the late twenties, whereas the mean age of women with cervical cancer is in the fifties [9].
Integration is an important event in the development of most cervical cancers. The retention of E6 and E7 and the continued expression of these proteins is usually accompanied by the loss of regulatory genes such as E2 and E4, which are thought to antagonize E7-mediated cell proliferation during productive infection [30,31,48,52,81]. Integration also leads to the loss of negative regulatory elements at the 3' end of HPV early transcripts, which contributes further to the deregulated expression of viral oncogenes [62]. The requirement of these genes for maintenance of the cancer cell phenotype is shown clearly by studies that have aimed to inhibit the expression of these genes in cervical cancer cells. Cell lines such as SiHa and Hela, which have been grown in the laboratory for almost 50 years, undergo apoptosis and cell death following the inhibition of E6 function by various approaches [18,53,63]. This also occurs following the re-introduction of E2, which suppresses the expression of the viral oncogenes by binding to the URR and down-regulating the early promoter [52,88]. Levels of E7 expression can also be affected by exposure to glucocorticoids and progesterone, as well as by methylation, and by changes in chromatin organization [3,65,90].
Although many different HPV types can infect the cervix, they are not all associated with cervical cancer. Most of the high-risk HPV types are contained within a separate evolutionary branch that is clearly distinct from the low-risk cutaneous and mucosal alpha papillomaviruses [102]. Even within the high-risk HPV group, there are however considerable differences in the frequency of cancer association which are not yet fully understood [23,25]. This can be seen from the fact that HPV16 and HPV31, which are closely related at the evolutionary level, are associated with very different levels of cervical cancer (HPV16, 53.5%; HPV31, 2.9%) [79]. Although these differences are not fully understood, the E6 and E7 proteins encoded by the high-risk types are thought to have specific functions that are important in cancer progression. High-risk (but not low-risk) E7 proteins can stimulate the accumulation of centrosomal abnormalities in cell culture and transgenic animals, and it has been suggested that this may increase the chance of genetic errors during each round of cell division [40,41,101]. This function of E7 is not totally dependent on its ability to bind to pRb and other members of the pocket protein family, which is of key importance in driving cell proliferation. High-risk E7 proteins bind pRb more efficiently than the low risk E7 proteins [50,78], and contribute to cancer progression by stimulating their proteosome-mediated degradation [5,10]. Like E7, the E6 proteins also differ in function when low and high-risk viruses are compared. The ability of high-risk E6 proteins to form a tripartite complex involving E6AP and p53 is well documented, and leads to the ubiquitination and degradation of p53 the p53 protein. The low-risk E6 proteins bind p53 with lower affinity, and have no significant ability to bind E6AP and to stimulate the degradation of p53 [60,110]. The loss of the p53-mediated DNA damage response in cells that are driven into S-phase over years or decades, allows the accumulation of secondary changes in the host cell chromosome that eventually lead to cancer. The high-risk E6 proteins differ from the low-risk proteins in having a PDZ-binding domain at their Cterminus, which allows the binding and degradation of cellular targets containing PDZ motifs, such as hDlg and hScrib, which are involved in the regulation of cell growth and attachment [124]. PDZ binding is a characteristic of high-risk E6 proteins, and it is thought that the loss of cell-cell contacts that is mediated by tight junctions may contribute to the loss of cell polarity seen in HPV-associated cervical cancers [80]. The association with PDZ-domain proteins is distinct from E6's ability to bind and degrade p53, and appears important for cell proliferation in experimental systems [66,85]. High-risk E6 proteins have also been reported to be able to be able to activate the catalytic subunit of telomerase (hTERT (human telomerase reverse transcriptase) [67]. Telomerase activity is usually absent in somatic cells, and leads to the shortening of telomeres as cells divide. E6-mediated hTERT activation stimulates the addition of hexamer repeats to the telomeric ends of chromosomes, which is expected to inhibit cell senescence and extend the lifespan of cells harbouring HPV genomes. Such a function, may predispose to persistent infection.
Most cervical HPV infections do not lead to cervical cancer, with the vast majority causing only transient infections. It has been suggested that HPV16 and 18 may be more persistent than other HPV types [20], with persistence becoming a problem in women over 30 [117].
In experimental models, lesion formation is followed by lesion regression in as little as 5 weeks [22,87], whereas in infected women, most HPVs are thought to be cleared within 18 months with re-infection by the same HPV type being uncommon [121]. The immune system is clearly important in controlling persistence, and patients with immune defects can develop widespread lesions that are refractory to treatment. In model systems and in humans, regressing warts are associated with lymphocyte infiltration and elimination of infected tissue over the course of a few weeks [26,68,87]. Although papillomavirus persistence is thought to depend in part on the genetic background of the host and the nature of the infecting HPV type, it appears that as a group, the Alpha papillomaviruses have developed characteristics that may enhance the longevity of infection. Amongst these characteristics are the regulation of E-cadherin expression and Langerhans cell density by E6 [71,125], the interference with MHC presentation by E5 [125], and interference with the function of interferon response factor 3 by E7 [4]. The fact that papillomaviruses avoid cell lysis, and express their late proteins only in the upper epithelial layers may further restrict detection by the immune system. The viral early proteins, which are expressed in the lower epithelial layers, are thought to be expressed at levels below those required to effectively trigger a host immune response. The stimulation of a cell-mediated response that can clear infection, appears to depend on the cross-priming of dendritic cells by viral antigens expressed on keratinocytes. It is uncertain whether the virus remains latent in the basal layer following regression, but this is suspected, and may explain the frequent detection of HPV DNA in the absence of disease. It appears that methylation may also play a role in controlling viral gene expression, and that under some circumstances, this may lead to an asymptomatic infection without a preceding productive phase [65] or immune response. Latent infection is thought to require E1 and E2 to allow basal cell replication of viral episomes, but is not thought to depend on the expression of E6 and E7 [126].

A consensus view of biomarker expression patterns during infection
Productively infected cells need to express the viral gene products in a defined order for infectious virions to be assembled at the epithelial surface. During productive infection, cell cycle markers such as PCNA and MCM are confined to the lower epithelial layers, with their presence being a direct consequence of E6/E7 activity. A separate and distinct type of biomarker identifies cells that are undergoing genome amplification, with the E4 protein being the most abundant member of this group. Virus assembly follows genome amplification, and is marked eventually by the appearance of capsid proteins (L1 and L2) in the upper epithelial layers. Virion structural proteins represent a third category of biomarker, and identify cells that are completing their productive cycle. Our current theories suggest that infected cells express each of these markers in turn during epithelial differentiation, and that the extent of productive infection can be established by considering the timing and extent of their expression.
During neoplastic progression, it appears that two types of change can occur. Firstly, viral gene expression becomes de-regulated and E6/E7 levels in the basal and parabasal cells increases. Given the known functions of these proteins, this is predicted to be a significant event in stimulating the accumulation of genetic errors in the host cell chromosome. Integration eventually fixes the expression of E6/E7 in the cell, which contributes further to the chance of progression. The second type of change relates to the regulation of late events, which become progressively retarded as the grade of neoplasia increases. This can be seen by immunostaining, as an increase in the thickness of the E7-expressing layers, and a reduction in the extent to which E4 is expressed in the upper layers of the epithelium. Whether these two events are linked remains to be established, as does the precise mechanisms that regulate such activity.