A Pilot Study Comparing HPV-Positive and HPV-Negative Head and Neck Squamous Cell Carcinomas by Whole Exome Sequencing

Background. Next-generation sequencing of cancers has identified important therapeutic targets and biomarkers. The goal of this pilot study was to compare the genetic changes in a human papillomavirus- (HPV-)positive and an HPV-negative head and neck tumor. Methods. DNA was extracted from the blood and primary tumor of a patient with an HPV-positive tonsillar cancer and those of a patient with an HPV-negative oral tongue tumor. Exome enrichment was performed using the Agilent SureSelect All Exon Kit, followed by sequencing on the ABI SOLiD platform. Results. Exome sequencing revealed slightly more mutations in the HPV-negative tumor (73) in contrast to the HPV-positive tumor (58). Multiple mutations were noted in zinc finger genes (ZNF3, 10, 229, 470, 543, 616, 664, 638, 716, and 799) and mucin genes (MUC4, 6, 12, and 16). Mutations were noted in MUC12 in both tumors. Conclusions. HPV-positive HNSCC is distinct from HPV-negative disease in terms of evidence of viral infection, p16 status, and frequency of mutations. Next-generation sequencing has the potential to identify novel therapeutic targets and biomarkers in HNSCC.


Introduction
Tobacco use has steadily declined over the last four decades [1]. In parallel, there has been a decline in cancers of most sites in the upper aerodigestive tract [2]. The exception to this trend is cancers of the oropharynx, particularly those of the palatine and lingual tonsils, which are caused by oncogenic subtypes of the human papillomavirus (HPV) [3]. The rise in incidence of HPV-positive head and neck squamous cell carcinoma (HNSCC) has been dramatic, causing the rates of tonsillar cancer to increase by as much as threefold in some countries [3,4]. HPV-positive patients experience markedly better survival, and their tumors are molecularly distinct from traditional head and neck cancers [5]. Overexpression of p16 and proteolysis of p53 are nearly universal in HPVpositive tumors, in contrast to frequent loss of p16 and point mutations in p53 that are found in HPV-negative cancers [5]. However, the specific mechanisms responsible for improved survival in HPV-positive patients have not been fully elucidated.
Next-generation sequencing has yielded important insights into the pathogenesis of other cancers by identifying biomarkers and therapeutic targets. High-throughput sequencing of HNSCC tumors has recently been reported, and NOTCH inactivation was the most significant finding [6,7]. This pilot study aims to contrast the mutations seen in an HPV-positive and an HPV-negative tumor using whole exome sequencing and further our understanding about the mutations that define HNSCC.

Patient Selection and Tumor and Blood Sample Collection.
Ethics approval was obtained from Western University Health Sciences Research Ethics Board. Informed consent was obtained from patients undergoing ablative surgery for head and neck cancer to have a portion of their tumor stored, a 10 mL blood sample taken, and their clinical parameters prospectively collected. Two patients were identified for this pilot study: a 49-year-old nonsmoking male with a T2N0 tonsillar cancer treated with transoral robotic surgery and neck dissection and an 81-year-old female with a history of heavy smoking with a T2N0 oral tongue cancer treated with partial glossectomy, neck dissection, and free flap reconstruction. Primary site tumor specimens were taken from the center of the resection specimen. Ten mL of venous blood were drawn intraoperatively into heparinized collection tubes.

p16
Immunohistochemistry. For each patient, a portion of the primary tumor was fixed in formalin and embedded in paraffin. The blocks were then sectioned (5 µm thick). p16 immunohistochemistry was performed as previously described using a mouse monoclonal antibody against p16 (MTM Laboratories, Heidelberg, Germany) at 1 : 500 dilution [8]. Immunohistochemistry scoring was conducted by two study pathologists (BW and KK) blinded to HPV status and patient information. Scoring was as described by Begum et al. with strong and diffuse staining (>80 percent of tumor cells) regarded as a positive result, and negative if absent or focal [9].

DNA Extraction from Blood and Tumor
Tissue. DNA was extracted from 10 mL of whole blood using the QIAmp Blood Maxi kit following instructions provided by the manufacturer (Qiagen, Valencia, CA, USA). DNA was extracted  [11]. Samples were initially locally realigned using the IndelRealigner walker from the GATK package with known insertions and deletions found in dbSNP 135. This was followed by base quality recalibration from GATK. Default parameters were used for both steps except for SOLiD specific parameters in the recalibration step. Reads without any color space calls were marked as failing vendor quality and thus were removed from further downstream analysis. In addition, reads that had a reference base inserted into the reads due to inconsistent color space calls had those bases set to Ns with base qualities of zero. Finally variants were called and filtered using the GATK UnifiedGenotyper and VariantFiltration walkers again with default settings. To be considered for further downstream analysis, a tumor variant had to have at least 8x coverage within the target regions 37,038,261 sites (71.86%) for the HPV-positive tumor and 39,150,091 sites (75.96%) for the HPV-negative tumor that met this criterion. In addition to coverage, the following requirements had to be identified by the VariantFiltration walker: (i) variant quality equal to or greater than 30, (ii) variant confidence/quality by depth (QD) equal to or greater than 2.0, is the total mapping quality zero reads and DP is the unfiltered read depth.
A reference variant required a minimum read depth of 8x within the target region for further consideration (38,673,520 sites (75.03%) and 38,058,450 sites (73.84%) for HPV-positive and HPV-negative tumors, resp.). This presented 36,020,799 and 37,049,778 comparable sites in the HPVpositive and HPV-negative tumors. Using in-house custom Perl code, somatic variants within the targeted regions were identified. To be classified as a somatic variant the following conditions had to be met: (1) a tumor variant was identified by GATK that met the above filtration requirements and (2) the corresponding position in the normal sample had 8x coverage and did not have a GATK variant call. Somatic variants were annotated with refGene annotations (http://varianttools.sourceforge.net/Annotation/RefGene), and consequences were identified using ANNOVAR v2012-03-08 [12].

p16 Immunohistochemistry and HPV Testing.
Genomic DNA was extracted from matching tumor and blood samples from two head and neck cancer patients: patient 1 was a 49-year-old nonsmoking and nondrinking male, and patient 2 was an 81-year-old female smoker. Patient demographics, treatment details, and histopathologic parameters are outlined in Table 2. Tumor sections from each patient were stained with hematoxylin and eosin (Figures 1(a) and 1(b)). Patient 1 stained diffusely positive for p16 (Figure 1(c)), while the tumor tissue from patient 2 was negative for p16 ( Figure 1(d)). In situ hybridization testing with the broadspectrum HPV probe demonstrated strong punctate staining within nuclei of the tumor of patient 1, consistent with highrisk HPV infection (Figure 1(e)). HPV-specific, punctate nuclear staining was absent in the tumor of patient 2 ( Figure 1(f)). We employed primers designed specifically against unique portions of the E6-E7 region of HPV type 16 and type 18 to confirm the HPV status of the patients in this study. The GAPDH control was amplified from both patients; as expected, only patient 1 was HPV type 16 positive (Figure 2). Patient 2 was negative for HPV type 16, and both patients were HPV type 18 negative (data not shown).

Exome Capture and Raw Sequencing
Results. The exomes from tumor tissue and matched blood samples from each patient were sequenced. For each tumor or blood sample, approximately 1.2 billion bases were sequenced, 86% of which were specific for exome sequences. The mean coverage of the exome targets was 28.1-fold, with 91.6% of the targets being sequenced at least once and 67.4% sequenced at    least ten times. The exome capture and sequencing results were within the normal range of performance specified by the manufacturer and are comparable with published results [13]. shown to harbor mutations in large-scale sequencing studies (Tables S1 and S2) [6,7].
Patient characteristics, PCR analysis, in situ hybridization testing, and immunohistochemistry all indicated that patient 1 was HPV type 16 positive (HPV type 18 negative by PCR) and that patient 2 was HPV negative (both type 16 and type 18 negative). When we used our four compiled exome sequences (blood and tumor from both patients) as queries against the HPV type 16 genomic sequence (RefSeq NC 001526.2), the tumor sequence from patient 1 matched numerous regions (39 hits) of the HPV 16 genome (Figure 3). Matches were identified to all the HPV type 16 genes (except E4) suggesting that the HPV type 16 genome had integrated into the tumor genome of patient 1. The tumor sample from patient 2 and the blood samples from both patients did not align to any specific HPV sequences.

Discussion
HPV-positive head and neck squamous cell carcinoma (HNSCC) has been described as molecularly distinct from traditional head and neck cancer [5]. The human papillomavirus (HPV) oncoproteins E6 and E7 promote carcinogenesis by degrading the tumor suppressors p53 and retinoblastoma protein (Rb), respectively. In contrast, p53 is not degraded in HPV-negative HNSCC but is frequently mutated, and p16 is often lost through homozygous deletion, methylation, or, less frequently, point mutation [5,14]. This might lead one to believe that carcinogens like tobacco and alcohol would promote HNSCC comprised of a large number of mutations in many different pathways. In contrast HPV-positive cancers, modulated by the activities of viral oncoproteins, might not accumulate a large number of cellular mutations. In our study, we provided quadruple confirmation of tumor HPV status with p16 immunohistochemistry, HPV in situ hybridization, HPV detection by PCR, and detection of the HPV 16 genome sequences within patient 1's sequenced exome. We observed more mutations in the HPV-negative tumor when compared to the HPV-positive tumor, although the absolute difference was not dramatic (73 versus 58, resp.). Two large-scale exome sequencing efforts characterizing HNSCC have been reported recently [6,7]. The study led by Stransky et al. reported twice as many mutations in the HPV-negative samples (4.83 mutations/Mb versus 2.28 mutations/Mb) [7]. The second group examined a set of 32 patients, four of which were HPV positive and reported on a subset of mutations that were identified by exome sequencing and confirmed by PCR. In this subset of genes, there were four times as many mutations in the HPV-negative tumors (20.6 ± 16.7 versus 4.8 ± 3 mutations in the HPV-positive tumors) [6]. Given the broad range of mutations seen in the HPVnegative cancers, our finding of slightly more mutations in the HPV-negative tumor is consistent with their results. As expected we did not identify TP53 or p16 mutations in the HPV-positive tumor; however these two genes appeared as wild type in the HPV-negative tumor as well. The lack of a p16 mutation in the setting of low expression levels as evidenced by immunohistochemistry may reflect that it has been inactivated by promoter methylation, the second most common cause of p16 loss [14].
Only a single genetic mutation (Muc12) was shared by both HPV-positive and HPV-negative tumor samples. The cell surface associated Muc12 was the only mucin identified in the HPV-negative tumor. In contrast, the HPV-positive tumor had five mutations in four different mucin genes, including the secreted Muc6, and the transmembrane bound Muc4, Muc12 and Muc16. Stransky et al. reported mutations in all the above mucins except for Muc12 [7]. Mucins are known to be involved in the development of epithelial cancer where they are often overexpressed, disrupting the epithelial cell polarity and promoting the epithelial to mesenchymal transition (EMT) phenotype [15]. Multiple damaging mutations within the mucins of HPV-positive tumor may suggest another cellular difference between these two distinct tumor types.
We also found multiple mutations in the zinc finger (ZNF) family genes in both tumor types. The ZNF family represents a large group of molecules which are involved in various aspects of transcriptional regulation [16]. There were almost twice as many ZNF mutated genes in the HPVpositive sample. Although there were a total of 11 ZNF mutations between the two tumor types, there were no shared ZNF members mutated in both cancers. Stransky et al.   [7]. Not enough is known about the role of mucins and ZNF proteins in HNSCC. These molecules may warrant further study. We confirmed that the sequence from the human pathogen HPV type 16 was identified within exome sequence of a HNSCC tumor. In order for HPV to be oncogenic, the viral E2 protein, which represses the expression of E6 and E7, must be lost [17]. This only occurs during integration when the episomal HPV DNA breaks within the E2 gene. PCR detection of E6 and E7 can detect both episomal and integrated forms and thus cannot distinguish between a superficial HPV infection and integrated viral DNA causing the cancer [17]. An additional benefit of whole exome sequencing is the detection of integrated HPV DNA. In Stransky's study, exome sequencing appeared even more sensitive than PCR for detection of HPV, as it identified the presence of HPV 16 in 14 of 73 cases versus 11 for PCR [7]. Perhaps more interesting is the concept of screening human disease genomes against pathogen datasets. In fact, it was this exact strategy that led to the discovery of Merkel cell polyomavirus in 2008 [18]. It may be that a subset of other cancers have a yet undiscovered viral etiology.

ISRN Oncology
This study represents a pilot effort to gain experience with this exciting new technology, which was instructive as our group moves forward with large-scale projects. In addition to the small sample size, the quality of data generated limited by the ABI SOLiD platform with an average 30-fold coverage with 50 base pair paired-end reads yielded only 10fold coverage over approximately two-thirds of the coding sequence. Thus, approximately a third of the exome was not adequately evaluated and important mutations could have been missed. We have recently completed characterizing a panel of head and neck cancer cell lines with 100-fold coverage with 100 base pair paired-end reads, and the results were vastly superior [19]. An average of 99% of the targeted exome had at least 10 reads and 90% had fiftyfold coverage. Perhaps more importantly, an expert bioinformatics team is critical to properly analyze the data. Although there are standard steps involved with aligning the sequencing data to the reference genome, false positive results can be frequent without adequate quality control measures. A carefully validated pipeline is necessary to filter spurious results in order to generate valid data.
It should be noted that tremendous insights can be gained by exome sequencing; however, whole genome sequencing offers the advantage to identify other genetic changes that can lead to tumorigenesis including copy number variation and translocations, in addition to point mutations, insertions, and deletions. Alterations in noncoding regions that may be important, such as promoters and miRNAs, would also be identified. The study by Stransky et al. reported whole genome sequencing in two patients and revealed markedly more translocations in the HPV-negative versus the HPVpositive tumors [7]. Ideally, future large-scale initiatives will be carried out using this more extensive but also more expensive technique to identify additional important genetic changes underlying HNSCC.

Conclusions
Whole exome sequencing of head and neck cancers can provide important insights into the molecular biology of the disease. HPV-positive and negative head and neck cancers are molecularly distinct, and HPV-negative cancers tend to harbor more mutations. Multiple, integrated HPV 16 sequences were identified in the exome targets from the HPV positive patient. These matches were restricted to the HPV-positive patient's tumor profile providing evidence of the utility of screening exome sequences against pathogen databases.

Disclosure
None of the authors have financial relationships with any of the commercial identities mentioned in this paper.

Authors' Contribution
Anthony C. Nichols, Michelle Chan-Seng-Yue, John Yoo, Paul Boutros, and John W. Barrett all contributed equally to this work.