Next-generation sequencing in the clinical genetic screening of patients with pheochromocytoma and paraganglioma

Background Recent findings have shown that up to 60% of pheochromocytomas (PCCs) and paragangliomas (PGLs) are caused by germline or somatic mutations in one of the 11 hitherto known susceptibility genes: SDHA, SDHB, SDHC, SDHD, SDHAF2, VHL, HIF2A (EPAS1), RET, NF1, TMEM127 and MAX. This list of genes is constantly growing and the 11 genes together consist of 144 exons. A genetic screening test is extensively time consuming and expensive. Hence, we introduce next-generation sequencing (NGS) as a time-efficient and cost-effective alternative. Methods Tumour lesions from three patients with apparently sporadic PCC were subjected to whole exome sequencing utilizing Agilent Sureselect target enrichment system and Illumina Hi seq platform. Bioinformatics analysis was performed in-house using commercially available software. Variants in PCC and PGL susceptibility genes were identified. Results We have identified 16 unique genetic variants in PCC susceptibility loci in three different PCC, spending less than a 30-min hands-on, in-house time. Two patients had one unique variant each that was classified as probably and possibly pathogenic: NF1 Arg304Ter and RET Tyr791Phe. The RET variant was verified by Sanger sequencing. Conclusions NGS can serve as a fast and cost-effective method in the clinical genetic screening of PCC. The bioinformatics analysis may be performed without expert skills. We identified process optimization, characterization of unknown variants and determination of additive effects of multiple variants as key issues to be addressed by future studies.


Introduction
Pheochromocytomas (PCCs) and paragangliomas (PGLs) are rare tumours arising from chromaffin cells in adrenal medulla and autonomous ganglia. A majority of these tumours have a low proliferation and seldom metastasize. The understanding of underlying molecular mechanisms in the tumorigenesis of these diseases has increased dramatically during the last decade (1). Up to 80% of all PCC and PGL could have either germline or somatic mutations (2,3,4) in one of the 11 hitherto known susceptibility genes: SDHA, SDHB, SDHC, SDHD, SDHAF2, VHL, HIF2A (EPAS1), RET, NF1, TMEM127 and MAX (5,6,7,8,9,10,11). While there has been a constant flow of reported new susceptibility loci, the capacity of instruments approved for diagnostic use has failed to keep up with the increasing demand. These 11 genes constitute 144 exons (w25 000 bases); consequently, a comprehensive PCC and PGL genetic screening test can be time consuming and is not regarded as cost effective (12). This has motivated the design of numerous screening algorithms to guide the investigators in the selection of appropriate patients and tests (12,13). Spare use of clinical genetic screening in patients with PCC and PGL, despite the introduction of such guidelines, has been mainly excused by cost-benefit explanations.
Introduction of novel sequencing techniques (denoted next-generation sequencing or NGS) has dramatically reduced the cost for DNA sequencing (14). The term NGS includes principally different sequencing platforms that share a high output of sequenced bases relative to traditional methods. Recently, the focus of experiments using NGS has been shifted from the research settings to investigate the use of NGS as a platform in clinical scenarios (15,16,17,18).
The NGS process is highly complex with multiple steps that may be divided into genomic enrichment (selected, all exons as in exome or none as in whole genome sequencing), sequencing (including library preparation), bioinformatics analysis and, in the clinical setting, genetic consultation (19).
Due to its well-characterized genotype-phenotype correlation and the limitations imposed by existing technologies, there is a strong argument for investigating the potential use of NGS as a diagnostic test in the clinical genetic screening of PCC and PGL.

Patients
Tumour tissues from three patients with PCC were selected for whole exome sequencing. Patient characteristics are summarized in Table 1. All the three patients had a secretory unilateral PCC and no apparent signs/symptoms/ history suggesting pathogenic germline variants in known susceptibility genes. The local ethics committee approved the study and written informed consent was obtained from all patients.
Exome capture and high-throughput sequencing All samples were macro-dissected to achieve neoplastic cellularity of O80%. DNA was prepared from cryosections using Genomic-tip 20/G (cat. no. 10223, Qiagen). Sequencing libraries were prepared from 3 mg gDNA using SureSelect target enrichment system for Illumina pairedend sequencing libraries v2.2, October 2010 (Agilent Technologies, Santa Clara, CA, USA), according to the manufacturer's instructions. Briefly, the DNA was fragmented using the Covaris S2 system (Covaris, Woburn, MA, USA). The DNA fragments were end-repaired using T4 DNA polymerase, Klenow DNA polymerase and T4 polynucleotide kinase (PNK), followed by purification using AMPure XP beads (Beckman Coulter, Brea, CA, USA). An A-base was ligated to the blunt ends of the DNA fragments using the Klenow DNA polymerase and the sample was purified using AMPure XP beads. Adapters for sequencing were ligated to the DNA fragments, followed by purification using AMPure XP beads. The adapterligated libraries were amplified for five PCR cycles, followed by a second purification using AMPure XP beads. The quality of the enriched libraries was evaluated using the 2100 Bioanalyzer and a DNA 1000 kit (Agilent). Exon capture was performed from 500 ng of each sequencing library using the SureSelect Human All Exon 50 Mb kit (Agilent). Briefly, the fragments in the library were hybridized to capture probes, unhybridized material was washed away and the captured fragments were amplified for ten PCR cycles, followed by purification using AMPure XP beads. The quality of the enriched libraries was evaluated using the 2100 Bioanalyzer and a

Bioinformatics
Sequencing generated a minimum of 125!10 6 reads in all three tumours with an average read length of 100 reads ( Table 2). Generated sequences were processed using commercially available software: CLC Genomics Workbench 4.9 (CLC Bio, Aarhus, Denmark). Reads from pairend fragments were trimmed for low-quality and duplicate reads ( Fig. 1). Remaining sequences were mapped to the human reference sequence GRCh37.p5. A single-nucleotide variant (SNV) and insertion/deletion detection algorithm was used with low-and high-stringency settings: low stringency, coverage of O8 reads and a variant allele frequency of O25%; and high stringency, coverage of O30 reads and a variant allele frequency of O35%. Generated results were filtered for non-synonymous variants and/or variants with a probable splice site effect. The list was annotated for all gene annotations and then filtered for variants in one of the 11 currently known PCC susceptibility genes. The remaining variants were annotated for overlapping information in selected genetic databases: the Single Nucleotide Polymorphism Database (dbSNP), Catalogue of Somatic Mutations in Cancer (COSMIC), the Human Gene Mutation Database (HGMD) and Leiden Open source Variation Databases (LOVD). Impact of non-synonymous amino acid substitution was assessed in silico, using Polyphen2 (20) and SIFT (21). Cross-references were manually gathered when available. Analysis of structural variants in data generated by exome sequencing was not adequately supported by the software and was excluded from this experiment.

Sanger sequencing
DNA was prepared from peripheral blood and tumour cryosections using DNeasy Blood and Tissue Kit (Qiagen). In order to be utilized as control and for verification of variants discovered by NGS, fragments corresponding to all exons and intron-exon junctions of major susceptibility genes; SDHB, SDHC, VHL, MAX, RET (exons 10, 11 and 13-16) as well as selected fragments in NF1 (exon 9), were amplified by PCR and sequenced using automated Sanger sequencing (Beckman Coulter, Takeley, UK). Primer sequences and PCR conditions can be obtained by request.

Patient 1: SDHC variant of uncertain clinical significance
A 61-year-old woman was investigated due to therapyresistant hypertension of unknown aetiology. Urine noradrenaline level was elevated. The patient was operated with a laparoscopic left-sided adrenalectomy and the pathology report described a benign PCC, 25!20 mm in size and a weight of 4.5 g. Immunohistochemistry demonstrated expression for chromogranin A and a Ki67 index of 1%. Exome sequencing revealed seven SNVs, one was classified as benign and six as unknown. There was one missense variant in SDHC located at position 477C!T, resulting in amino acid substitution Pro110Ser. This variant was not found in the HDMD, dbSNP, COSMIC or LOVD databases nor could it be found in a PubMed search. In silico analysis using Polyphen2 and SIFT estimated SDHC Pro110Ser as benign (score 0.231) and tolerated (score 0.93).

Patient 2: RET variant of uncertain clinical significance
A 27-year-old woman was investigated post partum due to therapy-resistant hypertension during the second and third trimesters. The patient had elevated urine noradrenaline and adrenaline levels. She was operated with a laparoscopic rightsided adrenalectomy and the pathology report described a PCC, 50!50 mm in size with a weight of 54 g. Immunohistochemistry showed strong staining of chromogranin A and a Ki67 index of !0.5%. Exome sequencing revealed 13 SNVs, three were classified as benign and nine as unknown. One missense variant was assessed as possibly pathogenic, located at position 2372A!T (rs77724903), resulting in the amino acid substitution Tyr791Phe, in the proto-oncogene  tyrosine-protein kinase receptor (RET) gene (Fig. 3). The pathogenicity of RET Tyr791Phe is disputed (22,23,24).

Patient 3: NF1 variant
A 65-year-old woman with a two-decade history of hypertension and newly diagnosed adenocarcinoma of the breast was investigated due to abdominal discomfort. Computed tomography of the abdomen showed a lesion in the left adrenal gland and subsequent urine collection revealed high levels of noradrenaline. The patient was operated with a left-sided adrenalectomy and the pathology report described a cystic PCC, 60!50 mm in size and a weight of 59 g. The immunoreactivity of chromogranin A was strong and Ki67 index was !1%. Exome sequencing revealed ten SNVs, nine were classified as unknown. One missense variant was assessed as probably pathogenic, a nonsense variant located at position 910COT (rs76015786), resulting in the amino acid substitution Arg304Ter, in the neurofibromin (NF1) gene. The phenotype of Arg304Ter is described in related tumours and we assessed the variants as probably pathogenic (25,26,27). However, this variation could not be confirmed by Sanger sequencing.

Discussion
Genetic screening of PCC and PGL has been found to be beneficial in practicing centres (28). Utilizing novel sequencing techniques have a potential to decrease costs and time consumption, thereby lowering the threshold for inclusion.
Finding of the clinically relevant allele RET Tyr791Phe clearly exemplified the potential of NGS as a diagnostic tool, while SDHC Pro110Ser illustrated the complexity of possibly pathogenic, but previously unknown, variants. NF1 Arg304Ter displayed potential methodology conflicts; however, conflicts in results generated by certified clinical genetic laboratory testing using Sanger sequencing have been reported (29).

Price
A direct cost comparison between whole exome sequencing and traditional methods is complicated due to the invariability in which genetic screening is currently performed. The total cost for analysing the most frequently mutated genes (SDHB, SDHD, VHL and RET) is estimated to be 3500 USD (12,30) and if screening all ten susceptibility genes, we estimate the cost to be 10 000 USD. The use of genetic screening algorithms may clearly reduce costs but can be time consuming and are designed for scenarios in which patient characteristics clearly indicate specific loci (31). The costs of exome enrichment and sequencing in this study were considerably lower than those of traditional screening, and as the techniques develop fast, further cost reductions are expected (Hayden EC, The $1000 genome: are we there yet?, 2012, NATURE NEWS BLOG).

Performance
Raw sequences generated by NGS require computational processing, mapping reads to a reference sequence and calling variants between the two. Results generated by NGS  should be confirmed with a principally different sequencing chemistry. The bioinformatics process should deliver a defined list of variants. Stochastic false positives occur at relatively high frequencies but may be filtered given that the position is covered by an adequate sequence depth (about 30-fold). False negatives are more insidious and may be caused by incomplete enrichment, uneven sequencing coverage or faulty bioinformatics processing (17). Additionally, a high sequence depth allows NGS to detect alleles at thresholds below that of Sanger sequencing. These specifications predict built-in conflicts in which NGS may generate probably pathological variants that cannot be validated by Sanger sequencing (i.e. patient 3). Other validation methods (e.g. pyrosequencing) may detect alleles at a lower frequency but at a higher cost (32). A situation with multiple unknown variants has been expected and was confirmed by this study (1). Evaluating the significance of such 'genetic incidentalomas' may be extensively recourse demanding and clearly demonstrates the need to further expand and curate allele databases such as dbSNP and LOVD. Time constraint in a clinical setting is also a challenge. A diagnostic test must have a throughput measurable in weeks. In theory, the NGS process can be tuned to deliver results within 1 week (33). With a pre-defined bioinformatics assay, the necessary computational analysis for our experiments had a throughput of !24 h, including a total in-house hands-on time of !30 min.

Exome vs targeted enrichment
Sequencing of tumour tissue with complete exome coverage differs from the current diagnostic procedure in which limited Pro/Ser Ala Leu Tyr/Phe Gly Leu Arg Lys

Patient 2 RET Tyr791Phe
Patient 3 NF1 Arg304* Figure 3 Screenshot of sequences as displayed in CLC genomics 4.9. From above: reference sequence, consensus sequence and mapped tumour reads (blue colour, intact read pairs; green colour, broken forward read; red colour, broken reverse read). Below: chromatograms of the corresponding sequences generated by Sanger sequencing. loci in germline DNA is analysed. The theoretical potential is to provide improved prognostic and/or predictive information to individualize the care of the patient (34,35). Managing the surplus of genetic information that does not involve genes associated with the specific disease nor with its treatment is problematic (36). Ethical and financial frameworks regarding rights and responsibilities of patients and providers need to be implemented (16). While the concept of personalized medicine based on whole genome or exome coverage needs to mature, there are immediate benefits of NGS in clinical situations such as in the PCC and PGL patients. Examining the available sequencing apparatus and the upcoming pipeline, applications classified as medium capacity are closest to fulfilling the optimal specification of requirements for this situation: low costs, fast throughput, high accuracy and a capacity matching the size of loci conferring susceptibility to PCC and PGL (35,37).

Limitations of this study
Exome enrichment resulted in a coverage of above ten reads for more than 90% of targeted regions. However, detailed coverage analysis (Fig. 2) revealed PCC loci lacking 10! coverage (VHL gene had 10! coverage at only w50% of bases). Use of exome enrichment prevents analysis of structural variants (38), thus limiting the comparison of NGS results with current standards (Multiplex Ligation-dependent Probe Amplification).
As tumour tissue was sequenced without matched constitutional DNA, the bioinformatics process could not classify variants as somatic or constitutional. Therefore, future studies should include multiple cases with matched tumoral and normal tissues from patients having characterized pathogenic disease-causing variants. The method for target enrichment should be selected with regard to expected coverage at PCC and PGL disease causing loci.
Sanger sequencing as a validation method for NGS results have been replaced by other more sensitive methods (39); the finding of NF1 Arg304Ter by NGS, but not by Sanger, is an example of inconclusiveness between these two methods.

Conclusion
We conclude that utilizing NGS may serve as a fast and costeffective method in the clinical genetic screening of patients with PCC and PGLs. In order to facilitate the introduction of NGS as a diagnostic application, we identified process optimization, characterization of unknown variants and determination of additive effects of multiple variants as key issues to be addressed by future studies.

Supplementary data
This is linked to the online version of the paper at http://dx.doi.org/10.1530/ EC-13-0009.

Declaration of interest
The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.

Funding
This study was supported by grants from Swedish Cancer Society, Swedish Society for Medical Research, Å ke Wiberg Foundation, Selander Foundation, Jeanssons Foundation and Lions Club Uppsala. P Bjö rklund is a Swedish Cancer Society Investigator.