Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

Identifieur interne : 000748 ( Pmc/Corpus ); précédent : 000747; suivant : 000749

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

Auteurs : Tyler S. Alioto ; Ivo Buchhalter ; Sophia Derdak ; Barbara Hutter ; Matthew D. Eldridge ; Eivind Hovig ; Lawrence E. Heisler ; Timothy A. Beck ; Jared T. Simpson ; Laurie Tonon ; Anne-Sophie Sertier ; Ann-Marie Patch ; Natalie J Ger ; Philip Ginsbach ; Ruben Drews ; Nagarajan Paramasivam ; Rolf Kabbe ; Sasithorn Chotewutmontri ; Nicolle Diessl ; Christopher Previti ; Sabine Schmidt ; Benedikt Brors ; Lars Feuerbach ; Michael Heinold ; Susanne Gröbner ; Andrey Korshunov ; Patrick S. Tarpey ; Adam P. Butler ; Jonathan Hinton ; David Jones ; Andrew Menzies ; Keiran Raine ; Rebecca Shepherd ; Lucy Stebbings ; Jon W. Teague ; Paolo Ribeca ; Francesc Castro Giner ; Sergi Beltran ; Emanuele Raineri ; Marc Dabad ; Simon C. Heath ; Marta Gut ; Robert E. Denroche ; Nicholas J. Harding ; Takafumi N. Yamaguchi ; Akihiro Fujimoto ; Hidewaki Nakagawa ; Víctor Quesada ; Rafael Valdés-Mas ; Sigve Nakken ; Daniel Vodák ; Lawrence Bower ; Andrew G. Lynch ; Charlotte L. Anderson ; Nicola Waddell ; John V. Pearson ; Sean M. Grimmond ; Myron Peto ; Paul Spellman ; Minghui He ; Cyriac Kandoth ; Semin Lee ; John Zhang ; Louis Létourneau ; Singer Ma ; Sahil Seth ; David Torrents ; Liu Xi ; David A. Wheeler ; Carlos L Pez-Otín ; Elías Campo ; Peter J. Campbell ; Paul C. Boutros ; Xose S. Puente ; Daniela S. Gerhard ; Stefan M. Pfister ; John D. Mcpherson ; Thomas J. Hudson ; Matthias Schlesner ; Peter Lichter ; Roland Eils ; David T. W. Jones ; Ivo G. Gut

Source :

RBID : PMC:4682041

Abstract

As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.


Url:
DOI: 10.1038/ncomms10001
PubMed: 26647970
PubMed Central: 4682041

Links to Exploration step

PMC:4682041

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing</title>
<author>
<name sortKey="Alioto, Tyler S" sort="Alioto, Tyler S" uniqKey="Alioto T" first="Tyler S." last="Alioto">Tyler S. Alioto</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Buchhalter, Ivo" sort="Buchhalter, Ivo" uniqKey="Buchhalter I" first="Ivo" last="Buchhalter">Ivo Buchhalter</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Derdak, Sophia" sort="Derdak, Sophia" uniqKey="Derdak S" first="Sophia" last="Derdak">Sophia Derdak</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hutter, Barbara" sort="Hutter, Barbara" uniqKey="Hutter B" first="Barbara" last="Hutter">Barbara Hutter</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Eldridge, Matthew D" sort="Eldridge, Matthew D" uniqKey="Eldridge M" first="Matthew D." last="Eldridge">Matthew D. Eldridge</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hovig, Eivind" sort="Hovig, Eivind" uniqKey="Hovig E" first="Eivind" last="Hovig">Eivind Hovig</name>
<affiliation>
<nlm:aff id="a6">
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a7">
<institution>Department of Informatics, University of Oslo</institution>
, 0373 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heisler, Lawrence E" sort="Heisler, Lawrence E" uniqKey="Heisler L" first="Lawrence E." last="Heisler">Lawrence E. Heisler</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Beck, Timothy A" sort="Beck, Timothy A" uniqKey="Beck T" first="Timothy A." last="Beck">Timothy A. Beck</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Simpson, Jared T" sort="Simpson, Jared T" uniqKey="Simpson J" first="Jared T." last="Simpson">Jared T. Simpson</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tonon, Laurie" sort="Tonon, Laurie" uniqKey="Tonon L" first="Laurie" last="Tonon">Laurie Tonon</name>
<affiliation>
<nlm:aff id="a9">
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sertier, Anne Sophie" sort="Sertier, Anne Sophie" uniqKey="Sertier A" first="Anne-Sophie" last="Sertier">Anne-Sophie Sertier</name>
<affiliation>
<nlm:aff id="a9">
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Patch, Ann Marie" sort="Patch, Ann Marie" uniqKey="Patch A" first="Ann-Marie" last="Patch">Ann-Marie Patch</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a11">
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="J Ger, Natalie" sort="J Ger, Natalie" uniqKey="J Ger N" first="Natalie" last="J Ger">Natalie J Ger</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a12">
<institution>Department of Genetics, Stanford University</institution>
, Mail Stop-5120, Stanford, California 94305-5120,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ginsbach, Philip" sort="Ginsbach, Philip" uniqKey="Ginsbach P" first="Philip" last="Ginsbach">Philip Ginsbach</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Drews, Ruben" sort="Drews, Ruben" uniqKey="Drews R" first="Ruben" last="Drews">Ruben Drews</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Paramasivam, Nagarajan" sort="Paramasivam, Nagarajan" uniqKey="Paramasivam N" first="Nagarajan" last="Paramasivam">Nagarajan Paramasivam</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kabbe, Rolf" sort="Kabbe, Rolf" uniqKey="Kabbe R" first="Rolf" last="Kabbe">Rolf Kabbe</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chotewutmontri, Sasithorn" sort="Chotewutmontri, Sasithorn" uniqKey="Chotewutmontri S" first="Sasithorn" last="Chotewutmontri">Sasithorn Chotewutmontri</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Diessl, Nicolle" sort="Diessl, Nicolle" uniqKey="Diessl N" first="Nicolle" last="Diessl">Nicolle Diessl</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Previti, Christopher" sort="Previti, Christopher" uniqKey="Previti C" first="Christopher" last="Previti">Christopher Previti</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Schmidt, Sabine" sort="Schmidt, Sabine" uniqKey="Schmidt S" first="Sabine" last="Schmidt">Sabine Schmidt</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Brors, Benedikt" sort="Brors, Benedikt" uniqKey="Brors B" first="Benedikt" last="Brors">Benedikt Brors</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Feuerbach, Lars" sort="Feuerbach, Lars" uniqKey="Feuerbach L" first="Lars" last="Feuerbach">Lars Feuerbach</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heinold, Michael" sort="Heinold, Michael" uniqKey="Heinold M" first="Michael" last="Heinold">Michael Heinold</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Grobner, Susanne" sort="Grobner, Susanne" uniqKey="Grobner S" first="Susanne" last="Gröbner">Susanne Gröbner</name>
<affiliation>
<nlm:aff id="a14">
<institution>Department of Pediatric Hematology and Oncology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 430, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Korshunov, Andrey" sort="Korshunov, Andrey" uniqKey="Korshunov A" first="Andrey" last="Korshunov">Andrey Korshunov</name>
<affiliation>
<nlm:aff id="a15">
<institution>Department of Neuropathology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 224, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tarpey, Patrick S" sort="Tarpey, Patrick S" uniqKey="Tarpey P" first="Patrick S." last="Tarpey">Patrick S. Tarpey</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Butler, Adam P" sort="Butler, Adam P" uniqKey="Butler A" first="Adam P." last="Butler">Adam P. Butler</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hinton, Jonathan" sort="Hinton, Jonathan" uniqKey="Hinton J" first="Jonathan" last="Hinton">Jonathan Hinton</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Jones, David" sort="Jones, David" uniqKey="Jones D" first="David" last="Jones">David Jones</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Menzies, Andrew" sort="Menzies, Andrew" uniqKey="Menzies A" first="Andrew" last="Menzies">Andrew Menzies</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Raine, Keiran" sort="Raine, Keiran" uniqKey="Raine K" first="Keiran" last="Raine">Keiran Raine</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shepherd, Rebecca" sort="Shepherd, Rebecca" uniqKey="Shepherd R" first="Rebecca" last="Shepherd">Rebecca Shepherd</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Stebbings, Lucy" sort="Stebbings, Lucy" uniqKey="Stebbings L" first="Lucy" last="Stebbings">Lucy Stebbings</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Teague, Jon W" sort="Teague, Jon W" uniqKey="Teague J" first="Jon W." last="Teague">Jon W. Teague</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ribeca, Paolo" sort="Ribeca, Paolo" uniqKey="Ribeca P" first="Paolo" last="Ribeca">Paolo Ribeca</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Giner, Francesc Castro" sort="Giner, Francesc Castro" uniqKey="Giner F" first="Francesc Castro" last="Giner">Francesc Castro Giner</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Beltran, Sergi" sort="Beltran, Sergi" uniqKey="Beltran S" first="Sergi" last="Beltran">Sergi Beltran</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Raineri, Emanuele" sort="Raineri, Emanuele" uniqKey="Raineri E" first="Emanuele" last="Raineri">Emanuele Raineri</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Dabad, Marc" sort="Dabad, Marc" uniqKey="Dabad M" first="Marc" last="Dabad">Marc Dabad</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heath, Simon C" sort="Heath, Simon C" uniqKey="Heath S" first="Simon C." last="Heath">Simon C. Heath</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Gut, Marta" sort="Gut, Marta" uniqKey="Gut M" first="Marta" last="Gut">Marta Gut</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Denroche, Robert E" sort="Denroche, Robert E" uniqKey="Denroche R" first="Robert E." last="Denroche">Robert E. Denroche</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Harding, Nicholas J" sort="Harding, Nicholas J" uniqKey="Harding N" first="Nicholas J." last="Harding">Nicholas J. Harding</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yamaguchi, Takafumi N" sort="Yamaguchi, Takafumi N" uniqKey="Yamaguchi T" first="Takafumi N." last="Yamaguchi">Takafumi N. Yamaguchi</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fujimoto, Akihiro" sort="Fujimoto, Akihiro" uniqKey="Fujimoto A" first="Akihiro" last="Fujimoto">Akihiro Fujimoto</name>
<affiliation>
<nlm:aff id="a17">
<institution>RIKEN Center for Integrative Medical Sciences</institution>
, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
<country>Japan</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nakagawa, Hidewaki" sort="Nakagawa, Hidewaki" uniqKey="Nakagawa H" first="Hidewaki" last="Nakagawa">Hidewaki Nakagawa</name>
<affiliation>
<nlm:aff id="a17">
<institution>RIKEN Center for Integrative Medical Sciences</institution>
, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
<country>Japan</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Quesada, Victor" sort="Quesada, Victor" uniqKey="Quesada V" first="Víctor" last="Quesada">Víctor Quesada</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Valdes Mas, Rafael" sort="Valdes Mas, Rafael" uniqKey="Valdes Mas R" first="Rafael" last="Valdés-Mas">Rafael Valdés-Mas</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nakken, Sigve" sort="Nakken, Sigve" uniqKey="Nakken S" first="Sigve" last="Nakken">Sigve Nakken</name>
<affiliation>
<nlm:aff id="a6">
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vodak, Daniel" sort="Vodak, Daniel" uniqKey="Vodak D" first="Daniel" last="Vodák">Daniel Vodák</name>
<affiliation>
<nlm:aff id="a6">
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a19">
<institution>The Bioinformatics Core Facility, Institute for Cancer Genetics and Informatics, Oslo University Hospital</institution>
, 0310 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bower, Lawrence" sort="Bower, Lawrence" uniqKey="Bower L" first="Lawrence" last="Bower">Lawrence Bower</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lynch, Andrew G" sort="Lynch, Andrew G" uniqKey="Lynch A" first="Andrew G." last="Lynch">Andrew G. Lynch</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Anderson, Charlotte L" sort="Anderson, Charlotte L" uniqKey="Anderson C" first="Charlotte L." last="Anderson">Charlotte L. Anderson</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a20">
<institution>Victorian Life Sciences Computation Initiative, The University of Melbourne</institution>
, Melbourne, Victoria 3053,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Waddell, Nicola" sort="Waddell, Nicola" uniqKey="Waddell N" first="Nicola" last="Waddell">Nicola Waddell</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a11">
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Pearson, John V" sort="Pearson, John V" uniqKey="Pearson J" first="John V." last="Pearson">John V. Pearson</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a11">
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Grimmond, Sean M" sort="Grimmond, Sean M" uniqKey="Grimmond S" first="Sean M." last="Grimmond">Sean M. Grimmond</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a21">
<institution>WolfsonWohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow</institution>
, Glasgow, Scotland G61 1QH,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Peto, Myron" sort="Peto, Myron" uniqKey="Peto M" first="Myron" last="Peto">Myron Peto</name>
<affiliation>
<nlm:aff id="a22">
<institution>Knight Cancer Institute, Oregon Health and Science University</institution>
, Portland, Oregon 97239-3098,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Spellman, Paul" sort="Spellman, Paul" uniqKey="Spellman P" first="Paul" last="Spellman">Paul Spellman</name>
<affiliation>
<nlm:aff id="a22">
<institution>Knight Cancer Institute, Oregon Health and Science University</institution>
, Portland, Oregon 97239-3098,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="He, Minghui" sort="He, Minghui" uniqKey="He M" first="Minghui" last="He">Minghui He</name>
<affiliation>
<nlm:aff id="a23">
<institution>BGI-Shenzhen</institution>
, Shenzhen 518083,
<country>China</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kandoth, Cyriac" sort="Kandoth, Cyriac" uniqKey="Kandoth C" first="Cyriac" last="Kandoth">Cyriac Kandoth</name>
<affiliation>
<nlm:aff id="a24">
<institution>The Genome Institute, Washington University</institution>
, St Louis, Missouri 63108,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lee, Semin" sort="Lee, Semin" uniqKey="Lee S" first="Semin" last="Lee">Semin Lee</name>
<affiliation>
<nlm:aff id="a25">
<institution>Harvard Medical School</institution>
, Boston, Massachusetts 02115,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, John" sort="Zhang, John" uniqKey="Zhang J" first="John" last="Zhang">John Zhang</name>
<affiliation>
<nlm:aff id="a25">
<institution>Harvard Medical School</institution>
, Boston, Massachusetts 02115,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a26">
<institution>MD Anderson Cancer Center</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Letourneau, Louis" sort="Letourneau, Louis" uniqKey="Letourneau L" first="Louis" last="Létourneau">Louis Létourneau</name>
<affiliation>
<nlm:aff id="a27">
<sup>27</sup>
<institution>McGill University</institution>
, Montreal, Quebec,
<country>Canada</country>
QC H3A 0G4</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ma, Singer" sort="Ma, Singer" uniqKey="Ma S" first="Singer" last="Ma">Singer Ma</name>
<affiliation>
<nlm:aff id="a28">
<institution>Center for Biomolecular Science and Engineering, University of California</institution>
, Santa Cruz, California 95064,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Seth, Sahil" sort="Seth, Sahil" uniqKey="Seth S" first="Sahil" last="Seth">Sahil Seth</name>
<affiliation>
<nlm:aff id="a26">
<institution>MD Anderson Cancer Center</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Torrents, David" sort="Torrents, David" uniqKey="Torrents D" first="David" last="Torrents">David Torrents</name>
<affiliation>
<nlm:aff id="a29">
<institution>IRB-BSC Joint Research Program on Computational Biology, Barcelona Supercomputing Center</institution>
, 08034 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Xi, Liu" sort="Xi, Liu" uniqKey="Xi L" first="Liu" last="Xi">Liu Xi</name>
<affiliation>
<nlm:aff id="a30">
<institution>Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wheeler, David A" sort="Wheeler, David A" uniqKey="Wheeler D" first="David A." last="Wheeler">David A. Wheeler</name>
<affiliation>
<nlm:aff id="a30">
<institution>Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="L Pez Otin, Carlos" sort="L Pez Otin, Carlos" uniqKey="L Pez Otin C" first="Carlos" last="L Pez-Otín">Carlos L Pez-Otín</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Campo, Elias" sort="Campo, Elias" uniqKey="Campo E" first="Elías" last="Campo">Elías Campo</name>
<affiliation>
<nlm:aff id="a31">
<institution>Hematopathology Unit, Department of Pathology, Hospital Clinic, University of Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer</institution>
, 08036 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Campbell, Peter J" sort="Campbell, Peter J" uniqKey="Campbell P" first="Peter J." last="Campbell">Peter J. Campbell</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Boutros, Paul C" sort="Boutros, Paul C" uniqKey="Boutros P" first="Paul C." last="Boutros">Paul C. Boutros</name>
<affiliation>
<nlm:aff id="a9">
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a32">
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Puente, Xose S" sort="Puente, Xose S" uniqKey="Puente X" first="Xose S." last="Puente">Xose S. Puente</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Gerhard, Daniela S" sort="Gerhard, Daniela S" uniqKey="Gerhard D" first="Daniela S." last="Gerhard">Daniela S. Gerhard</name>
<affiliation>
<nlm:aff id="a33">
<institution>National Cancer Institute, Office of Cancer Genomics</institution>
, 31 Center Drive, 10A07, Bethesda, Maryland 20892-2580,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Pfister, Stefan M" sort="Pfister, Stefan M" uniqKey="Pfister S" first="Stefan M." last="Pfister">Stefan M. Pfister</name>
<affiliation>
<nlm:aff id="a14">
<institution>Department of Pediatric Hematology and Oncology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 430, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a34">
<institution>Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mcpherson, John D" sort="Mcpherson, John D" uniqKey="Mcpherson J" first="John D." last="Mcpherson">John D. Mcpherson</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a32">
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hudson, Thomas J" sort="Hudson, Thomas J" uniqKey="Hudson T" first="Thomas J." last="Hudson">Thomas J. Hudson</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a32">
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a35">
<institution>Department of Molecular Genetics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5S 1A8</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Schlesner, Matthias" sort="Schlesner, Matthias" uniqKey="Schlesner M" first="Matthias" last="Schlesner">Matthias Schlesner</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lichter, Peter" sort="Lichter, Peter" uniqKey="Lichter P" first="Peter" last="Lichter">Peter Lichter</name>
<affiliation>
<nlm:aff id="a36">
<institution>Division of Molecular Genetics, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a37">
<institution>Heidelberg Center for Personalised Oncology (DKFZ-HIPO), German Cancer Research Center (DKFZ)</institution>
, Heidelberg,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Eils, Roland" sort="Eils, Roland" uniqKey="Eils R" first="Roland" last="Eils">Roland Eils</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a37">
<institution>Heidelberg Center for Personalised Oncology (DKFZ-HIPO), German Cancer Research Center (DKFZ)</institution>
, Heidelberg,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a38">
<institution>Institute of Pharmacy and Molecular Biotechnology, University of Heidelberg</institution>
, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a39">
<institution>Bioquant Center, University of Heidelberg</institution>
, Im Neuenheimer Feld 267, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Jones, David T W" sort="Jones, David T W" uniqKey="Jones D" first="David T. W." last="Jones">David T. W. Jones</name>
<affiliation>
<nlm:aff id="a40">
<institution>Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Gut, Ivo G" sort="Gut, Ivo G" uniqKey="Gut I" first="Ivo G." last="Gut">Ivo G. Gut</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26647970</idno>
<idno type="pmc">4682041</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682041</idno>
<idno type="RBID">PMC:4682041</idno>
<idno type="doi">10.1038/ncomms10001</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000748</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000748</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing</title>
<author>
<name sortKey="Alioto, Tyler S" sort="Alioto, Tyler S" uniqKey="Alioto T" first="Tyler S." last="Alioto">Tyler S. Alioto</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Buchhalter, Ivo" sort="Buchhalter, Ivo" uniqKey="Buchhalter I" first="Ivo" last="Buchhalter">Ivo Buchhalter</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Derdak, Sophia" sort="Derdak, Sophia" uniqKey="Derdak S" first="Sophia" last="Derdak">Sophia Derdak</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hutter, Barbara" sort="Hutter, Barbara" uniqKey="Hutter B" first="Barbara" last="Hutter">Barbara Hutter</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Eldridge, Matthew D" sort="Eldridge, Matthew D" uniqKey="Eldridge M" first="Matthew D." last="Eldridge">Matthew D. Eldridge</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hovig, Eivind" sort="Hovig, Eivind" uniqKey="Hovig E" first="Eivind" last="Hovig">Eivind Hovig</name>
<affiliation>
<nlm:aff id="a6">
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a7">
<institution>Department of Informatics, University of Oslo</institution>
, 0373 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heisler, Lawrence E" sort="Heisler, Lawrence E" uniqKey="Heisler L" first="Lawrence E." last="Heisler">Lawrence E. Heisler</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Beck, Timothy A" sort="Beck, Timothy A" uniqKey="Beck T" first="Timothy A." last="Beck">Timothy A. Beck</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Simpson, Jared T" sort="Simpson, Jared T" uniqKey="Simpson J" first="Jared T." last="Simpson">Jared T. Simpson</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tonon, Laurie" sort="Tonon, Laurie" uniqKey="Tonon L" first="Laurie" last="Tonon">Laurie Tonon</name>
<affiliation>
<nlm:aff id="a9">
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sertier, Anne Sophie" sort="Sertier, Anne Sophie" uniqKey="Sertier A" first="Anne-Sophie" last="Sertier">Anne-Sophie Sertier</name>
<affiliation>
<nlm:aff id="a9">
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Patch, Ann Marie" sort="Patch, Ann Marie" uniqKey="Patch A" first="Ann-Marie" last="Patch">Ann-Marie Patch</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a11">
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="J Ger, Natalie" sort="J Ger, Natalie" uniqKey="J Ger N" first="Natalie" last="J Ger">Natalie J Ger</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a12">
<institution>Department of Genetics, Stanford University</institution>
, Mail Stop-5120, Stanford, California 94305-5120,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ginsbach, Philip" sort="Ginsbach, Philip" uniqKey="Ginsbach P" first="Philip" last="Ginsbach">Philip Ginsbach</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Drews, Ruben" sort="Drews, Ruben" uniqKey="Drews R" first="Ruben" last="Drews">Ruben Drews</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Paramasivam, Nagarajan" sort="Paramasivam, Nagarajan" uniqKey="Paramasivam N" first="Nagarajan" last="Paramasivam">Nagarajan Paramasivam</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kabbe, Rolf" sort="Kabbe, Rolf" uniqKey="Kabbe R" first="Rolf" last="Kabbe">Rolf Kabbe</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chotewutmontri, Sasithorn" sort="Chotewutmontri, Sasithorn" uniqKey="Chotewutmontri S" first="Sasithorn" last="Chotewutmontri">Sasithorn Chotewutmontri</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Diessl, Nicolle" sort="Diessl, Nicolle" uniqKey="Diessl N" first="Nicolle" last="Diessl">Nicolle Diessl</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Previti, Christopher" sort="Previti, Christopher" uniqKey="Previti C" first="Christopher" last="Previti">Christopher Previti</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Schmidt, Sabine" sort="Schmidt, Sabine" uniqKey="Schmidt S" first="Sabine" last="Schmidt">Sabine Schmidt</name>
<affiliation>
<nlm:aff id="a13">
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Brors, Benedikt" sort="Brors, Benedikt" uniqKey="Brors B" first="Benedikt" last="Brors">Benedikt Brors</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Feuerbach, Lars" sort="Feuerbach, Lars" uniqKey="Feuerbach L" first="Lars" last="Feuerbach">Lars Feuerbach</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heinold, Michael" sort="Heinold, Michael" uniqKey="Heinold M" first="Michael" last="Heinold">Michael Heinold</name>
<affiliation>
<nlm:aff id="a4">
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Grobner, Susanne" sort="Grobner, Susanne" uniqKey="Grobner S" first="Susanne" last="Gröbner">Susanne Gröbner</name>
<affiliation>
<nlm:aff id="a14">
<institution>Department of Pediatric Hematology and Oncology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 430, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Korshunov, Andrey" sort="Korshunov, Andrey" uniqKey="Korshunov A" first="Andrey" last="Korshunov">Andrey Korshunov</name>
<affiliation>
<nlm:aff id="a15">
<institution>Department of Neuropathology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 224, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tarpey, Patrick S" sort="Tarpey, Patrick S" uniqKey="Tarpey P" first="Patrick S." last="Tarpey">Patrick S. Tarpey</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Butler, Adam P" sort="Butler, Adam P" uniqKey="Butler A" first="Adam P." last="Butler">Adam P. Butler</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hinton, Jonathan" sort="Hinton, Jonathan" uniqKey="Hinton J" first="Jonathan" last="Hinton">Jonathan Hinton</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Jones, David" sort="Jones, David" uniqKey="Jones D" first="David" last="Jones">David Jones</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Menzies, Andrew" sort="Menzies, Andrew" uniqKey="Menzies A" first="Andrew" last="Menzies">Andrew Menzies</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Raine, Keiran" sort="Raine, Keiran" uniqKey="Raine K" first="Keiran" last="Raine">Keiran Raine</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shepherd, Rebecca" sort="Shepherd, Rebecca" uniqKey="Shepherd R" first="Rebecca" last="Shepherd">Rebecca Shepherd</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Stebbings, Lucy" sort="Stebbings, Lucy" uniqKey="Stebbings L" first="Lucy" last="Stebbings">Lucy Stebbings</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Teague, Jon W" sort="Teague, Jon W" uniqKey="Teague J" first="Jon W." last="Teague">Jon W. Teague</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ribeca, Paolo" sort="Ribeca, Paolo" uniqKey="Ribeca P" first="Paolo" last="Ribeca">Paolo Ribeca</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Giner, Francesc Castro" sort="Giner, Francesc Castro" uniqKey="Giner F" first="Francesc Castro" last="Giner">Francesc Castro Giner</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Beltran, Sergi" sort="Beltran, Sergi" uniqKey="Beltran S" first="Sergi" last="Beltran">Sergi Beltran</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Raineri, Emanuele" sort="Raineri, Emanuele" uniqKey="Raineri E" first="Emanuele" last="Raineri">Emanuele Raineri</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Dabad, Marc" sort="Dabad, Marc" uniqKey="Dabad M" first="Marc" last="Dabad">Marc Dabad</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heath, Simon C" sort="Heath, Simon C" uniqKey="Heath S" first="Simon C." last="Heath">Simon C. Heath</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Gut, Marta" sort="Gut, Marta" uniqKey="Gut M" first="Marta" last="Gut">Marta Gut</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Denroche, Robert E" sort="Denroche, Robert E" uniqKey="Denroche R" first="Robert E." last="Denroche">Robert E. Denroche</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Harding, Nicholas J" sort="Harding, Nicholas J" uniqKey="Harding N" first="Nicholas J." last="Harding">Nicholas J. Harding</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yamaguchi, Takafumi N" sort="Yamaguchi, Takafumi N" uniqKey="Yamaguchi T" first="Takafumi N." last="Yamaguchi">Takafumi N. Yamaguchi</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fujimoto, Akihiro" sort="Fujimoto, Akihiro" uniqKey="Fujimoto A" first="Akihiro" last="Fujimoto">Akihiro Fujimoto</name>
<affiliation>
<nlm:aff id="a17">
<institution>RIKEN Center for Integrative Medical Sciences</institution>
, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
<country>Japan</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nakagawa, Hidewaki" sort="Nakagawa, Hidewaki" uniqKey="Nakagawa H" first="Hidewaki" last="Nakagawa">Hidewaki Nakagawa</name>
<affiliation>
<nlm:aff id="a17">
<institution>RIKEN Center for Integrative Medical Sciences</institution>
, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
<country>Japan</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Quesada, Victor" sort="Quesada, Victor" uniqKey="Quesada V" first="Víctor" last="Quesada">Víctor Quesada</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Valdes Mas, Rafael" sort="Valdes Mas, Rafael" uniqKey="Valdes Mas R" first="Rafael" last="Valdés-Mas">Rafael Valdés-Mas</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nakken, Sigve" sort="Nakken, Sigve" uniqKey="Nakken S" first="Sigve" last="Nakken">Sigve Nakken</name>
<affiliation>
<nlm:aff id="a6">
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vodak, Daniel" sort="Vodak, Daniel" uniqKey="Vodak D" first="Daniel" last="Vodák">Daniel Vodák</name>
<affiliation>
<nlm:aff id="a6">
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a19">
<institution>The Bioinformatics Core Facility, Institute for Cancer Genetics and Informatics, Oslo University Hospital</institution>
, 0310 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bower, Lawrence" sort="Bower, Lawrence" uniqKey="Bower L" first="Lawrence" last="Bower">Lawrence Bower</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lynch, Andrew G" sort="Lynch, Andrew G" uniqKey="Lynch A" first="Andrew G." last="Lynch">Andrew G. Lynch</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Anderson, Charlotte L" sort="Anderson, Charlotte L" uniqKey="Anderson C" first="Charlotte L." last="Anderson">Charlotte L. Anderson</name>
<affiliation>
<nlm:aff id="a5">
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a20">
<institution>Victorian Life Sciences Computation Initiative, The University of Melbourne</institution>
, Melbourne, Victoria 3053,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Waddell, Nicola" sort="Waddell, Nicola" uniqKey="Waddell N" first="Nicola" last="Waddell">Nicola Waddell</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a11">
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Pearson, John V" sort="Pearson, John V" uniqKey="Pearson J" first="John V." last="Pearson">John V. Pearson</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a11">
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Grimmond, Sean M" sort="Grimmond, Sean M" uniqKey="Grimmond S" first="Sean M." last="Grimmond">Sean M. Grimmond</name>
<affiliation>
<nlm:aff id="a10">
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a21">
<institution>WolfsonWohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow</institution>
, Glasgow, Scotland G61 1QH,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Peto, Myron" sort="Peto, Myron" uniqKey="Peto M" first="Myron" last="Peto">Myron Peto</name>
<affiliation>
<nlm:aff id="a22">
<institution>Knight Cancer Institute, Oregon Health and Science University</institution>
, Portland, Oregon 97239-3098,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Spellman, Paul" sort="Spellman, Paul" uniqKey="Spellman P" first="Paul" last="Spellman">Paul Spellman</name>
<affiliation>
<nlm:aff id="a22">
<institution>Knight Cancer Institute, Oregon Health and Science University</institution>
, Portland, Oregon 97239-3098,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="He, Minghui" sort="He, Minghui" uniqKey="He M" first="Minghui" last="He">Minghui He</name>
<affiliation>
<nlm:aff id="a23">
<institution>BGI-Shenzhen</institution>
, Shenzhen 518083,
<country>China</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kandoth, Cyriac" sort="Kandoth, Cyriac" uniqKey="Kandoth C" first="Cyriac" last="Kandoth">Cyriac Kandoth</name>
<affiliation>
<nlm:aff id="a24">
<institution>The Genome Institute, Washington University</institution>
, St Louis, Missouri 63108,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lee, Semin" sort="Lee, Semin" uniqKey="Lee S" first="Semin" last="Lee">Semin Lee</name>
<affiliation>
<nlm:aff id="a25">
<institution>Harvard Medical School</institution>
, Boston, Massachusetts 02115,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, John" sort="Zhang, John" uniqKey="Zhang J" first="John" last="Zhang">John Zhang</name>
<affiliation>
<nlm:aff id="a25">
<institution>Harvard Medical School</institution>
, Boston, Massachusetts 02115,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a26">
<institution>MD Anderson Cancer Center</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Letourneau, Louis" sort="Letourneau, Louis" uniqKey="Letourneau L" first="Louis" last="Létourneau">Louis Létourneau</name>
<affiliation>
<nlm:aff id="a27">
<sup>27</sup>
<institution>McGill University</institution>
, Montreal, Quebec,
<country>Canada</country>
QC H3A 0G4</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ma, Singer" sort="Ma, Singer" uniqKey="Ma S" first="Singer" last="Ma">Singer Ma</name>
<affiliation>
<nlm:aff id="a28">
<institution>Center for Biomolecular Science and Engineering, University of California</institution>
, Santa Cruz, California 95064,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Seth, Sahil" sort="Seth, Sahil" uniqKey="Seth S" first="Sahil" last="Seth">Sahil Seth</name>
<affiliation>
<nlm:aff id="a26">
<institution>MD Anderson Cancer Center</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Torrents, David" sort="Torrents, David" uniqKey="Torrents D" first="David" last="Torrents">David Torrents</name>
<affiliation>
<nlm:aff id="a29">
<institution>IRB-BSC Joint Research Program on Computational Biology, Barcelona Supercomputing Center</institution>
, 08034 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Xi, Liu" sort="Xi, Liu" uniqKey="Xi L" first="Liu" last="Xi">Liu Xi</name>
<affiliation>
<nlm:aff id="a30">
<institution>Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wheeler, David A" sort="Wheeler, David A" uniqKey="Wheeler D" first="David A." last="Wheeler">David A. Wheeler</name>
<affiliation>
<nlm:aff id="a30">
<institution>Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza</institution>
, Houston, Texas 77030,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="L Pez Otin, Carlos" sort="L Pez Otin, Carlos" uniqKey="L Pez Otin C" first="Carlos" last="L Pez-Otín">Carlos L Pez-Otín</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Campo, Elias" sort="Campo, Elias" uniqKey="Campo E" first="Elías" last="Campo">Elías Campo</name>
<affiliation>
<nlm:aff id="a31">
<institution>Hematopathology Unit, Department of Pathology, Hospital Clinic, University of Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer</institution>
, 08036 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Campbell, Peter J" sort="Campbell, Peter J" uniqKey="Campbell P" first="Peter J." last="Campbell">Peter J. Campbell</name>
<affiliation>
<nlm:aff id="a16">
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Boutros, Paul C" sort="Boutros, Paul C" uniqKey="Boutros P" first="Paul C." last="Boutros">Paul C. Boutros</name>
<affiliation>
<nlm:aff id="a9">
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a32">
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Puente, Xose S" sort="Puente, Xose S" uniqKey="Puente X" first="Xose S." last="Puente">Xose S. Puente</name>
<affiliation>
<nlm:aff id="a18">
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Gerhard, Daniela S" sort="Gerhard, Daniela S" uniqKey="Gerhard D" first="Daniela S." last="Gerhard">Daniela S. Gerhard</name>
<affiliation>
<nlm:aff id="a33">
<institution>National Cancer Institute, Office of Cancer Genomics</institution>
, 31 Center Drive, 10A07, Bethesda, Maryland 20892-2580,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Pfister, Stefan M" sort="Pfister, Stefan M" uniqKey="Pfister S" first="Stefan M." last="Pfister">Stefan M. Pfister</name>
<affiliation>
<nlm:aff id="a14">
<institution>Department of Pediatric Hematology and Oncology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 430, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a34">
<institution>Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mcpherson, John D" sort="Mcpherson, John D" uniqKey="Mcpherson J" first="John D." last="Mcpherson">John D. Mcpherson</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a32">
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hudson, Thomas J" sort="Hudson, Thomas J" uniqKey="Hudson T" first="Thomas J." last="Hudson">Thomas J. Hudson</name>
<affiliation>
<nlm:aff id="a8">
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a32">
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a35">
<institution>Department of Molecular Genetics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5S 1A8</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Schlesner, Matthias" sort="Schlesner, Matthias" uniqKey="Schlesner M" first="Matthias" last="Schlesner">Matthias Schlesner</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lichter, Peter" sort="Lichter, Peter" uniqKey="Lichter P" first="Peter" last="Lichter">Peter Lichter</name>
<affiliation>
<nlm:aff id="a36">
<institution>Division of Molecular Genetics, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a37">
<institution>Heidelberg Center for Personalised Oncology (DKFZ-HIPO), German Cancer Research Center (DKFZ)</institution>
, Heidelberg,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Eils, Roland" sort="Eils, Roland" uniqKey="Eils R" first="Roland" last="Eils">Roland Eils</name>
<affiliation>
<nlm:aff id="a3">
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a37">
<institution>Heidelberg Center for Personalised Oncology (DKFZ-HIPO), German Cancer Research Center (DKFZ)</institution>
, Heidelberg,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a38">
<institution>Institute of Pharmacy and Molecular Biotechnology, University of Heidelberg</institution>
, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a39">
<institution>Bioquant Center, University of Heidelberg</institution>
, Im Neuenheimer Feld 267, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Jones, David T W" sort="Jones, David T W" uniqKey="Jones D" first="David T. W." last="Jones">David T. W. Jones</name>
<affiliation>
<nlm:aff id="a40">
<institution>Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Gut, Ivo G" sort="Gut, Ivo G" uniqKey="Gut I" first="Ivo G." last="Gut">Ivo G. Gut</name>
<affiliation>
<nlm:aff id="a1">
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nature Communications</title>
<idno type="eISSN">2041-1723</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Hudson, T J" uniqKey="Hudson T">T. J. Hudson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mardis, E R" uniqKey="Mardis E">E. R. Mardis</name>
</author>
<author>
<name sortKey="Wilson, R K" uniqKey="Wilson R">R. K. Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ley, T J" uniqKey="Ley T">T. J. Ley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Puente, X S" uniqKey="Puente X">X. S. Puente</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alkodsi, A" uniqKey="Alkodsi A">A. Alkodsi</name>
</author>
<author>
<name sortKey="Louhimo, R" uniqKey="Louhimo R">R. Louhimo</name>
</author>
<author>
<name sortKey="Hautaniemi, S" uniqKey="Hautaniemi S">S. Hautaniemi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dewey, F E" uniqKey="Dewey F">F. E. Dewey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kandoth, C" uniqKey="Kandoth C">C. Kandoth</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, D T" uniqKey="Jones D">D. T. Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H. Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcginn, S" uniqKey="Mcginn S">S. McGinn</name>
</author>
<author>
<name sortKey="Gut, I G" uniqKey="Gut I">I. G. Gut</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xu, H" uniqKey="Xu H">H. Xu</name>
</author>
<author>
<name sortKey="Dicarlo, J" uniqKey="Dicarlo J">J. DiCarlo</name>
</author>
<author>
<name sortKey="Satya, R V" uniqKey="Satya R">R. V. Satya</name>
</author>
<author>
<name sortKey="Peng, Q" uniqKey="Peng Q">Q. Peng</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Highnam, G" uniqKey="Highnam G">G. Highnam</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zook, J M" uniqKey="Zook J">J. M. Zook</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pabinger, S" uniqKey="Pabinger S">S. Pabinger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fang, H" uniqKey="Fang H">H. Fang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="O Rawe, J" uniqKey="O Rawe J">J. O'Rawe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Q" uniqKey="Wang Q">Q. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, S Y" uniqKey="Kim S">S. Y. Kim</name>
</author>
<author>
<name sortKey="Speed, T P" uniqKey="Speed T">T. P. Speed</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Louis, D N" uniqKey="Louis D">D. N. Louis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taylor, M D" uniqKey="Taylor M">M. D. Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ewing, A D" uniqKey="Ewing A">A. D. Ewing</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kassahn, K S" uniqKey="Kassahn K">K. S. Kassahn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mckenna, A" uniqKey="Mckenna A">A. McKenna</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simpson, J T" uniqKey="Simpson J">J. T. Simpson</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R. Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garrison, E" uniqKey="Garrison E">E. Garrison</name>
</author>
<author>
<name sortKey="Marth, G" uniqKey="Marth G">G. Marth</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saunders, C T" uniqKey="Saunders C">C. T. Saunders</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rimmer, A" uniqKey="Rimmer A">A. Rimmer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Challis, D" uniqKey="Challis D">D. Challis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Moncunill, V" uniqKey="Moncunill V">V. Moncunill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cibulskis, K" uniqKey="Cibulskis K">K. Cibulskis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goode, D L" uniqKey="Goode D">D. L. Goode</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rieber, N" uniqKey="Rieber N">N. Rieber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alexandrov, L B" uniqKey="Alexandrov L">L. B. Alexandrov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alexandrov, L B" uniqKey="Alexandrov L">L. B. Alexandrov</name>
</author>
<author>
<name sortKey="Nik Zainal, S" uniqKey="Nik Zainal S">S. Nik-Zainal</name>
</author>
<author>
<name sortKey="Wedge, D C" uniqKey="Wedge D">D. C. Wedge</name>
</author>
<author>
<name sortKey="Campbell, P J" uniqKey="Campbell P">P. J. Campbell</name>
</author>
<author>
<name sortKey="Stratton, M R" uniqKey="Stratton M">M. R. Stratton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H. Li</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R. Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marco Sola, S" uniqKey="Marco Sola S">S. Marco-Sola</name>
</author>
<author>
<name sortKey="Sammeth, M" uniqKey="Sammeth M">M. Sammeth</name>
</author>
<author>
<name sortKey="Guigo, R" uniqKey="Guigo R">R. Guigo</name>
</author>
<author>
<name sortKey="Ribeca, P" uniqKey="Ribeca P">P. Ribeca</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raineri, E" uniqKey="Raineri E">E. Raineri</name>
</author>
<author>
<name sortKey="Dabad, M" uniqKey="Dabad M">M. Dabad</name>
</author>
<author>
<name sortKey="Heath, S" uniqKey="Heath S">S. Heath</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benson, G" uniqKey="Benson G">G. Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Derrien, T" uniqKey="Derrien T">T. Derrien</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nat Commun</journal-id>
<journal-id journal-id-type="iso-abbrev">Nat Commun</journal-id>
<journal-title-group>
<journal-title>Nature Communications</journal-title>
</journal-title-group>
<issn pub-type="epub">2041-1723</issn>
<publisher>
<publisher-name>Nature Publishing Group</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26647970</article-id>
<article-id pub-id-type="pmc">4682041</article-id>
<article-id pub-id-type="pii">ncomms10001</article-id>
<article-id pub-id-type="doi">10.1038/ncomms10001</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Alioto</surname>
<given-names>Tyler S.</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-2960-5420</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Buchhalter</surname>
<given-names>Ivo</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
<xref ref-type="aff" rid="a4">4</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-0764-5832</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Derdak</surname>
<given-names>Sophia</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hutter</surname>
<given-names>Barbara</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-9034-0329</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Eldridge</surname>
<given-names>Matthew D.</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hovig</surname>
<given-names>Eivind</given-names>
</name>
<xref ref-type="aff" rid="a6">6</xref>
<xref ref-type="aff" rid="a7">7</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-9103-1077</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Heisler</surname>
<given-names>Lawrence E.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Beck</surname>
<given-names>Timothy A.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Simpson</surname>
<given-names>Jared T.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tonon</surname>
<given-names>Laurie</given-names>
</name>
<xref ref-type="aff" rid="a9">9</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sertier</surname>
<given-names>Anne-Sophie</given-names>
</name>
<xref ref-type="aff" rid="a9">9</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Patch</surname>
<given-names>Ann-Marie</given-names>
</name>
<xref ref-type="aff" rid="a10">10</xref>
<xref ref-type="aff" rid="a11">11</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-6121-4019</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Jäger</surname>
<given-names>Natalie</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
<xref ref-type="aff" rid="a12">12</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ginsbach</surname>
<given-names>Philip</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Drews</surname>
<given-names>Ruben</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-7360-4970</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Paramasivam</surname>
<given-names>Nagarajan</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kabbe</surname>
<given-names>Rolf</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Chotewutmontri</surname>
<given-names>Sasithorn</given-names>
</name>
<xref ref-type="aff" rid="a13">13</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Diessl</surname>
<given-names>Nicolle</given-names>
</name>
<xref ref-type="aff" rid="a13">13</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Previti</surname>
<given-names>Christopher</given-names>
</name>
<xref ref-type="aff" rid="a13">13</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Schmidt</surname>
<given-names>Sabine</given-names>
</name>
<xref ref-type="aff" rid="a13">13</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Brors</surname>
<given-names>Benedikt</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Feuerbach</surname>
<given-names>Lars</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Heinold</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gröbner</surname>
<given-names>Susanne</given-names>
</name>
<xref ref-type="aff" rid="a14">14</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Korshunov</surname>
<given-names>Andrey</given-names>
</name>
<xref ref-type="aff" rid="a15">15</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tarpey</surname>
<given-names>Patrick S.</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Butler</surname>
<given-names>Adam P.</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hinton</surname>
<given-names>Jonathan</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Jones</surname>
<given-names>David</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Menzies</surname>
<given-names>Andrew</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Raine</surname>
<given-names>Keiran</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-5634-1539</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Shepherd</surname>
<given-names>Rebecca</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Stebbings</surname>
<given-names>Lucy</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Teague</surname>
<given-names>Jon W.</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ribeca</surname>
<given-names>Paolo</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-5599-3933</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Giner</surname>
<given-names>Francesc Castro</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-6111-0754</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Beltran</surname>
<given-names>Sergi</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Raineri</surname>
<given-names>Emanuele</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Dabad</surname>
<given-names>Marc</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Heath</surname>
<given-names>Simon C.</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-9550-0897</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gut</surname>
<given-names>Marta</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Denroche</surname>
<given-names>Robert E.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-2197-7083</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Harding</surname>
<given-names>Nicholas J.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yamaguchi</surname>
<given-names>Takafumi N.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-1082-3871</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Fujimoto</surname>
<given-names>Akihiro</given-names>
</name>
<xref ref-type="aff" rid="a17">17</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nakagawa</surname>
<given-names>Hidewaki</given-names>
</name>
<xref ref-type="aff" rid="a17">17</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Quesada</surname>
<given-names>Víctor</given-names>
</name>
<xref ref-type="aff" rid="a18">18</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Valdés-Mas</surname>
<given-names>Rafael</given-names>
</name>
<xref ref-type="aff" rid="a18">18</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nakken</surname>
<given-names>Sigve</given-names>
</name>
<xref ref-type="aff" rid="a6">6</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Vodák</surname>
<given-names>Daniel</given-names>
</name>
<xref ref-type="aff" rid="a6">6</xref>
<xref ref-type="aff" rid="a19">19</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bower</surname>
<given-names>Lawrence</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lynch</surname>
<given-names>Andrew G.</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Anderson</surname>
<given-names>Charlotte L.</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
<xref ref-type="aff" rid="a20">20</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Waddell</surname>
<given-names>Nicola</given-names>
</name>
<xref ref-type="aff" rid="a10">10</xref>
<xref ref-type="aff" rid="a11">11</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pearson</surname>
<given-names>John V.</given-names>
</name>
<xref ref-type="aff" rid="a10">10</xref>
<xref ref-type="aff" rid="a11">11</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Grimmond</surname>
<given-names>Sean M.</given-names>
</name>
<xref ref-type="aff" rid="a10">10</xref>
<xref ref-type="aff" rid="a21">21</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Peto</surname>
<given-names>Myron</given-names>
</name>
<xref ref-type="aff" rid="a22">22</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Spellman</surname>
<given-names>Paul</given-names>
</name>
<xref ref-type="aff" rid="a22">22</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>He</surname>
<given-names>Minghui</given-names>
</name>
<xref ref-type="aff" rid="a23">23</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kandoth</surname>
<given-names>Cyriac</given-names>
</name>
<xref ref-type="aff" rid="a24">24</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lee</surname>
<given-names>Semin</given-names>
</name>
<xref ref-type="aff" rid="a25">25</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>John</given-names>
</name>
<xref ref-type="aff" rid="a25">25</xref>
<xref ref-type="aff" rid="a26">26</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Létourneau</surname>
<given-names>Louis</given-names>
</name>
<xref ref-type="aff" rid="a27">27</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ma</surname>
<given-names>Singer</given-names>
</name>
<xref ref-type="aff" rid="a28">28</xref>
<xref ref-type="author-notes" rid="n3"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Seth</surname>
<given-names>Sahil</given-names>
</name>
<xref ref-type="aff" rid="a26">26</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-4579-3959</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Torrents</surname>
<given-names>David</given-names>
</name>
<xref ref-type="aff" rid="a29">29</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Xi</surname>
<given-names>Liu</given-names>
</name>
<xref ref-type="aff" rid="a30">30</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wheeler</surname>
<given-names>David A.</given-names>
</name>
<xref ref-type="aff" rid="a30">30</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>López-Otín</surname>
<given-names>Carlos</given-names>
</name>
<xref ref-type="aff" rid="a18">18</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Campo</surname>
<given-names>Elías</given-names>
</name>
<xref ref-type="aff" rid="a31">31</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Campbell</surname>
<given-names>Peter J.</given-names>
</name>
<xref ref-type="aff" rid="a16">16</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Boutros</surname>
<given-names>Paul C.</given-names>
</name>
<xref ref-type="aff" rid="a9">9</xref>
<xref ref-type="aff" rid="a32">32</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-0553-7520</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Puente</surname>
<given-names>Xose S.</given-names>
</name>
<xref ref-type="aff" rid="a18">18</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gerhard</surname>
<given-names>Daniela S.</given-names>
</name>
<xref ref-type="aff" rid="a33">33</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pfister</surname>
<given-names>Stefan M.</given-names>
</name>
<xref ref-type="aff" rid="a14">14</xref>
<xref ref-type="aff" rid="a34">34</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>McPherson</surname>
<given-names>John D.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
<xref ref-type="aff" rid="a32">32</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hudson</surname>
<given-names>Thomas J.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
<xref ref-type="aff" rid="a32">32</xref>
<xref ref-type="aff" rid="a35">35</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-1376-4849</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Schlesner</surname>
<given-names>Matthias</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lichter</surname>
<given-names>Peter</given-names>
</name>
<xref ref-type="aff" rid="a36">36</xref>
<xref ref-type="aff" rid="a37">37</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Eils</surname>
<given-names>Roland</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
<xref ref-type="aff" rid="a37">37</xref>
<xref ref-type="aff" rid="a38">38</xref>
<xref ref-type="aff" rid="a39">39</xref>
<xref ref-type="author-notes" rid="n2"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Jones</surname>
<given-names>David T. W.</given-names>
</name>
<xref ref-type="aff" rid="a40">40</xref>
<xref ref-type="author-notes" rid="n2"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gut</surname>
<given-names>Ivo G.</given-names>
</name>
<xref ref-type="corresp" rid="c1">a</xref>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="author-notes" rid="n2"></xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-7219-632X</contrib-id>
</contrib>
<aff id="a1">
<label>1</label>
<institution>CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST)</institution>
, Baldiri i Reixac 4, 08028 Barcelona,
<country>Spain</country>
</aff>
<aff id="a2">
<label>2</label>
<institution>Universitat Pompeu Fabra (UPF)</institution>
, 08002 Barcelona,
<country>Spain</country>
</aff>
<aff id="a3">
<label>3</label>
<institution>Division of Theoretical Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a4">
<label>4</label>
<institution>Division of Applied Bioinformatics, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a5">
<label>5</label>
<institution>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre</institution>
, Robinson Way, Cambridge CB2 0RE,
<country>UK</country>
</aff>
<aff id="a6">
<label>6</label>
<institution>Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital</institution>
, 0424 Oslo,
<country>Norway</country>
</aff>
<aff id="a7">
<label>7</label>
<institution>Department of Informatics, University of Oslo</institution>
, 0373 Oslo,
<country>Norway</country>
</aff>
<aff id="a8">
<label>8</label>
<institution>Ontario Institute for Cancer Research</institution>
, 661 University Avenue, Suite 510, Toronto, Ontario,
<country>Canada</country>
M5G 0A3</aff>
<aff id="a9">
<label>9</label>
<institution>Synergie Lyon Cancer Foundation, Centre Léon Bérard, Cheney C</institution>
, 28 rue Laennec, Lyon 69373,
<country>France</country>
</aff>
<aff id="a10">
<label>10</label>
<institution>Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland</institution>
, St Lucia, Brisbane, Queensland 4072,
<country>Australia</country>
</aff>
<aff id="a11">
<label>11</label>
<institution>QIMR Berghofer Medical Research Institute</institution>
, Brisbane, Queensland 4006,
<country>Australia</country>
</aff>
<aff id="a12">
<label>12</label>
<institution>Department of Genetics, Stanford University</institution>
, Mail Stop-5120, Stanford, California 94305-5120,
<country>USA</country>
</aff>
<aff id="a13">
<label>13</label>
<institution>Genome and Proteome Core Facility, German Cancer Research Center</institution>
, Im Neuenheimer Feld 280, Heidelberg, 69120
<country>Germany</country>
</aff>
<aff id="a14">
<label>14</label>
<institution>Department of Pediatric Hematology and Oncology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 430, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a15">
<label>15</label>
<institution>Department of Neuropathology, Heidelberg University Hospital</institution>
, Im Neuenheimer Feld 224, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a16">
<label>16</label>
<institution>Wellcome Trust Sanger Institute</institution>
, Hinxton, Cambridge CB10 1SA,
<country>UK</country>
</aff>
<aff id="a17">
<label>17</label>
<institution>RIKEN Center for Integrative Medical Sciences</institution>
, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
<country>Japan</country>
</aff>
<aff id="a18">
<label>18</label>
<institution>Universidad de Oviedo—IUOPA, C/Fernando Bongera s/n</institution>
, 33006 Oviedo,
<country>Spain</country>
</aff>
<aff id="a19">
<label>19</label>
<institution>The Bioinformatics Core Facility, Institute for Cancer Genetics and Informatics, Oslo University Hospital</institution>
, 0310 Oslo,
<country>Norway</country>
</aff>
<aff id="a20">
<label>20</label>
<institution>Victorian Life Sciences Computation Initiative, The University of Melbourne</institution>
, Melbourne, Victoria 3053,
<country>Australia</country>
</aff>
<aff id="a21">
<label>21</label>
<institution>WolfsonWohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow</institution>
, Glasgow, Scotland G61 1QH,
<country>UK</country>
</aff>
<aff id="a22">
<label>22</label>
<institution>Knight Cancer Institute, Oregon Health and Science University</institution>
, Portland, Oregon 97239-3098,
<country>USA</country>
</aff>
<aff id="a23">
<label>23</label>
<institution>BGI-Shenzhen</institution>
, Shenzhen 518083,
<country>China</country>
</aff>
<aff id="a24">
<label>24</label>
<institution>The Genome Institute, Washington University</institution>
, St Louis, Missouri 63108,
<country>USA</country>
</aff>
<aff id="a25">
<label>25</label>
<institution>Harvard Medical School</institution>
, Boston, Massachusetts 02115,
<country>USA</country>
</aff>
<aff id="a26">
<label>26</label>
<institution>MD Anderson Cancer Center</institution>
, Houston, Texas 77030,
<country>USA</country>
</aff>
<aff id="a27">
<label>27</label>
<sup>27</sup>
<institution>McGill University</institution>
, Montreal, Quebec,
<country>Canada</country>
QC H3A 0G4</aff>
<aff id="a28">
<label>28</label>
<institution>Center for Biomolecular Science and Engineering, University of California</institution>
, Santa Cruz, California 95064,
<country>USA</country>
</aff>
<aff id="a29">
<label>29</label>
<institution>IRB-BSC Joint Research Program on Computational Biology, Barcelona Supercomputing Center</institution>
, 08034 Barcelona,
<country>Spain</country>
</aff>
<aff id="a30">
<label>30</label>
<institution>Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza</institution>
, Houston, Texas 77030,
<country>USA</country>
</aff>
<aff id="a31">
<label>31</label>
<institution>Hematopathology Unit, Department of Pathology, Hospital Clinic, University of Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer</institution>
, 08036 Barcelona,
<country>Spain</country>
</aff>
<aff id="a32">
<label>32</label>
<institution>Department of Medical Biophysics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5G 1L7</aff>
<aff id="a33">
<label>33</label>
<institution>National Cancer Institute, Office of Cancer Genomics</institution>
, 31 Center Drive, 10A07, Bethesda, Maryland 20892-2580,
<country>USA</country>
</aff>
<aff id="a34">
<label>34</label>
<institution>Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a35">
<label>35</label>
<institution>Department of Molecular Genetics, University of Toronto</institution>
, Toronto, Ontario,
<country>Canada</country>
M5S 1A8</aff>
<aff id="a36">
<label>36</label>
<institution>Division of Molecular Genetics, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a37">
<label>37</label>
<institution>Heidelberg Center for Personalised Oncology (DKFZ-HIPO), German Cancer Research Center (DKFZ)</institution>
, Heidelberg,
<country>Germany</country>
</aff>
<aff id="a38">
<label>38</label>
<institution>Institute of Pharmacy and Molecular Biotechnology, University of Heidelberg</institution>
, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a39">
<label>39</label>
<institution>Bioquant Center, University of Heidelberg</institution>
, Im Neuenheimer Feld 267, Heidelberg 69120,
<country>Germany</country>
</aff>
<aff id="a40">
<label>40</label>
<institution>Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ)</institution>
, Im Neuenheimer Feld 280, Heidelberg 69120,
<country>Germany</country>
</aff>
</contrib-group>
<author-notes>
<corresp id="c1">
<label>a</label>
<email>ivo.gut@cnag.crg.eu</email>
</corresp>
<fn id="n1">
<label>*</label>
<p>These authors contributed equally to this work.</p>
</fn>
<fn id="n2">
<label></label>
<p>These authors jointly supervised this work.</p>
</fn>
<fn id="n3">
<label></label>
<p>Present address: DNAnexus, 1975W El Camino Real, Suite 101 Mountain View, California 94040, USA.</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>09</day>
<month>12</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<volume>6</volume>
<elocation-id>10001</elocation-id>
<history>
<date date-type="received">
<day>16</day>
<month>06</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>23</day>
<month>10</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2015, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<pmc-comment>author-paid</pmc-comment>
<license-p>This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
</license-p>
</license>
</permissions>
<abstract>
<p>As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.</p>
</abstract>
<abstract abstract-type="web-summary">
<p>
<inline-graphic id="i1" xlink:href="ncomms10001-i1.jpg"></inline-graphic>
Cancer genetics has benefited from the advent of next generation sequencing, yet a comparison of sequencing and analysis techniques is lacking. Here, the authors sequence a normal-tumour pair and perform data analysis at multiple institutes and highlight some of the pitfalls associated with the different methods.</p>
</abstract>
</article-meta>
</front>
<body>
<p>The International Cancer Genome Consortium (ICGC) is characterizing over 25,000 cancer cases from many forms of cancer
<xref ref-type="bibr" rid="b1">1</xref>
. Currently, there are 74 projects supported by different national and international funding agencies. As innovation and development of sequencing technologies have driven prices down and throughput up, projects have been transitioning from exome to whole-genome sequencing (WGS) of tumour and matched germline samples, supplemented by transcript and methylation analyses when possible, facilitating the discovery of new biology for many different forms of cancer
<xref ref-type="bibr" rid="b2">2</xref>
<xref ref-type="bibr" rid="b3">3</xref>
<xref ref-type="bibr" rid="b4">4</xref>
<xref ref-type="bibr" rid="b5">5</xref>
<xref ref-type="bibr" rid="b6">6</xref>
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b8">8</xref>
<xref ref-type="bibr" rid="b9">9</xref>
<xref ref-type="bibr" rid="b10">10</xref>
. However, as data from the different projects began to be collected and centralized (
<ext-link ext-link-type="uri" xlink:href="https://dcc.icgc.org/">https://dcc.icgc.org/</ext-link>
), it became apparent that there are marked differences in how teams generate WGS data and analyse it. On the basis of cost, capacity and analytical experience, it was initially determined that comprehensive identification of tumour-specific somatic mutations requires WGS with a minimum of 30 × sequence coverage of each the tumour and normal genomes
<xref ref-type="bibr" rid="b11">11</xref>
with paired reads on the order of 100–250 bp in length, depending on the platform. However, from project to project the sample preparation, coverage of tumour and normal samples and read lengths vary. Even more variability exists in the approaches to identify differences between tumour and normal genomes, evidenced by the many strategies developed to identify somatic single-base mutations (SSM)
<xref ref-type="bibr" rid="b12">12</xref>
, somatic insertion/deletion mutations (SIM) and larger structural changes (rearrangements and chromosome segment copy number changes)
<xref ref-type="bibr" rid="b5">5</xref>
.</p>
<p>This variation makes comparison of mutation calls across cancers challenging because of the unknown contributions of individual pipeline components and parameters on the accuracy of the calls. Benchmark data sets and analytical tools
<xref ref-type="bibr" rid="b13">13</xref>
<xref ref-type="bibr" rid="b14">14</xref>
<xref ref-type="bibr" rid="b15">15</xref>
<xref ref-type="bibr" rid="b16">16</xref>
<xref ref-type="bibr" rid="b17">17</xref>
have been developed for variant calling on normal genomes, while those for cancer have largely focused on SSM detection from exome sequencing
<xref ref-type="bibr" rid="b12">12</xref>
<xref ref-type="bibr" rid="b18">18</xref>
. Benchmarking of mutation calling from exome data from The Cancer Genome Atlas has raised concerns about biased inferences and highlights the need for benchmark data sets
<xref ref-type="bibr" rid="b19">19</xref>
. In our study we set out to investigate the factors that need to be considered to generate high-quality whole-genome sequence data and high-confidence variant calls from tumour-normal pairs, including new sources of bias and pitfalls not encountered in exome data.</p>
<p>We explored several benchmarking strategies. First, we evaluated somatic mutation calling pipelines using a common set of 40 × WGS reads of average quality corresponding to a case of chronic lymphocytic leukaemia (CLL). In a second benchmark, we evaluated both sequencing methods and somatic mutation calling pipelines using matched samples from a case of medulloblastoma (MB, a malignant pediatric brain tumour arising in the cerebellum
<xref ref-type="bibr" rid="b20">20</xref>
<xref ref-type="bibr" rid="b21">21</xref>
) from the ICGC PedBrain Tumor project. Both cancers exhibit a high degree of tumour purity (95–98%). For each case, we made available unaligned sequence reads of a tumour (∼40 × genome coverage) and its corresponding normal genome (∼30 × coverage) to members of the ICGC consortium, who then returned somatic mutation calls. In contrast to the approach taken in a recent benchmark of SSM calling using three simulated tumour genomes
<xref ref-type="bibr" rid="b22">22</xref>
, we have used the sequence from a real tumour-normal pair and made a concerted effort to manually curate both SSMs and SIMs detectable at a sequencing depth 8–10 times in excess of the standard amount (∼300 ×). We argue that real, not simulated, mutations are more useful for dissecting performance of mutation callers with respect to real genome-wide mutational signatures, and methods for detecting insertion–deletion mutations, an even bigger challenge to somatic mutation callers, must also be benchmarked. Our study has two main results: one, we identify outstanding issues in somatic mutation analysis from WGS data and begin to formulate a set of best practices to be adopted more widely by genome researchers, and, two, we provide two benchmark data sets for testing or developing new somatic mutation calling pipelines.</p>
<sec disp-level="1">
<title>Results</title>
<sec disp-level="2">
<title>WGS data generation</title>
<p>We conducted a first benchmark exercise using WGS data generated from a CLL tumour-normal pair and then a second using a case of MB. Both tumour types were expected to have relatively low mutational load and not very pervasive structural changes: CLL has a few known translocations and large copy number variants, and MB exhibits a large degree of tetraploidy but is otherwise typically free of complex rearrangements. The quality of the CLL data was below today's standards but typical for the time it was produced, while the quality of the MB library preparation and produced sequence were of high quality. The validation strategies also differed. For CLL we chose to validate submitted mutations by target capture and sequencing with two independent platforms (MiSeq and IonTorrent). This approach was limited by technical issues inherent to target capture and sequencing on these other platforms, which led to a low rate of independently validated mutations. Moreover, the real false-negative (FN) rates were underestimated because of the limited coverage provided to participants. For the MB data set, ∼300 × in sequence reads were generated by five different sequencing centres, which we joined and used to create a curated set of mutations. The MB tumour presented with a tetraploid background combined with other changes of ploidy in chromosomes 1, 4, 8, 15 and 17 (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 1</xref>
), giving us the opportunity to benchmark performance at lower mutant allele frequencies. Moreover, the high depth and relatively unbiased coverage of the genome enabled higher sensitivity in mutation detection leading to a more inclusive set of curated mutations. For these reasons, we present here only the results for the MB benchmark.</p>
</sec>
<sec disp-level="2">
<title>Evaluation of sequencing library construction methods</title>
<p>Several different protocols were used for generating sequencing libraries at the five contributing sequencing centres, which varied in their reagent supplier, methods for selecting the fragment size of library inserts and use of amplifying PCR steps (
<xref ref-type="table" rid="t1">Table 1</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 1</xref>
). Interestingly, these differences resulted in marked variation in the evenness of coverage genome-wide as well as in key regions of interest such as exons. PCR-free libraries were found to give the most even coverage, with very little effect of GC content on coverage levels (
<xref ref-type="fig" rid="f1">Fig. 1a</xref>
). Evenness is directly correlated with coverage of the genome: when we downsampled each data set to an average 30 × tumour and control coverage (libraries sequenced to less than 28 ×, L.G and L.H, were excluded from further analysis), we see that in the best-performing libraries (L.A and L.B controls), 73–74% of the genome was covered at or above 25 ×, while the worst-performing library (L.F tumour) had only 46% of the genome covered at this level (
<xref ref-type="fig" rid="f1">Fig. 1b</xref>
). In general, the coverage distribution was more even and the percentage of well-covered regions was higher in the control libraries compared with the tumour libraries, reflecting the different copy number states of the tumour. An unusual pattern of GC content distribution in control library E, however, meant that this was slightly worse than its tumour counterpart. The percentage of exonic regions covered at ≤10 × (that is, likely insufficient to accurately call mutations) also varied, with a range from less than 1% ‘missing' in the best-performing libraries to more than 10% in the worst (
<xref ref-type="fig" rid="f1">Fig. 1c</xref>
), demonstrating that sequencing library preparation performance can have a significant impact on the ability to identify coding variants in downstream analyses. Performance in other regions of interest, such as enhancers and untranslated repeats, was similarly variable (
<xref ref-type="fig" rid="f1">Fig. 1c</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 2</xref>
).</p>
</sec>
<sec disp-level="2">
<title>Evaluation of sequencing depth</title>
<p>Combining the sequencing data generated from each participating centre gave us the opportunity to investigate a tumour-normal pair with very deep coverage. After merging each of the individual pairs, the combined tumour coverage was 314 ×, and the control 272 ×. To remove already identified artefacts (
<xref ref-type="supplementary-material" rid="S1">Supplementary Note 1</xref>
), we excluded the tumour library from centre E and the slightly contaminated control library from centre B. For comparison of mutation-calling metrics at a range of coverage levels, the combined tumour and normal sets were randomly serially downsampled to 250, 200, 150, 100, 50, 30 and 20 × coverage and then analysed using the standard DKFZ pipeline (MB.I,
<xref ref-type="supplementary-material" rid="S1">Supplementary Methods</xref>
). The total number of mutations increases when going from 30 × to 50 × and further to 100 × coverage; however, no striking increase is seen above this level (at 100 ×, 95% of the maximum mutation number are detected, in contrast to only 77% at 30 × ;
<xref ref-type="fig" rid="f2">Fig. 2a</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 2</xref>
). While the majority of mutations were called at the 30 × level, there were some notable differences in the number and type of mutations detected as the coverage increased. The sensitivity for detecting mutations with lower mutant allele frequencies (that is, subclonal alterations and/or events happening after polysomic changes but also major somatic mutations in samples with low tumour cell content) was much greater with higher coverage, as seen from density plots of mutations versus allele frequency (AF,
<xref ref-type="fig" rid="f2">Fig. 2b</xref>
). This effect was even more striking when considering mutation calls per chromosome, which clearly shows the difference between low and high coverage when looking for late-occurring mutations after whole-chromosome copy number changes (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 3</xref>
).</p>
</sec>
<sec disp-level="2">
<title>Effect of tumour purity on mutation calling</title>
<p>Since MBs tend to show a very high tumour cell content (usually above 95%, and for this sample ∼98%, because of their nature as masses of small, round, tightly packed tumour cells), the high coverage data set also provided an opportunity to model the dynamics of mutation calling with increasing coverage and with increasing proportions of ‘contaminating' normal tissue. We found that the mutation calls with increasing coverage were accurately modelled with a Michaelis–Menten equation, reaching ‘saturation' (no or minimal additional mutations called as coverage increases) at around 100 × (
<xref ref-type="fig" rid="f2">Fig. 2c</xref>
). The impact of normal cells on SSM detection could be thought of as a ‘mixed-type inhibition' of mutation detection sensitivity, which we examined by mixing increasing proportions of normal sequence reads (17, 33 and 50%) into the tumour data set and re-calling mutations. Each curve displayed the same plateau after ∼100 × as the pure tumour sample; however, the addition of any normal content meant that the maximum mutation count from the pure tumour could not be reached, even at 250 × total coverage. At 100 ×, the detected proportions of mutation calls from the pure sample were 95%, 90% and 85%, respectively, for 17%, 33% and 50% ‘contamination' (
<xref ref-type="fig" rid="f2">Fig. 2c</xref>
). At lower coverage, the normal cell content had a proportionally larger impact. At 30 ×, only 92%, 83% or 68% of the calls from the 30 × pure sample were called when adding 17%, 33% or 50% normal reads, respectively (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 2</xref>
). For SIMs called using the DKFZ pipeline, a different picture was observed. SIM calling at present likely suffers at least as much from low specificity as from low sensitivity, as indicated by the fact that increasing coverage actually reduces the number of called variants (that is, the FP rate decreases;
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 4</xref>
).</p>
</sec>
<sec disp-level="2">
<title>Effect of tumour to normal sequencing depth ratio</title>
<p>We investigated the effect of tumour-normal coverage ratios on variant calling to assess whether increasing coverage of the tumour alone is sufficient to increase mutation detection sensitivity. The 250 × tumour genome was therefore compared with control genomes at 200, 150, 100, 50 and 30 × coverages. Down to the 150 × level, few differences are seen in the mutations called when compared with the 250 × /250 × standard (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 5a,b</xref>
). At lower control coverage levels, a notable increase is observed in the overall number of mutations reported because of a sharp rise in those called with a low allele fraction. Since these mutations are not called in the 250 × versus 250 × set, it is almost certain that they are sequencing artefacts arising in a very small proportion of calls, which appear to be somatic when the control coverage is insufficient to show the same phenomenon. These new calls are dominated by a T>G base change arising in a particular sequence context (GpTpG,
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 5c</xref>
). Indeed, performing a motif analysis on the wider context of these changes revealed that the majority occur at a thymine base within a homopolymer run of guanines (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 5d</xref>
). Keeping the ratio of tumour:normal coverage closer to one therefore appears to play a role in maintaining the accuracy of mutation calling with standard pipelines, since any systematic artefacts are then balanced out in both the tumour and control data sets. While it may be possible to apply additional filters to account for the new false positives (FPs) seen in unbalanced comparisons, this would potentially come at the cost of a reduced sensitivity for detecting true mutations with low allele frequencies (that is, tumour subpopulations), which are of particular interest when increasing sequencing coverage depth.</p>
</sec>
<sec disp-level="2">
<title>Curation of a Gold Set of somatic mutations</title>
<p>We used the high-coverage (314 × :272 ×) data set from the sequencing benchmark to curate a Gold Set of verified somatic mutations (Methods). Gold Set mutations were classified (
<xref ref-type="table" rid="t2">Table 2</xref>
) according to the potential issues that may lead to an incorrect call: Tier 1 mutations have a mutant AF≥10%, Tier 2 mutations have AF≥5%, Tier 3 includes all verified mutations supported by unambiguous alignments, while Tier 4 includes additional mutations with more complicated or ambiguous local alignments and Tier 5 includes those with unusually high or low read depth.</p>
<p>The MB Gold Set had a total of 1,620 bona fide mutation calls across all tiers (
<xref ref-type="table" rid="t2">Table 2</xref>
), with 962, 1,101, 1,255 and 1,263 SSMs in Tiers 1, 2, 3 and 4, respectively, and 337 and 347 SIMs in Tiers 1 and 4, respectively. The mutational load of this tumour was ∼0.5 mutations per Mbp. Of these, there were eleven exonic SSMs (seven missense, three synonymous and one early stop) and one splice site mutation. We found that 32% of SSMs are in RepeatMasked sequence, 9% in tandem repeats (4% in homopolymer tracts) and 4.4% adjacent to tandem repeats. About a quarter of SSMs (27%) exhibits a mutant AF in the tumour of less than 10%, with 6% being very close to the alternate AF in the normal sample (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 3</xref>
). For curated SIMs, 83% fall in tandem repeats (71% in homopolymers;
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 4</xref>
).</p>
</sec>
<sec disp-level="2">
<title>Evaluation of somatic mutation calling pipelines</title>
<p>A submission and revision process was set up for the MB benchmark with guidelines for the mutation call format. Participating centres were provided with the best-quality sequence data set from the sequencing benchmark (L.A). Using these FASTQs, they produced SSM and SIM calls and submitted them for evaluation. We received 18 SSM and 16 SIM submissions.</p>
<p>Submissions were compared among themselves and to the Gold Set.
<xref ref-type="fig" rid="f3">Figure 3</xref>
shows the overlap of mutation call sets (private calls are shown in
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 6</xref>
and
<xref ref-type="supplementary-material" rid="S1">7</xref>
). We found that only 205 SSMs and one SIM were agreed upon by all submitters (
<xref ref-type="fig" rid="f3">Fig. 3</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 6</xref>
and
<xref ref-type="supplementary-material" rid="S1">7</xref>
). Agreement among SSM sets was much greater than agreement among SIMs in general. In
<xref ref-type="fig" rid="f4">Fig. 4</xref>
we show the precision versus recall of the submitted mutation calls. Each letter corresponds to a submission compared with the Gold Set of Tiers 1, 2 or 3, with comparison with Tier 1 (AF>10% mutant AF) having the highest value for recall in the plot, Tier 2 (AF>5%) the second highest and Tier 3 (AF>∼2%) the lowest. Precision is always calculated against Tier 4, which also includes mutations that are complex or have ambiguous positions.</p>
<p>We observed a cluster of well-performing SSM submissions with high values for both precision and recall. Those with the highest F1 scores (
<xref ref-type="table" rid="t3">Table 3</xref>
) were MB.Q and MB.J, pipelines that combine two different somatic mutation callers: qSNP
<xref ref-type="bibr" rid="b23">23</xref>
with GATK
<xref ref-type="bibr" rid="b24">24</xref>
and SGA
<xref ref-type="bibr" rid="b25">25</xref>
with FreeBayes
<xref ref-type="bibr" rid="b26">26</xref>
(
<xref ref-type="supplementary-material" rid="S1">Supplementary Methods</xref>
). Submissions with a high number of calls did not necessarily achieve higher recall; about two-thirds of all mutations (or >80% of Tier 1 mutations) can be detected without making many false-positive (FP) calls, after which increases in recall are accompanied by precipitous declines in precision. This is because of the fact that at 40 × depth, a fraction of the curated mutations is impossible to detect. Likewise, the one submission whose precision was the highest (MB.L1) was not much more precise than MB.B or MB.Q, which both found over twice as many true mutations (
<xref ref-type="table" rid="t3">Table 3</xref>
). For SIMs (
<xref ref-type="fig" rid="f4">Fig. 4b</xref>
), some submissions achieved precisions greater than 0.9; however, their sensitivities were still low. The highest F1 score (0.65 for MB.I) is noticeably lower than that obtained for SSMs (0.79). Overall, SIM detection appears to be more challenging, with performance lagging behind than that of SSM detection.</p>
</sec>
<sec disp-level="2">
<title>Correlation of pipeline components with shared mutations</title>
<p>Could those submissions that cluster together in terms of precision and recall be calling the same mutations or have similarities in terms of their pipelines? Using a measure of pairwise overlap, the Jaccard index, we clustered the submissions and display the results as both a heatmap and hierarchical clustering in
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 8</xref>
. Correspondence analysis gave similar results. We also broke them down by true positives (TPs) or FPs (
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 9–12</xref>
) and clustered the pipelines based on shared components or parameters (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 13</xref>
, input data in
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 1</xref>
). We find that when submissions agree, they tend to agree on true mutations, and when they disagree, these are more likely to be FPs or true negatives. Some notable exceptions can be observed among the FP SIMs, where MB.L1 and MB.L2 cluster as do MB.F and MB.N. These concordant FPs may indicate incompleteness of the Gold Set and/or similarities in the pipelines. In this case, MB.L1 is a filtered subset of MB.L2, explaining the high degree of overlap. MB.F and MB.N both use Strelka
<xref ref-type="bibr" rid="b27">27</xref>
to call SIMs, possibly explaining the overlap in FPs. Indeed, some overlap of MB.F and MB.N FP SIMs is seen with MB.L2, which also uses Strelka. For TP SIMs, pipelines that share components tend to have higher overlap, for example, among Strelka calls or among GATK SomaticIndelDetector calls. Sometimes, we observe higher Jaccard index values for pipelines using different software, for example Platypus
<xref ref-type="bibr" rid="b28">28</xref>
and Atlas-indel
<xref ref-type="bibr" rid="b29">29</xref>
, which are two of the most sensitive mutation callers. There is much more concordance among SSM calls; therefore, trends are harder to see among the FPs. Logically, SSM submissions with the highest F1 scores have the highest Jaccard indices.</p>
</sec>
<sec disp-level="2">
<title>Genomic or alignment features affecting accuracy</title>
<p>We asked what genomic features or sample, sequence or alignment features might correlate with better or worse performance. We used ‘rainfall' plots that show density of features along a genome. By plotting the distance from the previous mutation (
<xref ref-type="fig" rid="f5">Fig. 5</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 14</xref>
), we can observe clustering or hotspots. SSM and SIM calls are coloured according to TP, FP and FN status. The Gold Set exhibits no mutational hotspots; therefore, any deviation is likely to be caused by a feature of the pipeline. Indeed, we detect quite different patterns: MB.Q (
<xref ref-type="fig" rid="f5">Fig. 5a</xref>
) and MB.B (
<xref ref-type="fig" rid="f5">Fig. 5b</xref>
) do not display any notable hotspots, while MB.C (
<xref ref-type="fig" rid="f5">Fig. 5c</xref>
), for example, has many FP mutation calls in centromeric regions. MB.D and other call sets display varying degrees of this problem, which may arise if alignment of reads is performed to the GRCH37 reference without the d5 decoy sequence and/or no ultrahigh-signal blacklist is employed. MB.K overcalls (
<xref ref-type="fig" rid="f5">Fig. 5d</xref>
) but a more subtle pattern is also apparent: higher ploidy chromosomes (for example, 17) display a greater density of calls and lower ploidy chromosomes (8, 15, X and Y) demonstrate a lower density of calls, presumably because of coverage.</p>
<p>Other genomic features such as tandem or interspersed repeats, as well as some key sample/sequence/alignment features, also create problems for mutation callers but are not detectable at the chromosomal scale. We annotated the Gold Set and all submitted SSM and SIM calls for each feature, indicating membership (Boolean flags) or a score (for continuous characteristics;
<xref ref-type="supplementary-material" rid="S1">Supplementary Tables 3</xref>
and
<xref ref-type="supplementary-material" rid="S1">4</xref>
). The frequencies or mean scores, respectively, were computed for three subsets of each submission (TPs, FPs or FNs) and for the Gold Sets. To highlight problematic features for each submission, the differences with respect to the Gold Set were computed and multiplied by the FP or FN rate, accordingly. The problematic features of FPs in the MB SSM data set are shown as a heat map (
<xref ref-type="fig" rid="f6">Fig. 6</xref>
). While nearly all sets of FPs are enriched in low-frequency mutations, which are harder to discriminate from background noise (also reflected by the ‘same AF' metric), some call sets (MB.K, H, C, D and M) do less well. MB.H seems to also have a problem with segmental duplications and multimappable regions, and MB.D with duplications only. MB.K and MB.M, to a lesser extent, are enriched in SSMs located in tandem repeats, simple repeats and homopolymers. MB.C has issues with FPs falling in blacklisted regions, specifically centromeric and simple repeats. The three submissions with fewer FPs immediately adjacent to tandem repeats than in the Gold Set (MB.H, MB.C and MB.D) do not use the Burrows-Wheeler Aligner (BWA) for the primary alignment step—instead, the mappers Novoalign or GEM are used, or the detection method is not based on mapping (SMUFIN
<xref ref-type="bibr" rid="b30">30</xref>
). The corresponding heatmaps for MB SSM FNs, and MB FP and FN SIMs are shown in
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 15–17</xref>
. In general, both tandem repeats and segmental duplications and interspersed repeats cause sensitivity issues, with some pipelines more affected than others. The results for SIMs show that SIMs in tandem repeats (the majority being homopolymers) are undercalled, being under-represented in FPs and over-represented in FNs. Interestingly, nested repeats and duplications show the opposite trend, indicating that many FPs likely arise from low mapping quality.</p>
<p>Correspondence analysis confirms some of the above findings for MB SSM FPs (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 18</xref>
). MB.C clusters with EncMap, dukeMap and centr, suggesting that MB.C FPs occur in some blacklisted regions. MB.H (and MB.G and MB.O to lesser extent) FPs are associated with segmental duplications. MB.K FPs are associated with tandem repeats (and microsatellites and simple repeats).</p>
</sec>
<sec disp-level="2">
<title>Effect of mapper on mutation calling</title>
<p>The differences between sets of mutations submitted by the participating groups raised questions about the impact of individual pipeline components on the results. The extent of observed pipeline customization (
<xref ref-type="supplementary-material" rid="S1">Supplementary Methods</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 1</xref>
) did not allow for exhaustive testing of all potentially important analysis steps; however, three pipeline components were selected for closer inspection because of their expected high impact: mapper, reference genome build and mutation caller. Four mappers (Novoalign2, BWA, BWA-mem and GEM), two SSM callers (MuTect
<xref ref-type="bibr" rid="b31">31</xref>
and Strelka) and three versions of the human reference genome (b37, b37+decoy and ‘hg19r'—a reduced version of hg19, with unplaced contigs and haplotypes removed) were selected for testing, based on their usage by the benchmarking groups (
<xref ref-type="supplementary-material" rid="S1">Supplementary Methods</xref>
for software versions and settings). To limit the effect of non-tested software on the produced mutation sets, a simple SSM-calling pipeline was established. First, we compared the effect of the mapper with each of the SSM callers. With a single SSM caller employed, a considerable fraction of unfiltered SSM calls for a given mapper (0.22–0.69, depending on the mapper–caller combination) is not reproducible by that caller with any other mapper (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 19</xref>
). When compared with the Gold Set (Tier 3 SSMs), calls supported by a single mapper are almost exclusively FPs (precision <0.02). On the other hand, a large majority of calls supported by all four mappers are TPs (with precision ranging from 0.87 for MuTect to 0.99 for Strelka).</p>
</sec>
<sec disp-level="2">
<title>Effect of primary mutation caller on mutation calling</title>
<p>Similar trends are observed when SSM callers are compared while holding the mapper constant (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 5</xref>
). A sizable fraction (0.22–0.87, depending on the mapper) of unfiltered SSM calls for any given mapper–caller combination is not reproducible by the other caller on the same alignment file. Remarkably, in case of Novoalign2, the same alignment file leads to the most somatic calls and the lowest overall precision when used with MuTect, but the fewest somatic calls and highest overall precision when used with Strelka. When compared with the Gold Set, calls private to a single caller appear to be mostly FPs, with precision ranging from 0.01 to 0.05. Calls supported by both callers prove to be mostly correct (with precision between 0.89 and 0.93;
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 6</xref>
). The consensus sets seem to be robust—considerably improving the precision rates while only minimally lowering the sensitivity. The results of reference genome choice and a detailed examination of the alignment characteristics of the different aligners are presented in
<xref ref-type="supplementary-material" rid="S1">Supplementary Note 1</xref>
.</p>
</sec>
<sec disp-level="2">
<title>Improvement of pipelines using the benchmark data set</title>
<p>As a demonstration of the utility of the benchmark data set to improve pipelines, we set out to improve the MB.F pipeline further (already the MB.E pipeline, which uses SomaticSniper, was replaced with the MB.F pipeline, which uses Strelka, based on the analysis of the CLL benchmark results). Using the Gold Set to devise and tune a set of filters for various metrics (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 7</xref>
), including mapping quality and distance from the end of the read alignment block (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 20</xref>
), we are able to outperform (in terms of F1) all other MB SSM submissions (
<xref ref-type="fig" rid="f7">Fig. 7a</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 8</xref>
). Despite choosing reasonably conservative thresholds, we were still worried about the possibility of overfitting; thus, we tested the adjusted pipeline on the CLL benchmark data set. We achieved similar results (
<xref ref-type="fig" rid="f7">Fig. 7b</xref>
), demonstrating that the filter settings work well on at least one other cancer type. Removal of the repeat copy filter in Strelka also improve both MB and CLL SIM sensitivity without greatly affecting precision (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 21</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 8</xref>
).</p>
<p>Additional information regarding the CLL benchmark is provided in
<xref ref-type="supplementary-material" rid="S1">Supplementary Note 1</xref>
,
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 22–36</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Tables 9–13</xref>
, which present the analogous information presented here for MB. Additional sequencing analyses are described in
<xref ref-type="supplementary-material" rid="S1">Supplementary Note 1</xref>
,
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 37–40</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Tables 14</xref>
and
<xref ref-type="supplementary-material" rid="S1">15</xref>
. Controls for genome reference builds and effect of mapper choice are presented in
<xref ref-type="supplementary-material" rid="S1">Supplementary Note 1</xref>
,
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 41–47</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 16</xref>
. All pipeline details (as given by each submitter) are presented in
<xref ref-type="supplementary-material" rid="S1">Supplementary Methods</xref>
,
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 48</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Tables 17</xref>
and
<xref ref-type="supplementary-material" rid="S1">18</xref>
.</p>
</sec>
</sec>
<sec disp-level="1">
<title>Discussion</title>
<p>This benchmarking exercise has highlighted the importance of carefully considering all stages of the laboratory and analysis pipelines required to generate consistent and high-quality whole-genome data for cancer analysis. In this study we have isolated and tested individual library construction/sequencing methods and complete analysis pipelines. Analysis pipelines themselves are also multicomponent; therefore, we have also evaluated mappers and two popular mutation callers in isolation.</p>
<p>By preparing libraries and generating sequence from the same MB tumour-normal pair at five different sequencing centres, we obtained results that suggest that PCR-free library preparation protocols should be the method of choice to ensure evenness of coverage, and that a sequencing depth of close to 100 × for both tumour and normal ought to be aimed for (particularly in situations where subclonal mutations or noncoding alterations are suspected to be playing a role). With platforms such as the Illumina HiSeq X now coming online in more centres, such an increase in coverage may be feasible without dramatically increasing costs.</p>
<p>This exercise also afforded a unique opportunity to compare validation schemes for benchmark creation. We found that high-depth (∼300 ×) WGS, in contrast to targeted resequencing, allowed us to more accurately assess FN rates in addition to FP rates, as well as better enable us to determine sweet spots of pipelines.</p>
<p>We have found that, contrary to common perception, identifying somatic mutations, be they SSMs or SIMs, from WGS data is still a major challenge. Calling mutations with different pipelines on differently prepared sequence read sets resulted in a low level of consensus. Using a standard pipeline had the potential of improving on this but still suffered from inadequate controls for library preparation and sequencing artefacts. Using a common high-quality sequence data set yielded higher concordance, but still resulted in substantial discrepancies in somatic mutation call rates and the calls themselves in the hands of different analysis groups. In some cases we were able to identify the Achilles' heels of pipelines for failing to identify or for overcalling mutations as a function of genomic features or properties of sequencing depth, quality or alignment problems. We found that dominating features of both FP and FN SSMs were low coverage in the tumour, and aberrant coverage of tumour and normal. Underlying these artefacts are features such as segmental duplications and centromeric repeats. Use of an appropriate reference sequence (including decoy sequences) and/or good use of blacklists (problematic high-coverage regions or low mappability) can reduce FPs, while only an increase in overall sequencing depth or more sensitive algorithms (MuTect, for example) can address the FN rate. In contrast, we found that the vast majority of curated SIMs fall in simple/tandem repeats, and yet they are often filtered out because of a concern that they may be artefacts. We found little basis for this concern, at least in our data set that came from a no-PCR library. We found that adjustment of filters related to the number of copies of a repeat unit can increase sensitivity for this type of mutation.</p>
<p>Data analysis pipelines are constructed by integrating software from diverse sources. We found that the particular choice of a pipeline software is not as critical as how each piece of software is applied and how the results are filtered. In many instances, the approaches used in the different software and the assumptions made therein are not evident to the user and many parts are black boxes. We found that certain combinations show much higher compatibility than others, for example, with Novoalign alignments as input, MuTect produces an SSM call set with the lowest overall precision, while the SSM call set produced by Strelka on the same alignment file has the highest overall precision. Using combinations of tools for the same process step, assuming that a result shared by two different approaches has a higher likelihood to be correct, results in higher accuracy
<xref ref-type="bibr" rid="b32">32</xref>
. Indeed, we found that some of the higher accuracy pipelines utilize consensus of more than one mutation caller. Our controlled experiment intersecting Strelka and MuTect calls bore this out as well.</p>
<p>Recommended checklist for WGS cancer studies:
<list id="l1" list-type="bullet">
<list-item>
<p>PCR-free library preparation</p>
</list-item>
<list-item>
<p>Tumour coverage >100 ×</p>
</list-item>
<list-item>
<p>Control coverage close to tumour coverage (±10%)</p>
</list-item>
<list-item>
<p>Reference genome hs37d5 (with decoy sequences) or GRCh38 (untested)</p>
</list-item>
<list-item>
<p>Optimize aligner/variant caller combination</p>
</list-item>
<list-item>
<p>Combine several mutation callers</p>
</list-item>
<list-item>
<p>Allow mutations in or near repeats (regions of the genome likely to be more prone to mutation)</p>
</list-item>
<list-item>
<p>Filter by mapping quality, strand bias, positional bias, presence of soft-clipping to minimize mapping artefacts</p>
</list-item>
</list>
</p>
<p>To account for many unknowns, variant calling pipeline developers often resort to calibrating their pipelines against known results, for example from a genotyping experiment performed on similar samples. This approach might only have limited validity genome-wide, as genotyping assays are typically biased towards less complex areas of the genome. We show that for cancer WGS experiments our benchmark set has the potential to be an unbiased and powerful calibration tool. The sequencing reads and curated Gold Set mutations described here are available to the research community through the ICGC DACO and the EGA to benchmark and calibrate their pipelines.</p>
<p>The issues that we have addressed in this study must be resolved (and we think they can, with the use of our benchmark data set) before WGS for cancer analysis can be wholly adopted for clinical use. However, we also suggest that further benchmarks be established to resolve even more difficult mutational features that tumour samples and genomes can present. These include, but are not limited to, low sample purity (contamination of tumour cells by normal cells and also the normal contaminated by tumour cells or viral components), subclonality, structural rearrangements and karyotype aberrations. Real cancers are complex and this complexity continues to challenge somatic mutation calling pipelines. In summary, this valuable resource can serve as a useful tool for the comparative assessment of sequencing pipelines, and gives important new insights into sequencing and analysis strategies as we move into the next big expansion phase of the high-throughput sequencing era.</p>
</sec>
<sec disp-level="1">
<title>Methods</title>
<sec disp-level="2">
<title>Patient material</title>
<p>An Institutional Review Board ethical vote (Medical Faculty of the University of Heidelberg) as well as informed consent were obtained according to ICGC the guidelines (
<ext-link ext-link-type="uri" xlink:href="http://www.icgc.org">www.icgc.org</ext-link>
). A limited amount of the original MB DNA can be made available on request.</p>
</sec>
<sec disp-level="2">
<title>Library preparation and sequencing</title>
<p>The libraries were prepared at the different sequencing centres: the National Center for Genome Analysis (CNAG), Barcelona, Spain; the German Cancer Research Center (DKFZ), Heidelberg, Germany; the RIKEN Institute, Tokyo, Japan; the Ontario Institute for Cancer Research (OICR), Toronto, Canada, and the Wellcome Trust Sanger Institute, Hinxton, UK. Some libraries actually comprise a mixture of different libraries (as per the centre's standard protocols); others comprise one library only. An overview of the composition of the different libraries and differences in the library preparation protocols is given in
<xref ref-type="table" rid="t1">Table 1</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 1</xref>
. All samples were sequenced using Illumina technology and chemistry. The majority of reads are of 2 × 100 bp length and are derived from HiSeq2000 or HiSeq2500 sequencers, however, in one read set (L.A), a low number of 2 × 250-bp MiSeq reads are also included.</p>
</sec>
<sec disp-level="2">
<title>Comparison of SSM calls</title>
<p>Each of the participating centres performed mutation calling using the respective in house pipelines (
<xref ref-type="supplementary-material" rid="S1">Supplementary Methods</xref>
). The raw SSM calls were provided in the form of customized Variant Call Format (VCF) files. To provide a fair comparison, only single base point mutations were considered. A call was considered to be equal when both the position and the exact substitution reported were identical. The calls were then sorted according to the number of centres that made this particular call using a custom Perl script. The resulting file was plotted using a custom R-script (both available on request).</p>
</sec>
<sec disp-level="2">
<title>Merging of the bam files to get the 300 × files</title>
<p>To create the high coverage ∼300 × bam files, the raw fastq files were aligned using bwa 0.6.2-r126-tpx aln -t 12 -q 20, followed by bwa-0.6.2-tpx sample -P -T -t 8 -a 1000 -r. The bam files for each centre/library were merged, and duplicates were marked using Picard tools MarkDuplicates Version 1.61. Finally, all merged per-centre bam files were merged using picard-1.95 MergeSamFiles and the header was adjusted using samtools-0.1.19 reheader. Since only reads from different libraries were merged at this step, duplicates were not marked. The coverage was calculated using an in-house tool, taking into account only non-N bases.</p>
</sec>
<sec disp-level="2">
<title>Downsampling of the 300 × files</title>
<p>The ∼300 × bam files were serially downsampled to different coverage levels (250 ×, 200 ×, 150 ×, 100 ×, 50 ×, 30 ×, 20) using picard-1.95 DownsampleSam, and the coverage was determined after each step.</p>
</sec>
<sec disp-level="2">
<title>Determination of library GC bias</title>
<p>To determine the GC bias of the libraries, we first created 10 kb windows over the whole genome using bedtools (v2.16.2) makewindows. Then, the GC content for each window was calculated using bedtools (v2.16.2) nuc. Windows containing more than 100 ‘N' bases were excluded (awk-3.1.6 ‘BEGIN{FS='\t'}{if ($10 <=100 && $11 <=100) print $1"\t"$2"\t"$3"\t"$5}'). Finally, the coverage for each of the remaining windows was calculated using bedtools (v2.16.2) multicov. Since the total coverage of the different libraries was not the same, the coverage was normalized by dividing the coverage for each window by the mean coverage across all windows for each of the samples. To visualize the GC bias, we then plotted the normalized coverage against the GC content.</p>
</sec>
<sec disp-level="2">
<title>Calculation of low coverage in special regions of interest</title>
<p>The regions of interest were defined as previously described
<xref ref-type="bibr" rid="b33">33</xref>
. To determine the percentage of bases covered with fewer than 10 reads, we first determined the coverage over the whole genome in per-base resolution using genomeCoverageBed (v2.16.2) -bga. The resulting coverage file was compressed using bgzip, and an index was produced with tabix-0.2.5 -p bed. We then extracted the coverage for our regions of interest using tabix-0.2.5. From the resulting extracted coverage files, we computed the number of bases covered by a certain number of reads using intersectBed and a custom perl script. This table was then used to determine the percentage of bases covered by ≤10 reads.</p>
</sec>
<sec disp-level="2">
<title>Extraction of mutation signatures</title>
<p>Mutational catalogues were generated based on the somatic mutations detected in the tumours. The 3′ and 5′ sequence context of all selected mutations was extracted, and the resulting trinucleotides were converted to the pyrimidine context of the substituted base. Considering six basic substitution types with surrounding sequence context, this results in a mutation-type vector of length 96. The mutational catalogue was set up by counting the occurrence of each of these 96 mutation types per sample.</p>
<p>The proportions of the signatures published by Alexandrov
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b34">34</xref>
<xref ref-type="bibr" rid="b35">35</xref>
contributing to the mutational profile of each sample were estimated based on the probabilities of point mutations with their trinucleotide context in the signatures. The respective exposures were extracted sample-wise by quadratic programming. Exposures were plotted if they accounted for at least 5% of the SSMs in a sample.</p>
</sec>
<sec disp-level="2">
<title>Somatic mutation calling benchmark data set</title>
<p>The sequencing reads provided to pipeline benchmark participants were produced at the CNAG using a no-PCR library preparation procedure that was adapted from the KAPA Library Preparation Kit protocol used together with Illumina TruSeq adaptors and omitting PCR amplification, each for the MB tumour and the corresponding normal DNA sample (L.A). For each sample two libraries were prepared with smaller (roughly 300 bases) and larger (roughly 450 bases) insert size. Sizing was performed using agarose gel separation and excision of corresponding size bands. The two tumour libraries were sequenced to 40.5 × and the two normal libraries to 29.6 × using a combination of Illumina HiSeq2000 (2 × 100 bp) and Illumina MiSeq (2 × 250 bp). MiSeq reads contributed about 2 × to each tumour and normal data. Reads in the FASTQ format were generated using the RTA software provided by Illumina.</p>
</sec>
<sec disp-level="2">
<title>Verification by 300 × coverage</title>
<p>All reads produced by the different sequencing centres on the MB tumour-normal pair (including the CNAG reads described above) were combined and analysed to generate a curated set of results (Gold Set). The combined sequences gave 314 × coverage of the tumour and 272 × in the normal. Six different teams carried out mutation calling using their pipelines (different combinations of aligners, mutation callers and filters). A consensus set was generated accepting all calls made by more than three submitters (a subset of 10% was reviewed manually to confirm the quality of these calls). All calls made by three or fewer submitters were reviewed manually. We generated Integrative Genomics Viewer (IGV) screenshots centred on the mutation positions, juxtaposing the normal and tumour BWA
<xref ref-type="bibr" rid="b36">36</xref>
alignments. The images were made available for visual inspection and reviewed manually and voted/commented on by the entire analysis team (more than eight researchers). For calls that did not achieve complete agreement with the reviewers, a final decision was reached as follows. Reads were aligned with GEM
<xref ref-type="bibr" rid="b37">37</xref>
(gem-mapper) and converted to the BAM format using gemtools scorereads. Alignments were filtered to retain only primary alignments with mapping quality ≥20. Duplicates were removed with Picard, indels realigned at 1,000 genomes indel target locations and all indels were left-aligned using GATK. The pileups at SSM positions were extracted using samtools mpileup with base-quality threshold ≥13. Read depth and base counts were extracted using a custom script. Mutant allele and normal counts were compared using in-house software
<italic>snape-cmp-counts</italic>
<xref ref-type="bibr" rid="b38">38</xref>
, which compares alternate and reference allele counts in tumour and normal and then assigns a score according to the probability that they are derived from different beta distributions. Mappabilities with 0, 1 and 2% mismatches were computed for the reference genome (h37d5). The average mappabilities in 100-bp windows preceding each candidate mutation were stored as tracks for visualization in IGV. In addition, the segmental duplication annotation from the UCSC browser was loaded into IGV. Mutations were then classified as follows. Mutations with sufficient depth (≥20) and a
<italic>snape-cmp-counts</italic>
score ≥0.98, average mappability of one and no overlap with segmental duplications were automatically classified in the Gold Set according to their mutant AF (class 1: MAF≥0.1, class 2: 0.1>MAF≥0.05 or class 3: MAF<0.05). All other candidates with
<italic>snape-cmp-counts</italic>
score >0.9 were reviewed visually in IGV. At this point, extensive soft-clipping in BWA, obvious strand bias and positional bias were also taken into consideration. Mutations with ambiguous alignments were assigned to class 4. Abnormally low or high depth mutations (taking also into account large-scale copy number variant regions) were assigned to class 5. Somatic mutation Gold Set tiers were compiled by cumulative addition of classes so that Tier 1 only includes class 1, while Tier 2 includes class 1 and class 2, Tier 3 includes classes 1, 2 and 3 and so on. All other candidate mutations were rejected and assigned to class 0 (
<xref ref-type="table" rid="t1">Table 1</xref>
). The estimated mutation AF cutoff is 2%, below which we deemed a call unreliable. The Gold Set was made available to all participants to review why a somatic mutation was wrongly called or missed in their respective submission.</p>
</sec>
<sec disp-level="2">
<title>Evaluation of submissions</title>
<p>Automatic validation was performed on the submission server to minimize formatting problems. In addition, the submitted VCF files were sorted, filtered to restrict calls to chromosomes 1–22, X and Y and SIMs were left-aligned. Submissions of both CLL and MB SSMs and SIMs were evaluated against their respective Gold Sets, whose derivation is described above. For calculation of recall, the curated mutations were classified into three tiers according to alternate (mutation) AF. Only positions were considered, not the genotypes reported. For calculation of precision, all Tier 3 mutations plus ambiguously aligned mutations (class 4) were included so as to not penalize difficult to align but otherwise convincing differences between tumour and normal samples. For SIMs, no stratification into tiers 2 or 3 was performed; for recall, Tier 1 SIMs were used while, for precision, Tier 4 SIMs were used. To compare overall performance, we used a balanced measure of accuracy, the F1 score, defined as 2 × (Precision × Recall)/(Precision+Recall).</p>
<p>Overlap calculations for the purpose of clustering and heat map generation were performed using Tier 3 for SSMs and Tier 1 for SIMs. The Jaccard index is defined as the intersection divided by the union.</p>
<p>Correspondence analysis was performed using the ‘ca' package in R on a table where each row corresponds to a genomic position at which at least one submission calls a somatic mutation in the Gold Set. The columns comprise the presence or absence in each call set, and Boolean values indicating whether certain genomic features such as repeats or presence in a blacklisted region apply, as well as sequence data such as AF or depth.</p>
<p>Rainfall plots represent the distance for each SSM call from its immediately prior SSM on the reference genome. For each submission, the SSM set used was made of SSM called classified as TP or FP and SSM from the Gold Set Tier 4 that were absent from the submission, classified as FN positions.</p>
<p>Feature analysis was conducted as follows. Tandem repeats were annotated with two programmes. Tandem repeats finder
<xref ref-type="bibr" rid="b39">39</xref>
was run on 201-bp windows around each SSM and SIM calls. Any repeats greater than or equal to six repeat units overlapping or immediately adjacent to the mutation position was annotated accordingly. SGA
<xref ref-type="bibr" rid="b25">25</xref>
was also used to annotate homopolymers specifically, giving a richer annotation of the repeat context and change induced by the mutation. Mappability was calculated using gem mappability
<xref ref-type="bibr" rid="b40">40</xref>
with one mismatch at both 100mer and 150mer lengths. The average 100mer and 150mer mappabilities for each mutation were calculated for a window of −90 to −10 or −140 to −10 with respect to its position, respectively. Mult100 and mult150 are defined as 1—mappability 100 and 150. Same AF is defined as 1−(2 × (SCORE
<sub>snape-cmp-counts</sub>
−0.5)) for SCORE≥0.5 else 0.</p>
</sec>
<sec disp-level="2">
<title>Control of pipeline components</title>
<p>The protocol consisted of choosing a genome reference, mapping, alignment processing, variant calling and then analysis of alignments or variant calls.</p>
<p>Genome references tested:
<list id="l2" list-type="bullet">
<list-item>
<p>‘b37d' (‘human_g1k_v37_decoy' from GATK bundle 2.8)</p>
</list-item>
<list-item>
<p>‘b37' (‘human_g1k_v37' from GATK bundle 2.8)</p>
</list-item>
<list-item>
<p>‘hg19r' (‘ucsc.hg19' from GATK bundle 2.3, with its unplaced contigs and haplotype chromosomes removed)</p>
</list-item>
</list>
</p>
<p>Mappers tested:
<list id="l3" list-type="bullet">
<list-item>
<p>Novoalign2 (v2.08.03) options: -i PE 360,60 -r All 10</p>
</list-item>
<list-item>
<p>BWA (0.6.2-r126-tpx)</p>
</list-item>
<list-item>
<p>BWA-mem (0.7.7-r441) options: -t 8 -M -B 3</p>
</list-item>
<list-item>
<p>GEM (1.828)</p>
</list-item>
</list>
</p>
<p>Mutation callers tested:
<list id="l5" list-type="bullet">
<list-item>
<p>MuTect
<xref ref-type="bibr" rid="b31">31</xref>
(v. 1.1.4; dbSNP v. 138; COSMIC v. 64)</p>
</list-item>
<list-item>
<p>Strelka
<xref ref-type="bibr" rid="b27">27</xref>
(1.0.13)</p>
</list-item>
</list>
</p>
<p>Alignment post-processing ensuring format compatibility with the downstream tools was performed as follows:
<list id="l6" list-type="bullet">
<list-item>
<p>Merge multiple SAM/BAM output files, coordinate-sorting of alignments with Picard tools (v. 1.84)</p>
</list-item>
<list-item>
<p>Add read group information with Picard tools</p>
</list-item>
<list-item>
<p>Discard secondary alignments (alignments with SAM FLAG 0 × 100 set) with samtools (v. 0.1.18) (‘samtools view -h -b -F 256')</p>
</list-item>
<list-item>
<p>Mark duplicates with Picard tools' MarkDuplicates.</p>
</list-item>
<list-item>
<p>Realignment (RealignerTargetCreator, IndelRealigner) around indels with GATK (v. 2.3-9-ge5ebf34). The tumour and control were processed together.</p>
</list-item>
<list-item>
<p>Apply Picard tools' FixMateInformation</p>
</list-item>
</list>
</p>
<p>No base-quality recalibration or mutation filtration was applied. Two hundred fifty-one-nucleotide-long reads mapped by Novoalign were truncated to 150 nucleotides; this affected ∼4.4% of tumour reads (causing possible FNs because of missing mutation support in the tumour) and ∼4.5% of control reads (causing possible FPs because of missing mutation evidence in the control).</p>
<p>The programme qProfiler (
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/p/adamajava/wiki/qProfiler/">http://sourceforge.net/p/adamajava/wiki/qProfiler/</ext-link>
) was run on each mapper's alignment files to investigate systematic mapping differences that potentially influenced subsequent mutation calling. Specifically, distributions of values in SAM fields ‘RNAME' (alignment reference sequence ID), ‘MAPQ' (alignment mapping quality score), ‘TLEN' (observed insert size), ‘CIGAR' (alignment-level indel details together with ‘soft clippings'—mapper-induced read trimming) and ‘MD' (alignment mismatch details) were of interest. The same alignment files were used for mutation calling and qprofiler analysis. However, since mutation calling results were limited to chromosomes 1–22, X and Y, alignment files serving as qprofiler input were first filtered so as to contain only alignments to chromosomes 1–22, X and Y in order for the statistics being relevant (contig coverage statistics being the only exception, since mapping to the decoy contig appears to be important).</p>
</sec>
<sec disp-level="2">
<title>Availability of data</title>
<p>Sequence data for this study have been deposited in the European Genome-phenome Archive (EGA) under the accession number EGAS00001001539.</p>
</sec>
</sec>
<sec disp-level="1">
<title>Additional information</title>
<p>
<bold>How to cite this article:</bold>
Alioto, T. S.
<italic>et al</italic>
. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.
<italic>Nat. Commun.</italic>
6:10001 doi: 10.1038/ncomms10001 (2015).</p>
</sec>
<sec sec-type="supplementary-material" id="S1">
<title>Supplementary Material</title>
<supplementary-material id="d33e18" content-type="local-data">
<caption>
<title>Supplementary Information</title>
<p>Supplementary Figures 1-48, Supplementary Tables 1-18, Supplementary Note 1, Supplementary Methods and Supplementary References</p>
</caption>
<media xlink:href="ncomms10001-s1.pdf"></media>
</supplementary-material>
<supplementary-material id="d33e24" content-type="local-data">
<caption>
<title>Supplementary Data 1</title>
<p>Gold Set mutations, accuracy of pipelines, pipeline parameter tables, and low coverage regions.</p>
</caption>
<media xlink:href="ncomms10001-s2.xlsx"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>We thank the DKFZ Genomics and Proteomics Core Facility and the OICR Genome Technologies Platform for provision of sequencing services. Financial support was provided by the consortium projects READNA under grant agreement FP7 Health-F4-2008-201418, ESGI under grant agreement 262055, GEUVADIS under grant agreement 261123 of the European Commission Framework Programme 7, ICGC-CLL through the Spanish Ministry of Science and Innovation (MICINN), the Instituto de Salud Carlos III (ISCIII) and the Generalitat de Catalunya. Additional financial support was provided by the PedBrain Tumor Project contributing to the ICGC, funded by German Cancer Aid (109252) and by the German Federal Ministry of Education and Research (BMBF, grants #01KU1201A, MedSys #0315416C and NGFNplus #01GS0883), the BMBF-funded ICGC projects on early-onset prostate cancer and malignant lymphoma (#01KU1001A, #01KU1002B) and via the BMBF-funded de.NBI HD-HuB network (#031A537A, #031A537C); the Ontario Institute for Cancer Research to PCB and JDM through funding provided by the Government of Ontario, Ministry of Research and Innovation; Genome Canada; the Canada Foundation for Innovation and Prostate Cancer Canada with funding from the Movember Foundation (PCB). P.C.B. was also supported by a Terry Fox Research Institute New Investigator Award, a CIHR New Investigator Award and a Genome Canada Large-Scale Applied Project Contract. The Synergie Lyon Cancer platform has received support from the French National Institute of Cancer (INCa) and from the ABS4NGS ANR project (ANR-11-BINF-0001-06). The ICGC RIKEN study was supported partially by RIKEN President's Fund 2011, and the supercomputing resource for the RIKEN study was provided by the Human Genome Center, University of Tokyo. M.D.E., L.B., A.G.L. and C.L.A. were supported by Cancer Research UK, the University of Cambridge and Hutchison-Whampoa Limited. S.D. is supported by the Torres Quevedo subprogramme (MICINN) under grant agreement PTQ-12-05391. E.H. is supported by the Research Council of Norway under grant agreements 221580 and 218241 and by the Norwegian Cancer Society under grant agreement 71220—PR-2006-0433. We specially thank Jennifer Jennings for administrating the activity of the ICGC Verification Working Group and Anna Borrell for administrative support.</p>
</ack>
<ref-list>
<ref id="b1">
<mixed-citation publication-type="journal">
<name>
<surname>Hudson</surname>
<given-names>T. J.</given-names>
</name>
<etal></etal>
.
<article-title>International network of cancer genome projects</article-title>
.
<source>Nature</source>
<volume>464</volume>
,
<fpage>993</fpage>
<lpage>998</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20393554</pub-id>
</mixed-citation>
</ref>
<ref id="b2">
<mixed-citation publication-type="journal">
<name>
<surname>Mardis</surname>
<given-names>E. R.</given-names>
</name>
&
<name>
<surname>Wilson</surname>
<given-names>R. K.</given-names>
</name>
<article-title>Cancer genome sequencing: a review</article-title>
.
<source>Hum. Mol. Genet.</source>
<volume>18</volume>
,
<fpage>R163</fpage>
<lpage>R168</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19808792</pub-id>
</mixed-citation>
</ref>
<ref id="b3">
<mixed-citation publication-type="journal">
<name>
<surname>Ley</surname>
<given-names>T. J.</given-names>
</name>
<etal></etal>
.
<article-title>DNMT3A mutations in acute myeloid leukemia</article-title>
.
<source>N. Engl. J. Med.</source>
<volume>363</volume>
,
<fpage>2424</fpage>
<lpage>2433</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">21067377</pub-id>
</mixed-citation>
</ref>
<ref id="b4">
<mixed-citation publication-type="journal">
<name>
<surname>Puente</surname>
<given-names>X. S.</given-names>
</name>
<etal></etal>
.
<article-title>Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia</article-title>
.
<source>Nature</source>
<volume>475</volume>
,
<fpage>101</fpage>
<lpage>105</lpage>
(
<year>2011</year>
).
<pub-id pub-id-type="pmid">21642962</pub-id>
</mixed-citation>
</ref>
<ref id="b5">
<mixed-citation publication-type="journal">
<name>
<surname>Alkodsi</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Louhimo</surname>
<given-names>R.</given-names>
</name>
&
<name>
<surname>Hautaniemi</surname>
<given-names>S.</given-names>
</name>
<article-title>Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data</article-title>
.
<source>Brief Bioinform.</source>
<volume>16</volume>
,
<fpage>242</fpage>
<lpage>254</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24599115</pub-id>
</mixed-citation>
</ref>
<ref id="b6">
<mixed-citation publication-type="journal">
<name>
<surname>Dewey</surname>
<given-names>F. E.</given-names>
</name>
<etal></etal>
.
<article-title>Clinical interpretation and implications of whole-genome sequencing</article-title>
.
<source>JAMA</source>
<volume>311</volume>
,
<fpage>1035</fpage>
<lpage>1045</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24618965</pub-id>
</mixed-citation>
</ref>
<ref id="b7">
<mixed-citation publication-type="journal">
<name>
<surname>Kandoth</surname>
<given-names>C.</given-names>
</name>
<etal></etal>
.
<article-title>Mutational landscape and significance across 12 major cancer types</article-title>
.
<source>Nature</source>
<volume>502</volume>
,
<fpage>333</fpage>
<lpage>339</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">24132290</pub-id>
</mixed-citation>
</ref>
<ref id="b8">
<mixed-citation publication-type="journal">
<name>
<surname>Jones</surname>
<given-names>D. T.</given-names>
</name>
<etal></etal>
.
<article-title>Dissecting the genomic complexity underlying medulloblastoma</article-title>
.
<source>Nature</source>
<volume>488</volume>
,
<fpage>100</fpage>
<lpage>105</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22832583</pub-id>
</mixed-citation>
</ref>
<ref id="b9">
<mixed-citation publication-type="journal">Cancer Genome Atlas Research, N.
<article-title>Genomic and epigenomic landscapes of adult
<italic>de novo</italic>
acute myeloid leukemia</article-title>
.
<source>N. Engl. J. Med.</source>
<volume>368</volume>
,
<fpage>2059</fpage>
<lpage>2074</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23634996</pub-id>
</mixed-citation>
</ref>
<ref id="b10">
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
<article-title>Toward better understanding of artifacts in variant calling from high-coverage samples</article-title>
.
<source>Bioinformatics</source>
<volume>30</volume>
,
<fpage>2843</fpage>
<lpage>2851</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24974202</pub-id>
</mixed-citation>
</ref>
<ref id="b11">
<mixed-citation publication-type="journal">
<name>
<surname>McGinn</surname>
<given-names>S.</given-names>
</name>
&
<name>
<surname>Gut</surname>
<given-names>I. G.</given-names>
</name>
<article-title>DNA sequencing—spanning the generations</article-title>
.
<source>N. Biotechnol.</source>
<volume>30</volume>
,
<fpage>366</fpage>
<lpage>372</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23165096</pub-id>
</mixed-citation>
</ref>
<ref id="b12">
<mixed-citation publication-type="journal">
<name>
<surname>Xu</surname>
<given-names>H.</given-names>
</name>
,
<name>
<surname>DiCarlo</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Satya</surname>
<given-names>R. V.</given-names>
</name>
,
<name>
<surname>Peng</surname>
<given-names>Q.</given-names>
</name>
&
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<article-title>Comparison of somatic mutation calling methods in amplicon and whole exome sequence data</article-title>
.
<source>BMC Genomics</source>
<volume>15</volume>
,
<fpage>244</fpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24678773</pub-id>
</mixed-citation>
</ref>
<ref id="b13">
<mixed-citation publication-type="journal">
<name>
<surname>Highnam</surname>
<given-names>G.</given-names>
</name>
<etal></etal>
.
<article-title>An analytical framework for optimizing variant discovery from personal genomes</article-title>
.
<source>Nat. Commun.</source>
<volume>6</volume>
,
<fpage>6275</fpage>
(
<year>2015</year>
).
<pub-id pub-id-type="pmid">25711446</pub-id>
</mixed-citation>
</ref>
<ref id="b14">
<mixed-citation publication-type="journal">
<name>
<surname>Zook</surname>
<given-names>J. M.</given-names>
</name>
<etal></etal>
.
<article-title>Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls</article-title>
.
<source>Nat. Biotechnol.</source>
<volume>32</volume>
,
<fpage>246</fpage>
<lpage>251</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24531798</pub-id>
</mixed-citation>
</ref>
<ref id="b15">
<mixed-citation publication-type="journal">
<name>
<surname>Pabinger</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>A survey of tools for variant analysis of next-generation genome sequencing data</article-title>
.
<source>Brief Bioinform.</source>
<volume>15</volume>
,
<fpage>256</fpage>
<lpage>278</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">23341494</pub-id>
</mixed-citation>
</ref>
<ref id="b16">
<mixed-citation publication-type="journal">
<name>
<surname>Fang</surname>
<given-names>H.</given-names>
</name>
<etal></etal>
.
<article-title>Reducing INDEL calling errors in whole genome and exome sequencing data</article-title>
.
<source>Genome Med.</source>
<volume>6</volume>
,
<fpage>89</fpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">25426171</pub-id>
</mixed-citation>
</ref>
<ref id="b17">
<mixed-citation publication-type="journal">
<name>
<surname>O'Rawe</surname>
<given-names>J.</given-names>
</name>
<etal></etal>
.
<article-title>Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing</article-title>
.
<source>Genome Med.</source>
<volume>5</volume>
,
<fpage>28</fpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23537139</pub-id>
</mixed-citation>
</ref>
<ref id="b18">
<mixed-citation publication-type="journal">
<name>
<surname>Wang</surname>
<given-names>Q.</given-names>
</name>
<etal></etal>
.
<article-title>Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers</article-title>
.
<source>Genome Med.</source>
<volume>5</volume>
,
<fpage>91</fpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">24112718</pub-id>
</mixed-citation>
</ref>
<ref id="b19">
<mixed-citation publication-type="journal">
<name>
<surname>Kim</surname>
<given-names>S. Y.</given-names>
</name>
&
<name>
<surname>Speed</surname>
<given-names>T. P.</given-names>
</name>
<article-title>Comparing somatic mutation-callers: beyond Venn diagrams</article-title>
.
<source>BMC Bioinformatics</source>
<volume>14</volume>
,
<fpage>189</fpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23758877</pub-id>
</mixed-citation>
</ref>
<ref id="b20">
<mixed-citation publication-type="journal">
<name>
<surname>Louis</surname>
<given-names>D. N.</given-names>
</name>
<etal></etal>
.
<article-title>The 2007 WHO classification of tumours of the central nervous system</article-title>
.
<source>Acta Neuropathol.</source>
<volume>114</volume>
,
<fpage>97</fpage>
<lpage>109</lpage>
(
<year>2007</year>
).
<pub-id pub-id-type="pmid">17618441</pub-id>
</mixed-citation>
</ref>
<ref id="b21">
<mixed-citation publication-type="journal">
<name>
<surname>Taylor</surname>
<given-names>M. D.</given-names>
</name>
<etal></etal>
.
<article-title>Molecular subgroups of medulloblastoma: the current consensus</article-title>
.
<source>Acta Neuropathol.</source>
<volume>123</volume>
,
<fpage>465</fpage>
<lpage>472</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22134537</pub-id>
</mixed-citation>
</ref>
<ref id="b22">
<mixed-citation publication-type="journal">
<name>
<surname>Ewing</surname>
<given-names>A. D.</given-names>
</name>
<etal></etal>
.
<article-title>Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection</article-title>
.
<source>Nat. Methods</source>
<volume>12</volume>
,
<fpage>623</fpage>
<lpage>630</lpage>
(
<year>2015</year>
).
<pub-id pub-id-type="pmid">25984700</pub-id>
</mixed-citation>
</ref>
<ref id="b23">
<mixed-citation publication-type="journal">
<name>
<surname>Kassahn</surname>
<given-names>K. S.</given-names>
</name>
<etal></etal>
.
<article-title>Somatic point mutation calling in low cellularity tumors</article-title>
.
<source>PLoS ONE</source>
<volume>8</volume>
,
<fpage>e74380</fpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">24250782</pub-id>
</mixed-citation>
</ref>
<ref id="b24">
<mixed-citation publication-type="journal">
<name>
<surname>McKenna</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
.
<article-title>The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data</article-title>
.
<source>Genome Res.</source>
<volume>20</volume>
,
<fpage>1297</fpage>
<lpage>1303</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20644199</pub-id>
</mixed-citation>
</ref>
<ref id="b25">
<mixed-citation publication-type="journal">
<name>
<surname>Simpson</surname>
<given-names>J. T.</given-names>
</name>
&
<name>
<surname>Durbin</surname>
<given-names>R.</given-names>
</name>
<article-title>Efficient construction of an assembly string graph using the FM-index</article-title>
.
<source>Bioinformatics</source>
<volume>26</volume>
,
<fpage>i367</fpage>
<lpage>i373</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20529929</pub-id>
</mixed-citation>
</ref>
<ref id="b26">
<mixed-citation publication-type="other">
<name>
<surname>Garrison</surname>
<given-names>E.</given-names>
</name>
&
<name>
<surname>Marth</surname>
<given-names>G.</given-names>
</name>
Haplotype-based variant detection from short-read sequencing. Preprint at arXiv:1207.3907 (
<year>2012</year>
).</mixed-citation>
</ref>
<ref id="b27">
<mixed-citation publication-type="journal">
<name>
<surname>Saunders</surname>
<given-names>C. T.</given-names>
</name>
<etal></etal>
.
<article-title>Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs</article-title>
.
<source>Bioinformatics</source>
<volume>28</volume>
,
<fpage>1811</fpage>
<lpage>1817</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22581179</pub-id>
</mixed-citation>
</ref>
<ref id="b28">
<mixed-citation publication-type="journal">
<name>
<surname>Rimmer</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
.
<article-title>Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications</article-title>
.
<source>Nat. Genet.</source>
<volume>46</volume>
,
<fpage>912</fpage>
<lpage>918</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">25017105</pub-id>
</mixed-citation>
</ref>
<ref id="b29">
<mixed-citation publication-type="journal">
<name>
<surname>Challis</surname>
<given-names>D.</given-names>
</name>
<etal></etal>
.
<article-title>An integrative variant analysis suite for whole exome next-generation sequencing data</article-title>
.
<source>BMC Bioinformatics</source>
<volume>13</volume>
,
<fpage>8</fpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22239737</pub-id>
</mixed-citation>
</ref>
<ref id="b30">
<mixed-citation publication-type="journal">
<name>
<surname>Moncunill</surname>
<given-names>V.</given-names>
</name>
<etal></etal>
.
<article-title>Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads</article-title>
.
<source>Nat. Biotechnol.</source>
<volume>32</volume>
,
<fpage>1106</fpage>
<lpage>1112</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">25344728</pub-id>
</mixed-citation>
</ref>
<ref id="b31">
<mixed-citation publication-type="journal">
<name>
<surname>Cibulskis</surname>
<given-names>K.</given-names>
</name>
<etal></etal>
.
<article-title>Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples</article-title>
.
<source>Nat. Biotechnol.</source>
<volume>31</volume>
,
<fpage>213</fpage>
<lpage>219</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23396013</pub-id>
</mixed-citation>
</ref>
<ref id="b32">
<mixed-citation publication-type="journal">
<name>
<surname>Goode</surname>
<given-names>D. L.</given-names>
</name>
<etal></etal>
.
<article-title>A simple consensus approach improves somatic mutation prediction accuracy</article-title>
.
<source>Genome Med.</source>
<volume>5</volume>
,
<fpage>90</fpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">24073752</pub-id>
</mixed-citation>
</ref>
<ref id="b33">
<mixed-citation publication-type="journal">
<name>
<surname>Rieber</surname>
<given-names>N.</given-names>
</name>
<etal></etal>
.
<article-title>Coverage bias and sensitivity of variant calling for four whole-genome sequencing technologies</article-title>
.
<source>PLoS ONE</source>
<volume>8</volume>
,
<fpage>e66621</fpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23776689</pub-id>
</mixed-citation>
</ref>
<ref id="b34">
<mixed-citation publication-type="journal">
<name>
<surname>Alexandrov</surname>
<given-names>L. B.</given-names>
</name>
<etal></etal>
.
<article-title>Signatures of mutational processes in human cancer</article-title>
.
<source>Nature</source>
<volume>500</volume>
,
<fpage>415</fpage>
<lpage>421</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23945592</pub-id>
</mixed-citation>
</ref>
<ref id="b35">
<mixed-citation publication-type="journal">
<name>
<surname>Alexandrov</surname>
<given-names>L. B.</given-names>
</name>
,
<name>
<surname>Nik-Zainal</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Wedge</surname>
<given-names>D. C.</given-names>
</name>
,
<name>
<surname>Campbell</surname>
<given-names>P. J.</given-names>
</name>
&
<name>
<surname>Stratton</surname>
<given-names>M. R.</given-names>
</name>
<article-title>Deciphering signatures of mutational processes operative in human cancer</article-title>
.
<source>Cell Rep.</source>
<volume>3</volume>
,
<fpage>246</fpage>
<lpage>259</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23318258</pub-id>
</mixed-citation>
</ref>
<ref id="b36">
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
&
<name>
<surname>Durbin</surname>
<given-names>R.</given-names>
</name>
<article-title>Fast and accurate short read alignment with Burrows-Wheeler transform</article-title>
.
<source>Bioinformatics</source>
<volume>25</volume>
,
<fpage>1754</fpage>
<lpage>1760</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19451168</pub-id>
</mixed-citation>
</ref>
<ref id="b37">
<mixed-citation publication-type="journal">
<name>
<surname>Marco-Sola</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Sammeth</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Guigo</surname>
<given-names>R.</given-names>
</name>
&
<name>
<surname>Ribeca</surname>
<given-names>P.</given-names>
</name>
<article-title>The GEM mapper: fast, accurate and versatile alignment by filtration</article-title>
.
<source>Nat. Methods</source>
<volume>9</volume>
,
<fpage>1185</fpage>
<lpage>1188</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">23103880</pub-id>
</mixed-citation>
</ref>
<ref id="b38">
<mixed-citation publication-type="journal">
<name>
<surname>Raineri</surname>
<given-names>E.</given-names>
</name>
,
<name>
<surname>Dabad</surname>
<given-names>M.</given-names>
</name>
&
<name>
<surname>Heath</surname>
<given-names>S.</given-names>
</name>
<article-title>A note on exact differences between beta distributions in genomic (Methylation) studies</article-title>
.
<source>PLoS ONE</source>
<volume>9</volume>
,
<fpage>e97349</fpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24824426</pub-id>
</mixed-citation>
</ref>
<ref id="b39">
<mixed-citation publication-type="journal">
<name>
<surname>Benson</surname>
<given-names>G.</given-names>
</name>
<article-title>Tandem repeats finder: a program to analyze DNA sequences</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>27</volume>
,
<fpage>573</fpage>
<lpage>580</lpage>
(
<year>1999</year>
).
<pub-id pub-id-type="pmid">9862982</pub-id>
</mixed-citation>
</ref>
<ref id="b40">
<mixed-citation publication-type="journal">
<name>
<surname>Derrien</surname>
<given-names>T.</given-names>
</name>
<etal></etal>
.
<article-title>Fast computation and applications of genome mappability</article-title>
.
<source>PLoS ONE</source>
<volume>7</volume>
,
<fpage>e30377</fpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22276185</pub-id>
</mixed-citation>
</ref>
</ref-list>
<fn-group>
<fn>
<p>
<bold>Author contributions</bold>
T.S.A., I.B., D.T.W.J. and I.G.G. planned the study and wrote the paper. T.S.A. coordinated mutation call submissions, led analysis team and performed primary analysis. I.B. and B.H. coordinated the sequencing benchmark and analysis. S.D. coordinated metadata collection and analysis. T.S.A., S.D., M.D.E., E.H., I.B., B.H., P.G., D.T.W.J., L.E.H., T.A.B., J.T.S., L.T., A.-S.S., A.-M.P., P.R., V.Q., R.V.-M., S.N., D.V., A.G.L., R.E.D., E.R., M.D., S.C.H., P.S.T., P.J.C., P.C.B., X.S.P., J.D.M. and I.G.G. contributed to analysis. S.D., M.D.E., E.H., I.B., B.H., B.B., R.D., R.K., S.G., A.K., D.T.W.J., L.E.H., T.A.B., J.T.S., L.T., A.-S.S., P.S.T., D.J., L.S., L.F., K.R., J.H., J.W.T., A.M., R.S., A.P.B., A.-M.P., P.R., V.Q., R.V.-M., S.N., D.V., L.B., A.G.L., C.L.A., N.J.H., T.N.Y., N.W., J.V.P., S.M.G., F.C.G., S.B., N.J., N.P., M.H., M.S., R.D., N.P., M.S., M.P., P.S., A.F., H.N., M.H., C.K., S.L., J.Z., L.L., S.M., S.S., D.T., L.X., D.A.W., C.L.-O., P.J.C., P.C.B. and X.S.P. contributed to pipeline development and formatting of submissions. E.C. and D.T.W.J. provided samples. S.C., S.S., N.D., C.P., H.N., A.F., X.S.P., R.E.D., J.D.M. and M.G. contributed to sequencing effort. P.B., J.D.M., S.M.P., R.E., P.L., D.G. and T.H. provided organization and additional input on the manuscript.</p>
</fn>
</fn-group>
</back>
<floats-group>
<fig id="f1">
<label>Figure 1</label>
<caption>
<title>Differences between the different sample libraries.</title>
<p>Libraries A, E and G are PCR-free. (
<bold>a</bold>
) GC bias of the different libraries. The genome was segmented into 10-kb windows. For each window, the GC content was calculated and the coverage for the respective library was added. For better comparability, the coverage was normalized by dividing by the mean. The major band in normal corresponds to autosomes, while the lower band corresponds to sex chromosomes. The increased number of bands in the tumour is because of a higher number of ploidy states in the (largely) tetraploid tumour sample. (
<bold>b</bold>
) Cumulative coverage displayed for different libraries. Displayed are all libraries sequenced to at least 28 ×. To make the values comparable, we downsampled all samples to a coverage of 28 × (the lowest coverage of the initially sequenced libraries). The plot shows the percentage of the genome (
<italic>y</italic>
axis) covered with a given minimum coverage (
<italic>x</italic>
axis). (
<bold>c</bold>
) Percentage of certain regions of interest covered with less than 10 ×. Different colours are used to distinguish centres.</p>
</caption>
<graphic xlink:href="ncomms10001-f1"></graphic>
</fig>
<fig id="f2">
<label>Figure 2</label>
<caption>
<title>Effect of sequencing coverage on the ability to call SSMs.</title>
<p>(
<bold>a</bold>
) Overlap of SSMs called on different balanced coverages. (
<bold>b</bold>
) Density plots of the variant allele frequencies for different balanced coverages of tumour and control (tumour_versus_control) and number of SSMs called in total (calls were performed using the DKFZ calling pipeline, MB.I). (
<bold>c</bold>
) Plot of the number of SSMs (
<italic>y</italic>
axis) found for a given coverage (
<italic>x</italic>
axis). The different colours represent different levels of normal ‘contamination' in the tumour (0% black, 17% blue, 33% green and 50% orange). Solid lines represent the real data and dashed lines are simulated. Lines are fitted against the Michaelis–Menten model using the ‘drc' package in R. Solid lines are fitted to the data points and dashed lines are simulated using a mixed inhibition model for enzyme kinetics.</p>
</caption>
<graphic xlink:href="ncomms10001-f2"></graphic>
</fig>
<fig id="f3">
<label>Figure 3</label>
<caption>
<title>Overlap of somatic mutation calls for each level of concordance.</title>
<p>Shared sets of calls are vertically aligned. GOLD indicates the Gold Set. (
<bold>a</bold>
) Medulloblastoma SSM calls shared by at least two call sets. (
<bold>b</bold>
) Medulloblastoma SIM calls shared by at least two call sets.</p>
</caption>
<graphic xlink:href="ncomms10001-f3"></graphic>
</fig>
<fig id="f4">
<label>Figure 4</label>
<caption>
<title>Somatic mutation calling accuracy against Gold Sets.</title>
<p>Decreasing sensitivity on Tiers 1, 2 and 3 shown as series for each SSM call set, while precision remains the same. (
<bold>a</bold>
) Medulloblastoma SSMs. (
<bold>b</bold>
) Medulloblastoma SIMs.</p>
</caption>
<graphic xlink:href="ncomms10001-f4"></graphic>
</fig>
<fig id="f5">
<label>Figure 5</label>
<caption>
<title>Rainfall plot showing distribution of called mutations on the genome.</title>
<p>The distance between mutations is plotted in the log scale (
<italic>y</italic>
axis) versus the genomic position on the
<italic>x</italic>
axis. TPs (blue), FPs (green) and FNs (red). Four MB submissions representative of distinct patterns are shown. (
<bold>a</bold>
) MB.Q is one of best balanced between FPs and FNs, with low positional bias. (
<bold>b</bold>
) MB.L1 has many FNs. (
<bold>c</bold>
) MB.C has clusters of FPs near centromeres and FNs on the X chromosome. (
<bold>d</bold>
)MB.K has a high FP rate with short distance clustering of mutations.</p>
</caption>
<graphic xlink:href="ncomms10001-f5"></graphic>
</fig>
<fig id="f6">
<label>Figure 6</label>
<caption>
<title>Enrichment or depletion of genomic and alignment features in FP calls for each medulloblastoma SSM submission.</title>
<p>For each feature, the difference in frequency with respect to the Gold Set is multiplied by the FP rate. Blue indicates values less than zero and thus the proportion of variants or their score on that feature is lower in the FP set with respect to the true variants. Reddish colours correspond to a higher proportion of variants or higher scores for the feature in FP calls versus the Gold Set. Both features and submissions are clustered hierarchically. The features shown here include same AF (the probability that the AF in the tumour sample is not higher than that in the normal samples, derived from the snape-cmp-counts score), DacBL (in ENCODE DAC mappability blacklist region), DukeBL (in Encode Duke Mappability blacklist region), centr (in centromere or centromeric repeat), mult100 (1—mappability of 100mers with 1% mismatch), map150 (1—mappability of 150mers with 1% mismatch), DPNhi (high depth in normal), DPNlo (low depth in normal), dups (in high-identity segmental duplication), nestRep (in nested repeat), sRep (in simple repeat), inTR (in tandem repeat), adjTR (immediately adjacent to tandem repeat), msat (in microsatellite), hp (in or next to homopolymer of length >6), AFN (mutant AF in normal) and AFTlo (mutant AF in tumour<10%).</p>
</caption>
<graphic xlink:href="ncomms10001-f6"></graphic>
</fig>
<fig id="f7">
<label>Figure 7</label>
<caption>
<title>Accuracy of re-filtered pipeline SSM calls.</title>
<p>Unfiltered calls (MB.F0 and CLL.F0) are shown as a red squares, while the calls using the tuned filters (MB.F2 and CLL.F2) are shown as red circles for the medulloblastoma (
<bold>a</bold>
) and CLL (
<bold>b</bold>
) benchmark GOLD sets. For MB, only the recall versus Tier 3 is shown. Overall, 1,019 (81.2%) of the medulloblastoma SSMs (indicated by the dotted line) are considered callable at 40 × coverage; 236 MB SSMs (18.2%) were not called by any pipeline. For CLL, verification was carried out on SSMs originally called on the 40 × data, which explains the higher recall.</p>
</caption>
<graphic xlink:href="ncomms10001-f7"></graphic>
</fig>
<table-wrap position="float" id="t1">
<label>Table 1</label>
<caption>
<title>Summary of medulloblastoma tumour-normal pair library construction and sequencing.</title>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="left"></col>
<col align="center"></col>
<col align="char" char="."></col>
<col align="left"></col>
<col align="left"></col>
<col align="char" char="."></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="bottom">
<tr>
<th align="left" valign="top" charoff="50">Library</th>
<th align="center" valign="top" charoff="50">Starting DNA (μg)</th>
<th align="center" valign="top" char="." charoff="50">Fragment Size (bp)</th>
<th align="left" valign="top" charoff="50">Size selection</th>
<th align="left" valign="top" charoff="50">Library protocol</th>
<th align="center" valign="top" char="." charoff="50">PCR cycles</th>
<th align="center" valign="top" charoff="50">Sequencing machine</th>
<th align="center" valign="top" charoff="50">Chemistry (Illumina)</th>
<th align="center" valign="top" charoff="50">Depth (×) control:tumour</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left" valign="top" charoff="50">L.A</td>
<td align="center" valign="top" charoff="50">4</td>
<td align="char" valign="top" char="." charoff="50">∼400</td>
<td align="left" valign="top" charoff="50">2% Agarose gel</td>
<td align="left" valign="top" charoff="50">KapaBio</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="center" valign="top" charoff="50">HiSeq 2500 HiSeq 2000MiSeq</td>
<td align="center" valign="top" charoff="50">V1 (RR)V3V2</td>
<td align="center" valign="top" charoff="50">29.6 : 40.5</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.B</td>
<td align="center" valign="top" charoff="50">1</td>
<td align="char" valign="top" char="." charoff="50">∼400</td>
<td align="left" valign="top" charoff="50">2% Agarose gel, Invitrogen E-gel</td>
<td align="left" valign="top" charoff="50">TrueSeq DNA</td>
<td align="char" valign="top" char="." charoff="50">10</td>
<td align="center" valign="top" charoff="50">HiSeq 2000</td>
<td align="center" valign="top" charoff="50">V3</td>
<td align="center" valign="top" charoff="50">44.9 : 62.8</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.C</td>
<td align="center" valign="top" charoff="50">2.5</td>
<td align="char" valign="top" char="." charoff="50">∼500</td>
<td align="left" valign="top" charoff="50">2% Agarose gel</td>
<td align="left" valign="top" charoff="50">NEBNext</td>
<td align="char" valign="top" char="." charoff="50">12</td>
<td align="center" valign="top" charoff="50">HiSeq 2500HiSeq 2000</td>
<td align="center" valign="top" charoff="50">V1 (RR)V3</td>
<td align="center" valign="top" charoff="50">58.9 : 66.8</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.D</td>
<td align="center" valign="top" charoff="50">1</td>
<td align="char" valign="top" char="." charoff="50">∼550</td>
<td align="left" valign="top" charoff="50">Agarose gel</td>
<td align="left" valign="top" charoff="50">TrueSeq DNA</td>
<td align="char" valign="top" char="." charoff="50">10</td>
<td align="center" valign="top" charoff="50">HiSeq 2000</td>
<td align="center" valign="top" charoff="50">V3</td>
<td align="center" valign="top" charoff="50">35.3 : 39.1</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.E</td>
<td align="center" valign="top" charoff="50">2.8</td>
<td align="char" valign="top" char="." charoff="50">∼620</td>
<td align="left" valign="top" charoff="50">1.5% Agarose gel pippin</td>
<td align="left" valign="top" charoff="50">NEBNext</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="center" valign="top" charoff="50">HiSeq 2000</td>
<td align="center" valign="top" charoff="50">V3</td>
<td align="center" valign="top" charoff="50">40.5 : 60.4</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.F</td>
<td align="center" valign="top" charoff="50">1</td>
<td align="char" valign="top" char="." charoff="50">∼400</td>
<td align="left" valign="top" charoff="50">AMPureXP beads</td>
<td align="left" valign="top" charoff="50">NEBDNA</td>
<td align="char" valign="top" char="." charoff="50">10</td>
<td align="center" valign="top" charoff="50">HiSeq 2000</td>
<td align="center" valign="top" charoff="50">V3</td>
<td align="center" valign="top" charoff="50">38.7 : 37.9</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.G</td>
<td align="center" valign="top" charoff="50">1</td>
<td align="char" valign="top" char="." charoff="50">∼350</td>
<td align="left" valign="top" charoff="50">AMPureXP beads</td>
<td align="left" valign="top" charoff="50">TrueSeq DNA PCR-Free</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="center" valign="top" charoff="50">HiSeq 2000</td>
<td align="center" valign="top" charoff="50">V3</td>
<td align="center" valign="top" charoff="50">19.4 : 19.3</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">L.H</td>
<td align="center" valign="top" charoff="50">0.5</td>
<td align="char" valign="top" char="." charoff="50">∼175</td>
<td align="left" valign="top" charoff="50">AMPureXP beads</td>
<td align="left" valign="top" charoff="50">SureSelect WGS</td>
<td align="char" valign="top" char="." charoff="50">10</td>
<td align="center" valign="top" charoff="50">HiSeq 2500</td>
<td align="center" valign="top" charoff="50">V3</td>
<td align="center" valign="top" charoff="50">28.7 : 26.5</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="t2">
<label>Table 2</label>
<caption>
<title>Classification of SSM and SIM Gold Set mutations for the medulloblastoma benchmark.</title>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="left"></col>
<col align="left"></col>
<col align="char" char="."></col>
<col align="char" char="."></col>
</colgroup>
<thead valign="bottom">
<tr>
<th align="left" valign="top" charoff="50"> </th>
<th align="left" valign="top" charoff="50">Definition</th>
<th align="center" valign="top" char="." charoff="50">MB SSM</th>
<th align="center" valign="top" char="." charoff="50">MB SIM</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left" valign="top" charoff="50">Class 1</td>
<td align="left" valign="top" charoff="50">Mutant AF≥0.10</td>
<td align="char" valign="top" char="." charoff="50">962</td>
<td align="char" valign="top" char="." charoff="50">337</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Class 2</td>
<td align="left" valign="top" charoff="50">0.05≤Mutant AF<0.10</td>
<td align="char" valign="top" char="." charoff="50">139</td>
<td align="char" valign="top" char="." charoff="50"> </td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Class 3</td>
<td align="left" valign="top" charoff="50">Mutant AF<0.05</td>
<td align="char" valign="top" char="." charoff="50">154</td>
<td align="char" valign="top" char="." charoff="50"> </td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Class 4</td>
<td align="left" valign="top" charoff="50">Ambiguous alignment</td>
<td align="char" valign="top" char="." charoff="50">8</td>
<td align="char" valign="top" char="." charoff="50">10</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Class 5</td>
<td align="left" valign="top" charoff="50">High or low depth</td>
<td align="char" valign="top" char="." charoff="50">29</td>
<td align="char" valign="top" char="." charoff="50"> </td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Tier 1</td>
<td align="left" valign="top" charoff="50">Class 1</td>
<td align="char" valign="top" char="." charoff="50">962</td>
<td align="char" valign="top" char="." charoff="50">337</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Tier 2</td>
<td align="left" valign="top" charoff="50">Classes 1 and 2</td>
<td align="char" valign="top" char="." charoff="50">1,101</td>
<td align="char" valign="top" char="." charoff="50"> </td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Tier 3</td>
<td align="left" valign="top" charoff="50">Classes 1, 2 and 3</td>
<td align="char" valign="top" char="." charoff="50">1,255</td>
<td align="char" valign="top" char="." charoff="50"> </td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Tier 4</td>
<td align="left" valign="top" charoff="50">Classes 1, 2, 3 and 4</td>
<td align="char" valign="top" char="." charoff="50">1,263</td>
<td align="char" valign="top" char="." charoff="50">347</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">Tier 5</td>
<td align="left" valign="top" charoff="50">Classes 1, 2, 3, 4 and 5</td>
<td align="char" valign="top" char="." charoff="50">1,292</td>
<td align="char" valign="top" char="." charoff="50"> </td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="t2-fn1">
<p>AF, allele frequency; MB, medulloblastoma; SIM, somatic insertion/deletion mutations; SNP, single-nucleotide polymorphisms; SNV, single-nucleotide variant; SSM, somatic single-base mutation.</p>
</fn>
<fn id="t2-fn2">
<p>Numbers of curated mutations falling in each class or tier are shown. Successive tiers represent cumulative addition of lower AF mutations, followed by those supported by ambiguous alignments, and finally those with either too low or too high a depth. SIMs were not subjected to such fine classification, with calls only assigned to classes 1 and 4. Note that we use the terms SSM and SIM for somatic mutations instead of more commonly used terms that ought to be reserved for germline variants such as SNP (refers to a single base variable position in the germline with a frequency of >1% in the general population) or SNV (refers to any single base variable position in the germline including those with a frequency <1% in the general population).</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="t3">
<label>Table 3</label>
<caption>
<title>Summary of accuracy measures.</title>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="char" char="("></col>
<col align="char" char="."></col>
<col align="char" char="."></col>
<col align="char" char="."></col>
<col align="char" char="."></col>
<col align="char" char="."></col>
</colgroup>
<thead valign="bottom">
<tr>
<th align="left" valign="top" charoff="50">SSM calls</th>
<th align="left" valign="top" charoff="50">Aligner</th>
<th align="left" valign="top" charoff="50">SSM Detection Software</th>
<th align="center" valign="top" char="(" charoff="50">TP</th>
<th align="center" valign="top" char="." charoff="50">FP</th>
<th align="center" valign="top" char="." charoff="50">FN</th>
<th align="center" valign="top" char="." charoff="50">P</th>
<th align="center" valign="top" char="." charoff="50">R</th>
<th align="center" valign="top" char="." charoff="50">F1</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left" valign="top" charoff="50">MB.GOLD</td>
<td align="left" valign="top" charoff="50">BWA, GEM</td>
<td align="left" valign="top" charoff="50">Curated</td>
<td align="char" valign="top" char="(" charoff="50">1,255 (8)</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="char" valign="top" char="." charoff="50">1.00</td>
<td align="char" valign="top" char="." charoff="50">1.00</td>
<td align="char" valign="top" char="." charoff="50">1.00</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.A</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">In-house</td>
<td align="char" valign="top" char="(" charoff="50">775 (0)</td>
<td align="char" valign="top" char="." charoff="50">147</td>
<td align="char" valign="top" char="." charoff="50">480</td>
<td align="char" valign="top" char="." charoff="50">0.84</td>
<td align="char" valign="top" char="." charoff="50">0.62</td>
<td align="char" valign="top" char="." charoff="50">0.71</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.B</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">samtools, Varscan</td>
<td align="char" valign="top" char="(" charoff="50">788 (1)</td>
<td align="char" valign="top" char="." charoff="50">12</td>
<td align="char" valign="top" char="." charoff="50">467</td>
<td align="char" valign="top" char="." charoff="50">0.99</td>
<td align="char" valign="top" char="." charoff="50">0.63</td>
<td align="char" valign="top" char="." charoff="50">0.77</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.C</td>
<td align="left" valign="top" charoff="50">GEM</td>
<td align="left" valign="top" charoff="50">samtools, bcftools</td>
<td align="char" valign="top" char="(" charoff="50">766 (3)</td>
<td align="char" valign="top" char="." charoff="50">1,025</td>
<td align="char" valign="top" char="." charoff="50">489</td>
<td align="char" valign="top" char="." charoff="50">0.43</td>
<td align="char" valign="top" char="." charoff="50">0.61</td>
<td align="char" valign="top" char="." charoff="50">0.50</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.D</td>
<td align="left" valign="top" charoff="50">n.a.</td>
<td align="left" valign="top" charoff="50">SMuFin</td>
<td align="char" valign="top" char="(" charoff="50">737 (4)</td>
<td align="char" valign="top" char="." charoff="50">1,086</td>
<td align="char" valign="top" char="." charoff="50">518</td>
<td align="char" valign="top" char="." charoff="50">0.41</td>
<td align="char" valign="top" char="." charoff="50">0.59</td>
<td align="char" valign="top" char="." charoff="50">0.48</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.E</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">SomaticSniper</td>
<td align="char" valign="top" char="(" charoff="50">750 (4)</td>
<td align="char" valign="top" char="." charoff="50">229</td>
<td align="char" valign="top" char="." charoff="50">505</td>
<td align="char" valign="top" char="." charoff="50">0.77</td>
<td align="char" valign="top" char="." charoff="50">0.60</td>
<td align="char" valign="top" char="." charoff="50">0.67</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.F</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Strelka</td>
<td align="char" valign="top" char="(" charoff="50">884 (2)</td>
<td align="char" valign="top" char="." charoff="50">165</td>
<td align="char" valign="top" char="." charoff="50">371</td>
<td align="char" valign="top" char="." charoff="50">0.84</td>
<td align="char" valign="top" char="." charoff="50">0.70</td>
<td align="char" valign="top" char="." charoff="50">0.77</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.G</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Caveman, Picnic</td>
<td align="char" valign="top" char="(" charoff="50">899 (3)</td>
<td align="char" valign="top" char="." charoff="50">140</td>
<td align="char" valign="top" char="." charoff="50">356</td>
<td align="char" valign="top" char="." charoff="50">0.87</td>
<td align="char" valign="top" char="." charoff="50">0.72</td>
<td align="char" valign="top" char="." charoff="50">0.78</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.H</td>
<td align="left" valign="top" charoff="50">Novoalign</td>
<td align="left" valign="top" charoff="50">MuTect</td>
<td align="char" valign="top" char="(" charoff="50">947 (3)</td>
<td align="char" valign="top" char="." charoff="50">6,296</td>
<td align="char" valign="top" char="." charoff="50">308</td>
<td align="char" valign="top" char="." charoff="50">0.13</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.76</bold>
</td>
<td align="char" valign="top" char="." charoff="50">0.22</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.I</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">samtools</td>
<td align="char" valign="top" char="(" charoff="50">879 (7)</td>
<td align="char" valign="top" char="." charoff="50">129</td>
<td align="char" valign="top" char="." charoff="50">376</td>
<td align="char" valign="top" char="." charoff="50">0.87</td>
<td align="char" valign="top" char="." charoff="50">0.70</td>
<td align="char" valign="top" char="." charoff="50">0.78</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.J</td>
<td align="left" valign="top" charoff="50">None, BWA</td>
<td align="left" valign="top" charoff="50">SGA+freebayes</td>
<td align="char" valign="top" char="(" charoff="50">856 (1)</td>
<td align="char" valign="top" char="." charoff="50">62</td>
<td align="char" valign="top" char="." charoff="50">399</td>
<td align="char" valign="top" char="." charoff="50">0.93</td>
<td align="char" valign="top" char="." charoff="50">0.68</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.79</bold>
</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.K</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Atlas2-snp</td>
<td align="char" valign="top" char="(" charoff="50">945 (8)</td>
<td align="char" valign="top" char="." charoff="50">7,923</td>
<td align="char" valign="top" char="." charoff="50">310</td>
<td align="char" valign="top" char="." charoff="50">0.11</td>
<td align="char" valign="top" char="." charoff="50">0.75</td>
<td align="char" valign="top" char="." charoff="50">0.19</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.L1</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">MuTect, Strelka</td>
<td align="char" valign="top" char="(" charoff="50">385 (0)</td>
<td align="char" valign="top" char="." charoff="50">3</td>
<td align="char" valign="top" char="." charoff="50">870</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.99</bold>
</td>
<td align="char" valign="top" char="." charoff="50">0.31</td>
<td align="char" valign="top" char="." charoff="50">0.47</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.L2</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">MuTect, Strelka</td>
<td align="char" valign="top" char="(" charoff="50">900 (1)</td>
<td align="char" valign="top" char="." charoff="50">253</td>
<td align="char" valign="top" char="." charoff="50">355</td>
<td align="char" valign="top" char="." charoff="50">0.78</td>
<td align="char" valign="top" char="." charoff="50">0.72</td>
<td align="char" valign="top" char="." charoff="50">0.75</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.M</td>
<td align="left" valign="top" charoff="50">BWA mem</td>
<td align="left" valign="top" charoff="50">samtools, GATK+MuTect</td>
<td align="char" valign="top" char="(" charoff="50">937 (4)</td>
<td align="char" valign="top" char="." charoff="50">1,695</td>
<td align="char" valign="top" char="." charoff="50">318</td>
<td align="char" valign="top" char="." charoff="50">0.36</td>
<td align="char" valign="top" char="." charoff="50">0.75</td>
<td align="char" valign="top" char="." charoff="50">0.48</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.N</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Strelka</td>
<td align="char" valign="top" char="(" charoff="50">847 (1)</td>
<td align="char" valign="top" char="." charoff="50">289</td>
<td align="char" valign="top" char="." charoff="50">408</td>
<td align="char" valign="top" char="." charoff="50">0.75</td>
<td align="char" valign="top" char="." charoff="50">0.68</td>
<td align="char" valign="top" char="." charoff="50">0.71</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.O</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">MuTect</td>
<td align="char" valign="top" char="(" charoff="50">944 (3)</td>
<td align="char" valign="top" char="." charoff="50">272</td>
<td align="char" valign="top" char="." charoff="50">311</td>
<td align="char" valign="top" char="." charoff="50">0.78</td>
<td align="char" valign="top" char="." charoff="50">0.75</td>
<td align="char" valign="top" char="." charoff="50">0.76</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.P</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Sidron</td>
<td align="char" valign="top" char="(" charoff="50">833 (3)</td>
<td align="char" valign="top" char="." charoff="50">256</td>
<td align="char" valign="top" char="." charoff="50">422</td>
<td align="char" valign="top" char="." charoff="50">0.77</td>
<td align="char" valign="top" char="." charoff="50">0.66</td>
<td align="char" valign="top" char="." charoff="50">0.71</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">MB.Q</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">qSNP+GATK</td>
<td align="char" valign="top" char="(" charoff="50">842 (2)</td>
<td align="char" valign="top" char="." charoff="50">25</td>
<td align="char" valign="top" char="." charoff="50">413</td>
<td align="char" valign="top" char="." charoff="50">0.97</td>
<td align="char" valign="top" char="." charoff="50">0.67</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.79</bold>
</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td colspan="9" align="left" valign="top" charoff="50">
<bold>SIM calls</bold>
</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.GOLD</td>
<td align="left" valign="top" charoff="50">BWA, GEM</td>
<td align="left" valign="top" charoff="50">Curated</td>
<td align="char" valign="top" char="(" charoff="50">337 (10)</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="char" valign="top" char="." charoff="50">0</td>
<td align="char" valign="top" char="." charoff="50">1.00</td>
<td align="char" valign="top" char="." charoff="50">1.00</td>
<td align="char" valign="top" char="." charoff="50">1.00</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.A</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">In-house</td>
<td align="char" valign="top" char="(" charoff="50">16 (0)</td>
<td align="char" valign="top" char="." charoff="50">63</td>
<td align="char" valign="top" char="." charoff="50">321</td>
<td align="char" valign="top" char="." charoff="50">0.20</td>
<td align="char" valign="top" char="." charoff="50">0.05</td>
<td align="char" valign="top" char="." charoff="50">0.08</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.B</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">GATK SomaticIndelDetector, Varscan</td>
<td align="char" valign="top" char="(" charoff="50">167 (0)</td>
<td align="char" valign="top" char="." charoff="50">20</td>
<td align="char" valign="top" char="." charoff="50">173</td>
<td align="char" valign="top" char="." charoff="50">0.89</td>
<td align="char" valign="top" char="." charoff="50">0.49</td>
<td align="char" valign="top" char="." charoff="50">0.63</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.C</td>
<td align="left" valign="top" charoff="50">GEM</td>
<td align="left" valign="top" charoff="50">samtools, bcftools</td>
<td align="char" valign="top" char="(" charoff="50">103 (0)</td>
<td align="char" valign="top" char="." charoff="50">26</td>
<td align="char" valign="top" char="." charoff="50">236</td>
<td align="char" valign="top" char="." charoff="50">0.80</td>
<td align="char" valign="top" char="." charoff="50">0.30</td>
<td align="char" valign="top" char="." charoff="50">0.44</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.D</td>
<td align="left" valign="top" charoff="50">none</td>
<td align="left" valign="top" charoff="50">SMuFin</td>
<td align="char" valign="top" char="(" charoff="50">29 (0)</td>
<td align="char" valign="top" char="." charoff="50">25</td>
<td align="char" valign="top" char="." charoff="50">308</td>
<td align="char" valign="top" char="." charoff="50">0.54</td>
<td align="char" valign="top" char="." charoff="50">0.09</td>
<td align="char" valign="top" char="." charoff="50">0.15</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.F</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Strelka</td>
<td align="char" valign="top" char="(" charoff="50">147 (8)</td>
<td align="char" valign="top" char="." charoff="50">12</td>
<td align="char" valign="top" char="." charoff="50">193</td>
<td align="char" valign="top" char="." charoff="50">0.93</td>
<td align="char" valign="top" char="." charoff="50">0.43</td>
<td align="char" valign="top" char="." charoff="50">0.58</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.G</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Pindel</td>
<td align="char" valign="top" char="(" charoff="50">189 (2)</td>
<td align="char" valign="top" char="." charoff="50">82</td>
<td align="char" valign="top" char="." charoff="50">152</td>
<td align="char" valign="top" char="." charoff="50">0.70</td>
<td align="char" valign="top" char="." charoff="50">0.55</td>
<td align="char" valign="top" char="." charoff="50">0.61</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.H</td>
<td align="left" valign="top" charoff="50">Novoalign</td>
<td align="left" valign="top" charoff="50">VarScan2</td>
<td align="char" valign="top" char="(" charoff="50">55 (0)</td>
<td align="char" valign="top" char="." charoff="50">248</td>
<td align="char" valign="top" char="." charoff="50">282</td>
<td align="char" valign="top" char="." charoff="50">0.18</td>
<td align="char" valign="top" char="." charoff="50">0.16</td>
<td align="char" valign="top" char="." charoff="50">0.17</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.I</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Platypus</td>
<td align="char" valign="top" char="(" charoff="50">271 (7)</td>
<td align="char" valign="top" char="." charoff="50">224</td>
<td align="char" valign="top" char="." charoff="50">70</td>
<td align="char" valign="top" char="." charoff="50">0.55</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.79</bold>
</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.65</bold>
</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.J</td>
<td align="left" valign="top" charoff="50">None</td>
<td align="left" valign="top" charoff="50">SGA</td>
<td align="char" valign="top" char="(" charoff="50">90 (1)</td>
<td align="char" valign="top" char="." charoff="50">34</td>
<td align="char" valign="top" char="." charoff="50">249</td>
<td align="char" valign="top" char="." charoff="50">0.72</td>
<td align="char" valign="top" char="." charoff="50">0.26</td>
<td align="char" valign="top" char="." charoff="50">0.38</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.K</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Atlas2-indel</td>
<td align="char" valign="top" char="(" charoff="50">268 (6)</td>
<td align="char" valign="top" char="." charoff="50">444</td>
<td align="char" valign="top" char="." charoff="50">72</td>
<td align="char" valign="top" char="." charoff="50">0.38</td>
<td align="char" valign="top" char="." charoff="50">0.79</td>
<td align="char" valign="top" char="." charoff="50">0.51</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.L1</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Strelka</td>
<td align="char" valign="top" char="(" charoff="50">64 (1)</td>
<td align="char" valign="top" char="." charoff="50">3</td>
<td align="char" valign="top" char="." charoff="50">273</td>
<td align="char" valign="top" char="." charoff="50">
<bold>0.96</bold>
</td>
<td align="char" valign="top" char="." charoff="50">0.19</td>
<td align="char" valign="top" char="." charoff="50">0.32</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.L2</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Strelka</td>
<td align="char" valign="top" char="(" charoff="50">130 (3)</td>
<td align="char" valign="top" char="." charoff="50">13</td>
<td align="char" valign="top" char="." charoff="50">210</td>
<td align="char" valign="top" char="." charoff="50">0.91</td>
<td align="char" valign="top" char="." charoff="50">0.38</td>
<td align="char" valign="top" char="." charoff="50">0.53</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.N</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Strelka</td>
<td align="char" valign="top" char="(" charoff="50">128 (6)</td>
<td align="char" valign="top" char="." charoff="50">16</td>
<td align="char" valign="top" char="." charoff="50">209</td>
<td align="char" valign="top" char="." charoff="50">0.89</td>
<td align="char" valign="top" char="." charoff="50">0.38</td>
<td align="char" valign="top" char="." charoff="50">0.53</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.O</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">GATK SomaticIndelDetector</td>
<td align="char" valign="top" char="(" charoff="50">140 (1)</td>
<td align="char" valign="top" char="." charoff="50">47</td>
<td align="char" valign="top" char="." charoff="50">197</td>
<td align="char" valign="top" char="." charoff="50">0.75</td>
<td align="char" valign="top" char="." charoff="50">0.42</td>
<td align="char" valign="top" char="." charoff="50">0.53</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.P</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">bcftools, PolyFilter</td>
<td align="char" valign="top" char="(" charoff="50">37 (0)</td>
<td align="char" valign="top" char="." charoff="50">57</td>
<td align="char" valign="top" char="." charoff="50">301</td>
<td align="char" valign="top" char="." charoff="50">0.39</td>
<td align="char" valign="top" char="." charoff="50">0.11</td>
<td align="char" valign="top" char="." charoff="50">0.17</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50"> MB.Q</td>
<td align="left" valign="top" charoff="50">BWA</td>
<td align="left" valign="top" charoff="50">Pindel</td>
<td align="char" valign="top" char="(" charoff="50">100 (2)</td>
<td align="char" valign="top" char="." charoff="50">61</td>
<td align="char" valign="top" char="." charoff="50">237</td>
<td align="char" valign="top" char="." charoff="50">0.63</td>
<td align="char" valign="top" char="." charoff="50">0.30</td>
<td align="char" valign="top" char="." charoff="50">0.40</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="t-fn1">
<p>F1, F1 score; FN, false negative; FP, false positives; P, precision; R, recall; TP, true positives.</p>
</fn>
<fn id="t-fn2">
<p>Shown are the evaluation results with respect to the medulloblastoma Gold Set (Tier 3). Shown are the number of true calls (TP) with additional Tier 4 calls in parentheses, the number of FP, the number of FN, P, R and F1. The submissions with the best precision, recall and F1 score are in bold.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000748 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000748 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4682041
   |texte=   A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:26647970" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a AustralieFrV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024