Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing

Identifieur interne : 000A98 ( Pmc/Corpus ); précédent : 000A97; suivant : 000A99

Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing

Auteurs : Jennifer Monson-Miller ; Diana C. Sanchez-Mendez ; Joseph Fass ; Isabelle M. Henry ; Thomas H. Tai ; Luca Comai

Source :

RBID : PMC:3305632

Abstract

Background

The availability of low cost sequencing has spurred its application to discovery and typing of variation, including variation induced by mutagenesis. Mutation discovery is challenging as it requires a substantial amount of sequencing and analysis to detect very rare changes and distinguish them from noise. Also challenging are the cases when the organism of interest has not been sequenced or is highly divergent from the reference.

Results

We describe the development of a simple method for reduced representation sequencing. Input DNA was digested with a single restriction enzyme and ligated to Y adapters modified to contain a sequence barcode and to provide a compatible overhang for ligation. We demonstrated the efficiency of this method at SNP discovery using rice and arabidopsis. To test its suitability for the discovery of very rare SNP, one control and three mutagenized rice individuals (1, 5 and 10 mM sodium azide) were used to prepare genomic libraries for Illumina sequencers by ligating barcoded adapters to NlaIII restriction sites. For genome-dependent discovery 15-30 million of 80 base reads per individual were aligned to the reference sequence achieving individual sequencing coverage from 7 to 15×. We identified high-confidence base changes by comparing sequences across individuals and identified instances consistent with mutations, i.e. changes that were found in a single treated individual and were solely GC to AT transitions. For genome-independent discovery 70-mers were extracted from the sequence of the control individual and single-copy sequence was identified by comparing the 70-mers across samples to evaluate copy number and variation. This de novo "genome" was used to align the reads and identify mutations as above. Covering approximately 1/5 of the 380 Mb genome of rice we detected mutation densities ranging from 0.6 to 4 per Mb of diploid DNA depending on the mutagenic treatment.

Conclusions

The combination of a simple and cost-effective library construction method, with Illumina sequencing, and the use of a bioinformatic pipeline allows practical SNP discovery regardless of whether a genomic reference is available.


Url:
DOI: 10.1186/1471-2164-13-72
PubMed: 22333298
PubMed Central: 3305632

Links to Exploration step

PMC:3305632

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing</title>
<author>
<name sortKey="Monson Miller, Jennifer" sort="Monson Miller, Jennifer" uniqKey="Monson Miller J" first="Jennifer" last="Monson-Miller">Jennifer Monson-Miller</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sanchez Mendez, Diana C" sort="Sanchez Mendez, Diana C" uniqKey="Sanchez Mendez D" first="Diana C" last="Sanchez-Mendez">Diana C. Sanchez-Mendez</name>
<affiliation>
<nlm:aff id="I2">Crops Pathology and Genetics Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fass, Joseph" sort="Fass, Joseph" uniqKey="Fass J" first="Joseph" last="Fass">Joseph Fass</name>
<affiliation>
<nlm:aff id="I3">Bioinformatics Core, Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Henry, Isabelle M" sort="Henry, Isabelle M" uniqKey="Henry I" first="Isabelle M" last="Henry">Isabelle M. Henry</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tai, Thomas H" sort="Tai, Thomas H" uniqKey="Tai T" first="Thomas H" last="Tai">Thomas H. Tai</name>
<affiliation>
<nlm:aff id="I2">Crops Pathology and Genetics Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Comai, Luca" sort="Comai, Luca" uniqKey="Comai L" first="Luca" last="Comai">Luca Comai</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">22333298</idno>
<idno type="pmc">3305632</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3305632</idno>
<idno type="RBID">PMC:3305632</idno>
<idno type="doi">10.1186/1471-2164-13-72</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000A98</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000A98</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing</title>
<author>
<name sortKey="Monson Miller, Jennifer" sort="Monson Miller, Jennifer" uniqKey="Monson Miller J" first="Jennifer" last="Monson-Miller">Jennifer Monson-Miller</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sanchez Mendez, Diana C" sort="Sanchez Mendez, Diana C" uniqKey="Sanchez Mendez D" first="Diana C" last="Sanchez-Mendez">Diana C. Sanchez-Mendez</name>
<affiliation>
<nlm:aff id="I2">Crops Pathology and Genetics Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fass, Joseph" sort="Fass, Joseph" uniqKey="Fass J" first="Joseph" last="Fass">Joseph Fass</name>
<affiliation>
<nlm:aff id="I3">Bioinformatics Core, Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Henry, Isabelle M" sort="Henry, Isabelle M" uniqKey="Henry I" first="Isabelle M" last="Henry">Isabelle M. Henry</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tai, Thomas H" sort="Tai, Thomas H" uniqKey="Tai T" first="Thomas H" last="Tai">Thomas H. Tai</name>
<affiliation>
<nlm:aff id="I2">Crops Pathology and Genetics Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Comai, Luca" sort="Comai, Luca" uniqKey="Comai L" first="Luca" last="Comai">Luca Comai</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Genomics</title>
<idno type="eISSN">1471-2164</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The availability of low cost sequencing has spurred its application to discovery and typing of variation, including variation induced by mutagenesis. Mutation discovery is challenging as it requires a substantial amount of sequencing and analysis to detect very rare changes and distinguish them from noise. Also challenging are the cases when the organism of interest has not been sequenced or is highly divergent from the reference.</p>
</sec>
<sec>
<title>Results</title>
<p>We describe the development of a simple method for reduced representation sequencing. Input DNA was digested with a single restriction enzyme and ligated to Y adapters modified to contain a sequence barcode and to provide a compatible overhang for ligation. We demonstrated the efficiency of this method at SNP discovery using rice and arabidopsis. To test its suitability for the discovery of very rare SNP, one control and three mutagenized rice individuals (1, 5 and 10 mM sodium azide) were used to prepare genomic libraries for Illumina sequencers by ligating barcoded adapters to
<italic>NlaIII </italic>
restriction sites. For genome-dependent discovery 15-30 million of 80 base reads per individual were aligned to the reference sequence achieving individual sequencing coverage from 7 to 15×. We identified high-confidence base changes by comparing sequences across individuals and identified instances consistent with mutations, i.e. changes that were found in a single treated individual and were solely GC to AT transitions. For genome-independent discovery 70-mers were extracted from the sequence of the control individual and single-copy sequence was identified by comparing the 70-mers across samples to evaluate copy number and variation. This
<italic>de novo </italic>
"genome" was used to align the reads and identify mutations as above. Covering approximately 1/5 of the 380 Mb genome of rice we detected mutation densities ranging from 0.6 to 4 per Mb of diploid DNA depending on the mutagenic treatment.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The combination of a simple and cost-effective library construction method, with Illumina sequencing, and the use of a bioinformatic pipeline allows practical SNP discovery regardless of whether a genomic reference is available.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Comai, L" uniqKey="Comai L">L Comai</name>
</author>
<author>
<name sortKey="Henikoff, S" uniqKey="Henikoff S">S Henikoff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tsai, H" uniqKey="Tsai H">H Tsai</name>
</author>
<author>
<name sortKey="Howell, T" uniqKey="Howell T">T Howell</name>
</author>
<author>
<name sortKey="Nitcher, R" uniqKey="Nitcher R">R Nitcher</name>
</author>
<author>
<name sortKey="Missirian, V" uniqKey="Missirian V">V Missirian</name>
</author>
<author>
<name sortKey="Watson, B" uniqKey="Watson B">B Watson</name>
</author>
<author>
<name sortKey="Ngo, Kj" uniqKey="Ngo K">KJ Ngo</name>
</author>
<author>
<name sortKey="Lieberman, M" uniqKey="Lieberman M">M Lieberman</name>
</author>
<author>
<name sortKey="Fass, J" uniqKey="Fass J">J Fass</name>
</author>
<author>
<name sortKey="Uauy, C" uniqKey="Uauy C">C Uauy</name>
</author>
<author>
<name sortKey="Tran, Rk" uniqKey="Tran R">RK Tran</name>
</author>
<author>
<name sortKey="Khan, Aa" uniqKey="Khan A">AA Khan</name>
</author>
<author>
<name sortKey="Filkov, V" uniqKey="Filkov V">V Filkov</name>
</author>
<author>
<name sortKey="Tai, Th" uniqKey="Tai T">TH Tai</name>
</author>
<author>
<name sortKey="Dubcovsky, J" uniqKey="Dubcovsky J">J Dubcovsky</name>
</author>
<author>
<name sortKey="Comai, L" uniqKey="Comai L">L Comai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Missirian, V" uniqKey="Missirian V">V Missirian</name>
</author>
<author>
<name sortKey="Comai, L" uniqKey="Comai L">L Comai</name>
</author>
<author>
<name sortKey="Filkov, V" uniqKey="Filkov V">V Filkov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ossowski, S" uniqKey="Ossowski S">S Ossowski</name>
</author>
<author>
<name sortKey="Schneeberger, K" uniqKey="Schneeberger K">K Schneeberger</name>
</author>
<author>
<name sortKey="Lucas Lledo, Ji" uniqKey="Lucas Lledo J">JI Lucas-Lledo</name>
</author>
<author>
<name sortKey="Warthmann, N" uniqKey="Warthmann N">N Warthmann</name>
</author>
<author>
<name sortKey="Clark, Rm" uniqKey="Clark R">RM Clark</name>
</author>
<author>
<name sortKey="Shaw, Rg" uniqKey="Shaw R">RG Shaw</name>
</author>
<author>
<name sortKey="Weigel, D" uniqKey="Weigel D">D Weigel</name>
</author>
<author>
<name sortKey="Lynch, M" uniqKey="Lynch M">M Lynch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ng, Sb" uniqKey="Ng S">SB Ng</name>
</author>
<author>
<name sortKey="Turner, Eh" uniqKey="Turner E">EH Turner</name>
</author>
<author>
<name sortKey="Robertson, Pd" uniqKey="Robertson P">PD Robertson</name>
</author>
<author>
<name sortKey="Flygare, Sd" uniqKey="Flygare S">SD Flygare</name>
</author>
<author>
<name sortKey="Bigham, Aw" uniqKey="Bigham A">AW Bigham</name>
</author>
<author>
<name sortKey="Lee, C" uniqKey="Lee C">C Lee</name>
</author>
<author>
<name sortKey="Shaffer, T" uniqKey="Shaffer T">T Shaffer</name>
</author>
<author>
<name sortKey="Wong, M" uniqKey="Wong M">M Wong</name>
</author>
<author>
<name sortKey="Bhattacharjee, A" uniqKey="Bhattacharjee A">A Bhattacharjee</name>
</author>
<author>
<name sortKey="Eichler, Ee" uniqKey="Eichler E">EE Eichler</name>
</author>
<author>
<name sortKey="Bamshad, M" uniqKey="Bamshad M">M Bamshad</name>
</author>
<author>
<name sortKey="Nickerson, Da" uniqKey="Nickerson D">DA Nickerson</name>
</author>
<author>
<name sortKey="Shendure, J" uniqKey="Shendure J">J Shendure</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altshuler, D" uniqKey="Altshuler D">D Altshuler</name>
</author>
<author>
<name sortKey="Pollara, Vj" uniqKey="Pollara V">VJ Pollara</name>
</author>
<author>
<name sortKey="Cowles, Cr" uniqKey="Cowles C">CR Cowles</name>
</author>
<author>
<name sortKey="Van Etten, Wj" uniqKey="Van Etten W">WJ Van Etten</name>
</author>
<author>
<name sortKey="Baldwin, J" uniqKey="Baldwin J">J Baldwin</name>
</author>
<author>
<name sortKey="Linton, L" uniqKey="Linton L">L Linton</name>
</author>
<author>
<name sortKey="Lander, Es" uniqKey="Lander E">ES Lander</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Baird, Na" uniqKey="Baird N">NA Baird</name>
</author>
<author>
<name sortKey="Etter, Pd" uniqKey="Etter P">PD Etter</name>
</author>
<author>
<name sortKey="Atwood, Ts" uniqKey="Atwood T">TS Atwood</name>
</author>
<author>
<name sortKey="Currey, Mc" uniqKey="Currey M">MC Currey</name>
</author>
<author>
<name sortKey="Shiver, Al" uniqKey="Shiver A">AL Shiver</name>
</author>
<author>
<name sortKey="Lewis, Za" uniqKey="Lewis Z">ZA Lewis</name>
</author>
<author>
<name sortKey="Selker, Eu" uniqKey="Selker E">EU Selker</name>
</author>
<author>
<name sortKey="Cresko, Wa" uniqKey="Cresko W">WA Cresko</name>
</author>
<author>
<name sortKey="Johnson, Ea" uniqKey="Johnson E">EA Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Elshire, Rj" uniqKey="Elshire R">RJ Elshire</name>
</author>
<author>
<name sortKey="Glaubitz, Jc" uniqKey="Glaubitz J">JC Glaubitz</name>
</author>
<author>
<name sortKey="Sun, Q" uniqKey="Sun Q">Q Sun</name>
</author>
<author>
<name sortKey="Poland, Ja" uniqKey="Poland J">JA Poland</name>
</author>
<author>
<name sortKey="Kawamoto, K" uniqKey="Kawamoto K">K Kawamoto</name>
</author>
<author>
<name sortKey="Buckler, Es" uniqKey="Buckler E">ES Buckler</name>
</author>
<author>
<name sortKey="Mitchell, Se" uniqKey="Mitchell S">SE Mitchell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goff, Sa" uniqKey="Goff S">SA Goff</name>
</author>
<author>
<name sortKey="Ricke, D" uniqKey="Ricke D">D Ricke</name>
</author>
<author>
<name sortKey="Lan, Th" uniqKey="Lan T">TH Lan</name>
</author>
<author>
<name sortKey="Presting, G" uniqKey="Presting G">G Presting</name>
</author>
<author>
<name sortKey="Wang, R" uniqKey="Wang R">R Wang</name>
</author>
<author>
<name sortKey="Dunn, M" uniqKey="Dunn M">M Dunn</name>
</author>
<author>
<name sortKey="Glazebrook, J" uniqKey="Glazebrook J">J Glazebrook</name>
</author>
<author>
<name sortKey="Sessions, A" uniqKey="Sessions A">A Sessions</name>
</author>
<author>
<name sortKey="Oeller, P" uniqKey="Oeller P">P Oeller</name>
</author>
<author>
<name sortKey="Varma, H" uniqKey="Varma H">H Varma</name>
</author>
<author>
<name sortKey="Hadley, D" uniqKey="Hadley D">D Hadley</name>
</author>
<author>
<name sortKey="Hutchison, D" uniqKey="Hutchison D">D Hutchison</name>
</author>
<author>
<name sortKey="Martin, C" uniqKey="Martin C">C Martin</name>
</author>
<author>
<name sortKey="Katagiri, F" uniqKey="Katagiri F">F Katagiri</name>
</author>
<author>
<name sortKey="Lange, Bm" uniqKey="Lange B">BM Lange</name>
</author>
<author>
<name sortKey="Moughamer, T" uniqKey="Moughamer T">T Moughamer</name>
</author>
<author>
<name sortKey="Xia, Y" uniqKey="Xia Y">Y Xia</name>
</author>
<author>
<name sortKey="Budworth, P" uniqKey="Budworth P">P Budworth</name>
</author>
<author>
<name sortKey="Zhong, J" uniqKey="Zhong J">J Zhong</name>
</author>
<author>
<name sortKey="Miguel, T" uniqKey="Miguel T">T Miguel</name>
</author>
<author>
<name sortKey="Paszkowski, U" uniqKey="Paszkowski U">U Paszkowski</name>
</author>
<author>
<name sortKey="Zhang, S" uniqKey="Zhang S">S Zhang</name>
</author>
<author>
<name sortKey="Colbert, M" uniqKey="Colbert M">M Colbert</name>
</author>
<author>
<name sortKey="Sun, Wl" uniqKey="Sun W">WL Sun</name>
</author>
<author>
<name sortKey="Chen, L" uniqKey="Chen L">L Chen</name>
</author>
<author>
<name sortKey="Cooper, B" uniqKey="Cooper B">B Cooper</name>
</author>
<author>
<name sortKey="Park, S" uniqKey="Park S">S Park</name>
</author>
<author>
<name sortKey="Wood, Tc" uniqKey="Wood T">TC Wood</name>
</author>
<author>
<name sortKey="Mao, L" uniqKey="Mao L">L Mao</name>
</author>
<author>
<name sortKey="Quail, P" uniqKey="Quail P">P Quail</name>
</author>
<author>
<name sortKey="Wing, R" uniqKey="Wing R">R Wing</name>
</author>
<author>
<name sortKey="Dean, R" uniqKey="Dean R">R Dean</name>
</author>
<author>
<name sortKey="Yu, Y" uniqKey="Yu Y">Y Yu</name>
</author>
<author>
<name sortKey="Zharkikh, A" uniqKey="Zharkikh A">A Zharkikh</name>
</author>
<author>
<name sortKey="Shen, R" uniqKey="Shen R">R Shen</name>
</author>
<author>
<name sortKey="Sahasrabudhe, S" uniqKey="Sahasrabudhe S">S Sahasrabudhe</name>
</author>
<author>
<name sortKey="Thomas, A" uniqKey="Thomas A">A Thomas</name>
</author>
<author>
<name sortKey="Cannings, R" uniqKey="Cannings R">R Cannings</name>
</author>
<author>
<name sortKey="Gutin, A" uniqKey="Gutin A">A Gutin</name>
</author>
<author>
<name sortKey="Pruss, D" uniqKey="Pruss D">D Pruss</name>
</author>
<author>
<name sortKey="Reid, J" uniqKey="Reid J">J Reid</name>
</author>
<author>
<name sortKey="Tavtigian, S" uniqKey="Tavtigian S">S Tavtigian</name>
</author>
<author>
<name sortKey="Mitchell, J" uniqKey="Mitchell J">J Mitchell</name>
</author>
<author>
<name sortKey="Eldredge, G" uniqKey="Eldredge G">G Eldredge</name>
</author>
<author>
<name sortKey="Scholl, T" uniqKey="Scholl T">T Scholl</name>
</author>
<author>
<name sortKey="Miller, Rm" uniqKey="Miller R">RM Miller</name>
</author>
<author>
<name sortKey="Bhatnagar, S" uniqKey="Bhatnagar S">S Bhatnagar</name>
</author>
<author>
<name sortKey="Adey, N" uniqKey="Adey N">N Adey</name>
</author>
<author>
<name sortKey="Rubano, T" uniqKey="Rubano T">T Rubano</name>
</author>
<author>
<name sortKey="Tusneem, N" uniqKey="Tusneem N">N Tusneem</name>
</author>
<author>
<name sortKey="Robinson, R" uniqKey="Robinson R">R Robinson</name>
</author>
<author>
<name sortKey="Feldhaus, J" uniqKey="Feldhaus J">J Feldhaus</name>
</author>
<author>
<name sortKey="Macalma, T" uniqKey="Macalma T">T Macalma</name>
</author>
<author>
<name sortKey="Oliphant, A" uniqKey="Oliphant A">A Oliphant</name>
</author>
<author>
<name sortKey="Briggs, S" uniqKey="Briggs S">S Briggs</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Initiative, Ag" uniqKey="Initiative A">AG Initiative</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, H" uniqKey="Wei H">H Wei</name>
</author>
<author>
<name sortKey="Therrien, C" uniqKey="Therrien C">C Therrien</name>
</author>
<author>
<name sortKey="Blanchard, A" uniqKey="Blanchard A">A Blanchard</name>
</author>
<author>
<name sortKey="Guan, S" uniqKey="Guan S">S Guan</name>
</author>
<author>
<name sortKey="Zhu, Z" uniqKey="Zhu Z">Z Zhu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quail, Ma" uniqKey="Quail M">MA Quail</name>
</author>
<author>
<name sortKey="Kozarewa, I" uniqKey="Kozarewa I">I Kozarewa</name>
</author>
<author>
<name sortKey="Smith, F" uniqKey="Smith F">F Smith</name>
</author>
<author>
<name sortKey="Scally, A" uniqKey="Scally A">A Scally</name>
</author>
<author>
<name sortKey="Stephens, Pj" uniqKey="Stephens P">PJ Stephens</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
<author>
<name sortKey="Swerdlow, H" uniqKey="Swerdlow H">H Swerdlow</name>
</author>
<author>
<name sortKey="Turner, Dj" uniqKey="Turner D">DJ Turner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deangelis, Mm" uniqKey="Deangelis M">MM DeAngelis</name>
</author>
<author>
<name sortKey="Wang, Dg" uniqKey="Wang D">DG Wang</name>
</author>
<author>
<name sortKey="Hawkins, Tl" uniqKey="Hawkins T">TL Hawkins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Till, Bj" uniqKey="Till B">BJ Till</name>
</author>
<author>
<name sortKey="Cooper, J" uniqKey="Cooper J">J Cooper</name>
</author>
<author>
<name sortKey="Tai, Th" uniqKey="Tai T">TH Tai</name>
</author>
<author>
<name sortKey="Colowit, P" uniqKey="Colowit P">P Colowit</name>
</author>
<author>
<name sortKey="Greene, Ea" uniqKey="Greene E">EA Greene</name>
</author>
<author>
<name sortKey="Henikoff, S" uniqKey="Henikoff S">S Henikoff</name>
</author>
<author>
<name sortKey="Comai, L" uniqKey="Comai L">L Comai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barker, Gl" uniqKey="Barker G">GL Barker</name>
</author>
<author>
<name sortKey="Edwards, Kj" uniqKey="Edwards K">KJ Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ouyang, S" uniqKey="Ouyang S">S Ouyang</name>
</author>
<author>
<name sortKey="Zhu, W" uniqKey="Zhu W">W Zhu</name>
</author>
<author>
<name sortKey="Hamilton, J" uniqKey="Hamilton J">J Hamilton</name>
</author>
<author>
<name sortKey="Lin, H" uniqKey="Lin H">H Lin</name>
</author>
<author>
<name sortKey="Campbell, M" uniqKey="Campbell M">M Campbell</name>
</author>
<author>
<name sortKey="Childs, K" uniqKey="Childs K">K Childs</name>
</author>
<author>
<name sortKey="Thibaud Nissen, F" uniqKey="Thibaud Nissen F">F Thibaud-Nissen</name>
</author>
<author>
<name sortKey="Malek, Rl" uniqKey="Malek R">RL Malek</name>
</author>
<author>
<name sortKey="Lee, Y" uniqKey="Lee Y">Y Lee</name>
</author>
<author>
<name sortKey="Zheng, L" uniqKey="Zheng L">L Zheng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hamming, Rw" uniqKey="Hamming R">RW Hamming</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Talame, V" uniqKey="Talame V">V Talame</name>
</author>
<author>
<name sortKey="Bovina, R" uniqKey="Bovina R">R Bovina</name>
</author>
<author>
<name sortKey="Sanguineti, Mc" uniqKey="Sanguineti M">MC Sanguineti</name>
</author>
<author>
<name sortKey="Tuberosa, R" uniqKey="Tuberosa R">R Tuberosa</name>
</author>
<author>
<name sortKey="Lundqvist, U" uniqKey="Lundqvist U">U Lundqvist</name>
</author>
<author>
<name sortKey="Salvi, S" uniqKey="Salvi S">S Salvi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Orsouw, Nj" uniqKey="Van Orsouw N">NJ van Orsouw</name>
</author>
<author>
<name sortKey="Hogers, Rc" uniqKey="Hogers R">RC Hogers</name>
</author>
<author>
<name sortKey="Janssen, A" uniqKey="Janssen A">A Janssen</name>
</author>
<author>
<name sortKey="Yalcin, F" uniqKey="Yalcin F">F Yalcin</name>
</author>
<author>
<name sortKey="Snoeijers, S" uniqKey="Snoeijers S">S Snoeijers</name>
</author>
<author>
<name sortKey="Verstege, E" uniqKey="Verstege E">E Verstege</name>
</author>
<author>
<name sortKey="Schneiders, H" uniqKey="Schneiders H">H Schneiders</name>
</author>
<author>
<name sortKey="Van Der Poel, H" uniqKey="Van Der Poel H">H van der Poel</name>
</author>
<author>
<name sortKey="Van Oeveren, J" uniqKey="Van Oeveren J">J van Oeveren</name>
</author>
<author>
<name sortKey="Verstegen, H" uniqKey="Verstegen H">H Verstegen</name>
</author>
<author>
<name sortKey="Van Eijk, Mj" uniqKey="Van Eijk M">MJ van Eijk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Tassell, Cp" uniqKey="Van Tassell C">CP Van Tassell</name>
</author>
<author>
<name sortKey="Smith, Tp" uniqKey="Smith T">TP Smith</name>
</author>
<author>
<name sortKey="Matukumalli, Lk" uniqKey="Matukumalli L">LK Matukumalli</name>
</author>
<author>
<name sortKey="Taylor, Jf" uniqKey="Taylor J">JF Taylor</name>
</author>
<author>
<name sortKey="Schnabel, Rd" uniqKey="Schnabel R">RD Schnabel</name>
</author>
<author>
<name sortKey="Lawley, Ct" uniqKey="Lawley C">CT Lawley</name>
</author>
<author>
<name sortKey="Haudenschild, Cd" uniqKey="Haudenschild C">CD Haudenschild</name>
</author>
<author>
<name sortKey="Moore, Ss" uniqKey="Moore S">SS Moore</name>
</author>
<author>
<name sortKey="Warren, Wc" uniqKey="Warren W">WC Warren</name>
</author>
<author>
<name sortKey="Sonstegard, Ts" uniqKey="Sonstegard T">TS Sonstegard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Davey, Jw" uniqKey="Davey J">JW Davey</name>
</author>
<author>
<name sortKey="Hohenlohe, Pa" uniqKey="Hohenlohe P">PA Hohenlohe</name>
</author>
<author>
<name sortKey="Etter, Pd" uniqKey="Etter P">PD Etter</name>
</author>
<author>
<name sortKey="Boone, Jq" uniqKey="Boone J">JQ Boone</name>
</author>
<author>
<name sortKey="Catchen, Jm" uniqKey="Catchen J">JM Catchen</name>
</author>
<author>
<name sortKey="Blaxter, Ml" uniqKey="Blaxter M">ML Blaxter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Andolfatto, P" uniqKey="Andolfatto P">P Andolfatto</name>
</author>
<author>
<name sortKey="Davison, D" uniqKey="Davison D">D Davison</name>
</author>
<author>
<name sortKey="Erezyilmaz, D" uniqKey="Erezyilmaz D">D Erezyilmaz</name>
</author>
<author>
<name sortKey="Hu, Tt" uniqKey="Hu T">TT Hu</name>
</author>
<author>
<name sortKey="Mast, J" uniqKey="Mast J">J Mast</name>
</author>
<author>
<name sortKey="Sunayama Morita, T" uniqKey="Sunayama Morita T">T Sunayama-Morita</name>
</author>
<author>
<name sortKey="Stern, Dl" uniqKey="Stern D">DL Stern</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scaglione, D" uniqKey="Scaglione D">D Scaglione</name>
</author>
<author>
<name sortKey="Acquadro, A" uniqKey="Acquadro A">A Acquadro</name>
</author>
<author>
<name sortKey="Portis, E" uniqKey="Portis E">E Portis</name>
</author>
<author>
<name sortKey="Tirone, M" uniqKey="Tirone M">M Tirone</name>
</author>
<author>
<name sortKey="Knapp, Sj" uniqKey="Knapp S">SJ Knapp</name>
</author>
<author>
<name sortKey="Lanteri, S" uniqKey="Lanteri S">S Lanteri</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barchi, L" uniqKey="Barchi L">L Barchi</name>
</author>
<author>
<name sortKey="Lanteri, S" uniqKey="Lanteri S">S Lanteri</name>
</author>
<author>
<name sortKey="Portis, E" uniqKey="Portis E">E Portis</name>
</author>
<author>
<name sortKey="Acquadro, A" uniqKey="Acquadro A">A Acquadro</name>
</author>
<author>
<name sortKey="Vale, G" uniqKey="Vale G">G Vale</name>
</author>
<author>
<name sortKey="Toppino, L" uniqKey="Toppino L">L Toppino</name>
</author>
<author>
<name sortKey="Rotino, Gl" uniqKey="Rotino G">GL Rotino</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Etter, Pd" uniqKey="Etter P">PD Etter</name>
</author>
<author>
<name sortKey="Preston, Jl" uniqKey="Preston J">JL Preston</name>
</author>
<author>
<name sortKey="Bassham, S" uniqKey="Bassham S">S Bassham</name>
</author>
<author>
<name sortKey="Cresko, Wa" uniqKey="Cresko W">WA Cresko</name>
</author>
<author>
<name sortKey="Johnson, Ea" uniqKey="Johnson E">EA Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Etter, Pd" uniqKey="Etter P">PD Etter</name>
</author>
<author>
<name sortKey="Bassham, S" uniqKey="Bassham S">S Bassham</name>
</author>
<author>
<name sortKey="Hohenlohe, Pa" uniqKey="Hohenlohe P">PA Hohenlohe</name>
</author>
<author>
<name sortKey="Johnson, Ea" uniqKey="Johnson E">EA Johnson</name>
</author>
<author>
<name sortKey="Cresko, Wa" uniqKey="Cresko W">WA Cresko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chutimanitsakun, Y" uniqKey="Chutimanitsakun Y">Y Chutimanitsakun</name>
</author>
<author>
<name sortKey="Nipper, Rw" uniqKey="Nipper R">RW Nipper</name>
</author>
<author>
<name sortKey="Cuesta Marcos, A" uniqKey="Cuesta Marcos A">A Cuesta-Marcos</name>
</author>
<author>
<name sortKey="Cistue, L" uniqKey="Cistue L">L Cistue</name>
</author>
<author>
<name sortKey="Corey, A" uniqKey="Corey A">A Corey</name>
</author>
<author>
<name sortKey="Filichkina, T" uniqKey="Filichkina T">T Filichkina</name>
</author>
<author>
<name sortKey="Johnson, Ea" uniqKey="Johnson E">EA Johnson</name>
</author>
<author>
<name sortKey="Hayes, Pm" uniqKey="Hayes P">PM Hayes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Willing, Em" uniqKey="Willing E">EM Willing</name>
</author>
<author>
<name sortKey="Hoffmann, M" uniqKey="Hoffmann M">M Hoffmann</name>
</author>
<author>
<name sortKey="Klein, Jd" uniqKey="Klein J">JD Klein</name>
</author>
<author>
<name sortKey="Weigel, D" uniqKey="Weigel D">D Weigel</name>
</author>
<author>
<name sortKey="Dreyer, C" uniqKey="Dreyer C">C Dreyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hohenlohe, Pa" uniqKey="Hohenlohe P">PA Hohenlohe</name>
</author>
<author>
<name sortKey="Amish, Sj" uniqKey="Amish S">SJ Amish</name>
</author>
<author>
<name sortKey="Catchen, Jm" uniqKey="Catchen J">JM Catchen</name>
</author>
<author>
<name sortKey="Allendorf, Fw" uniqKey="Allendorf F">FW Allendorf</name>
</author>
<author>
<name sortKey="Luikart, G" uniqKey="Luikart G">G Luikart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pfender, Wf" uniqKey="Pfender W">WF Pfender</name>
</author>
<author>
<name sortKey="Saha, Mc" uniqKey="Saha M">MC Saha</name>
</author>
<author>
<name sortKey="Johnson, Ea" uniqKey="Johnson E">EA Johnson</name>
</author>
<author>
<name sortKey="Slabaugh, Mb" uniqKey="Slabaugh M">MB Slabaugh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhao, H" uniqKey="Zhao H">H Zhao</name>
</author>
<author>
<name sortKey="Li, Qz" uniqKey="Li Q">QZ Li</name>
</author>
<author>
<name sortKey="Zeng, Cq" uniqKey="Zeng C">CQ Zeng</name>
</author>
<author>
<name sortKey="Yang, Hm" uniqKey="Yang H">HM Yang</name>
</author>
<author>
<name sortKey="Yu, J" uniqKey="Yu J">J Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Prina, Ar" uniqKey="Prina A">AR Prina</name>
</author>
<author>
<name sortKey="Favret, Ea" uniqKey="Favret E">EA Favret</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Seymour, Dk" uniqKey="Seymour D">DK Seymour</name>
</author>
<author>
<name sortKey="Filiault, Dl" uniqKey="Filiault D">DL Filiault</name>
</author>
<author>
<name sortKey="Henry, Ih" uniqKey="Henry I">IH Henry</name>
</author>
<author>
<name sortKey="Monson Miller, J" uniqKey="Monson Miller J">J Monson-Miller</name>
</author>
<author>
<name sortKey="Ravi, M" uniqKey="Ravi M">M Ravi</name>
</author>
<author>
<name sortKey="Pang, A" uniqKey="Pang A">A Pang</name>
</author>
<author>
<name sortKey="Comai, L" uniqKey="Comai L">L Comai</name>
</author>
<author>
<name sortKey="Chan, Swl" uniqKey="Chan S">SWL Chan</name>
</author>
<author>
<name sortKey="Maloof, Jn" uniqKey="Maloof J">JN Maloof</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tai, Th" uniqKey="Tai T">TH Tai</name>
</author>
<author>
<name sortKey="Tanksley, Sd" uniqKey="Tanksley S">SD Tanksley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Handsaker, B" uniqKey="Handsaker B">B Handsaker</name>
</author>
<author>
<name sortKey="Wysoker, A" uniqKey="Wysoker A">A Wysoker</name>
</author>
<author>
<name sortKey="Fennell, T" uniqKey="Fennell T">T Fennell</name>
</author>
<author>
<name sortKey="Ruan, J" uniqKey="Ruan J">J Ruan</name>
</author>
<author>
<name sortKey="Homer, N" uniqKey="Homer N">N Homer</name>
</author>
<author>
<name sortKey="Marth, G" uniqKey="Marth G">G Marth</name>
</author>
<author>
<name sortKey="Abecasis, G" uniqKey="Abecasis G">G Abecasis</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Genomics</journal-id>
<journal-title-group>
<journal-title>BMC Genomics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2164</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">22333298</article-id>
<article-id pub-id-type="pmc">3305632</article-id>
<article-id pub-id-type="publisher-id">1471-2164-13-72</article-id>
<article-id pub-id-type="doi">10.1186/1471-2164-13-72</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Methodology Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" id="A1">
<name>
<surname>Monson-Miller</surname>
<given-names>Jennifer</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>jmonsonmiller@ucdavis.edu</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Sanchez-Mendez</surname>
<given-names>Diana C</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>dianasanchezmendez@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Fass</surname>
<given-names>Joseph</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>joseph.fass@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A4">
<name>
<surname>Henry</surname>
<given-names>Isabelle M</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>imhenry@ucdavis.edu</email>
</contrib>
<contrib contrib-type="author" id="A5">
<name>
<surname>Tai</surname>
<given-names>Thomas H</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>thomas.tai@ars.usda.gov</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A6">
<name>
<surname>Comai</surname>
<given-names>Luca</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>lcomai@ucdavis.edu</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Department of Plant Biology and Genome Center, UC Davis, Davis, California 95616, USA</aff>
<aff id="I2">
<label>2</label>
Crops Pathology and Genetics Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Davis, California 95616, USA</aff>
<aff id="I3">
<label>3</label>
Bioinformatics Core, Genome Center, UC Davis, Davis, California 95616, USA</aff>
<pub-date pub-type="collection">
<year>2012</year>
</pub-date>
<pub-date pub-type="epub">
<day>14</day>
<month>2</month>
<year>2012</year>
</pub-date>
<volume>13</volume>
<fpage>72</fpage>
<lpage>72</lpage>
<history>
<date date-type="received">
<day>9</day>
<month>9</month>
<year>2011</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>2</month>
<year>2012</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright ©2012 Monson-Miller et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2012</copyright-year>
<copyright-holder>Monson-Miller et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2164/13/72"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>The availability of low cost sequencing has spurred its application to discovery and typing of variation, including variation induced by mutagenesis. Mutation discovery is challenging as it requires a substantial amount of sequencing and analysis to detect very rare changes and distinguish them from noise. Also challenging are the cases when the organism of interest has not been sequenced or is highly divergent from the reference.</p>
</sec>
<sec>
<title>Results</title>
<p>We describe the development of a simple method for reduced representation sequencing. Input DNA was digested with a single restriction enzyme and ligated to Y adapters modified to contain a sequence barcode and to provide a compatible overhang for ligation. We demonstrated the efficiency of this method at SNP discovery using rice and arabidopsis. To test its suitability for the discovery of very rare SNP, one control and three mutagenized rice individuals (1, 5 and 10 mM sodium azide) were used to prepare genomic libraries for Illumina sequencers by ligating barcoded adapters to
<italic>NlaIII </italic>
restriction sites. For genome-dependent discovery 15-30 million of 80 base reads per individual were aligned to the reference sequence achieving individual sequencing coverage from 7 to 15×. We identified high-confidence base changes by comparing sequences across individuals and identified instances consistent with mutations, i.e. changes that were found in a single treated individual and were solely GC to AT transitions. For genome-independent discovery 70-mers were extracted from the sequence of the control individual and single-copy sequence was identified by comparing the 70-mers across samples to evaluate copy number and variation. This
<italic>de novo </italic>
"genome" was used to align the reads and identify mutations as above. Covering approximately 1/5 of the 380 Mb genome of rice we detected mutation densities ranging from 0.6 to 4 per Mb of diploid DNA depending on the mutagenic treatment.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The combination of a simple and cost-effective library construction method, with Illumina sequencing, and the use of a bioinformatic pipeline allows practical SNP discovery regardless of whether a genomic reference is available.</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>Mutations caused by base changes can occur spontaneously during mitosis or meiosis, or through alterations of mechanisms required for fidelity of replication and repair, or through exposure to mutagenic environments. Measuring the mutation rate is important for evolution, biochemistry, medicine and functional genomics. We are specifically interested in the functional genomic tool called TILLING (Targeting of Induced Local Lesions IN Genomes) [
<xref ref-type="bibr" rid="B1">1</xref>
]. The combination of efficient mutation discovery via high-throughput sequencing and the ability to generate allelic series (missense, nonsense mutations) enables reverse genetics in many species with limited genomics resources. However, populations with optimal mutation densities are necessary for screening efficiency and optimizing mutagenic treatments requires measuring mutation densities. For this purpose, PCR amplicons representing selected loci can be screened for mutations by mismatch-detecting assays [
<xref ref-type="bibr" rid="B1">1</xref>
] or by high throughput sequencing [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B3">3</xref>
]. Both approaches, however, require testing of several hundred individuals [
<xref ref-type="bibr" rid="B1">1</xref>
]. With the advances in sequencing throughput, the entire genome of an individual can be resequenced with sufficient coverage to call changes with high reliability [
<xref ref-type="bibr" rid="B4">4</xref>
], or sequencing can be targeted to the exome by capture with complementary oligonucleotides [
<xref ref-type="bibr" rid="B5">5</xref>
]. Both methods, however, are still relatively expensive or laborious and required prior knowledge of the genome of interest or development of an oligonucleotide set suitable for exome capture.</p>
<p>A convenient approach to reduce genomic complexity for shotgun sequencing involves phasing the sequencing entry points at restriction enzyme sites to provide increased coverage of a subset of DNA regions [
<xref ref-type="bibr" rid="B6">6</xref>
]. Judicious selection of restriction enzyme and fragment size range can allow a coverage range that maximizes both discovery and economy [
<xref ref-type="bibr" rid="B7">7</xref>
,
<xref ref-type="bibr" rid="B8">8</xref>
], even for large genomes. Our variation of this method, RESCAN (Restriction Enzyme Sequence Comparative ANalysis), involves simple Illumina library construction using as little as 100 ng of input DNA and can be multiplexed (≤ 96 individuals) by employing custom barcoded adapters. The method allows genotyping using both the entry point restriction enzyme site and the adjacent sequenced DNA. If a reference sequence is not available, RESCAN read populations are intrinsically simpler than those derived from random fragmentation sequencing libraries and should be amenable to the construction of a reduced reference genome. Here, we describe development of this method and its application for discovery and detection of Single Nucleotide Polymorphisms (SNP) induced by mutagenesis, a type of variation much rarer and thus more difficult to detect than natural SNP. We demonstrate its capabilities in the characterization of mutation density in single individuals with or without the use of a reference genome. The method greatly facilitates the development of optimally mutagenized populations.</p>
</sec>
<sec>
<title>Results</title>
<sec>
<title>Method development</title>
<p>We devised a method (RESCAN) for the simple production of restriction enzyme-phased libraries for Illumina sequencers. The method entails digestion of the input DNA with a restriction enzyme, optional selection of a size range, ligation to modified Illumina Y adapters that feature sticky ends complementary to those produced by the enzyme (Figure
<xref ref-type="fig" rid="F1">1</xref>
), clean up of the ligation product, enrichment by PCR, and finally sequencing. To optimize this method, we used input genomic DNA from rice and arabidopsis, two model systems with well-characterized genomes [
<xref ref-type="bibr" rid="B9">9</xref>
,
<xref ref-type="bibr" rid="B10">10</xref>
]. Figure
<xref ref-type="fig" rid="F2">2</xref>
compares the effect of the order of size selection versus ligation to the adapters by comparing the yield and size of sequenced fragments to the total number predicted from the genome sequence of the target. The importance of choosing the right molecular weight fraction is further demonstrated in Figure
<xref ref-type="fig" rid="F3">3</xref>
. The libraries were processed as described in the Methods and sequenced in Illumina GA.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>Structure of barcoded adapters used for RESCAN</bold>
. The RESCAN adapters leverage the Y-adapter system used for standard Illumina sequencing libraries in which random-sheared, A-tailed insert DNA (grey boxed regions or NNN) is ligated to T-overhang formed by the paired adapters (top). The Y-adapter is formed by two oligonucleotides. A sequence barcode (lower case) is included adjacent to the end. For ligation to restriction enzyme-formed overhangs, the required extension is incorporated in the appropriate oligonucleotide of the adapter. Below each paired adapter sequence the beginning of the resulting sequence read is shown in blue, with the nucleotides that are not fixed, i.e. not part of the adapter, barcode and overhang, underlined in blue. The barcode length used in the early method-refining part of this work was of four bases. Five bases is the preferred length at the time of writing this paper because the first five cycles of Illumina HiSeq platform require random and similarly weighted base composition.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-1"></graphic>
</fig>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>Size distribution of RESCAN is affected by library construction and genotype</bold>
. The size of the restriction fragment sequenced in the RESCAN was calculated from the aligned reference genome. A, B, C. Effect of the method used for the construction of the library on the sampling of fragments. In the left graphs, the blue and red datapoints report respectively the number of total restriction fragment ends available in the genome for the indicated size (before fractionation) and the number sampled by one or more RESCAN reads. The blue points represent the same distribution in A, B and C, but zoomed on different Y-axis values. The right graphs report the distribution of number of RESCAN reads by size. All size fractionation in these preliminary experiments was done by gel electrophoresis and extraction of DNA from a selected section of the gel. D. Effect of a divergent genotype on the range of fragments sizes. The sequencing libraries for
<italic>A. thaliana </italic>
Col-0, the accession from which the reference genome is derived, and Ler, a divergent accession, were prepared according to protocol in C. The count of each RESCAN read is plotted versus the reference-deduced size of the restriction fragment to which it mapped. Many high coverage RESCAN reads from the Ler genome occur for fragments whose sizes (according to the Col-0 reference sequence) are not in the correct coverage size range. These cases are assumed to correspond to restriction size polymorphisms.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-2"></graphic>
</fig>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>Size fractionation of digested DNA by affinity beads</bold>
. A. Counts of restriction fragments by size after in silico digestion of the
<italic>Oryza sativa </italic>
Os6.1 genome with
<italic>NlaIII</italic>
. The Y-axis of the graph displays the count per 25 bp bins. The graph top axis displays the total count for in silico slices of 100 bp. The graph demonstrates how a size fraction from 100 to 200 bp would contain more than ten times the number of fragments found in the 600 to 700 bp fraction. B. Fractionation strategies with SPRI magnetic beads. On the left, a bottom-delimited size fraction of the digested input DNA can be taken in a single step (thicker arrows path), or a sliced size fraction in two steps (thinner arrows path). Slicing is demonstrated in a digital electrophoretogram on the right. In practice, bottom delimiting in a single step is the most practical solution since the larger size fragments contribute relatively less to the final library.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-3"></graphic>
</fig>
<p>The reads were aligned to the reference genome with Eland (Illumina, Inc). The position at which each read initiated was then extracted and the size of each corresponding restriction fragment was tabulated. The distribution of observed hits is displayed in Figure
<xref ref-type="fig" rid="F2">2</xref>
for each library construction strategy and compared to the corresponding genomic total. Each library generated a nearly-normal distribution centered within the targeted range. Ligation followed by fractionation, however (Figure
<xref ref-type="fig" rid="F2">2-A, B</xref>
), produced a bimodal curve with a maximum corresponding to low molecular weight fragments and another maximum corresponding to the selected range. Contamination by the small fragments could be minimized by pre-selection of the target size range followed by ligation (Figure
<xref ref-type="fig" rid="F2">2-C</xref>
).</p>
<p>Restriction fragment length polymorphisms are expected in different accessions and may shift a fragment into or outside of the optimally covered size range (Figure
<xref ref-type="fig" rid="F2">2-A, B, C</xref>
). The former instances should be manifest as outliers in a graph of coverage by size. This expectation was verified in the accession Ler (Figure
<xref ref-type="fig" rid="F2">2-D</xref>
) where frequent fragments with good coverage are observed outside the optimal size range.</p>
<p>Some RESCAN reads mapped on the reference genome appeared not to start at an
<italic>MseI </italic>
site, but at sites differing by one base that we called proto sites. These degenerate sites may be cut by restriction enzyme star activity or they can be diagnostic of a polymorphisms between the reference genome and the sample [
<xref ref-type="bibr" rid="B11">11</xref>
]. In the case of MseI fragments, modifications of the fourth base (incidence of TTAB vs TTAA sites, where B is a base other than A), can be identified because the sequence read from these sites would start with, respectively, "...TTABNNN..." vs "...TTAANNN..." (Figure
<xref ref-type="fig" rid="F1">1</xref>
). We measured the frequency of star cutting by counting the two read types. The incidence of TTAB sites in 2.6 M reads was 0.22% indicating that star activity was very low.</p>
<p>Reads initiating at reference sequence proto sites should increase with phylogenetic distance. RESCAN reads from the reference variety Nipponbare, two California varieties of japonica rice, and two indica varieties displayed progressively higher proto read counts (Table
<xref ref-type="table" rid="T1">1</xref>
). These reads were highly predictive of polymorphism even at very low coverage: 17/19 proto sites tested were digested by
<italic>MseI </italic>
in IR64 and not in Nipponbare, confirming an
<italic>MseI</italic>
-associated polymorphisms (Figure
<xref ref-type="fig" rid="F4">4</xref>
). Depending on the proto site context, the polymorphism could be called unequivocally (see Methods). The inferred SNP corresponding to the proto to full site conversion were called Type I contrasting to the Type II SNP, detected within the reads. For example, with TTAB (where B = T, G or C) proto sites, 1200 out of 3000 predicted B > A SNP confirmed SNP already present in the NCBI rice SNP database.</p>
<table-wrap id="T1" position="float">
<label>Table 1</label>
<caption>
<p>SNP discovery in rice from type I RESCAN</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Accession</th>
<th align="left">Type</th>
<th align="left">Total</th>
<th align="left">Off site</th>
<th align="left">Proto</th>
<th align="left">%</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Nipponbare</td>
<td align="left">Japonica</td>
<td align="left">273,959</td>
<td align="left">566</td>
<td align="left">400</td>
<td align="left">0.15</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">M-206</td>
<td align="left">Japonica</td>
<td align="left">156,710</td>
<td align="left">1543</td>
<td align="left">1100</td>
<td align="left">0.71</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">M-202</td>
<td align="left">Japonica</td>
<td align="left">375,081</td>
<td align="left">4483</td>
<td align="left">3194</td>
<td align="left">0.86</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">IR64</td>
<td align="left">Indica</td>
<td align="left">2,598,754</td>
<td align="left">55024</td>
<td align="left">37905</td>
<td align="left">3.31</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">IR50</td>
<td align="left">Indica</td>
<td align="left">264,618</td>
<td align="left">11450</td>
<td align="left">8490</td>
<td align="left">3.35</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>Confirmation of SNP detected by the RESCAN type I approach</bold>
. A. RESCAN type I SNP can be identified in sites that are found for the target restriction site in the query (in this case IR64) but are absent in the reference. In most cases, examination of the reference sequence reveals the presence of a proto sequence, i.e. a sequence that diverges by one base from the expected sequence TTAA: VTAA, TVAA, TTBA, TTAB, where V and B are, respectively, not T and not A. For a proto such as GTAA, a T > G SNP is inferred. A SNP cannot be inferred for a proto site such as TTTAG since either T3 > A or G5 > A could have produced the
<italic>MseI </italic>
site. B. We chose 20 type I sites that allowed inference and were detected through 1 or 2 RESCAN reads. The products amplified using flanking PCR primers from Nipponbare and IR64 are shown. C. The amplified products were subjected to digestion with
<italic>MseI </italic>
and analyzed by agarose gel electrophoresis. The presence of an extra restriction site in the amplified IR64 DNA and not in the control Nipponbare is evident in 17 of the 19 amplified products, confirming the presence of a SNP producing a restriction site in IR64.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-4"></graphic>
</fig>
<p>Gel-based electrophoretic fractionation (size selection) is cumbersome and not easy to scale up. We substituted it with Solid Phase Reversible Immobilization (SPRI) on magnetic beads [
<xref ref-type="bibr" rid="B12">12</xref>
,
<xref ref-type="bibr" rid="B13">13</xref>
]. Clean up of digested DNA with SPRI beads removed the bulk of the small restriction enzyme fragments. Different size cuts could also be carried out because the molecular weight of the bound DNA could be changed by the strength of the binding buffer (see Figure
<xref ref-type="fig" rid="F3">3</xref>
and Methods). The method was tested and worked well with another restriction enzyme,
<italic>NlaIII</italic>
. The ratio of adapters to input DNA required careful adjustment to avoid adapter dimers. The optimal ratio was considerably lower than that used for regular Illumina libraries [
<xref ref-type="bibr" rid="B12">12</xref>
]. The resulting protocol proved robust and scalable as we increased the number of barcoded adapters from the few used above to as many as 96. The amount of input DNA digested with either 4 bp cutter enzyme could be as little as 100 ng without loss in efficiency.</p>
</sec>
<sec>
<title>Discovery of rare polymorphisms</title>
<p>The method described above proved efficient at genotyping individuals in populations (Monson-Miller et al., unpublished results). A more challenging type of variation is the one resulting from mutagenesis because induced mutations occur at density lower than natural polymorphisms. For example, a well mutagenized population of rice has one base change every 250 kb of diploid DNA [
<xref ref-type="bibr" rid="B14">14</xref>
], which is about thousand time less frequent than the natural SNP density between japonica and indica rice [
<xref ref-type="bibr" rid="B15">15</xref>
].</p>
<p>To test the capabilities of the RESCAN system for discovery of induced mutation rates in plants we developed the experimental protocol described in Figure
<xref ref-type="fig" rid="F5">5</xref>
. We tested the effect of varying sodium azide (NaAzide) concentrations on mutation rate in rice (
<italic>Oryza sativa</italic>
) cv. Kitaake. Table
<xref ref-type="table" rid="T2">2</xref>
illustrates the progressively more deleterious effect of increasing NaAzide concentration. The highest treatment resulted in less than 2% survival to the M2 generation vs. ~20% for the next lower treatment. The genomic DNA of three M2 individuals derived from 1 mM (T1), 5 mM (T2) and 10 mM (T3) NaAzide treatments and of a single control individual (called "C") was used for RESCAN library preparation with the restriction enzyme
<italic>Nla</italic>
III. The four indexed libraries were each sequenced using 85b × 2 paired-end reads on one and one half lane of the Illumina GAII sequencer.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>Overview of experimental material and mutation discovery strategy</bold>
. The figure summarizes the steps undertaken in the mutation analysis. A. Plant mutagenesis, growth of M2 plants and production of RESCAN libraries. B. Informatic strategy for identification of mutations. The panel compares the bioinformatic process used with the genomic reference (left) and without (right). The table in the center bottom illustrates the strategy to identify mutations, which are expected to occur both as heterozygous and homozygous changes. T1, T2, T3 are mutagenized individuals. C is a control. For each position, calls concordant with the reference are dots, those discordant are base symbols. In the case of the second base A > G changes are found in multiple individuals and therefore cannot represent mutations (cross-out symbol is used). The fifth base G, however, displays changes unique to a single mutagenized individual. The G > A change is accepted. BWA and BLAST refer to the alignment programs used.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-5"></graphic>
</fig>
<table-wrap id="T2" position="float">
<label>Table 2</label>
<caption>
<p>Mutagenesis of rice by NaAzide</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">NaAzide</th>
<th align="left">Survival to maturity</th>
<th></th>
<th align="left">M1 Fertile (at least 1 seed)</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">mM</td>
<td align="left">count/total</td>
<td align="left">%</td>
<td align="left">count/total</td>
<td align="left">%</td>
</tr>
<tr>
<td colspan="5">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">0</td>
<td align="left">133/150</td>
<td align="left">88.7</td>
<td align="left">133/133</td>
<td align="left">100</td>
</tr>
<tr>
<td colspan="5">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">1</td>
<td align="left">301/398</td>
<td align="left">75.6</td>
<td align="left">299/301</td>
<td align="left">99.3</td>
</tr>
<tr>
<td colspan="5">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">5</td>
<td align="left">152/400</td>
<td align="left">38.0</td>
<td align="left">82/152</td>
<td align="left">53.9</td>
</tr>
<tr>
<td colspan="5">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">10</td>
<td align="left">16/400</td>
<td align="left">4.0</td>
<td align="left">7/16</td>
<td align="left">43.8</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec>
<title>Reference-dependent detection</title>
<p>We used the published rice cv. Nipponbare genome [
<xref ref-type="bibr" rid="B9">9</xref>
,
<xref ref-type="bibr" rid="B16">16</xref>
] to align the RESCAN reads using the program BWA [
<xref ref-type="bibr" rid="B17">17</xref>
] with default settings. The analysis of expected and predicted restriction fragment sizes (Figure
<xref ref-type="fig" rid="F3">3</xref>
and
<xref ref-type="fig" rid="F6">6</xref>
) demonstrates that the method using the SPRI bead-using method is comparable to size-selection after gel electrophoresis. By selecting fragments in the 100 to 250 bp size range we probed a relatively larger component of the genome (about 1/5). Using a custom parsing script, we searched for candidate SNPs in the resulting alignment. We filtered out SNPs corresponding to poorly mapped reads and those that occurred in repeated regions by setting a maximum cumulative allowable coverage of 200. We further required that candidates for homozygous mutations be unique to one individual and occur in sites where coverage was at least 2 in that individual and that all calls be identical and that position was covered at least once in each of the other 3 individuals (see Methods for details). For heterozygous mutations the general criteria were similar, but, expectedly, we encountered higher noise. Noise was evidenced by the high numbers of potential mutant calls in the control and in all genotypes by a high frequency of base change types that were not expected from the mutagenic action of NaAzide (see below). We found that using a minimum of 5 or more variant calls as a bottom threshold largely eliminated the noise.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption>
<p>
<bold>Distribution of the RESCAN reads used for mutation discovery</bold>
. Different views of the distribution of the RESCAN reads derived from the control individual "C". A. The red and blue datapoints report respectively the number of total restriction fragment ends available in the genome for the indicated size range (before fractionation) and the number covered by one or more RESCAN. The bar joining the two points highlights the difference. B. Exemplary data for chromosome 5 of rice. The top histogram displays the density distribution of the forward RESCAN reads. The bottom graph plots the count for each RESCAN read vs. the position on chr. 5. The schematic drawing below the chart illustrates the position of the centromere on the chromosome. C. The graph plots read counts for each of the forward RESCAN positions vs the predicted size of the restriction fragment involved. The rescan library for individual T1 has similar properties. Those for individuals T2 and T3 have about double the total number of reads.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-6"></graphic>
</fig>
</sec>
<sec>
<title>Reference-independent detection</title>
<p>Assessment of mutation density can be difficult if a reference genome is unavailable or diverges too much from the query sequences. In a parallel experiment, the same raw sequence data were used for reference-independent SNP discovery. We used the first 70 bases of each read to construct a list of 70-mer substrings (termed k-mers). We curated this set by eliminating the k-mers that had higher-than-expected coverage (potential repeated regions) and by discarding the minority member in pairs that had a Hamming distance of 1 [
<xref ref-type="bibr" rid="B18">18</xref>
]. We then used the resulting k-mer set as a
<italic>de novo </italic>
reference. The first 65 bases of each RESCAN read were aligned to this reference using BWA with a maximum mismatch allowance of 1. The resulting alignment was parsed as for the reference genome alignment.</p>
</sec>
</sec>
<sec>
<title>SNP types are consistent with a chemical mutagenesis mechanism</title>
<p>For each treatment, the
<italic>de novo </italic>
reference (Table
<xref ref-type="table" rid="T3">3</xref>
) was about 5% larger than the genome-aligned space (about 82 to 87 vs 78 to 82 Mb, see Table
<xref ref-type="table" rid="T3">3</xref>
for details). In both cases a set of SNPs consistent with mutation sites were identified. Treatment with NaAzide is expected to yield only or predominantly GC > AT (same as G > A and C > T transitions) changes [
<xref ref-type="bibr" rid="B19">19</xref>
]. For the mutagenized individuals, GC > AT SNPs were more frequent than other SNP types in both types of analysis (Figure
<xref ref-type="fig" rid="F7">7</xref>
). Candidates appeared to be randomly distributed throughout the genome. The fraction of GC > AT changes observed in all T2 and T3 measurements (Figure
<xref ref-type="fig" rid="F7">7</xref>
) was significantly different from the expectation of random sequencing error [
<xref ref-type="bibr" rid="B2">2</xref>
]. For T1, only the heterozygous changes were significant. Fewer SNPs were identified using the
<italic>O. sativa </italic>
genomic reference than using the
<italic>de novo </italic>
reference: 347 vs 623 for putative homozygous mutations and 863 vs 813 for putative heterozygous mutations. Of the 347 GC > AT putative homozygous changes found with the
<italic>O. sativa </italic>
reference, 247 (71%) were shared with the
<italic>de novo</italic>
-referenced analysis. Of the other changes, only 21% were shared. For the heterozygous mutations, overall 75% of the positions present in the referenced analysis were also present in the
<italic>de novo</italic>
-referenced analysis. On average, the GC > AT base changes were confirmed much more often (80%) than the other types of changes (25%).</p>
<table-wrap id="T3" position="float">
<label>Table 3</label>
<caption>
<p>Sequencing coverage</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Reference</th>
<th align="left">Read Alignment</th>
<th align="left">Control</th>
<th align="left">T1</th>
<th align="left">T2</th>
<th align="left">T3</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">None</td>
<td align="left">Total number of reads (million)</td>
<td align="left">15.5</td>
<td align="left">13.7</td>
<td align="left">28.5</td>
<td align="left">29.9</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">
<italic>O.s</italic>
. 6.1
<sup>1</sup>
</td>
<td align="left">Number of mapped reads (million)</td>
<td align="left">13.4</td>
<td align="left">11.8</td>
<td align="left">24.6</td>
<td align="left">26.4</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">
<italic>O.s</italic>
. 6.1</td>
<td align="left">Complexity (Mb)</td>
<td align="left">131.8</td>
<td align="left">131.5</td>
<td align="left">142.8</td>
<td align="left">142.4</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">
<italic>O.s</italic>
. 6.1</td>
<td align="left">Complexity, homozygous changes
<sup>2 </sup>
(Mb)</td>
<td align="left">78.2</td>
<td align="left">77.7</td>
<td align="left">82.7</td>
<td align="left">82.6</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">D
<italic>e novo</italic>
<sup>3</sup>
</td>
<td align="left">Number of mapped reads (million)</td>
<td align="left">13.3</td>
<td align="left">11.7</td>
<td align="left">24.5</td>
<td align="left">26.3</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">D
<italic>e novo </italic>
</td>
<td align="left">Complexity (Mb)</td>
<td align="left">102.4</td>
<td align="left">102.1</td>
<td align="left">106.9</td>
<td align="left">106.7</td>
</tr>
<tr>
<td colspan="6">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">D
<italic>e novo </italic>
</td>
<td align="left">Complexity, homozygous changes
<sup>2 </sup>
(Mb)</td>
<td align="left">82.6</td>
<td align="left">82.1</td>
<td align="left">87.4</td>
<td align="left">87.3</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<sup>1</sup>
Reference complexity: 380 Mb.
<sup>2</sup>
Complexity for homozygous scoring is calculated according to the following requirements: i) covered at least once in each of the four libraries, ii) covered at least twice in the putative mutant, iii) covered less than 200 times cumulatively in all libraries.
<sup>3</sup>
Reference complexity: 117 Mb.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption>
<p>
<bold>Pattern of SNP frequency</bold>
. A. The graphs illustrate the relationship between counts and coverage for the surveyed positions in the four tested libraries. Approximately half the number of reads were obtained from the control (C) and T1 libraries as for the C2 and C3 (Table 3). B. The absolute SNP count is shown for the tested individuals, using four bars listed in the order (C, T1, T2, T3) for each change type. Mutagenized individuals (T1, T2, T3) display increased SNP types consistent with the mutagen action (GC > AT) while the untreated individual (the first bar of each group) displays only background changes in both de novo referenced and
<italic>O.s</italic>
. genome referenced analyses. Changes that differ statistically from the expectation of random sequencing errors are marked by the asterisk.</p>
</caption>
<graphic xlink:href="1471-2164-13-72-7"></graphic>
</fig>
</sec>
<sec>
<title>Measurement of mutation rate</title>
<p>To determine the mutation rate, the count of unique SNPs in each treatment was divided by total number of bps effectively assayed (see Methods for details and Table
<xref ref-type="table" rid="T3">3</xref>
). The number of putative homozygous SNPs per Mb in the
<italic>O. sativa </italic>
genome-referenced analysis was 0.4 for the control, 0.58 for T1, 2.7 for T2, and 1.6 for T3. For the
<italic>de novo</italic>
-referenced analysis the number was 0.54 for the control, 1.1 for T1, 4.07 for T2, and 2.53 for T3. These values were fairly similar suggesting that reference-less alignments can be as effective as reference guided ones for discovery of very rare polymorphisms. Deriving the total mutation rate can be complicated by the assumptions used and is expected to be sensitive to the coverage (see Discussion). Since the above calculation did not include the heterozygous SNP, it is an underestimate of the real mutation rate. Depending on the number of heterozygous mutations that were classified as homozygous (see Discussion) the actual mutation density may range from less than 3 times to 3 times the one reported above. Nonetheless, the estimated mutations densities were generally consistent with previous work [
<xref ref-type="bibr" rid="B2">2</xref>
] and provide a guide in the design of future mutagenesis experiments.</p>
</sec>
</sec>
<sec>
<title>Discussion</title>
<p>We demonstrate the use of a simple method to identify extremely rare SNP in the genome of individuals and estimate the connected mutation rate, independent of availability of a reference genome. The method takes advantage of the facile construction of libraries for Illumina sequencers using restriction-digested genomes. Construction of reduced representation libraries using restriction enzymes was first described for Sanger sequencing [
<xref ref-type="bibr" rid="B6">6</xref>
]. The method has since been applied to high throughput sequencing library construction for the 454 platform [
<xref ref-type="bibr" rid="B20">20</xref>
], and for Illumina [
<xref ref-type="bibr" rid="B7">7</xref>
,
<xref ref-type="bibr" rid="B8">8</xref>
,
<xref ref-type="bibr" rid="B21">21</xref>
-
<xref ref-type="bibr" rid="B23">23</xref>
]. These approaches have been reviewed recently [
<xref ref-type="bibr" rid="B22">22</xref>
]. Restriction site associated DNA tags (RAD) sequencing, described four years ago, has found multiple successful applications [
<xref ref-type="bibr" rid="B22">22</xref>
,
<xref ref-type="bibr" rid="B24">24</xref>
-
<xref ref-type="bibr" rid="B31">31</xref>
]. Our method differs from RAD sequencing because a genomic fragment in the sequencing library is defined by two symmetric restriction sites instead of asymmetric combination of a restriction site with a randomly fragmented and flushed end, thus being most similar to that of Andolfatto
<italic>et al. </italic>
[
<xref ref-type="bibr" rid="B23">23</xref>
] and of Elshire
<italic>et al. </italic>
[
<xref ref-type="bibr" rid="B8">8</xref>
]. The one-step ligation for library construction used by all three methods provides a simple, robust and easily implemented step. Our method differs from the first because the terminal portion of the Illumina adapter sequence is left intact allowing the use of the standard Illumina sequencing primers and combined sequencing with regular libraries. It differs from the second by the use of a single Y-adapter instead of two fully double-stranded ones and the placement of the same barcode on both sequencing reads derived from paired-end sequencing (Figure
<xref ref-type="fig" rid="F1">1</xref>
). Double barcoding identifies potential chimeric products in multiplexed libraries that are sequenced as paired reads. The method employs oligonucleotides with desalted purity and no phosphorothioate modification, enabling considerable savings in setting up 96-barcode multiplexing. Similarly to the two methods above, different types of overhangs can be used. When used with a two-base overhang producing enzyme such as
<italic>MseI </italic>
(T↓TAA), the adapter can be ligated to the insert in the presence of the restriction enzyme because adapter-insert ligation eliminates the restriction site. When used with a four-base overhang producing enzyme such as
<italic>NlaIII</italic>
, the restriction enzyme must be removed or inactivated.</p>
<p>The complexity of the sequencing library can be modified by the choice of restriction enzyme and by size fractionation of DNA. Six-base cutters reduce complexity compared to four-base cutters. For example, the six-base cutter
<italic>SphI </italic>
(GCATG↓C) can be used with the
<italic>NlaIII </italic>
(CATG↓) adapters achieving satisfactory reduction in complexity of large genomes (> 10 Gb; Monson-Miller and Comai, unpublished results). SPRI bead cleanup before and after ligation of the digested DNA can be tailored to produce different molecular size cuts. Although size selection can be achieved by excision of DNA from agarose gel after electrophoresis, we found that the latter method is less efficient, more laborious and not easily scalable. A drawback of the SPRI bead-based fractionation is that removal of the abundant smaller fragments can be incomplete, resulting in their capture through intrafragmental ligation and chimeric library products. These instances complicate analyses based on assumed contiguity of the paired end reads. Often, however, the two reads are queried independently and this is not a problem.</p>
<p>Analysis of reduced complexity libraries starts by mapping quality-filtered reads to a reference genome using software such as ELAND or BWA. BWA outputs two file types useful in this application: SAM files and pileup files. The first provides the entry point of each read identifying restriction sites common to the input and the reference and restriction sites unique to the input and not present in the reference. Interestingly, the latter sites can be highly predictive of SNP even with very low coverage (one or two reads, Figure
<xref ref-type="fig" rid="F4">4</xref>
). For example, using
<italic>MseI </italic>
(T↓TAA) we demonstrated that sites where the fourth base (TTA
<underline>A</underline>
) is verified in the read and the corresponding reference proto site is TTAB, are confirmed in over 80% of the tested cases (Figure
<xref ref-type="fig" rid="F4">4</xref>
). Of 3000 SNP inferred B > A SNP, 1200 were confirmed in the databases. We believe that the remaining SNP are likely to be real as well.</p>
<p>More commonly, SNP discovery employs the sequence of the read beyond the restriction site, a method that requires higher coverage, but is more productive because it queries more sequence (currently 100 bases vs the 4 of a restriction site) and can provide codominant information for any SNP discovered. Detection of these SNP is achieved by parsing the pileup table produced by BWA, where calls for each position are listed with the corresponding qualities [
<xref ref-type="bibr" rid="B17">17</xref>
]. We searched for changes induced by NaAzide, which compared to changes derived from natural variation represent a considerable challenge because they are much rarer than variation SNP [
<xref ref-type="bibr" rid="B1">1</xref>
,
<xref ref-type="bibr" rid="B15">15</xref>
]. Typical mutations are present in mutagenized diploid genomes with frequencies ranging from less than 0.5 to 10 per 10
<sup>6 </sup>
bp of diploid DNA. Furthermore, while natural variation SNP are shared by multiple individuals and can thus be confirmed through biological replication, most mutations affect a single individual. We were able to detect induced changes by applying a common sense strategy (Figure
<xref ref-type="fig" rid="F5">5</xref>
, see methods). We were helped by the specificity of the induced changes: the mutations conformed the NaAzide mutagenic spectrum detected in barley and consisted almost exclusively of G:C to A:T transitions [
<xref ref-type="bibr" rid="B19">19</xref>
] in contrast to the 28% G:C to A:T (accompanied by 28% A:T to G:C) transitions expected from natural variation [
<xref ref-type="bibr" rid="B32">32</xref>
].</p>
<p>Reference-independent discovery using k-mers found more SNP, including 70-80% of those found in the
<italic>O. sativa</italic>
-referenced discovery. The fact that 20 to 30% of the potential mutations found in the reference-analysis were not found in the k-mer analysis can partially be explained by the fact that the k-mers were trimmed (from 75 bps to 70 bps) and the reads were further trimmed (from 75 bps to 65 bps, resulting in an effective loss of 14% of the sequence information). Another factor differentiating the two approaches is the potentially different treatment of repeated regions. SNP that are in known repetitive regions would be excluded in the referenced search. However, if only one of the repeat was represented in the RESCAN library, it would behave as single copy and be scored in the reference-independent discovery. Such cases may contribute to the efficiency of
<italic>de novo</italic>
-referenced discovery.</p>
<p>A considerable challenge in the analysis is constituted by the presence of heterozygous mutations, which in M2s are expected to be 2/3 of the mutant sites. One difficulty lays in distinguishing homozygous from heterozygous sites: for example, a base position for which three calls are all variant could be homozygous mutant or heterozygous with associated probabilities of 0.875 vs 0.125 (0.5
<sup>3</sup>
), respectively. Similarly, an heterozygous site could yield three wild-type calls with a 0.125 probability. A second difficulty, connected to the first, lies in estimating the covered genome because each coverage level has an associate probability of detection. In order to reduce noise, we set our algorithm to call heterozygous sites only if they carry a minimum of 5 mutant calls. For example, if a 100b DNA was sequenced to a coverage of ten, any heterozygous site would yield call ratios according to the binomial distribution resulting in a connected probability of detection of 0.62 (X ≥ 5, p = 0.5). Because heterozygotes are detected with lower efficiency, estimating the mutation density under these conditions would require adjusting the number of bases effectively assayed to 62 bases instead of 100. In practical terms, this is laborious and may require the careful construction of an adequate statistical model. A simpler solution to the heterozygous problem would be increasing the coverage, which can be achieved as discussed above. For the purpose of our estimate, we derived a mutation rate using a simplified calculation with the homozygous mutants. We estimate that this might be between half and one third the real mutation rate, depending on the fraction of putative homozygous sites that are actually heterozygous.</p>
<p>Number of SNPs consistent with NaAzide mutagenesis was higher in all 3 treated individuals than in the control, and mutation density peaked in the intermediate treatment according to the homozygous calls. This is not the case when considering the heterozygous calls. If we consider the homozygous analysis to be more accurate, a plausible outcome, this behavior requires potential explanations. The lethality of the mutagenic treatments increased from low to a very high 96% in the 10 mM NaAzide treatment. It is possible that the severity of the 10 mM treatment may be counterproductive and that, for example, survivors may have escaped the full treatment. A similar observation has been reported in barley using sterility as a proxy for mutation density [
<xref ref-type="bibr" rid="B33">33</xref>
]. Alternatively, variation may result from the limited sampling. It is possible, for example, that different cell types in the embryo may respond differently to the mutagen and subsequently enter stochastically the transient germ line that gives rise to plant gametes [
<xref ref-type="bibr" rid="B1">1</xref>
]. Therefore, the extent of individual variability remains to be assessed.</p>
<p>The method described here should be applicable to more studies than just those focusing on mutagenesis. For example, it will allow mapping and backcrossing of induced mutations in the background of the same accession used for mutagenesis by providing markers that allow discrimination of the mutagenized genome from the wild type. It should also allow comparison of substrains of the same variety and, if sufficient SNP are found, genetic characterization of diverged traits. We have also applied the RESCAN to natural SNP discovery and mapping in rice (Tai
<italic>et al</italic>
., unpublished results),
<italic>Arabidopsis suecica </italic>
(Henry and Comai, unpublished results) and
<italic>Arabidopsis thaliana </italic>
[
<xref ref-type="bibr" rid="B34">34</xref>
]. In all these systems, RESCAN proved robust in its application and analysis.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We describe here the development and application of a simple and economical method for reduced representation sequencing. We demonstrated it effectiveness by measuring the mutation rate in multiple individuals. RESCAN libraries, made by direct annealing and ligation of adapters to digested fragments of genomic DNA, are easy to multiplex and analyze. Coupled to the ability to assay a genome without a reference, the method should facilitate genotyping as well as the measurement of mutation densities in many systems.</p>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>Mutagenesis with sodium azide</title>
<p>Seeds of
<italic>O. sativa </italic>
ssp.
<italic>japonica </italic>
(cv. Kitaake), a variety closely related to the reference cv. Nipponbare, were pre-soaked in ultrapure water for 20 hours at 25°C prior to sodium azide treatment. For mutagenesis, sodium azide solutions of 1 mM, 5 mM, and 10 mM were made in 0.1 M sodium phosphate buffer, pH 3. Batches of 100 seeds were treated with 27 ml of sodium azide solution in a 50 ml tube at RT (22-24°C) for 3 hours. Sodium azide was decanted and seeds were washed 3 times with 30 ml of ultrapure water for 5 minutes each time. Seeds were then transferred to germination paper in a standard plastic petri dish for germination at 25°C. Control Kitaake seeds (neither presoaked nor treated with sodium azide) were plated at the same time. After 7 days, germinated seeds were transplanted to UC soil mix C and grown in greenhouse to produce M2 seeds. M2 seeds were planted directly in UC soil mix C and leaf tissue was harvested for DNA isolation.</p>
</sec>
<sec>
<title>DNA extraction</title>
<p>Total genomic DNA was isolated from frozen leaf tissues that were mechanically ground prior to extraction using a potassium acetate-SDS method [
<xref ref-type="bibr" rid="B35">35</xref>
].</p>
</sec>
<sec>
<title>RESCAN library construction</title>
<sec>
<title>Method development</title>
<p>Approximately 1000 ng of DNA was digested with the restriction enzyme
<italic>MseI </italic>
(T↓TAA, NEB, Ipswich, Massachusetts, cat. no. R0525) for 1 to 6 hour at 37°C. After the completion of digestion was verified by agarose gel electrophoresis, the DNA was purified with a Qiaquick PCR purification minicolumn (Qiagen, Germantown, Maryland, cat. no. 28104) and resuspended in 40 μl of 10 mM Tris buffer. Alternatively, the desired size range of genomic fragments was excised from agarose gel after electrophoretic separation, extracted with a Qiaquick gel extraction kit (cat. no. 28704) and resuspensed in 20 μl of 10 mM Tris buffer. For ligation, 20 μl of genomic DNA, either from the total digestion, or the size cut, were combined in a final volume of 44 μl with T4 DNA ligase (various manufacturers), the ligase buffer provided by the manufacturer, 0.5 μl of T4 DNA ligase, 0.5 μl of
<italic>MseI </italic>
(except for agct barcode, see below), and 1 μl of 0.05 μM premixed adapter oligonucleotides to form the single end sequencing Illumina Y adapter. The use of
<italic>MseI </italic>
during ligation depended on the sequence of the adapter and was employed whenever possible to minimize ligation between genomic fragments. The sequence of the adapter oligonucleotides is shown below, with the barcode in lower case. Additional barcodes used were (shown as the adA2 oligonucleotide strand sequence): gata, cacc, tagc, agct, ctag. Note that all, except "agct", cause loss of the
<italic>MseI </italic>
site upon ligation to an
<italic>MseI </italic>
fragment. The adapter sequences are shown for the record, but are no longer suited for the 2012 and later Illumina sequencing platform. The oligonucleotides, prepared at desalted quality, were obtained from Life Technologies (
<ext-link ext-link-type="uri" xlink:href="http://www.invitrogen.com">http://www.invitrogen.com</ext-link>
).</p>
<p>adA2_GGTG: P-TAggtgAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG</p>
<p>adB2_GGTG: ACACTCTTTCCCTACACGACGCTCTTCCGATCTcacc</p>
<p>Ligated DNA was purified on a Qiaquick column, enriched by PCR amplification with Illumina PCR amplification primers for 16 to 18 cycles and examined by analytical gel electrophoresis. A slightly diffuse band in the target range of molecular weight (insert size + ~100b of adapters) was diagnostic of the desired outcome. The presence of excessive adapter dimer (a fragment in the 100-150 bp) was undesirable. If found, it could be removed by preparative gel electrophoresis, or it could minimized by repeating the procedure from the ligation step using a lower concentration of adapters.</p>
<p>Illumina primers:</p>
<p>pr1: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT</p>
<p>pr2: CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT</p>
<p>Sequencing of the RESCAN libraries started with 25b reads on the original (2007) Genome Analyzer and progressively employed the improvements in chemistry and apparatus.</p>
</sec>
<sec>
<title>Standard method</title>
<p>Approximately 500 ng of DNA was digested with the restriction enzyme
<italic>NlaIII </italic>
(NEB, Ipswich, Massachusetts, cat. no. R0125) for 1 to 6 hour at 37°C. After the completion of digestion was verified by agarose gel electrophoresis, the DNA was purified selecting the desired molecular weight range. For this purpose AMPure SPRI Beads (Beckman Coulter Genomics, Danvers, Massachusetts, cat. no. A50850) were added in the suitable ratio to DNA (see Figure
<xref ref-type="fig" rid="F3">3</xref>
) and used to manufacturer's instructions. For example, to remove fragments smaller than 100 b the recommended ratio (1.8:1) was used. To remove fragments higher than a certain amount (top cut) SPRI beads in the desired ratio were applied and the unbound fraction was saved. Because the bulk of the fragments are always found in the bottom fraction (Figure
<xref ref-type="fig" rid="F3">3</xref>
), we commonly used conditions (i.e. ratios) that would remove all fragments below a certain molecular weight and directly proceeded with ligation of the remaining fragments to the adapter. Although this represented a range of sizes, the higher frequency of smaller fragments and their advantage during the amplification steps resulted in a de facto enrichment of the near-bottom class of fragments. This simplification shortened the protocol and made the method cheaper, important for high level multiplexing. Oligonucleotides for the barcoded Illumina adapters used in the mutation detection experiment were as follows (barcode in lower case):</p>
<p>Control, adA: P-atcacAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG</p>
<p>adB: ACACTCTTTCCCTACACGACGCTCTTCCGATCTgtgatCATG</p>
<p>The oligonucleotides were ordered as in desalted quality and were not modified except for phosphorylation of adA. The same oligonucleotides were used for control, T1, T2 and T3 samples, but for the barcode (respectively: atcac, ctctc, cgaat, and gagca). The oligonucleotides were mixed in a 1:1 ratio to form the adapter, which was stored and used as needed without any annealing pre-incubation. One μl of a 0.05 μM dilution of the adapter was added to the SPRI bead-fractionated DNA in a 44 μl ligation reaction employing the Quick Ligation Kit (NEB, cat. no. M2200). After 15 minute incubation at room temperature, the ligation reaction was cleaned with AMPure SPRI Beads, utilizing a 0.8:1 v/v ("bead in binding buffer":sample) ratio to remove smaller fragments (less than 250-bp) and unligated adapters. The libraries were enriched using a mix of 10 μl of template, 15 μl of Phusion 2x HF Master Mix (NEB, cat. no. F531), 1 μl of 5 μM premixed paired-end Illumina primers and 4 μl of water and the following amplification protocol: 30 sec at 98°C; 14 cycles of 10 sec at 98°, 30 sec at 65°, and 30 sec at 72°; and a final extension with 5 min at 72°. PCR product was purified using AMPure SPRI Beads with a 0.8:1 v/v (bead in buffer:sample) ratio. Libraries were quantified using the Agilent 2100 Bioanalyzer, and were sequenced according to manufacturer's instructions on one and one half lanes (3/8 lane per individual) of the Illumina GAII (Illumina, San Diego, California) with 85-bp paired-end reads.</p>
</sec>
</sec>
<sec>
<title>Computational analysis</title>
<p>Analysis during method development employed the Illumina alignment pipeline using the Eland program with the relevant updates of year 2007, 2008 and 2009. For the mutation analysis, the Illumina 1.5+ format (fastq) reads were filtered using a custom informatic pipeline (
<ext-link ext-link-type="uri" xlink:href="http://tinyurl.com/barcode-tool">http://tinyurl.com/barcode-tool</ext-link>
) that divided them based on barcode. Additionally, it removed the barcode sequences, adapter and primer sequences, reads shorter than 25-bp, and reads containing bases with Phred quality scores less than 20. Quality scores were converted to Sanger scale, which is compatible with most alignment programs.</p>
<p>
<bold>Mutation detection</bold>
: For referenced discovery: BWA (
<ext-link ext-link-type="uri" xlink:href="http://bio-bwa.sourceforge.net/">http://bio-bwa.sourceforge.net/</ext-link>
) was used to align reads to the reference (Os 6.1,
<ext-link ext-link-type="uri" xlink:href="http://rice.plantbiology.msu.edu/">http://rice.plantbiology.msu.edu/</ext-link>
) [
<xref ref-type="bibr" rid="B16">16</xref>
] genome with default mismatch allowance, producing an mpileup file with Samtools [
<xref ref-type="bibr" rid="B36">36</xref>
] (
<ext-link ext-link-type="uri" xlink:href="http://samtools.sourceforge.net/">http://samtools.sourceforge.net/</ext-link>
). The mpileup file contains the base calls at each position, for each library. Basecalls can be A, T, C, G, * (deleted base) or insertions (+AAT for example). This file was parsed in the following manner. First, any basecall with a sequence quality lower than 20 or a read with a mapping quality < 20 were discarded. Next, the basecalls for the four libraries were pooled and only positions that were collectively covered not more than 200 times were retained (to avoid repeated sequences). Positions that were not covered at least once in each of the four libraries were further discarded. If all basecalls were the same, that position was classified as homozygous and further classified as "ref" if the basecall was the same as in the reference genome or "SNP" if it was not. If there were more than 1 basecall, the following criteria were applied: If the least frequent basecall was found in more than 1 library and accounted for > 10% of the basecalls, the base was called heterozygous. If the least frequent basecall was found in more than 1 library and accounted for < 10% of the basecalls, the base was called homozygous.</p>
<p>This latter subset of positions was further assayed for the presence of potential mutations. Potential mutations were detected as follows: i) if there were only 2 different basecall (one dominant "WT" basecall and an another): if the non-WT base was observed at least twice from a single library and never from the other three, and there were no other basecalls for that library, the mutation was classified as potential homozygous mutation. If the non-WT base was observed at least 5 times but there were other basecalls for that library, it was classified as a potential heterozygous mutation. ii) positions for which more than two different basecalls were observed were dealt with in the following manner: If the least frequent basecalls were each only found once, the position was ignored for mutation detection purposes. Similarly, if the least frequent basecall was only found once, it was ignored and that base was processed as if there were only 2 basecalls (see above). If all basecalls were each found at least twice, that base was classified as "ambiguous" and removed from further analysis. This analysis was performed on reads aligned to the reference genome and reads aligned to the pseudo-reference.</p>
<p>The number of positions obtained in each of the categories described above are summarized as follows for
<italic>O. sativa</italic>
-based reference and for
<italic>de novo </italic>
reference, respectively: million bases in pileup table = 118.6, 101.9; total covererage > 200 and covered in all four samples (Mbases) = 84.7, 89.2; % SNP (as defined above) = 0.028, 0.0004; % heterozygotes = 0.019, 0.036.</p>
<p>To assess mutation rates, the number of putative mutations was divided by the number of assayed bases for each library. For homozygous scoring, these had to meet the following requirements: i) covered at least once in each of the four libraries, ii) covered at least twice in the putative mutant, iii) covered less than 200 times cumulatively in all libraries. For heterozygous scoring, the requirements were as follows: i) covered at least one in each of the four libraries, ii) covered less than 200 times cumulatively in all libraries and iii) mutant allele covered at least five times in the mutant library and never in the other three libraries. Positions covered less than 15 times were adjusted for random sampling effects of non mutant bases in a heterozygous background (see Discussion).</p>
<p>In order to determine how many potential mutations were found in both types of analysis, the k-mers containing potential mutations were aligned to the reference genome using BWA (as described above). The position of the potential mutation in the reference genome was extracted from the alignment and the position of the mutation in the k-mer. This set of positions were compared to the set of positions obtained from the referenced-analysis.</p>
</sec>
<sec>
<title>Statistical tests</title>
<p>The Fisher Exact test was used to compare observed and expected (from sequencing errors) SNP type ratios. We derived an expected fraction of GC > AT in sequencing errors of 0.59 based on Figure
<xref ref-type="fig" rid="F2">2c</xref>
of Tsai
<italic>et al</italic>
.[
<xref ref-type="bibr" rid="B2">2</xref>
]</p>
</sec>
<sec>
<title>Data, software and further information</title>
<p>Sequence reads used for the mutation analysis are available at NCBI Sequence Read Archive with the following accession number: [Sequence Read Archive:SRA049884.2]. Software and additional information on the RESCAN method are available at
<ext-link ext-link-type="uri" xlink:href="http://comailab.genomecenter.ucdavis.edu/index.php/RESCAN.">http://comailab.genomecenter.ucdavis.edu/index.php/RESCAN.</ext-link>
</p>
</sec>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<title>Authors' contributions</title>
<p>LC, JMM and JF carried out preliminary method development, THT and LC conceived the mutation search project, THT supervised the mutagenesis and initiated the mutation search project, DCSM performed the mutagenesis and prepared the sequencing libraries; LC and IMH supervised the bioinformatic analyses. JMM and IMH performed the bioinformatic analyses; JMM, IMH, THT and LC wrote the manuscript. All authors read and approved the final manuscript.</p>
</sec>
</body>
<back>
<sec>
<title>Acknowledgements</title>
<p>We thank Meric Lieberman for help in the bioinformatic analysis. This research was supported by USDA-ARS CRIS Project 5306-21000-017-00D (THT), International Atomic Energy Agency 58-5306-0-112F (THT), International Atomic Energy Agency Fellowship (DCSM), and by NSF Plant Genome award DBI-0822383, TRPGR: Efficient identification of induced mutations in crop species by ultra-high-throughput DNA sequencing (JMM, IMH and LC).</p>
</sec>
<ref-list>
<ref id="B1">
<mixed-citation publication-type="journal">
<name>
<surname>Comai</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Henikoff</surname>
<given-names>S</given-names>
</name>
<article-title>TILLING: practical single-nucleotide mutation discovery</article-title>
<source>Plant J</source>
<year>2006</year>
<volume>45</volume>
<fpage>684</fpage>
<lpage>694</lpage>
<pub-id pub-id-type="doi">10.1111/j.1365-313X.2006.02670.x</pub-id>
<pub-id pub-id-type="pmid">16441355</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<name>
<surname>Tsai</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Howell</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Nitcher</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Missirian</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Watson</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Ngo</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Lieberman</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fass</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Uauy</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Tran</surname>
<given-names>RK</given-names>
</name>
<name>
<surname>Khan</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Filkov</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Tai</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Dubcovsky</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Comai</surname>
<given-names>L</given-names>
</name>
<article-title>Discovery of rare mutations in populations: TILLING by sequencing</article-title>
<source>Plant Physiol</source>
<year>2011</year>
<volume>156</volume>
<fpage>1257</fpage>
<lpage>1268</lpage>
<pub-id pub-id-type="doi">10.1104/pp.110.169748</pub-id>
<pub-id pub-id-type="pmid">21531898</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<name>
<surname>Missirian</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Comai</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Filkov</surname>
<given-names>V</given-names>
</name>
<article-title>Statistical Mutation Calling from Sequenced Overlapping DNA Pools in TILLING Experiments</article-title>
<source>BMC Bioinformatics</source>
<year>2011</year>
<volume>12</volume>
<fpage>287</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-12-287</pub-id>
<pub-id pub-id-type="pmid">21756356</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<name>
<surname>Ossowski</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schneeberger</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Lucas-Lledo</surname>
<given-names>JI</given-names>
</name>
<name>
<surname>Warthmann</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Clark</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Shaw</surname>
<given-names>RG</given-names>
</name>
<name>
<surname>Weigel</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Lynch</surname>
<given-names>M</given-names>
</name>
<article-title>The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana</article-title>
<source>Science</source>
<year>2010</year>
<volume>327</volume>
<fpage>92</fpage>
<lpage>94</lpage>
<pub-id pub-id-type="doi">10.1126/science.1180677</pub-id>
<pub-id pub-id-type="pmid">20044577</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<name>
<surname>Ng</surname>
<given-names>SB</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Robertson</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Flygare</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Bigham</surname>
<given-names>AW</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Shaffer</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bhattacharjee</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Eichler</surname>
<given-names>EE</given-names>
</name>
<name>
<surname>Bamshad</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Nickerson</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Shendure</surname>
<given-names>J</given-names>
</name>
<article-title>Targeted capture and massively parallel sequencing of 12 human exomes</article-title>
<source>Nature</source>
<year>2009</year>
<volume>461</volume>
<fpage>272</fpage>
<lpage>276</lpage>
<pub-id pub-id-type="doi">10.1038/nature08250</pub-id>
<pub-id pub-id-type="pmid">19684571</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<name>
<surname>Altshuler</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Pollara</surname>
<given-names>VJ</given-names>
</name>
<name>
<surname>Cowles</surname>
<given-names>CR</given-names>
</name>
<name>
<surname>Van Etten</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Baldwin</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Linton</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Lander</surname>
<given-names>ES</given-names>
</name>
<article-title>An SNP map of the human genome generated by reduced representation shotgun sequencing</article-title>
<source>Nature</source>
<year>2000</year>
<volume>407</volume>
<fpage>513</fpage>
<lpage>516</lpage>
<pub-id pub-id-type="doi">10.1038/35035083</pub-id>
<pub-id pub-id-type="pmid">11029002</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal">
<name>
<surname>Baird</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Etter</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Atwood</surname>
<given-names>TS</given-names>
</name>
<name>
<surname>Currey</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Shiver</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>ZA</given-names>
</name>
<name>
<surname>Selker</surname>
<given-names>EU</given-names>
</name>
<name>
<surname>Cresko</surname>
<given-names>WA</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>EA</given-names>
</name>
<article-title>Rapid SNP discovery and genetic mapping using sequenced RAD markers</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<volume>3</volume>
<fpage>e3376</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0003376</pub-id>
<pub-id pub-id-type="pmid">18852878</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<name>
<surname>Elshire</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Glaubitz</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Poland</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Kawamoto</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Buckler</surname>
<given-names>ES</given-names>
</name>
<name>
<surname>Mitchell</surname>
<given-names>SE</given-names>
</name>
<article-title>A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e19379</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0019379</pub-id>
<pub-id pub-id-type="pmid">21573248</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<name>
<surname>Goff</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Ricke</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Lan</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Presting</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Dunn</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Glazebrook</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sessions</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Oeller</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Varma</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hadley</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hutchison</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Katagiri</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Lange</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Moughamer</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Budworth</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Miguel</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Paszkowski</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Colbert</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>WL</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wood</surname>
<given-names>TC</given-names>
</name>
<name>
<surname>Mao</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Wing</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Dean</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zharkikh</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sahasrabudhe</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cannings</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Gutin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Pruss</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tavtigian</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mitchell</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Eldredge</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Scholl</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Bhatnagar</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Adey</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Rubano</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Tusneem</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Robinson</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Feldhaus</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Macalma</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Oliphant</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Briggs</surname>
<given-names>S</given-names>
</name>
<article-title>A draft sequence of the rice genome (Oryza sativa L. ssp. japonica)</article-title>
<source>Science</source>
<year>2002</year>
<volume>296</volume>
<fpage>92</fpage>
<lpage>100</lpage>
<pub-id pub-id-type="doi">10.1126/science.1068275</pub-id>
<pub-id pub-id-type="pmid">11935018</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<name>
<surname>Initiative</surname>
<given-names>AG</given-names>
</name>
<article-title>Analysis of the genome sequence of the flowering plant Arabidopsis thaliana</article-title>
<source>Nature</source>
<year>2000</year>
<volume>408</volume>
<fpage>796</fpage>
<lpage>815</lpage>
<pub-id pub-id-type="doi">10.1038/35048692</pub-id>
<pub-id pub-id-type="pmid">11130711</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<name>
<surname>Wei</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Therrien</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Blanchard</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Z</given-names>
</name>
<article-title>The Fidelity Index provides a systematic quantitation of star activity of DNA restriction endonucleases</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<fpage>e50</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gkn182</pub-id>
<pub-id pub-id-type="pmid">18413342</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<name>
<surname>Quail</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Kozarewa</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Scally</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stephens</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Swerdlow</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>DJ</given-names>
</name>
<article-title>A large genome center's improvements to the Illumina sequencing system</article-title>
<source>Nat Methods</source>
<year>2008</year>
<volume>5</volume>
<fpage>1005</fpage>
<lpage>1010</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.1270</pub-id>
<pub-id pub-id-type="pmid">19034268</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<name>
<surname>DeAngelis</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Hawkins</surname>
<given-names>TL</given-names>
</name>
<article-title>Solid-phase reversible immobilization for the isolation of PCR products</article-title>
<source>Nucleic Acids Res</source>
<year>1995</year>
<volume>23</volume>
<fpage>4742</fpage>
<lpage>4743</lpage>
<pub-id pub-id-type="doi">10.1093/nar/23.22.4742</pub-id>
<pub-id pub-id-type="pmid">8524672</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<name>
<surname>Till</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tai</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Colowit</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Greene</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Henikoff</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Comai</surname>
<given-names>L</given-names>
</name>
<article-title>Discovery of chemically induced mutations in rice by TILLING</article-title>
<source>BMC Plant Biol</source>
<year>2007</year>
<volume>7</volume>
<fpage>19</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2229-7-19</pub-id>
<pub-id pub-id-type="pmid">17428339</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<name>
<surname>Barker</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>KJ</given-names>
</name>
<article-title>A genome-wide analysis of single nucleotide polymorphism diversity in the world's major cereal crops</article-title>
<source>Plant Biotechnol J</source>
<year>2009</year>
<volume>7</volume>
<fpage>318</fpage>
<lpage>325</lpage>
<pub-id pub-id-type="doi">10.1111/j.1467-7652.2009.00412.x</pub-id>
<pub-id pub-id-type="pmid">19386040</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<name>
<surname>Ouyang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Hamilton</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Campbell</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Childs</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Thibaud-Nissen</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Malek</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>L</given-names>
</name>
<article-title>The TIGR rice genome annotation resource: improvements and new features</article-title>
<source>Nucleic acids research</source>
<year>2007</year>
<volume>35</volume>
<fpage>D883</fpage>
<lpage>D887</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkl976</pub-id>
<pub-id pub-id-type="pmid">17145706</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<article-title>Fast and accurate short read alignment with Burrows-Wheeler transform</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>1754</fpage>
<lpage>1760</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id>
<pub-id pub-id-type="pmid">19451168</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<name>
<surname>Hamming</surname>
<given-names>RW</given-names>
</name>
<article-title>Error detecting and error correcting codes</article-title>
<source>Bell System Technical Journal</source>
<year>1950</year>
<volume>29</volume>
<fpage>147</fpage>
<lpage>160</lpage>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<name>
<surname>Talame</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Bovina</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sanguineti</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Tuberosa</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Lundqvist</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Salvi</surname>
<given-names>S</given-names>
</name>
<article-title>TILLMore, a resource for the discovery of chemically induced mutants in barley</article-title>
<source>Plant Biotechnol J</source>
<year>2008</year>
<volume>6</volume>
<fpage>477</fpage>
<lpage>485</lpage>
<pub-id pub-id-type="doi">10.1111/j.1467-7652.2008.00341.x</pub-id>
<pub-id pub-id-type="pmid">18422888</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<name>
<surname>van Orsouw</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Hogers</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Janssen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yalcin</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Snoeijers</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Verstege</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Schneiders</surname>
<given-names>H</given-names>
</name>
<name>
<surname>van der Poel</surname>
<given-names>H</given-names>
</name>
<name>
<surname>van Oeveren</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Verstegen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>van Eijk</surname>
<given-names>MJ</given-names>
</name>
<article-title>Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes</article-title>
<source>PLoS One</source>
<year>2007</year>
<volume>2</volume>
<fpage>e1172</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0001172</pub-id>
<pub-id pub-id-type="pmid">18000544</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<name>
<surname>Van Tassell</surname>
<given-names>CP</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Matukumalli</surname>
<given-names>LK</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Schnabel</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Lawley</surname>
<given-names>CT</given-names>
</name>
<name>
<surname>Haudenschild</surname>
<given-names>CD</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Warren</surname>
<given-names>WC</given-names>
</name>
<name>
<surname>Sonstegard</surname>
<given-names>TS</given-names>
</name>
<article-title>SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries</article-title>
<source>Nat Methods</source>
<year>2008</year>
<volume>5</volume>
<fpage>247</fpage>
<lpage>252</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.1185</pub-id>
<pub-id pub-id-type="pmid">18297082</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal">
<name>
<surname>Davey</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Hohenlohe</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Etter</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Boone</surname>
<given-names>JQ</given-names>
</name>
<name>
<surname>Catchen</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Blaxter</surname>
<given-names>ML</given-names>
</name>
<article-title>Genome-wide genetic marker discovery and genotyping using next-generation sequencing</article-title>
<source>Nat Rev Genet</source>
<year>2011</year>
<volume>12</volume>
<fpage>499</fpage>
<lpage>510</lpage>
<pub-id pub-id-type="doi">10.1038/nrg3012</pub-id>
<pub-id pub-id-type="pmid">21681211</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="other">
<name>
<surname>Andolfatto</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Davison</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Erezyilmaz</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>TT</given-names>
</name>
<name>
<surname>Mast</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sunayama-Morita</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Stern</surname>
<given-names>DL</given-names>
</name>
<article-title>Multiplexed shotgun genotyping for rapid and efficient genetic mapping</article-title>
<source>Genome Res</source>
<year>2011</year>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="journal">
<name>
<surname>Scaglione</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Acquadro</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Portis</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Tirone</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Knapp</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Lanteri</surname>
<given-names>S</given-names>
</name>
<article-title>RAD tag sequencing as a source of SNP markers in Cynara cardunculus L</article-title>
<source>BMC Genomics</source>
<year>2012</year>
<volume>13</volume>
<fpage>3</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-13-3</pub-id>
<pub-id pub-id-type="pmid">22214349</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<name>
<surname>Barchi</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Lanteri</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Portis</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Acquadro</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Vale</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Toppino</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Rotino</surname>
<given-names>GL</given-names>
</name>
<article-title>Identification of SNP and SSR markers in eggplant using RAD tag sequencing</article-title>
<source>BMC Genomics</source>
<year>2011</year>
<volume>12</volume>
<fpage>304</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-12-304</pub-id>
<pub-id pub-id-type="pmid">21663628</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="journal">
<name>
<surname>Etter</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Preston</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Bassham</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Cresko</surname>
<given-names>WA</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>EA</given-names>
</name>
<article-title>Local de novo assembly of RAD paired-end contigs using short sequencing reads</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e18561</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0018561</pub-id>
<pub-id pub-id-type="pmid">21541009</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<name>
<surname>Etter</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Bassham</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hohenlohe</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Cresko</surname>
<given-names>WA</given-names>
</name>
<article-title>SNP discovery and genotyping for evolutionary genetics using RAD sequencing</article-title>
<source>Methods Mol Biol</source>
<year>2011</year>
<volume>772</volume>
<fpage>157</fpage>
<lpage>178</lpage>
<pub-id pub-id-type="pmid">22065437</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<name>
<surname>Chutimanitsakun</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Nipper</surname>
<given-names>RW</given-names>
</name>
<name>
<surname>Cuesta-Marcos</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cistue</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Corey</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Filichkina</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Hayes</surname>
<given-names>PM</given-names>
</name>
<article-title>Construction and application for QTL analysis of a Restriction Site Associated DNA (RAD) linkage map in barley</article-title>
<source>BMC Genomics</source>
<year>2011</year>
<volume>12</volume>
<fpage>4</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-12-4</pub-id>
<pub-id pub-id-type="pmid">21205322</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="journal">
<name>
<surname>Willing</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Hoffmann</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Klein</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Weigel</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Dreyer</surname>
<given-names>C</given-names>
</name>
<article-title>Paired-end RAD-seq for de novo assembly and marker design without available reference</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<fpage>2187</fpage>
<lpage>2193</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr346</pub-id>
<pub-id pub-id-type="pmid">21712251</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<name>
<surname>Hohenlohe</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Amish</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Catchen</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Allendorf</surname>
<given-names>FW</given-names>
</name>
<name>
<surname>Luikart</surname>
<given-names>G</given-names>
</name>
<article-title>Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout</article-title>
<source>Mol Ecol Resour</source>
<year>2011</year>
<volume>11</volume>
<issue>Suppl 1</issue>
<fpage>117</fpage>
<lpage>122</lpage>
<pub-id pub-id-type="pmid">21429168</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<name>
<surname>Pfender</surname>
<given-names>WF</given-names>
</name>
<name>
<surname>Saha</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Slabaugh</surname>
<given-names>MB</given-names>
</name>
<article-title>Mapping with RAD (restriction-site associated DNA) markers to rapidly identify QTL for stem rust resistance in Lolium perenne</article-title>
<source>Theor Appl Genet</source>
<year>2011</year>
<volume>122</volume>
<fpage>1467</fpage>
<lpage>1480</lpage>
<pub-id pub-id-type="doi">10.1007/s00122-011-1546-3</pub-id>
<pub-id pub-id-type="pmid">21344184</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="journal">
<name>
<surname>Zhao</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>QZ</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>CQ</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>HM</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>J</given-names>
</name>
<article-title>Neighboring-nucleotide effects on the mutation patterns of the rice genome</article-title>
<source>Genomics Proteomics Bioinformatics</source>
<year>2005</year>
<volume>3</volume>
<fpage>158</fpage>
<lpage>168</lpage>
<pub-id pub-id-type="pmid">16487081</pub-id>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="journal">
<name>
<surname>Prina</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Favret</surname>
<given-names>EA</given-names>
</name>
<article-title>Parabolic effect in sodium azide mutagenesis in barley*</article-title>
<source>Hereditas</source>
<year>1983</year>
<volume>98</volume>
<fpage>89</fpage>
<lpage>94</lpage>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="other">
<name>
<surname>Seymour</surname>
<given-names>DK</given-names>
</name>
<name>
<surname>Filiault</surname>
<given-names>DL</given-names>
</name>
<name>
<surname>Henry</surname>
<given-names>IH</given-names>
</name>
<name>
<surname>Monson-Miller</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ravi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Pang</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Comai</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>SWL</given-names>
</name>
<name>
<surname>Maloof</surname>
<given-names>JN</given-names>
</name>
<article-title>Arabidopsis doubled haploids - rapid homozygous lines for quantitative trait locus mapping</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2012</year>
<comment> in press </comment>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<name>
<surname>Tai</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Tanksley</surname>
<given-names>SD</given-names>
</name>
<article-title>A rapid and inexpensive method for isolation of total DNA from dehydrated plant tissue</article-title>
<source>Plant Molecular Biology Reporter</source>
<year>1990</year>
<volume>8</volume>
<fpage>297</fpage>
<lpage>303</lpage>
<pub-id pub-id-type="doi">10.1007/BF02668766</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Handsaker</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Wysoker</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Fennell</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Homer</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Marth</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Abecasis</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<article-title>The Sequence Alignment/Map format and SAMtools</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>2078</fpage>
<lpage>2079</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp352</pub-id>
<pub-id pub-id-type="pmid">19505943</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A98 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000A98 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:3305632
   |texte=   Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:22333298" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021