CyberinfraV1, Pmc, Corpus, bibRecord, 000183

Tools and techniques for computational reproducibility

Identifieur interne : 000183 ( Pmc/Corpus ); précédent : 000182; suivant : 000184

Tools and techniques for computational reproducibility

Auteurs : Stephen R. Piccolo ; Michael B. Frampton

Source :

GigaScience [ 2047-217X ] ; 2016.

RBID : PMC:4940747

Abstract

When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed—and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.

Electronic supplementary material

The online version of this article (doi:10.1186/s13742-016-0135-4) contains supplementary material, which is available to authorized users.

Url:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940747

DOI: 10.1186/s13742-016-0135-4
PubMed: 27401684
PubMed Central: 4940747

Links to Exploration step

PMC:4940747

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Tools and techniques for computational reproducibility</title>
<author><name sortKey="Piccolo, Stephen R" sort="Piccolo, Stephen R" uniqKey="Piccolo S" first="Stephen R." last="Piccolo">Stephen R. Piccolo</name>
<affiliation><nlm:aff id="Aff1">Department of Biology, Brigham Young University, Provo, UT 84602 USA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Frampton, Michael B" sort="Frampton, Michael B" uniqKey="Frampton M" first="Michael B." last="Frampton">Michael B. Frampton</name>
<affiliation><nlm:aff id="Aff2">Department of Computer Science, Brigham Young University, Provo, UT USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">27401684</idno>
<idno type="pmc">4940747</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940747</idno>
<idno type="RBID">PMC:4940747</idno>
<idno type="doi">10.1186/s13742-016-0135-4</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000183</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Tools and techniques for computational reproducibility</title>
<author><name sortKey="Piccolo, Stephen R" sort="Piccolo, Stephen R" uniqKey="Piccolo S" first="Stephen R." last="Piccolo">Stephen R. Piccolo</name>
<affiliation><nlm:aff id="Aff1">Department of Biology, Brigham Young University, Provo, UT 84602 USA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Frampton, Michael B" sort="Frampton, Michael B" uniqKey="Frampton M" first="Michael B." last="Frampton">Michael B. Frampton</name>
<affiliation><nlm:aff id="Aff2">Department of Computer Science, Brigham Young University, Provo, UT USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">GigaScience</title>
<idno type="eISSN">2047-217X</idno>
<imprint><date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed—and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.</p>
<sec><title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s13742-016-0135-4) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Fisher, Ra" uniqKey="Fisher R">RA Fisher</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Popper, Kr" uniqKey="Popper K">KR Popper</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Peng, Rd" uniqKey="Peng R">RD Peng</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Russell, Jf" uniqKey="Russell J">JF Russell</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Wilson, G" uniqKey="Wilson G">G Wilson</name>
</author>
<author><name sortKey="Aruliah, Da" uniqKey="Aruliah D">DA Aruliah</name>
</author>
<author><name sortKey="Brown, Ct" uniqKey="Brown C">CT Brown</name>
</author>
<author><name sortKey="Chue Hong, Np" uniqKey="Chue Hong N">NP Chue Hong</name>
</author>
<author><name sortKey="Davis, M" uniqKey="Davis M">M Davis</name>
</author>
<author><name sortKey="Guy, Rt" uniqKey="Guy R">RT Guy</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Sacks, J" uniqKey="Sacks J">J Sacks</name>
</author>
<author><name sortKey="Welch, Wj" uniqKey="Welch W">WJ Welch</name>
</author>
<author><name sortKey="Mitchell, Tj" uniqKey="Mitchell T">TJ Mitchell</name>
</author>
<author><name sortKey="Wynn, Hp" uniqKey="Wynn H">HP Wynn</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Garijo, D" uniqKey="Garijo D">D Garijo</name>
</author>
<author><name sortKey="Kinnings, S" uniqKey="Kinnings S">S Kinnings</name>
</author>
<author><name sortKey="Xie, L" uniqKey="Xie L">L Xie</name>
</author>
<author><name sortKey="Xie, L" uniqKey="Xie L">L Xie</name>
</author>
<author><name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author><name sortKey="Bourne, Pe" uniqKey="Bourne P">PE Bourne</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Vandewalle, P" uniqKey="Vandewalle P">P Vandewalle</name>
</author>
<author><name sortKey="Barrenetxea, G" uniqKey="Barrenetxea G">G Barrenetxea</name>
</author>
<author><name sortKey="Jovanovic, I" uniqKey="Jovanovic I">I Jovanovic</name>
</author>
<author><name sortKey="Ridolfi, A" uniqKey="Ridolfi A">A Ridolfi</name>
</author>
<author><name sortKey="Vetterli, M" uniqKey="Vetterli M">M Vetterli</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cassey, P" uniqKey="Cassey P">P Cassey</name>
</author>
<author><name sortKey="Cassey, P" uniqKey="Cassey P">P Cassey</name>
</author>
<author><name sortKey="Blackburn, T" uniqKey="Blackburn T">T Blackburn</name>
</author>
<author><name sortKey="Blackburn, T" uniqKey="Blackburn T">T Blackburn</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Murphy, Jm" uniqKey="Murphy J">JM Murphy</name>
</author>
<author><name sortKey="Sexton, Dmh" uniqKey="Sexton D">DMH Sexton</name>
</author>
<author><name sortKey="Barnett, Dn" uniqKey="Barnett D">DN Barnett</name>
</author>
<author><name sortKey="Jones, Gs" uniqKey="Jones G">GS Jones</name>
</author>
<author><name sortKey="Webb, Mj" uniqKey="Webb M">MJ Webb</name>
</author>
<author><name sortKey="Collins, M" uniqKey="Collins M">M Collins</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mccarthy, Dj" uniqKey="Mccarthy D">DJ McCarthy</name>
</author>
<author><name sortKey="Humburg, P" uniqKey="Humburg P">P Humburg</name>
</author>
<author><name sortKey="Kanapin, A" uniqKey="Kanapin A">A Kanapin</name>
</author>
<author><name sortKey="Rivas, Ma" uniqKey="Rivas M">MA Rivas</name>
</author>
<author><name sortKey="Gaulton, K" uniqKey="Gaulton K">K Gaulton</name>
</author>
<author><name sortKey="Cazier, J B" uniqKey="Cazier J">J-B Cazier</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Neuman, Ja" uniqKey="Neuman J">JA Neuman</name>
</author>
<author><name sortKey="Isakov, O" uniqKey="Isakov O">O Isakov</name>
</author>
<author><name sortKey="Shomron, N" uniqKey="Shomron N">N Shomron</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bradnam, Kr" uniqKey="Bradnam K">KR Bradnam</name>
</author>
<author><name sortKey="Fass, Jn" uniqKey="Fass J">JN Fass</name>
</author>
<author><name sortKey="Alexandrov, A" uniqKey="Alexandrov A">A Alexandrov</name>
</author>
<author><name sortKey="Baranay, P" uniqKey="Baranay P">P Baranay</name>
</author>
<author><name sortKey="Bechner, M" uniqKey="Bechner M">M Bechner</name>
</author>
<author><name sortKey="Birol, I" uniqKey="Birol I">I Birol</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bilal, E" uniqKey="Bilal E">E Bilal</name>
</author>
<author><name sortKey="Dutkowski, J" uniqKey="Dutkowski J">J Dutkowski</name>
</author>
<author><name sortKey="Guinney, J" uniqKey="Guinney J">J Guinney</name>
</author>
<author><name sortKey="Jang, Is" uniqKey="Jang I">IS Jang</name>
</author>
<author><name sortKey="Logsdon, Ba" uniqKey="Logsdon B">BA Logsdon</name>
</author>
<author><name sortKey="Pandey, G" uniqKey="Pandey G">G Pandey</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gronenschild, Ehbm" uniqKey="Gronenschild E">EHBM Gronenschild</name>
</author>
<author><name sortKey="Habets, P" uniqKey="Habets P">P Habets</name>
</author>
<author><name sortKey="Jacobs, Hil" uniqKey="Jacobs H">HIL Jacobs</name>
</author>
<author><name sortKey="Mengelers, R" uniqKey="Mengelers R">R Mengelers</name>
</author>
<author><name sortKey="Rozendaal, N" uniqKey="Rozendaal N">N Rozendaal</name>
</author>
<author><name sortKey="Van Os, J" uniqKey="Van Os J">J van Os</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Moskvin, Ov" uniqKey="Moskvin O">OV Moskvin</name>
</author>
<author><name sortKey="Mcilwain, S" uniqKey="Mcilwain S">S McIlwain</name>
</author>
<author><name sortKey="Ong, Im" uniqKey="Ong I">IM Ong</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Michael, Cm" uniqKey="Michael C">CM Michael</name>
</author>
<author><name sortKey="Nass, Sj" uniqKey="Nass S">SJ Nass</name>
</author>
<author><name sortKey="Omenn, Gs" uniqKey="Omenn G">GS Omenn</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Collins, Fs" uniqKey="Collins F">FS Collins</name>
</author>
<author><name sortKey="Tabak, L A" uniqKey="Tabak L">L a Tabak</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Leveque, Rj" uniqKey="Leveque R">RJ LeVeque</name>
</author>
<author><name sortKey="Mitchell, Im" uniqKey="Mitchell I">IM Mitchell</name>
</author>
<author><name sortKey="Stodden, V" uniqKey="Stodden V">V Stodden</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stodden, V" uniqKey="Stodden V">V Stodden</name>
</author>
<author><name sortKey="Guo, P" uniqKey="Guo P">P Guo</name>
</author>
<author><name sortKey="Ma, Z" uniqKey="Ma Z">Z Ma</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Morin, A" uniqKey="Morin A">A Morin</name>
</author>
<author><name sortKey="Urban, J" uniqKey="Urban J">J Urban</name>
</author>
<author><name sortKey="Adams, Pd" uniqKey="Adams P">PD Adams</name>
</author>
<author><name sortKey="Foster, I" uniqKey="Foster I">I Foster</name>
</author>
<author><name sortKey="Sali, A" uniqKey="Sali A">A Sali</name>
</author>
<author><name sortKey="Baker, D" uniqKey="Baker D">D Baker</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Ioannidis, Jp A" uniqKey="Ioannidis J">JP a Ioannidis</name>
</author>
<author><name sortKey="Allison, Db" uniqKey="Allison D">DB Allison</name>
</author>
<author><name sortKey="Ball, C A" uniqKey="Ball C">C a Ball</name>
</author>
<author><name sortKey="Coulibaly, I" uniqKey="Coulibaly I">I Coulibaly</name>
</author>
<author><name sortKey="Cui, X" uniqKey="Cui X">X Cui</name>
</author>
<author><name sortKey="Culhane, Ac" uniqKey="Culhane A">AC Culhane</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Nekrutenko, A" uniqKey="Nekrutenko A">A Nekrutenko</name>
</author>
<author><name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Baggerly, K A" uniqKey="Baggerly K">K a Baggerly</name>
</author>
<author><name sortKey="Coombes, Kr" uniqKey="Coombes K">KR Coombes</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Decullier, E" uniqKey="Decullier E">E Decullier</name>
</author>
<author><name sortKey="Huot, L" uniqKey="Huot L">L Huot</name>
</author>
<author><name sortKey="Samson, G" uniqKey="Samson G">G Samson</name>
</author>
<author><name sortKey="Maisonneuve, H" uniqKey="Maisonneuve H">H Maisonneuve</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Stodden, V" uniqKey="Stodden V">V Stodden</name>
</author>
<author><name sortKey="Miguez, S" uniqKey="Miguez S">S Miguez</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ravel, J" uniqKey="Ravel J">J Ravel</name>
</author>
<author><name sortKey="Wommack, Ke" uniqKey="Wommack K">KE Wommack</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
<author><name sortKey="Hudson, Tj" uniqKey="Hudson T">TJ Hudson</name>
</author>
<author><name sortKey="Green, Ed" uniqKey="Green E">ED Green</name>
</author>
<author><name sortKey="Gunter, C" uniqKey="Gunter C">C Gunter</name>
</author>
<author><name sortKey="Eddy, S" uniqKey="Eddy S">S Eddy</name>
</author>
<author><name sortKey="Rogers, J" uniqKey="Rogers J">J Rogers</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hothorn, T" uniqKey="Hothorn T">T Hothorn</name>
</author>
<author><name sortKey="Leisch, F" uniqKey="Leisch F">F Leisch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schofield, Pn" uniqKey="Schofield P">PN Schofield</name>
</author>
<author><name sortKey="Bubela, T" uniqKey="Bubela T">T Bubela</name>
</author>
<author><name sortKey="Weaver, T" uniqKey="Weaver T">T Weaver</name>
</author>
<author><name sortKey="Portilla, L" uniqKey="Portilla L">L Portilla</name>
</author>
<author><name sortKey="Brown, Sd" uniqKey="Brown S">SD Brown</name>
</author>
<author><name sortKey="Hancock, Jm" uniqKey="Hancock J">JM Hancock</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Johnson, Ve" uniqKey="Johnson V">VE Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Halsey, Lg" uniqKey="Halsey L">LG Halsey</name>
</author>
<author><name sortKey="Curran Everett, D" uniqKey="Curran Everett D">D Curran-everett</name>
</author>
<author><name sortKey="Vowler, Sl" uniqKey="Vowler S">SL Vowler</name>
</author>
<author><name sortKey="Drummond, Gb" uniqKey="Drummond G">GB Drummond</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wilson, G" uniqKey="Wilson G">G Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sandve, Gk" uniqKey="Sandve G">GK Sandve</name>
</author>
<author><name sortKey="Nekrutenko, A" uniqKey="Nekrutenko A">A Nekrutenko</name>
</author>
<author><name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
<author><name sortKey="Hovig, E" uniqKey="Hovig E">E Hovig</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Blischak, Jd" uniqKey="Blischak J">JD Blischak</name>
</author>
<author><name sortKey="Davenport, Er" uniqKey="Davenport E">ER Davenport</name>
</author>
<author><name sortKey="Wilson, G" uniqKey="Wilson G">G Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Huber, W" uniqKey="Huber W">W Huber</name>
</author>
<author><name sortKey="Carey, Vj" uniqKey="Carey V">VJ Carey</name>
</author>
<author><name sortKey="Gentleman, R" uniqKey="Gentleman R">R Gentleman</name>
</author>
<author><name sortKey="Anders, S" uniqKey="Anders S">S Anders</name>
</author>
<author><name sortKey="Carlson, M" uniqKey="Carlson M">M Carlson</name>
</author>
<author><name sortKey="Carvalho, Bs" uniqKey="Carvalho B">BS Carvalho</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="T Th, G" uniqKey="T Th G">G Tóth</name>
</author>
<author><name sortKey="Sokolov, Iv" uniqKey="Sokolov I">IV Sokolov</name>
</author>
<author><name sortKey="Gombosi, Ti" uniqKey="Gombosi T">TI Gombosi</name>
</author>
<author><name sortKey="Chesney, Dr" uniqKey="Chesney D">DR Chesney</name>
</author>
<author><name sortKey="Clauer, Cr" uniqKey="Clauer C">CR Clauer</name>
</author>
<author><name sortKey="De Zeeuw, Dl" uniqKey="De Zeeuw D">DL De Zeeuw</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Schneider, Ca" uniqKey="Schneider C">CA Schneider</name>
</author>
<author><name sortKey="Rasband, Ws" uniqKey="Rasband W">WS Rasband</name>
</author>
<author><name sortKey="Eliceiri, Kw" uniqKey="Eliceiri K">KW Eliceiri</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schindelin, J" uniqKey="Schindelin J">J Schindelin</name>
</author>
<author><name sortKey="Arganda Carreras, I" uniqKey="Arganda Carreras I">I Arganda-Carreras</name>
</author>
<author><name sortKey="Frise, E" uniqKey="Frise E">E Frise</name>
</author>
<author><name sortKey="Kaynig, V" uniqKey="Kaynig V">V Kaynig</name>
</author>
<author><name sortKey="Longair, M" uniqKey="Longair M">M Longair</name>
</author>
<author><name sortKey="Pietzsch, T" uniqKey="Pietzsch T">T Pietzsch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Biasini, M" uniqKey="Biasini M">M Biasini</name>
</author>
<author><name sortKey="Schmidt, T" uniqKey="Schmidt T">T Schmidt</name>
</author>
<author><name sortKey="Bienert, S" uniqKey="Bienert S">S Bienert</name>
</author>
<author><name sortKey="Mariani, V" uniqKey="Mariani V">V Mariani</name>
</author>
<author><name sortKey="Studer, G" uniqKey="Studer G">G Studer</name>
</author>
<author><name sortKey="Haas, J" uniqKey="Haas J">J Haas</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Martin, Rc" uniqKey="Martin R">RC Martin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Knuth, De" uniqKey="Knuth D">DE Knuth</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Perez, F" uniqKey="Perez F">F Pérez</name>
</author>
<author><name sortKey="Granger, Be" uniqKey="Granger B">BE Granger</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shen, H" uniqKey="Shen H">H Shen</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Gross, Am" uniqKey="Gross A">AM Gross</name>
</author>
<author><name sortKey="Orosco, Rk" uniqKey="Orosco R">RK Orosco</name>
</author>
<author><name sortKey="Shen, Jp" uniqKey="Shen J">JP Shen</name>
</author>
<author><name sortKey="Egloff, Am" uniqKey="Egloff A">AM Egloff</name>
</author>
<author><name sortKey="Carter, H" uniqKey="Carter H">H Carter</name>
</author>
<author><name sortKey="Hofree, M" uniqKey="Hofree M">M Hofree</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ding, T" uniqKey="Ding T">T Ding</name>
</author>
<author><name sortKey="Schloss, Pd" uniqKey="Schloss P">PD Schloss</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ram, Y" uniqKey="Ram Y">Y Ram</name>
</author>
<author><name sortKey="Hadany, L" uniqKey="Hadany L">L Hadany</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Meadow, Jf" uniqKey="Meadow J">JF Meadow</name>
</author>
<author><name sortKey="Altrichter, Ae" uniqKey="Altrichter A">AE Altrichter</name>
</author>
<author><name sortKey="Kembel, Sw" uniqKey="Kembel S">SW Kembel</name>
</author>
<author><name sortKey="Moriyama, M" uniqKey="Moriyama M">M Moriyama</name>
</author>
<author><name sortKey="O Onnor, Tk" uniqKey="O Onnor T">TK O’Connor</name>
</author>
<author><name sortKey="Womack, Am" uniqKey="Womack A">AM Womack</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Gil, Y" uniqKey="Gil Y">Y Gil</name>
</author>
<author><name sortKey="Deelman, E" uniqKey="Deelman E">E Deelman</name>
</author>
<author><name sortKey="Ellisman, M" uniqKey="Ellisman M">M Ellisman</name>
</author>
<author><name sortKey="Fahringer, T" uniqKey="Fahringer T">T Fahringer</name>
</author>
<author><name sortKey="Fox, G" uniqKey="Fox G">G Fox</name>
</author>
<author><name sortKey="Gannon, D" uniqKey="Gannon D">D Gannon</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Giardine, B" uniqKey="Giardine B">B Giardine</name>
</author>
<author><name sortKey="Riemer, C" uniqKey="Riemer C">C Riemer</name>
</author>
<author><name sortKey="Hardison, Rc" uniqKey="Hardison R">RC Hardison</name>
</author>
<author><name sortKey="Burhans, R" uniqKey="Burhans R">R Burhans</name>
</author>
<author><name sortKey="Elnitski, L" uniqKey="Elnitski L">L Elnitski</name>
</author>
<author><name sortKey="Shah, P" uniqKey="Shah P">P Shah</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Goecks, J" uniqKey="Goecks J">J Goecks</name>
</author>
<author><name sortKey="Nekrutenko, A" uniqKey="Nekrutenko A">A Nekrutenko</name>
</author>
<author><name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Afgan, E" uniqKey="Afgan E">E Afgan</name>
</author>
<author><name sortKey="Baker, D" uniqKey="Baker D">D Baker</name>
</author>
<author><name sortKey="Coraor, N" uniqKey="Coraor N">N Coraor</name>
</author>
<author><name sortKey="Goto, H" uniqKey="Goto H">H Goto</name>
</author>
<author><name sortKey="Paul, Im" uniqKey="Paul I">IM Paul</name>
</author>
<author><name sortKey="Makova, Kd" uniqKey="Makova K">KD Makova</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Lazarus, R" uniqKey="Lazarus R">R Lazarus</name>
</author>
<author><name sortKey="Kaspi, A" uniqKey="Kaspi A">A Kaspi</name>
</author>
<author><name sortKey="Ziemann, M" uniqKey="Ziemann M">M Ziemann</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dudley, Jt" uniqKey="Dudley J">JT Dudley</name>
</author>
<author><name sortKey="Butte, Aj" uniqKey="Butte A">AJ Butte</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hurley, Dg" uniqKey="Hurley D">DG Hurley</name>
</author>
<author><name sortKey="Budden, Dm" uniqKey="Budden D">DM Budden</name>
</author>
<author><name sortKey="Crampin, Ej" uniqKey="Crampin E">EJ Crampin</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Howe, B" uniqKey="Howe B">B Howe</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Krampis, K" uniqKey="Krampis K">K Krampis</name>
</author>
<author><name sortKey="Booth, T" uniqKey="Booth T">T Booth</name>
</author>
<author><name sortKey="Chapman, B" uniqKey="Chapman B">B Chapman</name>
</author>
<author><name sortKey="Tiwari, B" uniqKey="Tiwari B">B Tiwari</name>
</author>
<author><name sortKey="Bicak, M" uniqKey="Bicak M">M Bicak</name>
</author>
<author><name sortKey="Field, D" uniqKey="Field D">D Field</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Eglen, Sj" uniqKey="Eglen S">SJ Eglen</name>
</author>
<author><name sortKey="Weeks, M" uniqKey="Weeks M">M Weeks</name>
</author>
<author><name sortKey="Jessop, M" uniqKey="Jessop M">M Jessop</name>
</author>
<author><name sortKey="Simonotto, J" uniqKey="Simonotto J">J Simonotto</name>
</author>
<author><name sortKey="Jackson, T" uniqKey="Jackson T">T Jackson</name>
</author>
<author><name sortKey="Sernagor, E" uniqKey="Sernagor E">E Sernagor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Eglen, Sj" uniqKey="Eglen S">SJ Eglen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bremges, A" uniqKey="Bremges A">A Bremges</name>
</author>
<author><name sortKey="Maus, I" uniqKey="Maus I">I Maus</name>
</author>
<author><name sortKey="Belmann, P" uniqKey="Belmann P">P Belmann</name>
</author>
<author><name sortKey="Eikmeyer, F" uniqKey="Eikmeyer F">F Eikmeyer</name>
</author>
<author><name sortKey="Winkler, A" uniqKey="Winkler A">A Winkler</name>
</author>
<author><name sortKey="Albersmeier, A" uniqKey="Albersmeier A">A Albersmeier</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Belmann, P" uniqKey="Belmann P">P Belmann</name>
</author>
<author><name sortKey="Droge, J" uniqKey="Droge J">J Dröge</name>
</author>
<author><name sortKey="Bremges, A" uniqKey="Bremges A">A Bremges</name>
</author>
<author><name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author><name sortKey="Sczyrba, A" uniqKey="Sczyrba A">A Sczyrba</name>
</author>
<author><name sortKey="Barton, Md" uniqKey="Barton M">MD Barton</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Hones, Mj" uniqKey="Hones M">MJ Hones</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Donoho, Dl" uniqKey="Donoho D">DL Donoho</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Goldberg, D" uniqKey="Goldberg D">D Goldberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shirts, M" uniqKey="Shirts M">M Shirts</name>
</author>
<author><name sortKey="Pande, Vs" uniqKey="Pande V">VS Pande</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bird, I" uniqKey="Bird I">I Bird</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Ransohoff, Df" uniqKey="Ransohoff D">DF Ransohoff</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bild, Ah" uniqKey="Bild A">AH Bild</name>
</author>
<author><name sortKey="Chang, Jt" uniqKey="Chang J">JT Chang</name>
</author>
<author><name sortKey="Johnson, We" uniqKey="Johnson W">WE Johnson</name>
</author>
<author><name sortKey="Piccolo, Sr" uniqKey="Piccolo S">SR Piccolo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Koster, J" uniqKey="Koster J">J Köster</name>
</author>
<author><name sortKey="Rahmann, S" uniqKey="Rahmann S">S Rahmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sadedin, Sp" uniqKey="Sadedin S">SP Sadedin</name>
</author>
<author><name sortKey="Pope, B" uniqKey="Pope B">B Pope</name>
</author>
<author><name sortKey="Oshlack, A" uniqKey="Oshlack A">A Oshlack</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Goff, Sa" uniqKey="Goff S">SA Goff</name>
</author>
<author><name sortKey="Vaughn, M" uniqKey="Vaughn M">M Vaughn</name>
</author>
<author><name sortKey="Mckay, S" uniqKey="Mckay S">S McKay</name>
</author>
<author><name sortKey="Lyons, E" uniqKey="Lyons E">E Lyons</name>
</author>
<author><name sortKey="Stapleton, Ae" uniqKey="Stapleton A">AE Stapleton</name>
</author>
<author><name sortKey="Gessler, D" uniqKey="Gessler D">D Gessler</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Reich, M" uniqKey="Reich M">M Reich</name>
</author>
<author><name sortKey="Liefeld, T" uniqKey="Liefeld T">T Liefeld</name>
</author>
<author><name sortKey="Gould, J" uniqKey="Gould J">J Gould</name>
</author>
<author><name sortKey="Lerner, J" uniqKey="Lerner J">J Lerner</name>
</author>
<author><name sortKey="Tamayo, P" uniqKey="Tamayo P">P Tamayo</name>
</author>
<author><name sortKey="Mesirov, Jp" uniqKey="Mesirov J">JP Mesirov</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Reich, M" uniqKey="Reich M">M Reich</name>
</author>
<author><name sortKey="Liefeld, J" uniqKey="Liefeld J">J Liefeld</name>
</author>
<author><name sortKey="Thorvaldsdottir, H" uniqKey="Thorvaldsdottir H">H Thorvaldsdottir</name>
</author>
<author><name sortKey="Ocana, M" uniqKey="Ocana M">M Ocana</name>
</author>
<author><name sortKey="Polk, E" uniqKey="Polk E">E Polk</name>
</author>
<author><name sortKey="Jang, D" uniqKey="Jang D">D Jang</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Wolstencroft, K" uniqKey="Wolstencroft K">K Wolstencroft</name>
</author>
<author><name sortKey="Haines, R" uniqKey="Haines R">R Haines</name>
</author>
<author><name sortKey="Fellows, D" uniqKey="Fellows D">D Fellows</name>
</author>
<author><name sortKey="Williams, A" uniqKey="Williams A">A Williams</name>
</author>
<author><name sortKey="Withers, D" uniqKey="Withers D">D Withers</name>
</author>
<author><name sortKey="Owen, S" uniqKey="Owen S">S Owen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rex, De" uniqKey="Rex D">DE Rex</name>
</author>
<author><name sortKey="Ma, Jq" uniqKey="Ma J">JQ Ma</name>
</author>
<author><name sortKey="Toga, Aw" uniqKey="Toga A">AW Toga</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">Gigascience</journal-id>
<journal-id journal-id-type="iso-abbrev">Gigascience</journal-id>
<journal-title-group><journal-title>GigaScience</journal-title>
</journal-title-group>
<issn pub-type="epub">2047-217X</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">27401684</article-id>
<article-id pub-id-type="pmc">4940747</article-id>
<article-id pub-id-type="publisher-id">135</article-id>
<article-id pub-id-type="doi">10.1186/s13742-016-0135-4</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Review</subject>
</subj-group>
</article-categories>
<title-group><article-title>Tools and techniques for computational reproducibility</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" corresp="yes"><name><surname>Piccolo</surname>
<given-names>Stephen R.</given-names>
</name>
<address><phone>+1 801-422-7116</phone>
<email>stephen_piccolo@byu.edu</email>
</address>
<xref ref-type="aff" rid="Aff1">1</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Frampton</surname>
<given-names>Michael B.</given-names>
</name>
<xref ref-type="aff" rid="Aff2">2</xref>
</contrib>
<aff id="Aff1"><label>1</label>
Department of Biology, Brigham Young University, Provo, UT 84602 USA</aff>
<aff id="Aff2"><label>2</label>
Department of Computer Science, Brigham Young University, Provo, UT USA</aff>
</contrib-group>
<pub-date pub-type="epub"><day>11</day>
<month>7</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release"><day>11</day>
<month>7</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="collection"><year>2016</year>
</pub-date>
<volume>5</volume>
<elocation-id>30</elocation-id>
<permissions><copyright-statement>© The Author(s). 2016</copyright-statement>
<license license-type="OpenAccess"><license-p><bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1"><p>When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed—and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.</p>
<sec><title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s13742-016-0135-4) contains supplementary material, which is available to authorized users.</p>
</sec>
</abstract>
<kwd-group xml:lang="en"><title>Keywords</title>
<kwd>Computational reproducibility</kwd>
<kwd>Practice of science</kwd>
<kwd>Literate programming</kwd>
<kwd>Virtualization</kwd>
<kwd>Software containers</kwd>
<kwd>Software frameworks</kwd>
</kwd-group>
<custom-meta-group><custom-meta><meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2016</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body><sec id="Sec1"><title>Background</title>
<p>When reporting research, scientists document the steps they followed to obtain their results. If the description is comprehensive enough that they and others can repeat the procedures and obtain semantically consistent results, the findings are considered to be “reproducible” [<xref ref-type="bibr" rid="CR1">1</xref>
–<xref ref-type="bibr" rid="CR6">6</xref>
]. Reproducible research forms the basic building blocks of science, insofar as it allows researchers to verify and build on each other’s work with confidence.</p>
<p>Computers play an increasingly important role in many scientific disciplines [<xref ref-type="bibr" rid="CR7">7</xref>
–<xref ref-type="bibr" rid="CR10">10</xref>
]. For example, in the United Kingdom, 92 % of academic scientists use some type of software in their research, and 69 % of scientists say their research is feasible only with software tools [<xref ref-type="bibr" rid="CR11">11</xref>
]. Thus efforts to increase scientific reproducibility should consider the ubiquity of computers in research.</p>
<p>Computers present both opportunities and challenges for scientific reproducibility. On one hand, the deterministic nature of most computer programs means that identical results can be obtained from many computational analyses applied to the same input data [<xref ref-type="bibr" rid="CR12">12</xref>
]. Accordingly, computational research can be held to a high reproducibility standard. On the other hand, even when no technical barrier prevents reproducibility, scientists often cannot reproduce computational findings because of complexities in how software is packaged, installed, and executed—and because of limitations associated with how scientists document these steps [<xref ref-type="bibr" rid="CR13">13</xref>
]. This problem is acute in many disciplines, including genomics, signal processing, and ecological modeling [<xref ref-type="bibr" rid="CR14">14</xref>
–<xref ref-type="bibr" rid="CR16">16</xref>
], which have large data sets and rapidly evolving computational tools. However, the same problem can affect any scientific discipline requiring computers for research. Seemingly minor differences in computational approaches can have major influences on analytical outputs [<xref ref-type="bibr" rid="CR12">12</xref>
, <xref ref-type="bibr" rid="CR17">17</xref>
–<xref ref-type="bibr" rid="CR22">22</xref>
], and the effects of these differences may exceed those resulting from experimental factors [<xref ref-type="bibr" rid="CR23">23</xref>
].</p>
<p>Journal editors, funding agencies, governmental institutions, and individual scientists have increasingly made calls for the scientific community to embrace practices to support computational reproducibility [<xref ref-type="bibr" rid="CR24">24</xref>
–<xref ref-type="bibr" rid="CR31">31</xref>
]. This movement has been motivated, in part, by scientists’ failed efforts to reproduce previously published analyses. For example, Ioannidis et al. evaluated 18 published research studies that used computational methods to evaluate gene expression data, but they were able to reproduce only two of those studies [<xref ref-type="bibr" rid="CR32">32</xref>
]. In many cases, the culprit was a failure to share the study’s data; however, incomplete descriptions of software-based analyses were also common. Nekrutenko and Taylor examined 50 papers that analyzed next-generation sequencing data and observed that fewer than half provided any details about software versions or parameters [<xref ref-type="bibr" rid="CR33">33</xref>
]. Recreating analyses that lack such details can require hundreds of hours of effort [<xref ref-type="bibr" rid="CR34">34</xref>
] and may be impossible, even after consulting the original authors. Failure to reproduce research may also lead to careerist effects, including retractions [<xref ref-type="bibr" rid="CR35">35</xref>
].</p>
<p>Noting such concerns, some journals have emphasized the value of placing computer code in open access repositories. It is most useful when scientists provide direct access to an archived version of the code via a uniform resource locator (URL). For example, Zenodo.org and figshare.com provide permanent digital object identifiers (DOI) that can link to software code (and other digital objects) used in publications. In addition, some journals have extended requirements for “Methods” sections, now asking researchers to provide detailed descriptions of 1) how to install software and its dependencies, and 2) what parameters and data preprocessing steps are used in analyses [<xref ref-type="bibr" rid="CR10">10</xref>
, <xref ref-type="bibr" rid="CR24">24</xref>
]. A 2012 Institute of Medicine report emphasized that, in addition to computer code and research data, “fully specified computational procedures” should be made available to the scientific community [<xref ref-type="bibr" rid="CR25">25</xref>
]. The report’s authors elaborated that such procedures should include “all of the steps of computational analysis”, and that “all aspects of the analysis need to be transparently reported” [<xref ref-type="bibr" rid="CR25">25</xref>
]. Such policies represent important progress. However, it is ultimately the responsibility of individual scientists to ensure that others can verify and build upon their analyses.</p>
<p>Describing a computational analysis sufficiently—such that others can re-execute, validate, and refine it—requires more than simply stating what software was used, what commands were executed, and where to find the source code [<xref ref-type="bibr" rid="CR13">13</xref>
, <xref ref-type="bibr" rid="CR27">27</xref>
, <xref ref-type="bibr" rid="CR36">36</xref>
–<xref ref-type="bibr" rid="CR38">38</xref>
]. Software is executed within the context of an operating system (for example, Windows, Mac OS, or Linux), which enables the software to interface with computer hardware. In addition, most software relies on a hierarchy of software dependencies, which perform complementary functions and must be installed alongside the main software tool. One version of a given software tool or dependency may behave differently or have a different interface than another version of the same software. In addition, most analytical software offers a range of parameters (or settings) that the user can specify. If any of these variables differs from those used by the original experimenter, the software may not execute properly or analytical outputs may differ considerably from those observed by the original experimenter.</p>
<p>Scientists can use various tools and techniques to overcome these challenges and to increase the likelihood that their computational analyses will be reproducible. These techniques range in complexity from simple (e.g., providing written documentation) to advanced (e.g., providing a virtual environment that includes an operating system and all the software necessary to execute the analysis). This review describes seven strategies across this spectrum. We describe many of the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy will be sufficient for every scenario; therefore, in many cases, it will be most practical to combine multiple approaches. This review focuses primarily on the computational aspects of reproducibility. The related topics of empirical reproducibility, statistical reproducibility, data sharing, and education about reproducibility have been described elsewhere [<xref ref-type="bibr" rid="CR39">39</xref>
–<xref ref-type="bibr" rid="CR46">46</xref>
]. We believe that with greater awareness and understanding of computational reproducibility techniques, scientists—including those with limited computational experience—will be more apt to perform computational research in a reproducible manner.</p>
</sec>
<sec id="Sec2"><title>Narrative descriptions are a simple but valuable way to support computational reproducibility</title>
<p>The most fundamental strategy for enabling others to reproduce a computational analysis is to provide a detailed, written description of the process. For example, when reporting computational results in a research article, authors customarily provide a narrative that describes the software they used and the analytical steps they followed. Such narratives can be invaluable in enabling others to evaluate the scientific approach and to reproduce the findings. In many situations—for example, when software execution requires user interaction or when proprietary software is used—narratives are the only feasible option for documenting such steps. However, even when a computational analysis uses open-source software and can be fully automated, narratives help others understand how to re-execute an analysis.</p>
<p>Although most articles about research that uses computational methods provide some type of narrative, these descriptions often lack sufficient detail to enable others to retrace those steps [<xref ref-type="bibr" rid="CR32">32</xref>
, <xref ref-type="bibr" rid="CR33">33</xref>
]. Narrative descriptions should indicate the operating system(s), software dependencies, and analytical software that were used, and how to obtain them. In addition, narratives should indicate the exact software versions used, the order in which they were executed, and all non-default parameters that were specified. Such descriptions should account for the fact that computer configurations can differ vastly, even for computers with the same operating system. Because it can be difficult for scientists to remember such details after the fact, it is best to record this information throughout the research process, rather than at the time of manuscript preparation [<xref ref-type="bibr" rid="CR8">8</xref>
].</p>
<p>The following sections describe techniques for automating computational analyses. These techniques can diminish the need for scientists to write narratives. However, because it is often impractical to automate all computational steps, we expect that, for the foreseeable future, narratives will play a vital role in enabling computational reproducibility.</p>
</sec>
<sec id="Sec3"><title>Custom scripts and code can automate research analysis</title>
<p>Scientific software can often be executed in an automated manner via text-based commands. Using such commands—via a command-line interface—scientists can indicate the software program(s) to be executed and which parameter(s) should be used. When multiple commands must be executed, they can be compiled into scripts specifying the order in which the commands should be executed (Fig. <xref rid="Fig1" ref-type="fig">1</xref>
; Additional file <xref rid="MOESM1" ref-type="media">1</xref>
). In many cases, scripts also include commands for installing and configuring software. Such scripts serve as valuable documentation not only for individuals who wish to re-execute the analysis, but also for the researcher who performed the original analysis [<xref ref-type="bibr" rid="CR47">47</xref>
]. In these cases, no amount of narrative is an adequate substitute for providing the actual commands that were used.<fig id="Fig1"><label>Fig. 1</label>
<caption><p>Example of a command line script. This script can be used to align DNA sequence data to a reference genome. First, it downloads the software and data files necessary for the analysis. Then, it extracts (“unzips”) these files, and aligns the data to a reference genome for Ebola virus. Finally, it converts, sorts, and indexes the aligned data. See Additional file <xref rid="MOESM1" ref-type="media">1</xref>
 for an executable version of this script</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig1_HTML" id="MO1"></graphic>
</fig>
</p>
<p>When writing command-line scripts, it is essential to explicitly document any software dependencies and input data that are required for each step in the analysis. The Make utility [<xref ref-type="bibr" rid="CR48">48</xref>
, <xref ref-type="bibr" rid="CR49">49</xref>
] provides one way to specify such requirements [<xref ref-type="bibr" rid="CR36">36</xref>
]. Before any command is executed, Make verifies that each documented dependency is available. Accordingly, researchers can use Make files (scripts) to specify a full hierarchy of operating system components and dependent software that must be present to perform the analysis (Fig. <xref rid="Fig2" ref-type="fig">2</xref>
; Additional file <xref rid="MOESM2" ref-type="media">2</xref>
). In addition, Make can automatically identify any commands that can be executed in parallel, potentially reducing the amount of time required for the analysis. Although Make was originally designed for UNIX-based operating systems (such as Mac OS or Linux), similar utilities have since been developed for Windows operating systems [<xref ref-type="bibr" rid="CR50">50</xref>
]. Table <xref rid="Tab1" ref-type="table">1</xref>
 lists various utilities that can be used to automate software execution.<fig id="Fig2"><label>Fig. 2</label>
<caption><p>Example of a Make file. This file performs the same function as the command line script shown in Fig. <xref rid="Fig1" ref-type="fig">1</xref>
, except that it is formatted for the Make utility. Accordingly, it is structured so that specific tasks must be executed before other tasks, in a hierarchical manner. See Additional file <xref rid="MOESM2" ref-type="media">2</xref>
 for an executable version of this file</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig2_HTML" id="MO2"></graphic>
</fig>
<table-wrap id="Tab1"><label>Table 1</label>
<caption><p>Utilities that can be used to automate software execution</p>
</caption>
<table frame="hsides" rules="groups"><tbody><tr><td>• GNU Make and Make for Windows: tools for building software from source files and for ensuring that the software’s dependencies are met.</td>
</tr>
<tr><td>• Snakemake [<xref ref-type="bibr" rid="CR109">109</xref>
]: an extension of Make that provides a more flexible syntax and makes it easier to execute tasks in parallel.</td>
</tr>
<tr><td>• BPipe [<xref ref-type="bibr" rid="CR110">110</xref>
]: a tool that provides a flexible syntax for users to specify commands to be executed; it maintains an audit trail of all commands that have been executed.</td>
</tr>
<tr><td>• GNU Parallel [<xref ref-type="bibr" rid="CR111">111</xref>
]: a tool for executing commands in parallel across one or more computers.</td>
</tr>
<tr><td>• Makeflow [<xref ref-type="bibr" rid="CR112">112</xref>
]: a tool that can execute commands simultaneously on various types of computer architectures, including computer clusters and cloud environments.</td>
</tr>
<tr><td>• SCONS [<xref ref-type="bibr" rid="CR113">113</xref>
]: an alternative to GNU Make that enables users to customize the process of building and executing software using scripts written in the Python programming language.</td>
</tr>
<tr><td>• CMAKE.org: a tool that enables users to execute Make scripts more easily on multiple operating systems.</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>As well as creating scripts to execute existing software, many researchers also create new software by writing computer code in a programming language such as Python, C++, Java, or R. Such code may perform relatively simple tasks, such as reformatting data files or invoking third-party software. In other cases, computer code may constitute a manuscript’s key intellectual contribution.</p>
<p>Whether analysis steps are encoded in scripts or as computer code, scientists can support reproducibility by publishing these artifacts alongside research papers. By doing so, authors enable readers to evaluate the analytical approach in full detail and to extend the analysis more readily [<xref ref-type="bibr" rid="CR51">51</xref>
]. Although scripts and code may be included alongside a manuscript as supplementary material, a better alternative is to store them in a public repository with a permanent URL. It is often also useful to store code in a version control system (VCS) [<xref ref-type="bibr" rid="CR8">8</xref>
, <xref ref-type="bibr" rid="CR9">9</xref>
, <xref ref-type="bibr" rid="CR47">47</xref>
], and to share it via Web-based services like GitHub.com or Bitbucket.org [<xref ref-type="bibr" rid="CR52">52</xref>
]. With such a VCS repository, scientists can track the different versions of scripts and code that have been developed throughout the evolution of the research project. In addition, outside observers can see the full version history, contribute revisions to the code, and reuse the code for their own purposes [<xref ref-type="bibr" rid="CR53">53</xref>
]. When submitting a manuscript, the authors may “tag” a specific version of the repository that was used for the final analysis described in the manuscript.</p>
</sec>
<sec id="Sec4"><title>Software frameworks enable easier handling of software dependencies</title>
<p>Virtually all computer scripts and code relies on external software dependencies and operating system components. For example, suppose a research study required a scientist to apply Student’s <italic>t</italic>
-test. Rather than write code to implement this statistical test, the scientist would likely find an existing software library that implements the test and then invoke that library from their code. Much time can be saved with this approach, and a wide range of software libraries are freely available. However, software libraries change frequently; invoking the wrong version of a library may result in an error or an unexpected output. Thus, to enable others to reproduce an analysis, it is critical to indicate which dependencies (and versions thereof) must be installed.</p>
<p>One way to address this challenge is to build on a pre-existing software framework, which makes it easier to access software libraries that are commonly used to perform specific types of analysis task. Typically, such frameworks also make it easier to download and install software dependencies, and to ensure that the versions of software libraries and their dependencies are compatible with each other. For example, Bioconductor [<xref ref-type="bibr" rid="CR54">54</xref>
], created for the R statistical programming language [<xref ref-type="bibr" rid="CR55">55</xref>
], is a popular framework that contains hundreds of software packages for analyzing biological data. The Bioconductor framework facilitates versioning, documenting, and distributing code. Once a software library has been incorporated into Bioconductor, other researchers can find, download, install, and configure it on most operating systems with relative ease. In addition, Bioconductor installs software dependencies automatically. These features ease the process of performing an analysis, and can help with reproducibility. Various software frameworks exist for other scientific disciplines [<xref ref-type="bibr" rid="CR56">56</xref>
–<xref ref-type="bibr" rid="CR61">61</xref>
]. General purpose tools for managing software dependencies also exist, for example, Apache Ivy [<xref ref-type="bibr" rid="CR62">62</xref>
] and Puppet [<xref ref-type="bibr" rid="CR50">50</xref>
].</p>
<p>To best support reproducibility, software frameworks should make it easy for scientists to download and install previous versions of a software tool, as well as previous versions of dependencies. Such a design enables other scientists to reproduce analyses that were conducted with previous versions of a software framework. In the case of Bioconductor, considerable extra work may be required to install specific versions of Bioconductor software and their dependencies. To overcome these limitations, scientists may use a software container or virtual machine to package together the specific versions they used in an analysis. Alternatively, they might use third-party solutions such as the aRchive project [<xref ref-type="bibr" rid="CR63">63</xref>
].</p>
</sec>
<sec id="Sec5"><title>Literate programming combines narratives with code</title>
<p>Although narratives, scripts, and computer code individually support reproducibility, there is additional value in combining these entities. Even though a researcher may provide computer code alongside a research paper, other scientists may have difficulty interpreting how the code accomplishes specific tasks. A longstanding way to address this problem is via code comments: human-readable annotations interspersed throughout computer code. However, code comments and other types of documentation often become outdated as code evolves throughout the analysis process [<xref ref-type="bibr" rid="CR64">64</xref>
]. One way to overcome this problem is to use a technique called literate programming [<xref ref-type="bibr" rid="CR65">65</xref>
]. In this approach, the scientist writes a narrative of the scientific analysis and intermingles code directly within the narrative. As the code is executed, a document is generated that includes the code, narratives, and any outputs (e.g., figures, tables) of the code. Accordingly, literate programming helps ensure that readers understand exactly how a particular research result was obtained. In addition, this approach motivates the scientist to keep the target audience in mind when performing a computational analysis, rather than simply to write code that a computer can parse [<xref ref-type="bibr" rid="CR65">65</xref>
]. Consequently, by reducing barriers of understanding among scientists, literate programming can help to engender greater trust in computational findings.</p>
<p>One popular literate programming tool is Jupyter [<xref ref-type="bibr" rid="CR66">66</xref>
]. Using Jupyter.org’s Web-based interface, scientists can create interactive “notebooks” that combine code, data, mathematical equations, plots, and rich media [<xref ref-type="bibr" rid="CR67">67</xref>
]. Originally known as IPython, and previously designed exclusively for the Python programming language, Jupyter now makes it possible to execute code in many different programming languages. Such functionality may be important to scientists who prefer to combine the strengths of different programming languages.</p>
<p>knitr [<xref ref-type="bibr" rid="CR68">68</xref>
] has also gained considerable popularity as a literate programming tool. It is written in the R programming language, and thus can be integrated seamlessly with the array of statistical and plotting tools available in that environment. However, like Jupyter, knitr can execute code written in multiple programming languages. Commonly, knitr is applied to documents that have been authored using RStudio [<xref ref-type="bibr" rid="CR69">69</xref>
], an open-source tool with advanced editing and package management features.</p>
<p>Jupyter notebooks and knitr reports can be saved in various output formats, including hypertext markup language (HTML) and portable document format (PDF; see examples in Figs. <xref rid="Fig3" ref-type="fig">3</xref>
 and <xref rid="Fig4" ref-type="fig">4</xref>
; Additional files <xref rid="MOESM3" ref-type="media">3</xref>
 and <xref rid="MOESM4" ref-type="media">4</xref>
). Increasingly, scientists include such documents as supplementary materials to journal manuscripts, enabling others to repeat analysis steps and recreate manuscript figures [<xref ref-type="bibr" rid="CR70">70</xref>
–<xref ref-type="bibr" rid="CR73">73</xref>
].<fig id="Fig3"><label>Fig. 3</label>
<caption><p>Example of a Jupyter notebook. This example contains code (in the Python programming language) for generating random numbers and plotting them in a graph within a Jupyter notebook. Importantly, the code and output object (graph) are contained within the same document. See Additional file <xref rid="MOESM3" ref-type="media">3</xref>
 for an executable version of the notebook</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig3_HTML" id="MO3"></graphic>
</fig>
<fig id="Fig4"><label>Fig. 4</label>
<caption><p>Example of a document created using knitr. This example contains code (in the R language) for generating random numbers and plotting them on a graph. The knitr tool was used to generate the document, which combines the code and the output object (figure). See Additional file <xref rid="MOESM4" ref-type="media">4</xref>
 for an executable version of this document</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig4_HTML" id="MO4"></graphic>
</fig>
</p>
<p>Scientists typically use literate programming tools for data analysis tasks that can be executed interactively, in a modest amount of time (e.g., minutes or hours). However, it is possible to execute Jupyter or knitr at the command line; thus longer running tasks can be executed on high-performance computers.</p>
<p>Literate programming notebooks are suitable for research analyses that require a modest amount of computer code. For analyses needing larger amounts of code, more advanced programming environments may be more suitable, perhaps in combination with a “literate documentation” tool such as Dexy.it.</p>
</sec>
<sec id="Sec6"><title>Workflow management systems enable software to be executed via a graphical user interface</title>
<p>Writing computer scripts and code seems daunting to many researchers. Although various courses and tutorials are helping to make this task less formidable [<xref ref-type="bibr" rid="CR46">46</xref>
, <xref ref-type="bibr" rid="CR74">74</xref>
–<xref ref-type="bibr" rid="CR76">76</xref>
], many scientists use “workflow management systems” to facilitate the execution of scientific software [<xref ref-type="bibr" rid="CR77">77</xref>
]. Typically managed via a graphical user interface, workflow management systems enable scientists to upload data and process them using existing tools. For multistep analyses, the output from one tool can be used as input to additional tools, resulting in a series of commands known as a workflow.</p>
<p>Galaxy [<xref ref-type="bibr" rid="CR78">78</xref>
, <xref ref-type="bibr" rid="CR79">79</xref>
] has gained considerable popularity within the bioinformatics community, especially for performing next-generation sequencing analysis. As users construct workflows, Galaxy provides descriptions of how software parameters should be used, examples of how input files should be formatted, and links to relevant discussion forums. To help with processing large data sets and computationally complex algorithms, Galaxy also provides an option to execute workflows on cloud-computing services [<xref ref-type="bibr" rid="CR80">80</xref>
]. In addition, researchers can share workflows with each other at UseGalaxy.org; this feature has enabled the Galaxy team to build a community that encourages reproducibility, helps define best practices, and reduce the time required for novices to get started.</p>
<p>Various other workflow systems are freely available to the research community (see Table <xref rid="Tab2" ref-type="table">2</xref>
). For example, VisTrails.org is used by researchers from many disciplines, including climate science, microbial ecology, and quantum mechanics [<xref ref-type="bibr" rid="CR81">81</xref>
]. It enables scientists to design visual workflows, and connect data inputs with analytical modules and the resulting outputs. In addition, VisTrails tracks a full history of how each workflow was created. This capability, referred to as “retrospective provenance”, makes it possible for others to not only reproduce the final version of an analysis, but also to examine previous incarnations of the workflow and how each change influenced the analytical outputs [<xref ref-type="bibr" rid="CR82">82</xref>
].<table-wrap id="Tab2"><label>Table 2</label>
<caption><p>Workflow management tools freely available to the research community</p>
</caption>
<table frame="hsides" rules="groups"><tbody><tr><td>• Galaxy [<xref ref-type="bibr" rid="CR78">78</xref>
, <xref ref-type="bibr" rid="CR79">79</xref>
]</td>
</tr>
<tr><td>• VisTrails [<xref ref-type="bibr" rid="CR81">81</xref>
]</td>
</tr>
<tr><td>• Kepler-project.org [<xref ref-type="bibr" rid="CR114">114</xref>
]</td>
</tr>
<tr><td>• CyVerse.org (formerly known as The iPlant Collaborative) [<xref ref-type="bibr" rid="CR115">115</xref>
]</td>
</tr>
<tr><td>• GenePattern [<xref ref-type="bibr" rid="CR116">116</xref>
–<xref ref-type="bibr" rid="CR118">118</xref>
]</td>
</tr>
<tr><td>• Taverna.org.uk [<xref ref-type="bibr" rid="CR119">119</xref>
]</td>
</tr>
<tr><td>• LONI Pipeline [<xref ref-type="bibr" rid="CR120">120</xref>
, <xref ref-type="bibr" rid="CR121">121</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>Although workflow management systems offer many advantages, users must accept tradeoffs. For example, although the teams that develop these tools often provide public servers where users can execute workflows, many scientists share these resources, limiting the computational power or storage space available to execute large-scale analyses in a timely manner. As an alternative, many scientists install these systems on their own computers; however, configuring and supporting them requires time and expertise. In addition, if a workflow tool does not yet provide a module to support a given analysis, the scientist must create one. This task constitutes additional overheads; however, utilities such as the Galaxy Tool Shed [<xref ref-type="bibr" rid="CR83">83</xref>
] are helping to facilitate this process.</p>
</sec>
<sec id="Sec7"><title>Virtual machines encapsulate an entire operating system and software dependencies</title>
<p>Whether within a literate programming notebook, or via a workflow management system, an operating system and relevant software dependencies must be installed before an analysis is executed. The process of identifying, installing, and configuring such dependencies consumes a considerable amount of scientists’ time. Different operating systems (and versions thereof) may require different installation and configuration steps. Furthermore, earlier versions of software dependencies, which may currently be installed on a given computer, may be incompatible with—or produce different outputs than—newer versions.</p>
<p>One solution is to use virtual machines, which can encapsulate an entire operating system and all software, scripts, code, and data necessary to execute a computational analysis [<xref ref-type="bibr" rid="CR84">84</xref>
, <xref ref-type="bibr" rid="CR85">85</xref>
] (Fig. <xref rid="Fig5" ref-type="fig">5</xref>
). Using virtualization software such as VirtualBox or VMWare (see Table <xref rid="Tab3" ref-type="table">3</xref>
), a virtual machine can be executed on practically any desktop, laptop, or server, irrespective of the main (“host”) operating system on the computer. For example, even though a scientist’s computer may be running a Windows operating system, they may perform an analysis on a Linux operating system that is running concurrently—within a virtual machine—on the same computer. The scientist has full control over the virtual (“guest”) operating system, and thus can install software and modify configuration settings as necessary. In addition, a virtual machine can be constrained to use specific amounts of computational resources (e.g., computer memory, processing power), thus enabling system administrators to ensure that multiple virtual machines can be executed simultaneously on the same computer without impacting each other’s performance. After executing an analysis, the scientist can export the entire virtual machine to a single, binary file. Other scientists can then use this file to reconstitute the same computational environment that was used for the original analysis. With a few exceptions (see Discussion), these scientists will obtain exactly the same results as the original scientist. This process provides the added benefits that 1) the scientist must only document the installation and configuration steps for a single operating system, 2) other scientists need only install the virtualization software and not individual software components, and 3) analyses can be re-executed indefinitely, so long as the virtualization software remains compatible with current computer systems [<xref ref-type="bibr" rid="CR86">86</xref>
]. The fact that a team of scientists can employ virtual machines to ensure that each team member has the same computational environment is also useful because team members may have different configurations on their host operating systems.<fig id="Fig5"><label>Fig. 5</label>
<caption><p>Architecture of virtual machines. Virtual machines encapsulate analytical software and dependencies within a “guest” operating system, which may be different to the main (“host”) operating system. A virtual machine executes in the context of virtualization software, which runs alongside other software installed on the computer</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig5_HTML" id="MO5"></graphic>
</fig>
<table-wrap id="Tab3"><label>Table 3</label>
<caption><p>Virtual machine software</p>
</caption>
<table frame="hsides" rules="groups"><tbody><tr><td>Virtualization hypervisors:</td>
</tr>
<tr><td> • VirtualBox.org (open source)</td>
</tr>
<tr><td> • XenProject.org (open source)</td>
</tr>
<tr><td> • VMWare.com (partially open source)</td>
</tr>
<tr><td>Virtual machine management tools:</td>
</tr>
<tr><td> • VagrantUP.com (open source)</td>
</tr>
<tr><td> • Vortex (open source) [<xref ref-type="bibr" rid="CR122">122</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>One criticism of using virtual machines to support computational reproducibility is that virtual machine files are large (typically multiple gigabytes), especially if they include raw data files. This imposes a barrier for researchers to share virtual machines with the research community. One option is to use cloud-computing services (see Table <xref rid="Tab4" ref-type="table">4</xref>
). Scientists can execute an analysis in the cloud, take a “snapshot” of their virtual machine, and share it with others in that environment [<xref ref-type="bibr" rid="CR84">84</xref>
, <xref ref-type="bibr" rid="CR87">87</xref>
]. Cloud-based services typically provide repositories where virtual machine files can easily be stored and shared among users. Despite these advantages, some researchers may prefer their data to reside on local computers, rather than in the cloud—at least while the research is being performed. In addition, cloud-based services may use proprietary software, so virtual machines may only be executable within each provider’s infrastructure. Furthermore, to use a cloud service provider, scientists may need to activate a fee-based account.<table-wrap id="Tab4"><label>Table 4</label>
<caption><p>Commercial cloud-service providers</p>
</caption>
<table frame="hsides" rules="groups"><tbody><tr><td>• Amazon Web Services [<xref ref-type="bibr" rid="CR123">123</xref>
]</td>
</tr>
<tr><td>• Rackspace.com/Cloud</td>
</tr>
<tr><td>• Google Cloud Platform [<xref ref-type="bibr" rid="CR124">124</xref>
]</td>
</tr>
<tr><td>• Windows Azure [<xref ref-type="bibr" rid="CR125">125</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>When using virtual machines to support reproducibility, it is important that other scientists can not only re-execute the analysis, but also examine the scripts and code used within the virtual machine [<xref ref-type="bibr" rid="CR88">88</xref>
]. Although it is possible for others to examine the contents of a virtual machine directly, it is preferable to store the scripts and code in public repositories—separately from the virtual machine—so others can examine and extend the analysis more easily [<xref ref-type="bibr" rid="CR89">89</xref>
]. In addition, scientists can use a virtual machine that has been prepackaged for a particular research discipline. For example, CloudBioLinux contains a variety of bioinformatics tools commonly used by genomics researchers [<xref ref-type="bibr" rid="CR90">90</xref>
]. The scripts for building this virtual machine are stored in a public repository [<xref ref-type="bibr" rid="CR91">91</xref>
].</p>
<p>Scientists can automate the process of building and configuring virtual machines using tools such as Vagrant or Vortex (see Table <xref rid="Tab3" ref-type="table">3</xref>
). For either tool, users can write text-based configuration files that provide instructions for building virtual machines and allocating computational resources to them. In addition, these configuration files can be used to specify analysis steps [<xref ref-type="bibr" rid="CR89">89</xref>
]. Because these files are text based and relatively small (usually a few kilobytes), scientists can share them easily and track different versions of the files via source control repositories. This approach also mitigates problems that might arise during the analysis stage. For example, even when a computer’s host operating system must be reinstalled because of a computer hardware failure, the virtual machine can be recreated with relative ease.</p>
</sec>
<sec id="Sec8"><title>Software containers ease the process of installing and configuring dependencies</title>
<p>Software containers are a lighter weight alternative to virtual machines. Like virtual machines, containers encapsulate operating system components, scripts, code, and data into a single package that can be shared with others. Thus, as with virtual machines, analyses executed within a software container should produce identical outputs, irrespective of the underlying operating system or the software that may be installed outside the container (see Discussion for caveats). As is true for virtual machines, multiple containers can be executed simultaneously on a single computer, and each container may contain different software versions and configurations. However, whereas virtual machines include an entire operating system, software containers interface directly with the computer’s main operating system and extend it as needed (Fig. <xref rid="Fig6" ref-type="fig">6</xref>
). This design provides less flexibility than virtual machines because containers are specific to a given type of operating system; however, containers require considerably less computational overhead than virtual machines, and can be initialized much more quickly [<xref ref-type="bibr" rid="CR92">92</xref>
].<fig id="Fig6"><label>Fig. 6</label>
<caption><p>Architecture of software containers. Software containers encapsulate analytical software and dependencies. In contrast to virtual machines, containers execute within the context of the computer’s main operating system</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig6_HTML" id="MO6"></graphic>
</fig>
</p>
<p>The open source Docker.com utility, which has gained popularity among informaticians since its release in 2013, provides the ability to build, execute, and share software containers for Linux-based operating systems. Users specify a Docker container’s contents using text-based commands. These instructions can be placed in a “Dockerfile”, which other scientists can use to rebuild the container. As with virtual machine configuration files, Dockerfiles are text based, so they can be shared easily, and can be tracked and versioned in source control repositories. Once a Docker container has been built, its contents can be exported to a binary file; these files are generally smaller than virtual machine files, so they can be shared more easily—for example, via hub.Docker.com.</p>
<p>A key feature of Docker containers is that their contents can be stacked in distinct layers (or “images”). Each image includes software components to address a particular need. Within a given research lab, scientists might create general purpose images to support functionality for multiple projects, and specialized images to address the needs of specific projects. An advantage of Docker’s modular design is that when images within a container are updated, Docker only needs to track the specific components that have changed; users who wish to update to a newer version must download a relatively small update. In contrast, even a minor change to a virtual machine would require users to export and reshare the entire virtual machine.</p>
<p>Scientists have begun to share Docker images that enable others to execute analyses described in research papers [<xref ref-type="bibr" rid="CR93">93</xref>
–<xref ref-type="bibr" rid="CR95">95</xref>
], and to facilitate benchmarking efforts among researchers in a given subdiscipline. For example, nucleotid.es is a catalog of genome-assembly tools that have been encapsulated in Docker images [<xref ref-type="bibr" rid="CR96">96</xref>
, <xref ref-type="bibr" rid="CR97">97</xref>
]. Genome assembly tools differ considerably in the dependencies they require, and in the parameters they support. This project provides a means to standardize these assemblers, circumvent the need to install dependencies for each tool, and perform benchmarks across the tools. Such projects may help to reduce the reproducibility burden on individual scientists.</p>
<p>The use of Docker containers for reproducible research comes with caveats. Individual containers are stored and executed in isolation from other containers on the same computer; however, because all containers on a given machine share the same operating system, this isolation is not as complete as it is with virtual machines. This means, for example, that a given container is not guaranteed to have access to a specific amount of computer memory or processing power—multiple containers may have to compete for these resources [<xref ref-type="bibr" rid="CR92">92</xref>
]. In addition, containers may be more vulnerable to security breaches [<xref ref-type="bibr" rid="CR92">92</xref>
]. Because Docker containers can only be executed on Linux-based operating systems, they must be executed within a virtual machine on Windows and Mac operating systems. Docker provides installation packages to facilitate this integration; however, the overhead of using a virtual machine offsets some of the performance benefits of using containers.</p>
<p>Efforts are ongoing to develop and refine software container technologies. Table <xref rid="Tab5" ref-type="table">5</xref>
 lists various tools that are currently available. In the coming years, these technologies promise to play an influential role within the scientific community.<table-wrap id="Tab5"><label>Table 5</label>
<caption><p>Open-source containerization software</p>
</caption>
<table frame="hsides" rules="groups"><tbody><tr><td>• Docker.com</td>
</tr>
<tr><td>• LinuxContainers.org</td>
</tr>
<tr><td>• lmctfy [<xref ref-type="bibr" rid="CR126">126</xref>
]</td>
</tr>
<tr><td>• OpenVZ.org</td>
</tr>
<tr><td>• Warden [<xref ref-type="bibr" rid="CR127">127</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
<sec id="Sec9"><title>Conclusions</title>
<p>Scientific advancement requires researchers to explicitly document the research steps they performed and to transparently share those steps with other researchers. This review provides a comprehensive, though not exhaustive, list of techniques that can help meet these requirements for computational analyses. Science philosopher Karl Popper contended that, “[w]e do not take even our own observations quite seriously, or accept them as scientific observations, until we have repeated and tested them” [<xref ref-type="bibr" rid="CR2">2</xref>
]. Indeed, in many cases, the individuals who benefit most from computational reproducibility are those who performed the original analysis, but reproducible and transparent practices can also increase the level at which a scientist’s work is accepted by other scientists [<xref ref-type="bibr" rid="CR47">47</xref>
, <xref ref-type="bibr" rid="CR98">98</xref>
]. When other scientists can reproduce an analysis and determine exactly how its conclusions were drawn, they may be more apt to cite and build upon the work. In contrast, when others fail to reproduce research findings, it can derail scientific progress and may lead to embarrassment, accusations, and retractions.</p>
<p>We have described seven tools and techniques for facilitating computational reproducibility. None of these approaches is sufficient for every scenario in isolation; rather, scientists will often find value in combining approaches. For example, a researcher who uses a literate programming notebook (that combines narratives with code) might incorporate the notebook into a software container so that others can execute it without needing to install specific software dependencies. The container might also include a workflow management system to ease the process of integrating multiple tools and incorporating best practices for the analysis. This container could be packaged within a virtual machine or cloud-computing environment to ensure that it can be executed consistently (see Fig. <xref rid="Fig7" ref-type="fig">7</xref>
). Binder [<xref ref-type="bibr" rid="CR99">99</xref>
] and Everware [<xref ref-type="bibr" rid="CR100">100</xref>
] are two services that allow researchers to execute Jupyter notebooks within a Web browser, using a Docker container to package the underlying software, and a cloud-computing environment to execute it. Although still under active development, such services may be harbingers of the future for computationally reproducible science.<fig id="Fig7"><label>Fig. 7</label>
<caption><p>Example of a Docker container for genomics research. This container would enable researchers to preprocess various types of molecular data, using tools from Bioconductor and Galaxy, and to analyze the resulting data within a Jupyter notebook. Each box within the container represents a distinct Docker image. These images are layered such that some images depend on others (for example, the Bioconductor image depends on R). At its base, the container includes operating system libraries, which may not be present (or may be configured differently) on the computer’s main operating system</p>
</caption>
<graphic xlink:href="13742_2016_135_Fig7_HTML" id="MO7"></graphic>
</fig>
</p>
<p>The call for computational reproducibility relies on the premise that reproducible science will bolster the efficiency of the overall scientific enterprise [<xref ref-type="bibr" rid="CR101">101</xref>
]. Although reproducible practices may require additional time and effort, these practices provide ancillary benefits that help offset those expenditures [<xref ref-type="bibr" rid="CR47">47</xref>
]. Primarily, scientists may experience increased efficiency in their research [<xref ref-type="bibr" rid="CR47">47</xref>
]. For example, before and after a manuscript is submitted for publication, it faces scrutiny from co-authors and peer reviewers who may suggest alterations to the analysis. Having a complete record of all the analysis steps, and being able to retrace those steps precisely, makes it faster and easier to implement the requested alterations [<xref ref-type="bibr" rid="CR47">47</xref>
, <xref ref-type="bibr" rid="CR102">102</xref>
]. Reproducible practices can also improve the efficiency of team science because colleagues can more easily communicate their research protocols and inspect each other’s work; one type of relationship where this is critical is that between academic advisors and mentees [<xref ref-type="bibr" rid="CR102">102</xref>
]. Finally, when research protocols are shared transparently with the broader community, scientific advancement increases because scientists can learn more easily from each other’s work and there is less duplication of effort [<xref ref-type="bibr" rid="CR102">102</xref>
].</p>
<p>Reproducible practices do not necessarily ensure that others can obtain identical results to those obtained by the original scientists. Indeed, this objective may be infeasible for some types of computational analysis, including those that use randomization procedures, floating-point operations, or specialized computer hardware [<xref ref-type="bibr" rid="CR85">85</xref>
, <xref ref-type="bibr" rid="CR103">103</xref>
]. In such cases, the goal may shift to ensuring that others can obtain results that are semantically consistent with the original findings [<xref ref-type="bibr" rid="CR5">5</xref>
, <xref ref-type="bibr" rid="CR6">6</xref>
]. In addition, in studies where vast computational resources are needed to perform an analysis, or where data sets are distributed geographically [<xref ref-type="bibr" rid="CR104">104</xref>
–<xref ref-type="bibr" rid="CR106">106</xref>
], full reproducibility may be infeasible. Alternatively, it may be infeasible to reallocate computational resources for highly computationally intensive analyses [<xref ref-type="bibr" rid="CR8">8</xref>
]. In these cases, researchers can provide relatively simple examples to demonstrate the methodology [<xref ref-type="bibr" rid="CR8">8</xref>
]. When legal restrictions prevent researchers from publicly sharing software or data, or when software is available only via a Web interface, researchers should document the analysis steps as well as possible and describe why such components cannot be shared [<xref ref-type="bibr" rid="CR25">25</xref>
].</p>
<p>Computational reproducibility does not guarantee against analytical biases, or ensure that software produces scientifically valid results [<xref ref-type="bibr" rid="CR107">107</xref>
]. As with any research, a poor study design, confounding effects, or improper use of analytical software may plague even the most reproducible analyses [<xref ref-type="bibr" rid="CR107">107</xref>
, <xref ref-type="bibr" rid="CR108">108</xref>
]. On one hand, increased transparency puts scientists at a greater risk that such problems will be exposed. On the other hand, scientists who are fully transparent about their scientific approach may be more likely to avoid such pitfalls, knowing that they will be more vulnerable to such criticisms. Either way, the scientific community benefits.</p>
<p>Lastly, we emphasize that some reproducibility is better than none. Although some of the practices described in this review require more technical expertise than others, they are freely accessible to all scientists, and provide long-term benefits to the researcher and to the scientific community. Indeed, as scientists act in good faith to perform these practices, where feasible, the pace of scientific progress will surely increase.</p>
<sec id="Sec10"><title>Open Peer Review</title>
<p>The Open Peer Review files are available for this manuscript as Additional files – See Additional file <xref rid="MOESM5" ref-type="media">5</xref>
.</p>
</sec>
</sec>
</body>
<back><app-group><app id="App1"><sec id="Sec11"><title>Additional files</title>
<p><media position="anchor" xlink:href="13742_2016_135_MOESM1_ESM.sh" id="MOESM1"><label>Additional file 1:</label>
<caption><p>This script is supporting material for Fig. <xref rid="Fig1" ref-type="fig">1</xref>
. It can be used to align DNA sequence data to a reference genome. First, it downloads the software and data files necessary for the analysis. Then, it extracts (“unzips”) these files, and aligns the data to a reference genome for Ebola virus. Finally, it converts, sorts, and indexes the aligned data. (SH 865 bytes)</p>
</caption>
</media>
<media position="anchor" xlink:href="13742_2016_135_MOESM2_ESM.xls" id="MOESM2"><label>Additional file 2:</label>
<caption><p>This Make file is supporting material for Fig. <xref rid="Fig2" ref-type="fig">2</xref>
. It performs the same function as Additional file <xref rid="MOESM1" ref-type="media">1</xref>
, except that it is formatted for the Make utility. Accordingly, it is structured so that specific tasks must be executed before other tasks, in a hierarchical manner</p>
</caption>
</media>
<media position="anchor" xlink:href="13742_2016_135_MOESM3_ESM.ipynb" id="MOESM3"><label>Additional file 3:</label>
<caption><p>This Jupyter notebook is supporting material for Fig. <xref rid="Fig3" ref-type="fig">3</xref>
. It contains code (in the Python programming language) for generating random numbers and plotting them in a graph. (IPYNB 53 kb)</p>
</caption>
</media>
<media position="anchor" xlink:href="13742_2016_135_MOESM4_ESM.rmd" id="MOESM4"><label>Additional file 4:</label>
<caption><p>This document contains code (in the R language) for generating random numbers and plotting them on a graph. This document is in R Markdown format and can be compiled using knitr. (RMD 382 bytes)</p>
</caption>
</media>
<media position="anchor" xlink:href="13742_2016_135_MOESM5_ESM.pdf" id="MOESM5"><label>Additional file 5:</label>
<caption><p>Open Peer Review. (PDF 6134 kb)</p>
</caption>
</media>
</p>
</sec>
</app>
</app-group>
<glossary><title>Abbreviations</title>
<def-list><def-item><term>DOI</term>
<def><p>Digital object identifier</p>
</def>
</def-item>
<def-item><term>HTML</term>
<def><p>Hypertext markup language</p>
</def>
</def-item>
<def-item><term>PDF</term>
<def><p>Portable document format</p>
</def>
</def-item>
<def-item><term>URL</term>
<def><p>Uniform resource locator</p>
</def>
</def-item>
<def-item><term>VCS</term>
<def><p>Version control system</p>
</def>
</def-item>
</def-list>
</glossary>
<fn-group><fn><p>Twitter: @stevepiccolo</p>
</fn>
</fn-group>
<ack><title>Acknowledgements</title>
<p>SRP acknowledges startup funds provided by Brigham Young University. We thank research-community members and reviewers who provided valuable feedback on this manuscript.</p>
<sec id="FPar1"><title>Authors’ contributions</title>
<p>SRP wrote the manuscript and created figures. MBF created figures and helped to revise the manuscript. Both authors read and approved the final manuscript.</p>
</sec>
<sec id="FPar2"><title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
</ack>
<ref-list id="Bib1"><title>References</title>
<ref id="CR1"><label>1.</label>
<element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Fisher</surname>
<given-names>RA</given-names>
</name>
</person-group>
<source>The Design of Experiments</source>
<year>1935</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>Hafner Press</publisher-name>
</element-citation>
</ref>
<ref id="CR2"><label>2.</label>
<element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Popper</surname>
<given-names>KR</given-names>
</name>
</person-group>
<source>The logic of scientific discovery</source>
<year>1959</year>
<publisher-loc>London</publisher-loc>
<publisher-name>Routledge</publisher-name>
</element-citation>
</ref>
<ref id="CR3"><label>3.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Peng</surname>
<given-names>RD</given-names>
</name>
</person-group>
<article-title>Reproducible research in computational science</article-title>
<source>Science</source>
<year>2011</year>
<volume>334</volume>
<fpage>1226</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="doi">10.1126/science.1213847</pub-id>
<pub-id pub-id-type="pmid">22144613</pub-id>
</element-citation>
</ref>
<ref id="CR4"><label>4.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Russell</surname>
<given-names>JF</given-names>
</name>
</person-group>
<article-title>If a job is worth doing, it is worth doing twice</article-title>
<source>Nature</source>
<year>2013</year>
<volume>496</volume>
<fpage>7</fpage>
<pub-id pub-id-type="doi">10.1038/496007a</pub-id>
<pub-id pub-id-type="pmid">23552910</pub-id>
</element-citation>
</ref>
<ref id="CR5"><label>5.</label>
<mixed-citation publication-type="other">Feynman RP. Six Easy Pieces: Essentials of Physics Explained by Its Most Brilliant Teacher. Boston, MA: Addison-Wesley; 1995. p. 34–5.</mixed-citation>
</ref>
<ref id="CR6"><label>6.</label>
<mixed-citation publication-type="other">Murray-Rust P, Murray-Rust D. Reproducible Physical Science and the Declaratron. In: Stodden VC, Leisch F, Peng RD, editors. Implementing Reproducible Research. Boca Raton, FL: CRC Press; 2014. p. 113.</mixed-citation>
</ref>
<ref id="CR7"><label>7.</label>
<mixed-citation publication-type="other">Hey AJG, Tansley S, Tolle KM, Others. The fourth paradigm: data-intensive scientific discovery. Redmond, WA: Microsoft Research Redmond, WA; 2009.</mixed-citation>
</ref>
<ref id="CR8"><label>8.</label>
<mixed-citation publication-type="other">Millman KJ, Pérez F. Developing Open-Source Scientific Practice. Implementing Reproducible Research. Boca Raton, FL: CRC Press; 2014;149.</mixed-citation>
</ref>
<ref id="CR9"><label>9.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wilson</surname>
<given-names>G</given-names>
</name>
<name><surname>Aruliah</surname>
<given-names>DA</given-names>
</name>
<name><surname>Brown</surname>
<given-names>CT</given-names>
</name>
<name><surname>Chue Hong</surname>
<given-names>NP</given-names>
</name>
<name><surname>Davis</surname>
<given-names>M</given-names>
</name>
<name><surname>Guy</surname>
<given-names>RT</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Best practices for scientific computing</article-title>
<source>PLoS Biol</source>
<year>2014</year>
<volume>12</volume>
<fpage>e1001745</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.1001745</pub-id>
<pub-id pub-id-type="pmid">24415924</pub-id>
</element-citation>
</ref>
<ref id="CR10"><label>10.</label>
<mixed-citation publication-type="other">Software with impact. Nat Methods. 2014;11:211.</mixed-citation>
</ref>
<ref id="CR11"><label>11.</label>
<mixed-citation publication-type="other">Hong NC. We are the 92% [Internet]. Figshare; 2014. Available from: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.6084/M9.FIGSHARE.1243288">http://dx.doi.org/10.6084/M9.FIGSHARE.1243288</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR12"><label>12.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sacks</surname>
<given-names>J</given-names>
</name>
<name><surname>Welch</surname>
<given-names>WJ</given-names>
</name>
<name><surname>Mitchell</surname>
<given-names>TJ</given-names>
</name>
<name><surname>Wynn</surname>
<given-names>HP</given-names>
</name>
</person-group>
<article-title>Design and analysis of computer experiments</article-title>
<source>Stat Sci</source>
<year>1989</year>
<volume>4</volume>
<fpage>409</fpage>
<lpage>23</lpage>
<pub-id pub-id-type="doi">10.1214/ss/1177012413</pub-id>
</element-citation>
</ref>
<ref id="CR13"><label>13.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Garijo</surname>
<given-names>D</given-names>
</name>
<name><surname>Kinnings</surname>
<given-names>S</given-names>
</name>
<name><surname>Xie</surname>
<given-names>L</given-names>
</name>
<name><surname>Xie</surname>
<given-names>L</given-names>
</name>
<name><surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name><surname>Bourne</surname>
<given-names>PE</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Quantifying reproducibility in computational biology: the case of the tuberculosis drugome</article-title>
<source>PLoS One</source>
<year>2013</year>
<volume>8</volume>
<fpage>e80278</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0080278</pub-id>
<pub-id pub-id-type="pmid">24312207</pub-id>
</element-citation>
</ref>
<ref id="CR14"><label>14.</label>
<mixed-citation publication-type="other">Error prone. Nature. 2012;487:406.</mixed-citation>
</ref>
<ref id="CR15"><label>15.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vandewalle</surname>
<given-names>P</given-names>
</name>
<name><surname>Barrenetxea</surname>
<given-names>G</given-names>
</name>
<name><surname>Jovanovic</surname>
<given-names>I</given-names>
</name>
<name><surname>Ridolfi</surname>
<given-names>A</given-names>
</name>
<name><surname>Vetterli</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Experiences with reproducible research in various facets of signal processing research. IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP’07</article-title>
<source>IEEE</source>
<year>2007</year>
<volume>2007</volume>
<fpage>IV-1253</fpage>
<lpage>IV-1256</lpage>
</element-citation>
</ref>
<ref id="CR16"><label>16.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cassey</surname>
<given-names>P</given-names>
</name>
<name><surname>Cassey</surname>
<given-names>P</given-names>
</name>
<name><surname>Blackburn</surname>
<given-names>T</given-names>
</name>
<name><surname>Blackburn</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Reproducibility and repeatability in ecology</article-title>
<source>Bioscience</source>
<year>2006</year>
<volume>56</volume>
<fpage>958</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1641/0006-3568(2006)56[958:RARIE]2.0.CO;2</pub-id>
</element-citation>
</ref>
<ref id="CR17"><label>17.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Murphy</surname>
<given-names>JM</given-names>
</name>
<name><surname>Sexton</surname>
<given-names>DMH</given-names>
</name>
<name><surname>Barnett</surname>
<given-names>DN</given-names>
</name>
<name><surname>Jones</surname>
<given-names>GS</given-names>
</name>
<name><surname>Webb</surname>
<given-names>MJ</given-names>
</name>
<name><surname>Collins</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Quantification of modelling uncertainties in a large ensemble of climate change simulations</article-title>
<source>Nature</source>
<year>2004</year>
<volume>430</volume>
<fpage>768</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="doi">10.1038/nature02771</pub-id>
<pub-id pub-id-type="pmid">15306806</pub-id>
</element-citation>
</ref>
<ref id="CR18"><label>18.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McCarthy</surname>
<given-names>DJ</given-names>
</name>
<name><surname>Humburg</surname>
<given-names>P</given-names>
</name>
<name><surname>Kanapin</surname>
<given-names>A</given-names>
</name>
<name><surname>Rivas</surname>
<given-names>MA</given-names>
</name>
<name><surname>Gaulton</surname>
<given-names>K</given-names>
</name>
<name><surname>Cazier</surname>
<given-names>J-B</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Choice of transcripts and software has a large effect on variant annotation</article-title>
<source>Genome Med</source>
<year>2014</year>
<volume>6</volume>
<fpage>26</fpage>
<pub-id pub-id-type="doi">10.1186/gm543</pub-id>
<pub-id pub-id-type="pmid">24944579</pub-id>
</element-citation>
</ref>
<ref id="CR19"><label>19.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Neuman</surname>
<given-names>JA</given-names>
</name>
<name><surname>Isakov</surname>
<given-names>O</given-names>
</name>
<name><surname>Shomron</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Analysis of insertion-deletion from deep-sequencing data: Software evaluation for optimal detection</article-title>
<source>Brief Bioinform</source>
<year>2013</year>
<volume>14</volume>
<fpage>46</fpage>
<lpage>55</lpage>
<pub-id pub-id-type="doi">10.1093/bib/bbs013</pub-id>
<pub-id pub-id-type="pmid">22707752</pub-id>
</element-citation>
</ref>
<ref id="CR20"><label>20.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bradnam</surname>
<given-names>KR</given-names>
</name>
<name><surname>Fass</surname>
<given-names>JN</given-names>
</name>
<name><surname>Alexandrov</surname>
<given-names>A</given-names>
</name>
<name><surname>Baranay</surname>
<given-names>P</given-names>
</name>
<name><surname>Bechner</surname>
<given-names>M</given-names>
</name>
<name><surname>Birol</surname>
<given-names>I</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species</article-title>
<source>Gigascience</source>
<year>2013</year>
<volume>2</volume>
<fpage>10</fpage>
<pub-id pub-id-type="doi">10.1186/2047-217X-2-10</pub-id>
<pub-id pub-id-type="pmid">23870653</pub-id>
</element-citation>
</ref>
<ref id="CR21"><label>21.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bilal</surname>
<given-names>E</given-names>
</name>
<name><surname>Dutkowski</surname>
<given-names>J</given-names>
</name>
<name><surname>Guinney</surname>
<given-names>J</given-names>
</name>
<name><surname>Jang</surname>
<given-names>IS</given-names>
</name>
<name><surname>Logsdon</surname>
<given-names>BA</given-names>
</name>
<name><surname>Pandey</surname>
<given-names>G</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Improving breast cancer survival analysis through competition-based multidimensional modeling</article-title>
<source>PLoS Comput Biol</source>
<year>2013</year>
<volume>9</volume>
<fpage>e1003047</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1003047</pub-id>
<pub-id pub-id-type="pmid">23671412</pub-id>
</element-citation>
</ref>
<ref id="CR22"><label>22.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gronenschild</surname>
<given-names>EHBM</given-names>
</name>
<name><surname>Habets</surname>
<given-names>P</given-names>
</name>
<name><surname>Jacobs</surname>
<given-names>HIL</given-names>
</name>
<name><surname>Mengelers</surname>
<given-names>R</given-names>
</name>
<name><surname>Rozendaal</surname>
<given-names>N</given-names>
</name>
<name><surname>van Os</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The effects of FreeSurfer version, workstation type, and Macintosh operating system version on anatomical volume and cortical thickness measurements</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<fpage>e38234</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0038234</pub-id>
<pub-id pub-id-type="pmid">22675527</pub-id>
</element-citation>
</ref>
<ref id="CR23"><label>23.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Moskvin</surname>
<given-names>OV</given-names>
</name>
<name><surname>McIlwain</surname>
<given-names>S</given-names>
</name>
<name><surname>Ong</surname>
<given-names>IM</given-names>
</name>
</person-group>
<article-title>CAMDA 2014: Making sense of RNA-Seq data: From low-level processing to functional analysis</article-title>
<source>Systems Biomedicine</source>
<year>2014</year>
<volume>2</volume>
<fpage>31</fpage>
<lpage>40</lpage>
<pub-id pub-id-type="doi">10.1080/21628130.2015.1010923</pub-id>
</element-citation>
</ref>
<ref id="CR24"><label>24.</label>
<mixed-citation publication-type="other">Reducing our irreproducibility. Nature. 2013;496:398–398.</mixed-citation>
</ref>
<ref id="CR25"><label>25.</label>
<element-citation publication-type="book"><person-group person-group-type="editor"><name><surname>Michael</surname>
<given-names>CM</given-names>
</name>
<name><surname>Nass</surname>
<given-names>SJ</given-names>
</name>
<name><surname>Omenn</surname>
<given-names>GS</given-names>
</name>
</person-group>
<source>Evolution of Translational Omics: Lessons Learned and the Path Forward</source>
<year>2012</year>
<publisher-loc>Washington, D.C</publisher-loc>
<publisher-name>The National Academies Press</publisher-name>
</element-citation>
</ref>
<ref id="CR26"><label>26.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Collins</surname>
<given-names>FS</given-names>
</name>
<name><surname>Tabak</surname>
<given-names>L a</given-names>
</name>
</person-group>
<article-title>Policy: NIH plans to enhance reproducibility</article-title>
<source>Nature</source>
<year>2014</year>
<volume>505</volume>
<fpage>612</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="doi">10.1038/505612a</pub-id>
<pub-id pub-id-type="pmid">24482835</pub-id>
</element-citation>
</ref>
<ref id="CR27"><label>27.</label>
<mixed-citation publication-type="other">Chambers JM. S as a Programming Environment for Data Analysis and Graphics. Problem Solving Environments for Scientific Computing, Proceedings 17th Symposium on the Interface of Statistics and Computing North Holland; 1985. p. 211–4.</mixed-citation>
</ref>
<ref id="CR28"><label>28.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>LeVeque</surname>
<given-names>RJ</given-names>
</name>
<name><surname>Mitchell</surname>
<given-names>IM</given-names>
</name>
<name><surname>Stodden</surname>
<given-names>V</given-names>
</name>
</person-group>
<article-title>Reproducible research for scientific computing: Tools and strategies for changing the culture</article-title>
<source>Comput Sci Eng</source>
<year>2012</year>
<volume>14</volume>
<fpage>13</fpage>
<pub-id pub-id-type="doi">10.1109/MCSE.2012.38</pub-id>
</element-citation>
</ref>
<ref id="CR29"><label>29.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stodden</surname>
<given-names>V</given-names>
</name>
<name><surname>Guo</surname>
<given-names>P</given-names>
</name>
<name><surname>Ma</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>Toward reproducible computational research: an empirical analysis of data and code policy adoption by journals</article-title>
<source>PLoS One</source>
<year>2013</year>
<volume>8</volume>
<fpage>2</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0067111</pub-id>
</element-citation>
</ref>
<ref id="CR30"><label>30.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Morin</surname>
<given-names>A</given-names>
</name>
<name><surname>Urban</surname>
<given-names>J</given-names>
</name>
<name><surname>Adams</surname>
<given-names>PD</given-names>
</name>
<name><surname>Foster</surname>
<given-names>I</given-names>
</name>
<name><surname>Sali</surname>
<given-names>A</given-names>
</name>
<name><surname>Baker</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Research priorities. Shining light into black boxes</article-title>
<source>Science</source>
<year>2012</year>
<volume>336</volume>
<fpage>159</fpage>
<lpage>60</lpage>
<pub-id pub-id-type="doi">10.1126/science.1218263</pub-id>
<pub-id pub-id-type="pmid">22499926</pub-id>
</element-citation>
</ref>
<ref id="CR31"><label>31.</label>
<mixed-citation publication-type="other">Rebooting review. Nat Biotechnol. 2015;33:319.</mixed-citation>
</ref>
<ref id="CR32"><label>32.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ioannidis</surname>
<given-names>JP a</given-names>
</name>
<name><surname>Allison</surname>
<given-names>DB</given-names>
</name>
<name><surname>Ball</surname>
<given-names>C a</given-names>
</name>
<name><surname>Coulibaly</surname>
<given-names>I</given-names>
</name>
<name><surname>Cui</surname>
<given-names>X</given-names>
</name>
<name><surname>Culhane</surname>
<given-names>AC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Repeatability of published microarray gene expression analyses</article-title>
<source>Nat Genet</source>
<year>2009</year>
<volume>41</volume>
<fpage>149</fpage>
<lpage>55</lpage>
<pub-id pub-id-type="doi">10.1038/ng.295</pub-id>
<pub-id pub-id-type="pmid">19174838</pub-id>
</element-citation>
</ref>
<ref id="CR33"><label>33.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nekrutenko</surname>
<given-names>A</given-names>
</name>
<name><surname>Taylor</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Next-generation sequencing data interpretation: enhancing reproducibility and accessibility</article-title>
<source>Nat Rev Genet</source>
<year>2012</year>
<volume>13</volume>
<fpage>667</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="doi">10.1038/nrg3305</pub-id>
<pub-id pub-id-type="pmid">22898652</pub-id>
</element-citation>
</ref>
<ref id="CR34"><label>34.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Baggerly</surname>
<given-names>K a</given-names>
</name>
<name><surname>Coombes</surname>
<given-names>KR</given-names>
</name>
</person-group>
<article-title>Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology</article-title>
<source>Ann Appl Stat</source>
<year>2009</year>
<volume>3</volume>
<fpage>1309</fpage>
<lpage>34</lpage>
<pub-id pub-id-type="doi">10.1214/09-AOAS291</pub-id>
</element-citation>
</ref>
<ref id="CR35"><label>35.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Decullier</surname>
<given-names>E</given-names>
</name>
<name><surname>Huot</surname>
<given-names>L</given-names>
</name>
<name><surname>Samson</surname>
<given-names>G</given-names>
</name>
<name><surname>Maisonneuve</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Visibility of retractions: a cross-sectional one-year study</article-title>
<source>BMC Res Notes</source>
<year>2013</year>
<volume>6</volume>
<fpage>238</fpage>
<pub-id pub-id-type="doi">10.1186/1756-0500-6-238</pub-id>
<pub-id pub-id-type="pmid">23782596</pub-id>
</element-citation>
</ref>
<ref id="CR36"><label>36.</label>
<mixed-citation publication-type="other">Claerbout JF, Karrenbach M. Electronic Documents Give Reproducible Research a New Meaning. Meeting of the Society of Exploration Geophysics. New Orleans, LA; 1992.</mixed-citation>
</ref>
<ref id="CR37"><label>37.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stodden</surname>
<given-names>V</given-names>
</name>
<name><surname>Miguez</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Best practices for computational science: software infrastructure and environments for reproducible and extensible research</article-title>
<source>J Open Res Softw</source>
<year>2014</year>
<volume>2</volume>
<fpage>21</fpage>
<pub-id pub-id-type="doi">10.5334/jors.ay</pub-id>
</element-citation>
</ref>
<ref id="CR38"><label>38.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ravel</surname>
<given-names>J</given-names>
</name>
<name><surname>Wommack</surname>
<given-names>KE</given-names>
</name>
</person-group>
<article-title>All hail reproducibility in microbiome research</article-title>
<source>Microbiome</source>
<year>2014</year>
<volume>2</volume>
<fpage>8</fpage>
<pub-id pub-id-type="doi">10.1186/2049-2618-2-8</pub-id>
<pub-id pub-id-type="pmid">24602292</pub-id>
</element-citation>
</ref>
<ref id="CR39"><label>39.</label>
<mixed-citation publication-type="other">Stodden V. 2014: What scientific idea is ready for retirement? [Internet]. <ext-link ext-link-type="uri" xlink:href="http://edge.org/response-detail/25340">http://edge.org/response-detail/25340</ext-link>
. 2014. Available from: <ext-link ext-link-type="uri" xlink:href="http://edge.org/response-detail/25340">http://edge.org/response-detail/25340</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR40"><label>40.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Birney</surname>
<given-names>E</given-names>
</name>
<name><surname>Hudson</surname>
<given-names>TJ</given-names>
</name>
<name><surname>Green</surname>
<given-names>ED</given-names>
</name>
<name><surname>Gunter</surname>
<given-names>C</given-names>
</name>
<name><surname>Eddy</surname>
<given-names>S</given-names>
</name>
<name><surname>Rogers</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Prepublication data sharing</article-title>
<source>Nature</source>
<year>2009</year>
<volume>461</volume>
<fpage>168</fpage>
<lpage>70</lpage>
<pub-id pub-id-type="doi">10.1038/461168a</pub-id>
<pub-id pub-id-type="pmid">19741685</pub-id>
</element-citation>
</ref>
<ref id="CR41"><label>41.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hothorn</surname>
<given-names>T</given-names>
</name>
<name><surname>Leisch</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Case studies in reproducibility</article-title>
<source>Brief Bioinform</source>
<year>2011</year>
<volume>12</volume>
<fpage>288</fpage>
<lpage>300</lpage>
<pub-id pub-id-type="doi">10.1093/bib/bbq084</pub-id>
<pub-id pub-id-type="pmid">21278369</pub-id>
</element-citation>
</ref>
<ref id="CR42"><label>42.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schofield</surname>
<given-names>PN</given-names>
</name>
<name><surname>Bubela</surname>
<given-names>T</given-names>
</name>
<name><surname>Weaver</surname>
<given-names>T</given-names>
</name>
<name><surname>Portilla</surname>
<given-names>L</given-names>
</name>
<name><surname>Brown</surname>
<given-names>SD</given-names>
</name>
<name><surname>Hancock</surname>
<given-names>JM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Post-publication sharing of data and tools</article-title>
<source>Nature</source>
<year>2009</year>
<volume>461</volume>
<fpage>171</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="doi">10.1038/461171a</pub-id>
<pub-id pub-id-type="pmid">19741686</pub-id>
</element-citation>
</ref>
<ref id="CR43"><label>43.</label>
<mixed-citation publication-type="other">Piwowar H a., Day RS, Fridsma DB. Sharing detailed research data is associated with increased citation rate. PLoS One. 2007;2.</mixed-citation>
</ref>
<ref id="CR44"><label>44.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Johnson</surname>
<given-names>VE</given-names>
</name>
</person-group>
<article-title>Revised standards for statistical evidence</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2013</year>
<volume>110</volume>
<fpage>19313</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1313476110</pub-id>
<pub-id pub-id-type="pmid">24218581</pub-id>
</element-citation>
</ref>
<ref id="CR45"><label>45.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Halsey</surname>
<given-names>LG</given-names>
</name>
<name><surname>Curran-everett</surname>
<given-names>D</given-names>
</name>
<name><surname>Vowler</surname>
<given-names>SL</given-names>
</name>
<name><surname>Drummond</surname>
<given-names>GB</given-names>
</name>
</person-group>
<article-title>The fickle P value generates irreproducible results</article-title>
<source>Nat Methods</source>
<year>2015</year>
<volume>12</volume>
<fpage>179</fpage>
<lpage>85</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.3288</pub-id>
<pub-id pub-id-type="pmid">25719825</pub-id>
</element-citation>
</ref>
<ref id="CR46"><label>46.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wilson</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Software Carpentry: lessons learned</article-title>
<source>F1000Res</source>
<year>2016</year>
<volume>3</volume>
<fpage>62</fpage>
<pub-id pub-id-type="pmid">24715981</pub-id>
</element-citation>
</ref>
<ref id="CR47"><label>47.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sandve</surname>
<given-names>GK</given-names>
</name>
<name><surname>Nekrutenko</surname>
<given-names>A</given-names>
</name>
<name><surname>Taylor</surname>
<given-names>J</given-names>
</name>
<name><surname>Hovig</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Ten simple rules for reproducible computational research</article-title>
<source>PLoS Comput Biol</source>
<year>2013</year>
<volume>9</volume>
<fpage>1</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1003285</pub-id>
</element-citation>
</ref>
<ref id="CR48"><label>48.</label>
<mixed-citation publication-type="other">GNU Make [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/software/make">https://www.gnu.org/software/make</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR49"><label>49.</label>
<mixed-citation publication-type="other">Make for Windows [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://gnuwin32.sourceforge.net/packages/make.htm">http://gnuwin32.sourceforge.net/packages/make.htm</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR50"><label>50.</label>
<mixed-citation publication-type="other">Puppet [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://puppetlabs.com">https://puppetlabs.com</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR51"><label>51.</label>
<mixed-citation publication-type="other">Code share. Nature. 2014;514:536.</mixed-citation>
</ref>
<ref id="CR52"><label>52.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Blischak</surname>
<given-names>JD</given-names>
</name>
<name><surname>Davenport</surname>
<given-names>ER</given-names>
</name>
<name><surname>Wilson</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>A quick introduction to version control with Git and GitHub</article-title>
<source>PLoS Comput Biol</source>
<year>2016</year>
<volume>12</volume>
<fpage>e1004668</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1004668</pub-id>
<pub-id pub-id-type="pmid">26785377</pub-id>
</element-citation>
</ref>
<ref id="CR53"><label>53.</label>
<mixed-citation publication-type="other">Loeliger J, McCullough M. Version Control with Git: Powerful Tools and Techniques for Collaborative Software Development. Sebastopol, California: “O’Reilly Media, Inc.”; 2012. p. 456.</mixed-citation>
</ref>
<ref id="CR54"><label>54.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huber</surname>
<given-names>W</given-names>
</name>
<name><surname>Carey</surname>
<given-names>VJ</given-names>
</name>
<name><surname>Gentleman</surname>
<given-names>R</given-names>
</name>
<name><surname>Anders</surname>
<given-names>S</given-names>
</name>
<name><surname>Carlson</surname>
<given-names>M</given-names>
</name>
<name><surname>Carvalho</surname>
<given-names>BS</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Orchestrating high-throughput genomic analysis with Bioconductor</article-title>
<source>Nat Methods</source>
<year>2015</year>
<volume>12</volume>
<fpage>115</fpage>
<lpage>21</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.3252</pub-id>
<pub-id pub-id-type="pmid">25633503</pub-id>
</element-citation>
</ref>
<ref id="CR55"><label>55.</label>
<mixed-citation publication-type="other">R Core Team. R: A Language and Environment for Statistical Computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2014. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.r-project.org">http://www.r-project.org</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR56"><label>56.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tóth</surname>
<given-names>G</given-names>
</name>
<name><surname>Sokolov</surname>
<given-names>IV</given-names>
</name>
<name><surname>Gombosi</surname>
<given-names>TI</given-names>
</name>
<name><surname>Chesney</surname>
<given-names>DR</given-names>
</name>
<name><surname>Clauer</surname>
<given-names>CR</given-names>
</name>
<name><surname>De Zeeuw</surname>
<given-names>DL</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Space weather modeling framework: a new tool for the space science community</article-title>
<source>J Geophys Res</source>
<year>2005</year>
<volume>110</volume>
<fpage>A12226</fpage>
<pub-id pub-id-type="doi">10.1029/2005JA011126</pub-id>
</element-citation>
</ref>
<ref id="CR57"><label>57.</label>
<mixed-citation publication-type="other">Tan E, Choi E, Thoutireddy P, Gurnis M, Aivazis M. GeoFramework: Coupling multiple models of mantle convection within a computational framework. Geochem Geophys Geosyst. [Internet]. 2006;7. Available from: <ext-link ext-link-type="uri" xlink:href="http://doi.wiley.com/10.1029/2005GC001155">http://doi.wiley.com/10.1029/2005GC001155</ext-link>
</mixed-citation>
</ref>
<ref id="CR58"><label>58.</label>
<mixed-citation publication-type="other">Heisen B, Boukhelef D, Esenov S, Hauf S, Kozlova I, Maia L, et al. Karabo: An Integrated Software Framework Combining Control, Data Management, and Scientific Computing Tasks. 14th International Conference on Accelerator & Large Experimental Physics Control Systems, ICALEPCS2013. San Francisco, CA; 2013.</mixed-citation>
</ref>
<ref id="CR59"><label>59.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schneider</surname>
<given-names>CA</given-names>
</name>
<name><surname>Rasband</surname>
<given-names>WS</given-names>
</name>
<name><surname>Eliceiri</surname>
<given-names>KW</given-names>
</name>
</person-group>
<article-title>NIH Image to ImageJ: 25 years of image analysis</article-title>
<source>Nat Methods</source>
<year>2012</year>
<volume>9</volume>
<fpage>671</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.2089</pub-id>
<pub-id pub-id-type="pmid">22930834</pub-id>
</element-citation>
</ref>
<ref id="CR60"><label>60.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schindelin</surname>
<given-names>J</given-names>
</name>
<name><surname>Arganda-Carreras</surname>
<given-names>I</given-names>
</name>
<name><surname>Frise</surname>
<given-names>E</given-names>
</name>
<name><surname>Kaynig</surname>
<given-names>V</given-names>
</name>
<name><surname>Longair</surname>
<given-names>M</given-names>
</name>
<name><surname>Pietzsch</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Fiji: an open-source platform for biological-image analysis</article-title>
<source>Nat Methods</source>
<year>2012</year>
<volume>9</volume>
<fpage>676</fpage>
<lpage>82</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.2019</pub-id>
<pub-id pub-id-type="pmid">22743772</pub-id>
</element-citation>
</ref>
<ref id="CR61"><label>61.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Biasini</surname>
<given-names>M</given-names>
</name>
<name><surname>Schmidt</surname>
<given-names>T</given-names>
</name>
<name><surname>Bienert</surname>
<given-names>S</given-names>
</name>
<name><surname>Mariani</surname>
<given-names>V</given-names>
</name>
<name><surname>Studer</surname>
<given-names>G</given-names>
</name>
<name><surname>Haas</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>OpenStructure: an integrated software framework for computational structural biology</article-title>
<source>Acta Crystallogr D Biol Crystallogr</source>
<year>2013</year>
<volume>69</volume>
<fpage>701</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1107/S0907444913007051</pub-id>
<pub-id pub-id-type="pmid">23633579</pub-id>
</element-citation>
</ref>
<ref id="CR62"><label>62.</label>
<mixed-citation publication-type="other">Ivy, the agile dependency manager [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://ant.apache.org/ivy">http://ant.apache.org/ivy</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR63"><label>63.</label>
<mixed-citation publication-type="other">aRchive: Enabling reproducibility of Bioconductor package versions (for Galaxy) [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://bioarchive.github.io">http://bioarchive.github.io</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR64"><label>64.</label>
<element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Martin</surname>
<given-names>RC</given-names>
</name>
</person-group>
<source>Clean code: a handbook of agile software craftsmanship. Pearson Education</source>
<year>2009</year>
</element-citation>
</ref>
<ref id="CR65"><label>65.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Knuth</surname>
<given-names>DE</given-names>
</name>
</person-group>
<article-title>Literate programming</article-title>
<source>Comput J</source>
<year>1984</year>
<volume>27</volume>
<fpage>97</fpage>
<lpage>111</lpage>
<pub-id pub-id-type="doi">10.1093/comjnl/27.2.97</pub-id>
</element-citation>
</ref>
<ref id="CR66"><label>66.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pérez</surname>
<given-names>F</given-names>
</name>
<name><surname>Granger</surname>
<given-names>BE</given-names>
</name>
</person-group>
<article-title>IPython: a system for interactive scientific computing</article-title>
<source>Comput Sci Eng</source>
<year>2007</year>
<volume>9</volume>
<fpage>21</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1109/MCSE.2007.53</pub-id>
</element-citation>
</ref>
<ref id="CR67"><label>67.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Interactive notebooks: Sharing the code</article-title>
<source>Nature</source>
<year>2014</year>
<volume>515</volume>
<fpage>151</fpage>
<lpage>2</lpage>
<pub-id pub-id-type="doi">10.1038/515151a</pub-id>
<pub-id pub-id-type="pmid">25373681</pub-id>
</element-citation>
</ref>
<ref id="CR68"><label>68.</label>
<mixed-citation publication-type="other">Xie Y. Dynamic Documents with R and knitr. Boca Raton, FL: CRC Press; 2013. p. 216.</mixed-citation>
</ref>
<ref id="CR69"><label>69.</label>
<mixed-citation publication-type="other">RStudio Team. RStudio: Integrated Development for R [Internet]. [cited 2015 Nov 20]. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.rstudio.com">http://www.rstudio.com</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR70"><label>70.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gross</surname>
<given-names>AM</given-names>
</name>
<name><surname>Orosco</surname>
<given-names>RK</given-names>
</name>
<name><surname>Shen</surname>
<given-names>JP</given-names>
</name>
<name><surname>Egloff</surname>
<given-names>AM</given-names>
</name>
<name><surname>Carter</surname>
<given-names>H</given-names>
</name>
<name><surname>Hofree</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss</article-title>
<source>Nat Genet</source>
<year>2014</year>
<volume>46</volume>
<fpage>1</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="doi">10.1038/ng.3051</pub-id>
<pub-id pub-id-type="pmid">24370738</pub-id>
</element-citation>
</ref>
<ref id="CR71"><label>71.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ding</surname>
<given-names>T</given-names>
</name>
<name><surname>Schloss</surname>
<given-names>PD</given-names>
</name>
</person-group>
<article-title>Dynamics and associations of microbial community types across the human body</article-title>
<source>Nature</source>
<year>2014</year>
<volume>509</volume>
<fpage>357</fpage>
<lpage>60</lpage>
<pub-id pub-id-type="doi">10.1038/nature13178</pub-id>
<pub-id pub-id-type="pmid">24739969</pub-id>
</element-citation>
</ref>
<ref id="CR72"><label>72.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ram</surname>
<given-names>Y</given-names>
</name>
<name><surname>Hadany</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>The probability of improvement in Fisher’s geometric model: A probabilistic approach</article-title>
<source>Theor Popul Biol</source>
<year>2015</year>
<volume>99</volume>
<fpage>1</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="doi">10.1016/j.tpb.2014.10.004</pub-id>
<pub-id pub-id-type="pmid">25453607</pub-id>
</element-citation>
</ref>
<ref id="CR73"><label>73.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Meadow</surname>
<given-names>JF</given-names>
</name>
<name><surname>Altrichter</surname>
<given-names>AE</given-names>
</name>
<name><surname>Kembel</surname>
<given-names>SW</given-names>
</name>
<name><surname>Moriyama</surname>
<given-names>M</given-names>
</name>
<name><surname>O’Connor</surname>
<given-names>TK</given-names>
</name>
<name><surname>Womack</surname>
<given-names>AM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Bacterial communities on classroom surfaces vary with human contact</article-title>
<source>Microbiome</source>
<year>2014</year>
<volume>2</volume>
<fpage>7</fpage>
<pub-id pub-id-type="doi">10.1186/2049-2618-2-7</pub-id>
<pub-id pub-id-type="pmid">24602274</pub-id>
</element-citation>
</ref>
<ref id="CR74"><label>74.</label>
<mixed-citation publication-type="other">White E. Programming for Biologists [Internet]. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.programmingforbiologists.org">http://www.programmingforbiologists.org</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR75"><label>75.</label>
<mixed-citation publication-type="other">Peng RD, Leek J, Caffo B. Coursera course: Exploratory Data Analysis [Internet]. Available from: <ext-link ext-link-type="uri" xlink:href="https://www.coursera.org/learn/exploratory-data-analysis">https://www.coursera.org/learn/exploratory-data-analysis</ext-link>
.</mixed-citation>
</ref>
<ref id="CR76"><label>76.</label>
<mixed-citation publication-type="other">Bioconductor - Courses and Conferences [Internet]. [cited 2015 Nov 20]. Available from: <ext-link ext-link-type="uri" xlink:href="http://master.bioconductor.org/help/course-materials">http://master.bioconductor.org/help/course-materials</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR77"><label>77.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gil</surname>
<given-names>Y</given-names>
</name>
<name><surname>Deelman</surname>
<given-names>E</given-names>
</name>
<name><surname>Ellisman</surname>
<given-names>M</given-names>
</name>
<name><surname>Fahringer</surname>
<given-names>T</given-names>
</name>
<name><surname>Fox</surname>
<given-names>G</given-names>
</name>
<name><surname>Gannon</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Examining the challenges of scientific workflows</article-title>
<source>Computer</source>
<year>2007</year>
<volume>40</volume>
<fpage>24</fpage>
<lpage>32</lpage>
<pub-id pub-id-type="doi">10.1109/MC.2007.421</pub-id>
</element-citation>
</ref>
<ref id="CR78"><label>78.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Giardine</surname>
<given-names>B</given-names>
</name>
<name><surname>Riemer</surname>
<given-names>C</given-names>
</name>
<name><surname>Hardison</surname>
<given-names>RC</given-names>
</name>
<name><surname>Burhans</surname>
<given-names>R</given-names>
</name>
<name><surname>Elnitski</surname>
<given-names>L</given-names>
</name>
<name><surname>Shah</surname>
<given-names>P</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Galaxy: a platform for interactive large-scale genome analysis</article-title>
<source>Genome Res</source>
<year>2005</year>
<volume>15</volume>
<fpage>1451</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="doi">10.1101/gr.4086505</pub-id>
<pub-id pub-id-type="pmid">16169926</pub-id>
</element-citation>
</ref>
<ref id="CR79"><label>79.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goecks</surname>
<given-names>J</given-names>
</name>
<name><surname>Nekrutenko</surname>
<given-names>A</given-names>
</name>
<name><surname>Taylor</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences</article-title>
<source>Genome Biol</source>
<year>2010</year>
<volume>11</volume>
<fpage>R86</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2010-11-8-r86</pub-id>
<pub-id pub-id-type="pmid">20738864</pub-id>
</element-citation>
</ref>
<ref id="CR80"><label>80.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Afgan</surname>
<given-names>E</given-names>
</name>
<name><surname>Baker</surname>
<given-names>D</given-names>
</name>
<name><surname>Coraor</surname>
<given-names>N</given-names>
</name>
<name><surname>Goto</surname>
<given-names>H</given-names>
</name>
<name><surname>Paul</surname>
<given-names>IM</given-names>
</name>
<name><surname>Makova</surname>
<given-names>KD</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Harnessing cloud computing with Galaxy Cloud</article-title>
<source>Nat Biotechnol</source>
<year>2011</year>
<volume>29</volume>
<fpage>972</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="doi">10.1038/nbt.2028</pub-id>
<pub-id pub-id-type="pmid">22068528</pub-id>
</element-citation>
</ref>
<ref id="CR81"><label>81.</label>
<mixed-citation publication-type="other">Callahan SP, Freire J, Santos E, Scheidegger CE, Silva CT, Vo HT. VisTrails: Visualization Meets Data Management. Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data. New York, NY, USA: ACM; 2006. p. 745–7.</mixed-citation>
</ref>
<ref id="CR82"><label>82.</label>
<mixed-citation publication-type="other">Davidson SB, Freire J. Provenance and scientific workflows. Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD’08. 2008. p. 1345.</mixed-citation>
</ref>
<ref id="CR83"><label>83.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lazarus</surname>
<given-names>R</given-names>
</name>
<name><surname>Kaspi</surname>
<given-names>A</given-names>
</name>
<name><surname>Ziemann</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Creating re-usable tools from scripts: The Galaxy Tool Factory</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<fpage>3139</fpage>
<lpage>40</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bts573</pub-id>
<pub-id pub-id-type="pmid">23024011</pub-id>
</element-citation>
</ref>
<ref id="CR84"><label>84.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dudley</surname>
<given-names>JT</given-names>
</name>
<name><surname>Butte</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>In silico research in the era of cloud computing</article-title>
<source>Nat Biotechnol</source>
<year>2010</year>
<volume>28</volume>
<fpage>1181</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="doi">10.1038/nbt1110-1181</pub-id>
<pub-id pub-id-type="pmid">21057489</pub-id>
</element-citation>
</ref>
<ref id="CR85"><label>85.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hurley</surname>
<given-names>DG</given-names>
</name>
<name><surname>Budden</surname>
<given-names>DM</given-names>
</name>
<name><surname>Crampin</surname>
<given-names>EJ</given-names>
</name>
</person-group>
<article-title>Virtual Reference Environments: a simple way to make research reproducible</article-title>
<source>Brief Bioinform.</source>
<year>2015</year>
<volume>16</volume>
<issue>5</issue>
<fpage>901</fpage>
<lpage>903</lpage>
<pub-id pub-id-type="doi">10.1093/bib/bbu043</pub-id>
<pub-id pub-id-type="pmid">25433467</pub-id>
</element-citation>
</ref>
<ref id="CR86"><label>86.</label>
<mixed-citation publication-type="other">Gent IP. The Recomputation Manifesto. arXiv [Internet]. 2013; Available from: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1304.3674">http://arxiv.org/abs/1304.3674</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR87"><label>87.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Howe</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Virtual appliances, cloud computing, and reproducible research</article-title>
<source>Comput Sci Eng</source>
<year>2012</year>
<volume>14</volume>
<fpage>36</fpage>
<lpage>41</lpage>
<pub-id pub-id-type="doi">10.1109/MCSE.2012.62</pub-id>
</element-citation>
</ref>
<ref id="CR88"><label>88.</label>
<mixed-citation publication-type="other">Brown CT. Virtual machines considered harmful for reproducibility [Internet]. 2012. Available from: <ext-link ext-link-type="uri" xlink:href="http://ivory.idyll.org/blog/vms-considered-harmful.html">http://ivory.idyll.org/blog/vms-considered-harmful.html</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR89"><label>89.</label>
<mixed-citation publication-type="other">Piccolo SR. Building portable analytical environments to improve sustainability of computational-analysis pipelines in the sciences [Internet]. 2014. Available from: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.6084/m9.figshare.1112571">http://dx.doi.org/10.6084/m9.figshare.1112571</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR90"><label>90.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Krampis</surname>
<given-names>K</given-names>
</name>
<name><surname>Booth</surname>
<given-names>T</given-names>
</name>
<name><surname>Chapman</surname>
<given-names>B</given-names>
</name>
<name><surname>Tiwari</surname>
<given-names>B</given-names>
</name>
<name><surname>Bicak</surname>
<given-names>M</given-names>
</name>
<name><surname>Field</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community</article-title>
<source>BMC Bioinformatics</source>
<year>2012</year>
<volume>13</volume>
<fpage>42</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-13-42</pub-id>
<pub-id pub-id-type="pmid">22429538</pub-id>
</element-citation>
</ref>
<ref id="CR91"><label>91.</label>
<mixed-citation publication-type="other">CloudBioLinux: configure virtual (or real) machines with tools for biological analyses [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://github.com/chapmanb/cloudbiolinux">https://github.com/chapmanb/cloudbiolinux</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR92"><label>92.</label>
<mixed-citation publication-type="other">Felter W, Ferreira A, Rajamony R, Rubio J. An Updated Performance Comparison of Virtual Machines and Linux Containers [Internet]. IBM Research Division; 2014. Available from: <ext-link ext-link-type="uri" xlink:href="http://domino.research.ibm.com/library/CyberDig.nsf/papers/0929052195DD819C85257D2300681E7B/$File/rc25482.pdf">http://domino.research.ibm.com/library/CyberDig.nsf/papers/0929052195DD819C85257D2300681E7B/$File/rc25482.pdf</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR93"><label>93.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Eglen</surname>
<given-names>SJ</given-names>
</name>
<name><surname>Weeks</surname>
<given-names>M</given-names>
</name>
<name><surname>Jessop</surname>
<given-names>M</given-names>
</name>
<name><surname>Simonotto</surname>
<given-names>J</given-names>
</name>
<name><surname>Jackson</surname>
<given-names>T</given-names>
</name>
<name><surname>Sernagor</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>A data repository and analysis framework for spontaneous neural activity recordings in developing retina</article-title>
<source>Gigascience</source>
<year>2014</year>
<volume>3</volume>
<fpage>3</fpage>
<pub-id pub-id-type="doi">10.1186/2047-217X-3-3</pub-id>
<pub-id pub-id-type="pmid">24666584</pub-id>
</element-citation>
</ref>
<ref id="CR94"><label>94.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Eglen</surname>
<given-names>SJ</given-names>
</name>
</person-group>
<article-title>Bivariate spatial point patterns in the retina: a reproducible review</article-title>
<source>Journal de la Société Française de Statistique</source>
<year>2016</year>
<volume>157</volume>
<fpage>33</fpage>
<lpage>48</lpage>
</element-citation>
</ref>
<ref id="CR95"><label>95.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bremges</surname>
<given-names>A</given-names>
</name>
<name><surname>Maus</surname>
<given-names>I</given-names>
</name>
<name><surname>Belmann</surname>
<given-names>P</given-names>
</name>
<name><surname>Eikmeyer</surname>
<given-names>F</given-names>
</name>
<name><surname>Winkler</surname>
<given-names>A</given-names>
</name>
<name><surname>Albersmeier</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Deeply sequenced metagenome and metatranscriptome of a biogas-producing microbial community from an agricultural production-scale biogas plant</article-title>
<source>Gigascience</source>
<year>2015</year>
<volume>4</volume>
<fpage>33</fpage>
<pub-id pub-id-type="doi">10.1186/s13742-015-0073-6</pub-id>
<pub-id pub-id-type="pmid">26229594</pub-id>
</element-citation>
</ref>
<ref id="CR96"><label>96.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Belmann</surname>
<given-names>P</given-names>
</name>
<name><surname>Dröge</surname>
<given-names>J</given-names>
</name>
<name><surname>Bremges</surname>
<given-names>A</given-names>
</name>
<name><surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name><surname>Sczyrba</surname>
<given-names>A</given-names>
</name>
<name><surname>Barton</surname>
<given-names>MD</given-names>
</name>
</person-group>
<article-title>Bioboxes: standardised containers for interchangeable bioinformatics software</article-title>
<source>Gigascience</source>
<year>2015</year>
<volume>4</volume>
<fpage>47</fpage>
<pub-id pub-id-type="doi">10.1186/s13742-015-0087-0</pub-id>
<pub-id pub-id-type="pmid">26473029</pub-id>
</element-citation>
</ref>
<ref id="CR97"><label>97.</label>
<mixed-citation publication-type="other">Barton M. nucleotides · genome assembler benchmarking [Internet]. [cited 2015 Nov 20]. Available from: <ext-link ext-link-type="uri" xlink:href="http://nucleotid.es">http://nucleotid.es</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR98"><label>98.</label>
<element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Hones</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<source>Reproducibility as a Methodological Imperative in Experimental Research. PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association. Philosophy of Science Association</source>
<year>1990</year>
<fpage>585</fpage>
<lpage>99</lpage>
</element-citation>
</ref>
<ref id="CR99"><label>99.</label>
<mixed-citation publication-type="other">Rosenberg DM, Horn CC. Neurophysiological analytics for all! Free open-source software tools for documenting, analyzing, visualizing, and sharing using electronic notebooks. J Neurophysiol American Physiological Society; Apr2016;jn.00137.2016.</mixed-citation>
</ref>
<ref id="CR100"><label>100.</label>
<mixed-citation publication-type="other">everware [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://github.com/everware/everware">https://github.com/everware/everware</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR101"><label>101.</label>
<mixed-citation publication-type="other">Crick T. “Share and Enjoy”: Publishing Useful and Usable Scientific Models. Available from: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1409.0367v2">http://arxiv.org/abs/1409.0367v2</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR102"><label>102.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Donoho</surname>
<given-names>DL</given-names>
</name>
</person-group>
<article-title>An invitation to reproducible computational research</article-title>
<source>Biostatistics</source>
<year>2010</year>
<volume>11</volume>
<fpage>385</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="doi">10.1093/biostatistics/kxq028</pub-id>
<pub-id pub-id-type="pmid">20538873</pub-id>
</element-citation>
</ref>
<ref id="CR103"><label>103.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goldberg</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>What every computer scientist should know about floating-point arithmetic</article-title>
<source>ACM Comput Surv</source>
<year>1991</year>
<volume>23</volume>
<fpage>5</fpage>
<lpage>48</lpage>
<pub-id pub-id-type="doi">10.1145/103162.103163</pub-id>
</element-citation>
</ref>
<ref id="CR104"><label>104.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shirts</surname>
<given-names>M</given-names>
</name>
<name><surname>Pande</surname>
<given-names>VS</given-names>
</name>
</person-group>
<article-title>COMPUTING: screen savers of the world unite!</article-title>
<source>Science</source>
<year>2000</year>
<volume>290</volume>
<fpage>1903</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="doi">10.1126/science.290.5498.1903</pub-id>
<pub-id pub-id-type="pmid">17742054</pub-id>
</element-citation>
</ref>
<ref id="CR105"><label>105.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bird</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>Computing for the large hadron Collider</article-title>
<source>Annu Rev Nucl Part Sci</source>
<year>2011</year>
<volume>61</volume>
<fpage>99</fpage>
<lpage>118</lpage>
<pub-id pub-id-type="doi">10.1146/annurev-nucl-102010-130059</pub-id>
</element-citation>
</ref>
<ref id="CR106"><label>106.</label>
<mixed-citation publication-type="other">Anderson DP. BOINC: A System for Public Resource Computing and Storage. Proceedings of the Fifth IEEE/ACM International Workshop on Grid Computing (GRID’04). 2004.</mixed-citation>
</ref>
<ref id="CR107"><label>107.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ransohoff</surname>
<given-names>DF</given-names>
</name>
</person-group>
<article-title>Bias as a threat to the validity of cancer molecular-marker research</article-title>
<source>Nat Rev Cancer</source>
<year>2005</year>
<volume>5</volume>
<fpage>142</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1038/nrc1550</pub-id>
<pub-id pub-id-type="pmid">15685197</pub-id>
</element-citation>
</ref>
<ref id="CR108"><label>108.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bild</surname>
<given-names>AH</given-names>
</name>
<name><surname>Chang</surname>
<given-names>JT</given-names>
</name>
<name><surname>Johnson</surname>
<given-names>WE</given-names>
</name>
<name><surname>Piccolo</surname>
<given-names>SR</given-names>
</name>
</person-group>
<article-title>A field guide to genomics research</article-title>
<source>PLoS Biol</source>
<year>2014</year>
<volume>12</volume>
<fpage>e1001744</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.1001744</pub-id>
<pub-id pub-id-type="pmid">24409093</pub-id>
</element-citation>
</ref>
<ref id="CR109"><label>109.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Köster</surname>
<given-names>J</given-names>
</name>
<name><surname>Rahmann</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Snakemake—a scalable bioinformatics workflow engine</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<fpage>2520</fpage>
<lpage>2</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bts480</pub-id>
<pub-id pub-id-type="pmid">22908215</pub-id>
</element-citation>
</ref>
<ref id="CR110"><label>110.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sadedin</surname>
<given-names>SP</given-names>
</name>
<name><surname>Pope</surname>
<given-names>B</given-names>
</name>
<name><surname>Oshlack</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Bpipe : a tool for running and managing bioinformatics pipelines</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<fpage>1525</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bts167</pub-id>
<pub-id pub-id-type="pmid">22500002</pub-id>
</element-citation>
</ref>
<ref id="CR111"><label>111.</label>
<mixed-citation publication-type="other">Tange O. GNU Parallel - The Command-Line Power Tool.;login: The USENIX Magazine. Frederiksberg, Denmark; 2011;36:42–7</mixed-citation>
</ref>
<ref id="CR112"><label>112.</label>
<mixed-citation publication-type="other">Albrecht M, Donnelly P, Bui P, Thain D. Makeflow: A portable abstraction for data intensive computing on clusters, clouds, and grids. Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies. 2012.</mixed-citation>
</ref>
<ref id="CR113"><label>113.</label>
<mixed-citation publication-type="other">Knight S, Austin C, Crain C, Leblanc S, Roach A. Scons software construction tool [Internet]. 2011. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.scons.org">http://www.scons.org</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR114"><label>114.</label>
<mixed-citation publication-type="other">Altintas I, Berkley C, Jaeger E, Jones M, Ludascher B, Mock S. Kepler: an extensible system for design and execution of scientific workflows. Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004. IEEE; 2004. p. 423–4.</mixed-citation>
</ref>
<ref id="CR115"><label>115.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goff</surname>
<given-names>SA</given-names>
</name>
<name><surname>Vaughn</surname>
<given-names>M</given-names>
</name>
<name><surname>McKay</surname>
<given-names>S</given-names>
</name>
<name><surname>Lyons</surname>
<given-names>E</given-names>
</name>
<name><surname>Stapleton</surname>
<given-names>AE</given-names>
</name>
<name><surname>Gessler</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The iPlant collaborative: cyberinfrastructure for plant biology</article-title>
<source>Front Plant Sci Frontiers</source>
<year>2011</year>
<volume>2</volume>
<fpage>34</fpage>
</element-citation>
</ref>
<ref id="CR116"><label>116.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reich</surname>
<given-names>M</given-names>
</name>
<name><surname>Liefeld</surname>
<given-names>T</given-names>
</name>
<name><surname>Gould</surname>
<given-names>J</given-names>
</name>
<name><surname>Lerner</surname>
<given-names>J</given-names>
</name>
<name><surname>Tamayo</surname>
<given-names>P</given-names>
</name>
<name><surname>Mesirov</surname>
<given-names>JP</given-names>
</name>
</person-group>
<article-title>GenePattern 2.0</article-title>
<source>Nat Genet</source>
<year>2006</year>
<volume>38</volume>
<fpage>500</fpage>
<lpage>1</lpage>
<pub-id pub-id-type="doi">10.1038/ng0506-500</pub-id>
<pub-id pub-id-type="pmid">16642009</pub-id>
</element-citation>
</ref>
<ref id="CR117"><label>117.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reich</surname>
<given-names>M</given-names>
</name>
<name><surname>Liefeld</surname>
<given-names>J</given-names>
</name>
<name><surname>Thorvaldsdottir</surname>
<given-names>H</given-names>
</name>
<name><surname>Ocana</surname>
<given-names>M</given-names>
</name>
<name><surname>Polk</surname>
<given-names>E</given-names>
</name>
<name><surname>Jang</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>GenomeSpace: An environment for frictionless bioinformatics</article-title>
<source>Cancer Res</source>
<year>2012</year>
<volume>72</volume>
<fpage>3966</fpage>
<lpage>3966</lpage>
<pub-id pub-id-type="doi">10.1158/1538-7445.AM2012-3966</pub-id>
</element-citation>
</ref>
<ref id="CR118"><label>118.</label>
<mixed-citation publication-type="other">GenePattern: A platform for reproducible bioinformatics [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://www.broadinstitute.org/cancer/software/genepattern">http://www.broadinstitute.org/cancer/software/genepattern</ext-link>
]. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR119"><label>119.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wolstencroft</surname>
<given-names>K</given-names>
</name>
<name><surname>Haines</surname>
<given-names>R</given-names>
</name>
<name><surname>Fellows</surname>
<given-names>D</given-names>
</name>
<name><surname>Williams</surname>
<given-names>A</given-names>
</name>
<name><surname>Withers</surname>
<given-names>D</given-names>
</name>
<name><surname>Owen</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud</article-title>
<source>Nucleic Acids Res</source>
<year>2013</year>
<volume>41</volume>
<fpage>557</fpage>
<lpage>61</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkt328</pub-id>
</element-citation>
</ref>
<ref id="CR120"><label>120.</label>
<element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rex</surname>
<given-names>DE</given-names>
</name>
<name><surname>Ma</surname>
<given-names>JQ</given-names>
</name>
<name><surname>Toga</surname>
<given-names>AW</given-names>
</name>
</person-group>
<article-title>The LONI pipeline processing environment</article-title>
<source>Neuroimage</source>
<year>2003</year>
<volume>19</volume>
<fpage>1033</fpage>
<lpage>48</lpage>
<pub-id pub-id-type="doi">10.1016/S1053-8119(03)00185-X</pub-id>
<pub-id pub-id-type="pmid">12880830</pub-id>
</element-citation>
</ref>
<ref id="CR121"><label>121.</label>
<mixed-citation publication-type="other">LONI Pipeline Processing Environment [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://www.loni.usc.edu/Software/Pipeline">http://www.loni.usc.edu/Software/Pipeline</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR122"><label>122.</label>
<mixed-citation publication-type="other">Vortex [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://github.com/websecurify/node-vortex">https://github.com/websecurify/node-vortex</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR123"><label>123.</label>
<mixed-citation publication-type="other">Amazon Web Services [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://aws.amazon.com">http://aws.amazon.com</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR124"><label>124.</label>
<mixed-citation publication-type="other">Google Cloud Platform [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://cloud.google.com/compute">https://cloud.google.com/compute</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR125"><label>125.</label>
<mixed-citation publication-type="other">Microsoft Azure [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://azure.microsoft.com">https://azure.microsoft.com</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR126"><label>126.</label>
<mixed-citation publication-type="other">lmctfy - Let Me Contain That For You [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="https://github.com/google/lmctfy">https://github.com/google/lmctfy</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
<ref id="CR127"><label>127.</label>
<mixed-citation publication-type="other">Warden [Internet]. 2016. Available from <ext-link ext-link-type="uri" xlink:href="http://docs.cloudfoundry.org/concepts/architecture/warden.html">http://docs.cloudfoundry.org/concepts/architecture/warden.html</ext-link>
. Accessed 1 March 2016.</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000183 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000183 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4940747
   |texte=   Tools and techniques for computational reproducibility
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:27401684" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024

	Serveur d'exploration Cyberinfrastructure
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration Cyberinfrastructure

Tools and techniques for computational reproducibility

Tools and techniques for computational reproducibility

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki