Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000160 ( Pmc/Corpus ); précédent : 0001599; suivant : 0001610 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">ECL: an exhaustive search tool for the identification of cross-linked peptides using whole database</title>
<author>
<name sortKey="Yu, Fengchao" sort="Yu, Fengchao" uniqKey="Yu F" first="Fengchao" last="Yu">Fengchao Yu</name>
<affiliation>
<nlm:aff id="Aff1">Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Ning" sort="Li, Ning" uniqKey="Li N" first="Ning" last="Li">Ning Li</name>
<affiliation>
<nlm:aff id="Aff2">Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yu, Weichuan" sort="Yu, Weichuan" uniqKey="Yu W" first="Weichuan" last="Yu">Weichuan Yu</name>
<affiliation>
<nlm:aff id="Aff1">Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff3">Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27206479</idno>
<idno type="pmc">4874008</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4874008</idno>
<idno type="RBID">PMC:4874008</idno>
<idno type="doi">10.1186/s12859-016-1073-y</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000160</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">ECL: an exhaustive search tool for the identification of cross-linked peptides using whole database</title>
<author>
<name sortKey="Yu, Fengchao" sort="Yu, Fengchao" uniqKey="Yu F" first="Fengchao" last="Yu">Fengchao Yu</name>
<affiliation>
<nlm:aff id="Aff1">Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Ning" sort="Li, Ning" uniqKey="Li N" first="Ning" last="Li">Ning Li</name>
<affiliation>
<nlm:aff id="Aff2">Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yu, Weichuan" sort="Yu, Weichuan" uniqKey="Yu W" first="Weichuan" last="Yu">Weichuan Yu</name>
<affiliation>
<nlm:aff id="Aff1">Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff3">Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Chemical cross-linking combined with mass spectrometry (CX-MS) is a high-throughput approach to studying protein-protein interactions. The number of peptide-peptide combinations grows quadratically with respect to the number of proteins, resulting in a high computational complexity. Widely used methods including xQuest (Rinner et al., Nat Methods 5(4):315–8, 2008; Walzthoeni et al., Nat Methods 9(9):901–3, 2012), pLink (Yang et al., Nat Methods 9(9):904–6, 2012), ProteinProspector (Chu et al., Mol Cell Proteomics 9:25–31, 2010; Trnka et al., 13(2):420–34, 2014) and Kojak (Hoopmann et al., J Proteome Res 14(5):2190–198, 2015) avoid searching all peptide-peptide combinations by pre-selecting peptides with heuristic approaches. However, pre-selection procedures may cause missing findings. The most intuitive approach is searching all possible candidates. A tool that can exhaustively search a whole database without any heuristic pre-selection procedure is therefore desirable.</p>
</sec>
<sec>
<title>Results</title>
<p>We have developed a cross-linked peptides identification tool named ECL. It can exhaustively search a whole database in a reasonable period of time without any heuristic pre-selection procedure. Tests showed that searching a database containing 5200 proteins took 7 h.</p>
<p>ECL identified more non-redundant cross-linked peptides than xQuest, pLink, and ProteinProspector. Experiments showed that about 30
<italic>%</italic>
of these additional identified peptides were not pre-selected by Kojak. We used protein crystal structures from the protein data bank to check the intra-protein cross-linked peptides. Most of the distances between cross-linking sites were smaller than 30 Å.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>To the best of our knowledge, ECL is the first tool that can exhaustively search all candidates in cross-linked peptides identification. The experiments showed that ECL could identify more peptides than xQuest, pLink, and ProteinProspector. A further analysis indicated that some of the additional identified results were thanks to the exhaustive search.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s12859-016-1073-y) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Young, Mm" uniqKey="Young M">MM Young</name>
</author>
<author>
<name sortKey="Tang, N" uniqKey="Tang N">N Tang</name>
</author>
<author>
<name sortKey="Hempel, Jc" uniqKey="Hempel J">JC Hempel</name>
</author>
<author>
<name sortKey="Oshiro, Cm" uniqKey="Oshiro C">CM Oshiro</name>
</author>
<author>
<name sortKey="Taylor, Ew" uniqKey="Taylor E">EW Taylor</name>
</author>
<author>
<name sortKey="Kuntz, Id" uniqKey="Kuntz I">ID Kuntz</name>
</author>
<author>
<name sortKey="Gibson, Bw" uniqKey="Gibson B">BW Gibson</name>
</author>
<author>
<name sortKey="Dollinger, G" uniqKey="Dollinger G">G Dollinger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schilling, B" uniqKey="Schilling B">B Schilling</name>
</author>
<author>
<name sortKey="Row, Rh" uniqKey="Row R">RH Row</name>
</author>
<author>
<name sortKey="Gibsonb, Bw" uniqKey="Gibsonb B">BW Gibsonb</name>
</author>
<author>
<name sortKey="Guo, X" uniqKey="Guo X">X Guo</name>
</author>
<author>
<name sortKey="Young, Mm" uniqKey="Young M">MM Young</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chu, F" uniqKey="Chu F">F Chu</name>
</author>
<author>
<name sortKey="Shan, S O" uniqKey="Shan S">S-o Shan</name>
</author>
<author>
<name sortKey="Moustakas, Dt" uniqKey="Moustakas D">DT Moustakas</name>
</author>
<author>
<name sortKey="Alber, F" uniqKey="Alber F">F Alber</name>
</author>
<author>
<name sortKey="Egea, Pf" uniqKey="Egea P">PF Egea</name>
</author>
<author>
<name sortKey="Stroud, Rm" uniqKey="Stroud R">RM Stroud</name>
</author>
<author>
<name sortKey="Walter, P" uniqKey="Walter P">P Walter</name>
</author>
<author>
<name sortKey="Burlingame, Al" uniqKey="Burlingame A">AL Burlingame</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tang, Y" uniqKey="Tang Y">Y Tang</name>
</author>
<author>
<name sortKey="Chen, Y" uniqKey="Chen Y">Y Chen</name>
</author>
<author>
<name sortKey="Lichti, C" uniqKey="Lichti C">C Lichti</name>
</author>
<author>
<name sortKey="Hall, R" uniqKey="Hall R">R Hall</name>
</author>
<author>
<name sortKey="Raney, K" uniqKey="Raney K">K Raney</name>
</author>
<author>
<name sortKey="Jennings, S" uniqKey="Jennings S">S Jennings</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ihling, C" uniqKey="Ihling C">C Ihling</name>
</author>
<author>
<name sortKey="Schmidt, A" uniqKey="Schmidt A">A Schmidt</name>
</author>
<author>
<name sortKey="Kalkhof, S" uniqKey="Kalkhof S">S Kalkhof</name>
</author>
<author>
<name sortKey="Schulz, Dm" uniqKey="Schulz D">DM Schulz</name>
</author>
<author>
<name sortKey="Stingl, C" uniqKey="Stingl C">C Stingl</name>
</author>
<author>
<name sortKey="Mechtler, K" uniqKey="Mechtler K">K Mechtler</name>
</author>
<author>
<name sortKey="Haack, M" uniqKey="Haack M">M Haack</name>
</author>
<author>
<name sortKey="Beck Sickinger, Ag" uniqKey="Beck Sickinger A">AG Beck-Sickinger</name>
</author>
<author>
<name sortKey="Cooper, Dm" uniqKey="Cooper D">DM Cooper</name>
</author>
<author>
<name sortKey="Sinz, A" uniqKey="Sinz A">A Sinz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Koning, Lj" uniqKey="Koning L">LJ Koning</name>
</author>
<author>
<name sortKey="Kasper, Pt" uniqKey="Kasper P">PT Kasper</name>
</author>
<author>
<name sortKey="Back, Jw" uniqKey="Back J">JW Back</name>
</author>
<author>
<name sortKey="Nessen, Ma" uniqKey="Nessen M">MA Nessen</name>
</author>
<author>
<name sortKey="Vanrobaeys, F" uniqKey="Vanrobaeys F">F Vanrobaeys</name>
</author>
<author>
<name sortKey="Beeumen, J" uniqKey="Beeumen J">J Beeumen</name>
</author>
<author>
<name sortKey="Gherardi, E" uniqKey="Gherardi E">E Gherardi</name>
</author>
<author>
<name sortKey="Koster, Cg" uniqKey="Koster C">CG Koster</name>
</author>
<author>
<name sortKey="Jong, L" uniqKey="Jong L">L Jong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Maiolica, A" uniqKey="Maiolica A">A Maiolica</name>
</author>
<author>
<name sortKey="Cittaro, D" uniqKey="Cittaro D">D Cittaro</name>
</author>
<author>
<name sortKey="Borsotti, D" uniqKey="Borsotti D">D Borsotti</name>
</author>
<author>
<name sortKey="Sennels, L" uniqKey="Sennels L">L Sennels</name>
</author>
<author>
<name sortKey="Ciferri, C" uniqKey="Ciferri C">C Ciferri</name>
</author>
<author>
<name sortKey="Tarricone, C" uniqKey="Tarricone C">C Tarricone</name>
</author>
<author>
<name sortKey="Musacchio, A" uniqKey="Musacchio A">A Musacchio</name>
</author>
<author>
<name sortKey="Rappsilber, J" uniqKey="Rappsilber J">J Rappsilber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, Yj" uniqKey="Lee Y">YJ Lee</name>
</author>
<author>
<name sortKey="Lackner, Ll" uniqKey="Lackner L">LL Lackner</name>
</author>
<author>
<name sortKey="Nunnari, Jm" uniqKey="Nunnari J">JM Nunnari</name>
</author>
<author>
<name sortKey="Phinney, Bs" uniqKey="Phinney B">BS Phinney</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Singh, P" uniqKey="Singh P">P Singh</name>
</author>
<author>
<name sortKey="Shaffer, Sa" uniqKey="Shaffer S">SA Shaffer</name>
</author>
<author>
<name sortKey="Scherl, A" uniqKey="Scherl A">A Scherl</name>
</author>
<author>
<name sortKey="Holman, C" uniqKey="Holman C">C Holman</name>
</author>
<author>
<name sortKey="Pfuetzner, Ra" uniqKey="Pfuetzner R">RA Pfuetzner</name>
</author>
<author>
<name sortKey="Freeman, Tjl" uniqKey="Freeman T">TJL Freeman</name>
</author>
<author>
<name sortKey="Miller, Si" uniqKey="Miller S">SI Miller</name>
</author>
<author>
<name sortKey="Hernandez, P" uniqKey="Hernandez P">P Hernandez</name>
</author>
<author>
<name sortKey="Appel, Rd" uniqKey="Appel R">RD Appel</name>
</author>
<author>
<name sortKey="Goodlett, Dr" uniqKey="Goodlett D">DR Goodlett</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, Et" uniqKey="Yu E">ET Yu</name>
</author>
<author>
<name sortKey="Hawkins, A" uniqKey="Hawkins A">A Hawkins</name>
</author>
<author>
<name sortKey="Kuntz, Id" uniqKey="Kuntz I">ID Kuntz</name>
</author>
<author>
<name sortKey="Rahn, La" uniqKey="Rahn L">LA Rahn</name>
</author>
<author>
<name sortKey="Rothfuss, A" uniqKey="Rothfuss A">A Rothfuss</name>
</author>
<author>
<name sortKey="Sale, K" uniqKey="Sale K">K Sale</name>
</author>
<author>
<name sortKey="Young, Mm" uniqKey="Young M">MM Young</name>
</author>
<author>
<name sortKey="Yang, Cl" uniqKey="Yang C">CL Yang</name>
</author>
<author>
<name sortKey="Pancerella, Cm" uniqKey="Pancerella C">CM Pancerella</name>
</author>
<author>
<name sortKey="Fabris, D" uniqKey="Fabris D">D Fabris</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nadeau, Ow" uniqKey="Nadeau O">OW Nadeau</name>
</author>
<author>
<name sortKey="Wyckoff, Gj" uniqKey="Wyckoff G">GJ Wyckoff</name>
</author>
<author>
<name sortKey="Paschall, Je" uniqKey="Paschall J">JE Paschall</name>
</author>
<author>
<name sortKey="Artigues, A" uniqKey="Artigues A">A Artigues</name>
</author>
<author>
<name sortKey="Sage, J" uniqKey="Sage J">J Sage</name>
</author>
<author>
<name sortKey="Villar, Mt" uniqKey="Villar M">MT Villar</name>
</author>
<author>
<name sortKey="Carlson, Gm" uniqKey="Carlson G">GM Carlson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Panchaud, A" uniqKey="Panchaud A">A Panchaud</name>
</author>
<author>
<name sortKey="Singh, P" uniqKey="Singh P">P Singh</name>
</author>
<author>
<name sortKey="Shaffer, Sa" uniqKey="Shaffer S">SA Shaffer</name>
</author>
<author>
<name sortKey="Goodlett, Dr" uniqKey="Goodlett D">DR Goodlett</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcilwain, S" uniqKey="Mcilwain S">S McIlwain</name>
</author>
<author>
<name sortKey="Draghicescu, P" uniqKey="Draghicescu P">P Draghicescu</name>
</author>
<author>
<name sortKey="Singh, P" uniqKey="Singh P">P Singh</name>
</author>
<author>
<name sortKey="Goodlett, Dr" uniqKey="Goodlett D">DR Goodlett</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Du, X" uniqKey="Du X">X Du</name>
</author>
<author>
<name sortKey="Chowdhury, Sm" uniqKey="Chowdhury S">SM Chowdhury</name>
</author>
<author>
<name sortKey="Manes, Np" uniqKey="Manes N">NP Manes</name>
</author>
<author>
<name sortKey="Wu, S" uniqKey="Wu S">S Wu</name>
</author>
<author>
<name sortKey="Mayer, Mu" uniqKey="Mayer M">MU Mayer</name>
</author>
<author>
<name sortKey="Adkins, Jn" uniqKey="Adkins J">JN Adkins</name>
</author>
<author>
<name sortKey="Anderson, Ga" uniqKey="Anderson G">GA Anderson</name>
</author>
<author>
<name sortKey="Smith, Rd" uniqKey="Smith R">RD Smith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holding, An" uniqKey="Holding A">AN Holding</name>
</author>
<author>
<name sortKey="Lamers, Mh" uniqKey="Lamers M">MH Lamers</name>
</author>
<author>
<name sortKey="Stephens, E" uniqKey="Stephens E">E Stephens</name>
</author>
<author>
<name sortKey="Skehel, Jm" uniqKey="Skehel J">JM Skehel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mueller Planitz, F" uniqKey="Mueller Planitz F">F Mueller-Planitz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Petrotchenko, Ev" uniqKey="Petrotchenko E">EV Petrotchenko</name>
</author>
<author>
<name sortKey="Borchers, Ch" uniqKey="Borchers C">CH Borchers</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Petrotchenko, Ev" uniqKey="Petrotchenko E">EV Petrotchenko</name>
</author>
<author>
<name sortKey="Serpa, Jj" uniqKey="Serpa J">JJ Serpa</name>
</author>
<author>
<name sortKey="Borchers, Ch" uniqKey="Borchers C">CH Borchers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kaake, Rm" uniqKey="Kaake R">RM Kaake</name>
</author>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author>
<name sortKey="Burke, A" uniqKey="Burke A">A Burke</name>
</author>
<author>
<name sortKey="Yu, C" uniqKey="Yu C">C Yu</name>
</author>
<author>
<name sortKey="Kandur, W" uniqKey="Kandur W">W Kandur</name>
</author>
<author>
<name sortKey="Yang, Y" uniqKey="Yang Y">Y Yang</name>
</author>
<author>
<name sortKey="Novtisky, Ej" uniqKey="Novtisky E">EJ Novtisky</name>
</author>
<author>
<name sortKey="Second, T" uniqKey="Second T">T Second</name>
</author>
<author>
<name sortKey="Duan, J" uniqKey="Duan J">J Duan</name>
</author>
<author>
<name sortKey="Kao, A" uniqKey="Kao A">A Kao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Herzog, F" uniqKey="Herzog F">F Herzog</name>
</author>
<author>
<name sortKey="Kahraman, A" uniqKey="Kahraman A">A Kahraman</name>
</author>
<author>
<name sortKey="Boehringer, D" uniqKey="Boehringer D">D Boehringer</name>
</author>
<author>
<name sortKey="Mak, R" uniqKey="Mak R">R Mak</name>
</author>
<author>
<name sortKey="Bracher, A" uniqKey="Bracher A">A Bracher</name>
</author>
<author>
<name sortKey="Walzthoeni, T" uniqKey="Walzthoeni T">T Walzthoeni</name>
</author>
<author>
<name sortKey="Leitner, A" uniqKey="Leitner A">A Leitner</name>
</author>
<author>
<name sortKey="Beck, M" uniqKey="Beck M">M Beck</name>
</author>
<author>
<name sortKey="Hartl, Fu" uniqKey="Hartl F">FU Hartl</name>
</author>
<author>
<name sortKey="Ban, N" uniqKey="Ban N">N Ban</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nguyen, Vq" uniqKey="Nguyen V">VQ Nguyen</name>
</author>
<author>
<name sortKey="Ranjan, A" uniqKey="Ranjan A">A Ranjan</name>
</author>
<author>
<name sortKey="Stengel, F" uniqKey="Stengel F">F Stengel</name>
</author>
<author>
<name sortKey="Wei, D" uniqKey="Wei D">D Wei</name>
</author>
<author>
<name sortKey="Aebersold, R" uniqKey="Aebersold R">R Aebersold</name>
</author>
<author>
<name sortKey="Wu, C" uniqKey="Wu C">C Wu</name>
</author>
<author>
<name sortKey="Leschziner, Ae" uniqKey="Leschziner A">AE Leschziner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Politis, A" uniqKey="Politis A">A Politis</name>
</author>
<author>
<name sortKey="Stengel, F" uniqKey="Stengel F">F Stengel</name>
</author>
<author>
<name sortKey="Hall, Z" uniqKey="Hall Z">Z Hall</name>
</author>
<author>
<name sortKey="Hernandez, H" uniqKey="Hernandez H">H Hernández</name>
</author>
<author>
<name sortKey="Leitner, A" uniqKey="Leitner A">A Leitner</name>
</author>
<author>
<name sortKey="Walzthoeni, T" uniqKey="Walzthoeni T">T Walzthoeni</name>
</author>
<author>
<name sortKey="Robinson, Cv" uniqKey="Robinson C">CV Robinson</name>
</author>
<author>
<name sortKey="Aebersold, R" uniqKey="Aebersold R">R Aebersold</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Greber, Bj" uniqKey="Greber B">BJ Greber</name>
</author>
<author>
<name sortKey="Boehringer, D" uniqKey="Boehringer D">D Boehringer</name>
</author>
<author>
<name sortKey="Leitner, A" uniqKey="Leitner A">A Leitner</name>
</author>
<author>
<name sortKey="Bieri, P" uniqKey="Bieri P">P Bieri</name>
</author>
<author>
<name sortKey="Voigts Hoffmann, F" uniqKey="Voigts Hoffmann F">F Voigts-Hoffmann</name>
</author>
<author>
<name sortKey="Erzberger, Jp" uniqKey="Erzberger J">JP Erzberger</name>
</author>
<author>
<name sortKey="Leibundgut, M" uniqKey="Leibundgut M">M Leibundgut</name>
</author>
<author>
<name sortKey="Aebersold, R" uniqKey="Aebersold R">R Aebersold</name>
</author>
<author>
<name sortKey="Ban, N" uniqKey="Ban N">N Ban</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rinner, O" uniqKey="Rinner O">O Rinner</name>
</author>
<author>
<name sortKey="Seebacher, J" uniqKey="Seebacher J">J Seebacher</name>
</author>
<author>
<name sortKey="Walzthoeni, T" uniqKey="Walzthoeni T">T Walzthoeni</name>
</author>
<author>
<name sortKey="Mueller, L" uniqKey="Mueller L">L Mueller</name>
</author>
<author>
<name sortKey="Beck, M" uniqKey="Beck M">M Beck</name>
</author>
<author>
<name sortKey="Schmidt, A" uniqKey="Schmidt A">A Schmidt</name>
</author>
<author>
<name sortKey="Mueller, M" uniqKey="Mueller M">M Mueller</name>
</author>
<author>
<name sortKey="Aebersold, R" uniqKey="Aebersold R">R Aebersold</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Walzthoeni, T" uniqKey="Walzthoeni T">T Walzthoeni</name>
</author>
<author>
<name sortKey="Claassen, M" uniqKey="Claassen M">M Claassen</name>
</author>
<author>
<name sortKey="Leitner, A" uniqKey="Leitner A">A Leitner</name>
</author>
<author>
<name sortKey="Herzog, F" uniqKey="Herzog F">F Herzog</name>
</author>
<author>
<name sortKey="Bohn, S" uniqKey="Bohn S">S Bohn</name>
</author>
<author>
<name sortKey="Forster, F" uniqKey="Forster F">F Förster</name>
</author>
<author>
<name sortKey="Beck, M" uniqKey="Beck M">M Beck</name>
</author>
<author>
<name sortKey="Aebersold, R" uniqKey="Aebersold R">R Aebersold</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, B" uniqKey="Yang B">B Yang</name>
</author>
<author>
<name sortKey="Wu, Yj" uniqKey="Wu Y">YJ Wu</name>
</author>
<author>
<name sortKey="Zhu, M" uniqKey="Zhu M">M Zhu</name>
</author>
<author>
<name sortKey="Fan, Sb" uniqKey="Fan S">SB Fan</name>
</author>
<author>
<name sortKey="Lin, J" uniqKey="Lin J">J Lin</name>
</author>
<author>
<name sortKey="Zhang, K" uniqKey="Zhang K">K Zhang</name>
</author>
<author>
<name sortKey="Li, S" uniqKey="Li S">S Li</name>
</author>
<author>
<name sortKey="Chi, H" uniqKey="Chi H">H Chi</name>
</author>
<author>
<name sortKey="Li, Yx" uniqKey="Li Y">YX Li</name>
</author>
<author>
<name sortKey="Chen, Hf" uniqKey="Chen H">HF Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chu, F" uniqKey="Chu F">F Chu</name>
</author>
<author>
<name sortKey="Baker, Pr" uniqKey="Baker P">PR Baker</name>
</author>
<author>
<name sortKey="Burlingame, Al" uniqKey="Burlingame A">AL Burlingame</name>
</author>
<author>
<name sortKey="Chalkley, Rj" uniqKey="Chalkley R">RJ Chalkley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Trnka, Mj" uniqKey="Trnka M">MJ Trnka</name>
</author>
<author>
<name sortKey="Baker, Pr" uniqKey="Baker P">PR Baker</name>
</author>
<author>
<name sortKey="Robinson, Pj" uniqKey="Robinson P">PJ Robinson</name>
</author>
<author>
<name sortKey="Burlingame, A" uniqKey="Burlingame A">A Burlingame</name>
</author>
<author>
<name sortKey="Chalkley, Rj" uniqKey="Chalkley R">RJ Chalkley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoopmann, Mr" uniqKey="Hoopmann M">MR Hoopmann</name>
</author>
<author>
<name sortKey="Zelter, A" uniqKey="Zelter A">A Zelter</name>
</author>
<author>
<name sortKey="Johnson, Rs" uniqKey="Johnson R">RS Johnson</name>
</author>
<author>
<name sortKey="Riffle, M" uniqKey="Riffle M">M Riffle</name>
</author>
<author>
<name sortKey="Maccoss, Mj" uniqKey="Maccoss M">MJ MacCoss</name>
</author>
<author>
<name sortKey="Davis, Tn" uniqKey="Davis T">TN Davis</name>
</author>
<author>
<name sortKey="Moritz, Rl" uniqKey="Moritz R">RL Moritz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Storey, Jd" uniqKey="Storey J">JD Storey</name>
</author>
<author>
<name sortKey="Tibshirani, R" uniqKey="Tibshirani R">R Tibshirani</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bohn, S" uniqKey="Bohn S">S Bohn</name>
</author>
<author>
<name sortKey="Beck, F" uniqKey="Beck F">F Beck</name>
</author>
<author>
<name sortKey="Sakata, E" uniqKey="Sakata E">E Sakata</name>
</author>
<author>
<name sortKey="Walzthoeni, T" uniqKey="Walzthoeni T">T Walzthoeni</name>
</author>
<author>
<name sortKey="Beck, M" uniqKey="Beck M">M Beck</name>
</author>
<author>
<name sortKey="Aebersold, R" uniqKey="Aebersold R">R Aebersold</name>
</author>
<author>
<name sortKey="Frster, F" uniqKey="Frster F">F Frster</name>
</author>
<author>
<name sortKey="Baumeister, W" uniqKey="Baumeister W">W Baumeister</name>
</author>
<author>
<name sortKey="Nickell, S" uniqKey="Nickell S">S Nickell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nesvizhskii, Ai" uniqKey="Nesvizhskii A">AI Nesvizhskii</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kertesz Farkas, A" uniqKey="Kertesz Farkas A">A Kertesz-Farkas</name>
</author>
<author>
<name sortKey="Keich, U" uniqKey="Keich U">U Keich</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27206479</article-id>
<article-id pub-id-type="pmc">4874008</article-id>
<article-id pub-id-type="publisher-id">1073</article-id>
<article-id pub-id-type="doi">10.1186/s12859-016-1073-y</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Software</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>ECL: an exhaustive search tool for the identification of cross-linked peptides using whole database</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Yu</surname>
<given-names>Fengchao</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Li</surname>
<given-names>Ning</given-names>
</name>
<address>
<email>boningli@ust.hk</email>
</address>
<xref ref-type="aff" rid="Aff2"></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Yu</surname>
<given-names>Weichuan</given-names>
</name>
<address>
<email>eeyu@ust.hk</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
<xref ref-type="aff" rid="Aff3"></xref>
</contrib>
<aff id="Aff1">
<label></label>
Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</aff>
<aff id="Aff2">
<label></label>
Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China</aff>
<aff id="Aff3">
<label></label>
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>20</day>
<month>5</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>20</day>
<month>5</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="collection">
<year>2016</year>
</pub-date>
<volume>17</volume>
<elocation-id>217</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>8</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>7</day>
<month>5</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>© Yu et al. 2016</copyright-statement>
<license license-type="OpenAccess">
<license-p>
<bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p>Chemical cross-linking combined with mass spectrometry (CX-MS) is a high-throughput approach to studying protein-protein interactions. The number of peptide-peptide combinations grows quadratically with respect to the number of proteins, resulting in a high computational complexity. Widely used methods including xQuest (Rinner et al., Nat Methods 5(4):315–8, 2008; Walzthoeni et al., Nat Methods 9(9):901–3, 2012), pLink (Yang et al., Nat Methods 9(9):904–6, 2012), ProteinProspector (Chu et al., Mol Cell Proteomics 9:25–31, 2010; Trnka et al., 13(2):420–34, 2014) and Kojak (Hoopmann et al., J Proteome Res 14(5):2190–198, 2015) avoid searching all peptide-peptide combinations by pre-selecting peptides with heuristic approaches. However, pre-selection procedures may cause missing findings. The most intuitive approach is searching all possible candidates. A tool that can exhaustively search a whole database without any heuristic pre-selection procedure is therefore desirable.</p>
</sec>
<sec>
<title>Results</title>
<p>We have developed a cross-linked peptides identification tool named ECL. It can exhaustively search a whole database in a reasonable period of time without any heuristic pre-selection procedure. Tests showed that searching a database containing 5200 proteins took 7 h.</p>
<p>ECL identified more non-redundant cross-linked peptides than xQuest, pLink, and ProteinProspector. Experiments showed that about 30
<italic>%</italic>
of these additional identified peptides were not pre-selected by Kojak. We used protein crystal structures from the protein data bank to check the intra-protein cross-linked peptides. Most of the distances between cross-linking sites were smaller than 30 Å.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>To the best of our knowledge, ECL is the first tool that can exhaustively search all candidates in cross-linked peptides identification. The experiments showed that ECL could identify more peptides than xQuest, pLink, and ProteinProspector. A further analysis indicated that some of the additional identified results were thanks to the exhaustive search.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s12859-016-1073-y) contains supplementary material, which is available to authorized users.</p>
</sec>
</abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>Cross-linking</kwd>
<kwd>Peptide identification</kwd>
<kwd>Database searching</kwd>
</kwd-group>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2016</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1">
<title>Background</title>
<p>Chemical cross-linking combined with mass spectrometry (CX-MS) is becoming a powerful approach to studying protein-protein interactions. In the CX-MS protocol, proteins are linked before digestion. Digested products include cross-linked peptides and conventional linear peptides. In this paper, we refer to conventional linear peptides as peptides if there is no ambiguity. Cross-linked peptides are two peptides linked by a chemical compound. Two such peptides are referred to as chains, and the chemical compound is referred to as cross-linker. In the database searching based identification framework, the number of all possible peptide-peptide combinations grows quadratically with respect to the number of proteins, which results in a large search space.</p>
<p>Many tools have been developed to identify cross-linked peptides. An incomplete list includes ASAP [
<xref ref-type="bibr" rid="CR1">1</xref>
], MS2Assign [
<xref ref-type="bibr" rid="CR2">2</xref>
], MS-Bridge [
<xref ref-type="bibr" rid="CR3">3</xref>
], CLPM [
<xref ref-type="bibr" rid="CR4">4</xref>
], GPMAW [
<xref ref-type="bibr" rid="CR5">5</xref>
], Virtual-MSLab [
<xref ref-type="bibr" rid="CR6">6</xref>
], XDB [
<xref ref-type="bibr" rid="CR7">7</xref>
], X!Link [
<xref ref-type="bibr" rid="CR8">8</xref>
], Popitam [
<xref ref-type="bibr" rid="CR9">9</xref>
], MS3D [
<xref ref-type="bibr" rid="CR10">10</xref>
], CrossSearch [
<xref ref-type="bibr" rid="CR11">11</xref>
], xComb [
<xref ref-type="bibr" rid="CR12">12</xref>
], crux [
<xref ref-type="bibr" rid="CR13">13</xref>
], Xlink-Identifier [
<xref ref-type="bibr" rid="CR14">14</xref>
], pLink [
<xref ref-type="bibr" rid="CR27">27</xref>
], Hekate [
<xref ref-type="bibr" rid="CR15">15</xref>
], ProteinProspector [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
], Crossfinder [
<xref ref-type="bibr" rid="CR16">16</xref>
], and Kojak [
<xref ref-type="bibr" rid="CR30">30</xref>
]. The approach of most of these tools is to modify conventional peptide identification tools’ workflow and the corresponding score functions based on the property of cross-linked peptides. Because the search space is large, most of them pre-select high possibility candidates before scoring PSMs (peptide spectrum matches). In order to reduce the search space, cleavable cross-linkers [
<xref ref-type="bibr" rid="CR17">17</xref>
<xref ref-type="bibr" rid="CR20">20</xref>
] have been developed to avoid generating peptide-peptide combinations during database searching. Peptides linked by this kind of cross-linker can be broken into two peptides in dissociation. Thus, the cross-linked peptides identification problem is converted to the conventional peptide identification problem.</p>
<p>Due to the good chemical and biological properties of noncleavable amine-reactive cross-linkers (e.g. DSS (disuccinimidyl suberate) and BS3 (bis(sulfosuccinimidyl) suberate)), they have been widely used recently [
<xref ref-type="bibr" rid="CR21">21</xref>
<xref ref-type="bibr" rid="CR24">24</xref>
]. Tools including xQuest [
<xref ref-type="bibr" rid="CR25">25</xref>
,
<xref ref-type="bibr" rid="CR26">26</xref>
], pLink [
<xref ref-type="bibr" rid="CR27">27</xref>
], ProteinProspector [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
], and Kojak [
<xref ref-type="bibr" rid="CR30">30</xref>
] were proposed to identify peptides linked by this kind of cross-linkers. They use preprocessing procedures to eliminate candidates with low possibilities before scoring. Given a spectrum, they compare it with the theoretical spectra from peptides to determine their chances of resulting in high scores heuristically. Peptides with low chances are eliminated. Eliminating some of the peptides before PSM scoring may result in false negatives. The most intuitive approach is searching all candidates exhaustively.</p>
<p>In this paper, we propose a new tool, named ECL (exhaustive cross-linked peptides identification), that can exhaustively search a whole database within a reasonable period of time. Experiments showed that more cross-linked peptides were identified thanks to exhaustive searching. For the purpose of visualization, we developed another tool, named ECLAnnotator, that converts ECL results into webpages. These webpages show annotated tandem mass spectra and matched/unmatched theoretical ions clearly.</p>
</sec>
<sec id="Sec2">
<title>Implementation</title>
<p>ECL is designed to identify peptides linked by noncleavable amine-reactive cross linkers like DSS and BS3. In the current version, ECL only supports CID (collision-induced dissociation). Given a peptide-peptide combination, ECL
<italic>in silico</italic>
fragments it to b-ions and y-ions with different charges. These ions form a theoretical spectrum whose peaks’ intensities are the numbers of ions with the corresponding mass-to-charge ratios. The tandem mass spectra produced by a mass spectrometer are referred to as experimental spectra in this paper. ECL uses the normalized cross correlation coefficient to measure the similarity between a theoretical spectrum and an experimental spectrum:
<disp-formula id="Equ1">
<label>1</label>
<alternatives>
<tex-math id="M1">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ score = \frac{X^{T} Y}{||X|| ||Y||}, $$ \end{document}</tex-math>
<mml:math id="M2">
<mml:mtext mathvariant="italic">score</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>X</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ1.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>where
<italic>X</italic>
is the theoretical spectrum,
<italic>Y</italic>
is the experimental spectrum, and
<italic>T</italic>
stands for vector transpose.</p>
<p>Because the search space is large, we developed an efficient and low memory requirement algorithm to score PSMs. Concretely, Eq. (
<xref rid="Equ1" ref-type="">1</xref>
) can be rewritten as:
<disp-formula id="Equ2">
<label>2</label>
<alternatives>
<tex-math id="M3">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $${} score = \frac{(X_{1} + X_{2})^{T} Y}{||X|| ||Y||} = \frac{{X_{1}^{T}} Y + {X_{2}^{T}} Y}{||X|| ||Y||} = \frac{{X_{1}^{T}} \tilde{Y} + {X_{2}^{T}} \tilde{Y}}{||X||}, $$ \end{document}</tex-math>
<mml:math id="M4">
<mml:mtext mathvariant="italic">score</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>X</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mi>Y</mml:mi>
<mml:mo>+</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>X</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
<mml:mo>+</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>X</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ2.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>where
<italic>X</italic>
<sub>1</sub>
is the vector whose elements are contributed by the first chain,
<italic>X</italic>
<sub>2</sub>
is the vector whose elements are contributed by the second chain,
<italic>X</italic>
<sub>1</sub>
+
<italic>X</italic>
<sub>2</sub>
=
<italic>X</italic>
, and
<inline-formula id="IEq1">
<alternatives>
<tex-math id="M5">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\tilde {Y} = Y/||Y||$\end{document}</tex-math>
<mml:math id="M6">
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>/</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:math>
<inline-graphic xlink:href="12859_2016_1073_Article_IEq1.gif"></inline-graphic>
</alternatives>
</inline-formula>
. ECL calculates
<inline-formula id="IEq2">
<alternatives>
<tex-math id="M7">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\tilde {Y}$\end{document}</tex-math>
<mml:math id="M8">
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
</mml:math>
<inline-graphic xlink:href="12859_2016_1073_Article_IEq2.gif"></inline-graphic>
</alternatives>
</inline-formula>
before scoring PSMs, which reduces the computational complexity largely. Both
<italic>X</italic>
<sub>1</sub>
and
<italic>X</italic>
<sub>2</sub>
have linear ions containing one chain’s amino acids and cross-linking ions containing both chains’ amino acids (Fig.
<xref rid="Fig1" ref-type="fig">1</xref>
). Given an experimental spectrum and a chain, ECL can obtain this chain’s ion masses as
<disp-formula id="Equ3">
<label>3</label>
<alternatives>
<tex-math id="M9">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ x_{i} = \left\{ \begin{array}{ll} p - c + l_{i}, & cross-linking\ ion \\ l_{i}, & linear\ ion \end{array} \right., $$ \end{document}</tex-math>
<mml:math id="M10">
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfenced close="" open="{" separators="">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>p</mml:mi>
<mml:mo></mml:mo>
<mml:mi>c</mml:mi>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd>
<mml:mtext mathvariant="italic">cross</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="italic">linking</mml:mtext>
<mml:mspace width="1em"></mml:mspace>
<mml:mtext mathvariant="italic">ion</mml:mtext>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd>
<mml:mtext mathvariant="italic">linear</mml:mtext>
<mml:mspace width="1em"></mml:mspace>
<mml:mtext mathvariant="italic">ion</mml:mtext>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ3.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
<fig id="Fig1">
<label>Fig. 1</label>
<caption>
<p>An illustration of cross-linked peptides’ dissociation pattern. Two chains’ lysines are linked. Green markers indicate linear ions, and red markers indicate cross-linking ions. A chain’s linear ions only contain that chain’s amino acids. A chain’s cross-linking ions contain that chain’s amino acids, a cross-linker, and another whole chain</p>
</caption>
<graphic xlink:href="12859_2016_1073_Fig1_HTML" id="MO1"></graphic>
</fig>
</p>
<p>where
<italic>i</italic>
is the ion index starting from 0,
<italic>x</italic>
<sub>
<italic>i</italic>
</sub>
is
<italic>i</italic>
th ion’s mass,
<italic>p</italic>
is the experimental spectrum’s precursor mass,
<italic>c</italic>
is the chain’s mass, and
<italic>l</italic>
<sub>
<italic>i</italic>
</sub>
is the corresponding linear ion’s mass. Taking the first chain in Fig.
<xref rid="Fig1" ref-type="fig">1</xref>
for example, 4th b-ion is a cross-linking ion containing “EAKE” and “EVRKELDDLR” linked by a cross-linker. Thus, its corresponding linear b-ion is “EAKE”. Clearly,
<italic>p</italic>
<italic>c</italic>
is equal to the summation of the other chain’s mass and the cross-linker’s mass. We don’t consider the difference between the experimental spectrum’s precursor mass and the theoretical spectrum’s precursor mass because the precursor mass tolerance is smaller than or equal to the tandem mass tolerance for almost all mass spectrometers. Given each ion’s mass, ECL calculates its corresponding mass-to-charge ratios with different charges. After getting all ions’ mass-to-charge ratios for one chain, ECL generates
<italic>X</italic>
<sub>1</sub>
or
<italic>X</italic>
<sub>2</sub>
. Given an experimental spectrum,
<inline-formula id="IEq3">
<alternatives>
<tex-math id="M11">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${X_{1}^{T}} \tilde {Y}$\end{document}</tex-math>
<mml:math id="M12">
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
</mml:math>
<inline-graphic xlink:href="12859_2016_1073_Article_IEq3.gif"></inline-graphic>
</alternatives>
</inline-formula>
only needs to be calculated once for different
<italic>X</italic>
<sub>2</sub>
, which reduces the computational complexity largely.</p>
<p>With the above optimization, ECL’s workflow is described as follows:
<list list-type="order">
<list-item>
<p>Indexing chains based on their masses.</p>
</list-item>
<list-item>
<p>Calculating ions’ masses for each chain.</p>
</list-item>
<list-item>
<p>Indexing experimental spectra based on their precursor masses.</p>
</list-item>
<list-item>
<p>Peak de-noising. Eliminating peaks whose intensities have the highest frequency.</p>
</list-item>
<list-item>
<p>Calculating
<inline-formula id="IEq4">
<alternatives>
<tex-math id="M13">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\tilde {Y} = Y/||Y||$\end{document}</tex-math>
<mml:math id="M14">
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>/</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:math>
<inline-graphic xlink:href="12859_2016_1073_Article_IEq4.gif"></inline-graphic>
</alternatives>
</inline-formula>
for each experimental spectrum.</p>
</list-item>
<list-item>
<p>Finding the largest precursor mass from all experimental spectra.</p>
</list-item>
<list-item>
<p>Looping over all chains whose masses are smaller than or equal to half of the largest precursor mass in ascending order:
<list list-type="bullet">
<list-item>
<label></label>
<p>Finding all spectra whose precursor masses are larger than or equal to 2×
<italic>c</italic>
+
<italic>r</italic>
<italic>o</italic>
, where
<italic>r</italic>
is the cross-linker’s mass and
<italic>o</italic>
is the precursor mass tolerance.</p>
</list-item>
<list-item>
<label></label>
<p>Calculating ions’ masses using Eq. (
<xref rid="Equ3" ref-type="">3</xref>
), and using these masses to generate
<italic>X</italic>
<sub>1</sub>
.</p>
</list-item>
<list-item>
<label></label>
<p>Calculating
<inline-formula id="IEq5">
<alternatives>
<tex-math id="M15">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${X_{1}^{T}} \tilde {Y}$\end{document}</tex-math>
<mml:math id="M16">
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
</mml:math>
<inline-graphic xlink:href="12859_2016_1073_Article_IEq5.gif"></inline-graphic>
</alternatives>
</inline-formula>
for each corresponding spectrum.</p>
</list-item>
<list-item>
<label></label>
<p>Finding all chains whose masses are within the range [
<italic>p</italic>
<italic>o</italic>
<italic>c</italic>
<italic>r, p</italic>
+
<italic>o</italic>
<italic>c</italic>
<italic>r</italic>
).</p>
</list-item>
<list-item>
<label></label>
<p>Looping over all found chains:
<list list-type="simple">
<list-item>
<label>7.5.1</label>
<p>Calculating ions’ masses using Eq. (
<xref rid="Equ3" ref-type="">3</xref>
), and using these masses to generate
<italic>X</italic>
<sub>2</sub>
.</p>
</list-item>
<list-item>
<label>7.5.2</label>
<p>Calculating
<inline-formula id="IEq6">
<alternatives>
<tex-math id="M17">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${X_{2}^{T}} \tilde {Y}$\end{document}</tex-math>
<mml:math id="M18">
<mml:msubsup>
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>~</mml:mo>
</mml:mover>
</mml:math>
<inline-graphic xlink:href="12859_2016_1073_Article_IEq6.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
</list-item>
<list-item>
<label>7.5.3</label>
<p>Calculating the final score using Eq. (
<xref rid="Equ2" ref-type="">2</xref>
).</p>
</list-item>
<list-item>
<label>7.5.4</label>
<p>Saving each spectrum’s top score result as a PSM.</p>
</list-item>
</list>
</p>
</list-item>
</list>
</p>
</list-item>
<list-item>
<p>Estimating FDR (false discovery rate) for each PSM.</p>
</list-item>
<list-item>
<p>Converting FDR to
<italic>q</italic>
-value.</p>
</list-item>
</list>
</p>
<p>ECL estimates FDR as what xProphet [
<xref ref-type="bibr" rid="CR26">26</xref>
] and pLink [
<xref ref-type="bibr" rid="CR27">27</xref>
] do. Three kinds of PSMs are used:
<list list-type="order">
<list-item>
<p>Both chains are from the target database.</p>
</list-item>
<list-item>
<p>Both chains are from the decoy database.</p>
</list-item>
<list-item>
<p>One chain is from the target database and the other chain is from the decoy database.</p>
</list-item>
</list>
</p>
<p>FDR is estimated with
<disp-formula id="Equ4">
<label>4</label>
<alternatives>
<tex-math id="M19">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ FDR(s) = \frac{f(s) - d(s)}{t(s)}, $$ \end{document}</tex-math>
<mml:math id="M20">
<mml:mtext mathvariant="italic">FDR</mml:mtext>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo></mml:mo>
<mml:mi>d</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ4.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>where
<italic>s</italic>
is a score,
<italic>t</italic>
(
<italic>s</italic>
) is the number of the first kind of PSMs whose scores are smaller than or equal to
<italic>s</italic>
,
<italic>d</italic>
(
<italic>s</italic>
) is the number of the second kind of PSMs whose scores are smaller than or equal to
<italic>s</italic>
, and
<italic>f</italic>
(
<italic>s</italic>
) is the number of the third kind of PSMs whose scores are smaller than or equal to
<italic>s</italic>
. Finally, FDR is converted to
<italic>q</italic>
-value [
<xref ref-type="bibr" rid="CR31">31</xref>
]:
<disp-formula id="Equ5">
<label>5</label>
<alternatives>
<tex-math id="M21">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ q(t) = \min_{s \leq t} FDR(s), $$ \end{document}</tex-math>
<mml:math id="M22">
<mml:mi>q</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mo>min</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mtext mathvariant="italic">FDR</mml:mtext>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>,</mml:mo>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ5.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>where
<italic>t</italic>
is a threshold.</p>
</sec>
<sec id="Sec3">
<title>Results and discussion</title>
<sec id="Sec4">
<title>Computational complexity analysis</title>
<p>ECL is closely related to the work of Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] and Kojak [
<xref ref-type="bibr" rid="CR30">30</xref>
]. Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] provided their algorithm’s computational complexity. Hoopmann et al. [
<xref ref-type="bibr" rid="CR30">30</xref>
] provided Kojak’s source code without computational complexity analysis, so we analyzed its computational complexity based on the source code. In this section, we will analyze ECL’s computational complexity in detail.</p>
<sec id="Sec5">
<title>Computational complexity analysis</title>
<p>Defining the following variables:
<list list-type="bullet">
<list-item>
<p>
<italic>k</italic>
: number of proteins in a database.</p>
</list-item>
<list-item>
<p>
<italic>n</italic>
: average number of peptides in a protein.</p>
</list-item>
<list-item>
<p>
<italic>m</italic>
: average length of a chain.</p>
</list-item>
<list-item>
<p>
<italic>h</italic>
: average number of peaks in an experimental spectrum.</p>
</list-item>
<list-item>
<p>
<italic>s</italic>
: number of experimental spectra.</p>
</list-item>
<list-item>
<p>
<italic>L</italic>
: number of precursor mass tolerance ranges. This approximately equals the precursor mass range divided by the precursor mass tolerance.</p>
</list-item>
</list>
</p>
<p>The time complexity of the algorithm proposed by Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] is
<disp-formula id="Equ6">
<label>6</label>
<alternatives>
<tex-math id="M23">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ O(skn^{2} \log (kn) + sk^{2}n^{2} \log (kn) / L + s k^{2} n^{2}(m + h) / L). $$ \end{document}</tex-math>
<mml:math id="M24">
<mml:mi>O</mml:mi>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">sk</mml:mtext>
<mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>log</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">kn</mml:mtext>
<mml:mo>)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>log</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">kn</mml:mtext>
<mml:mo>)</mml:mo>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>)</mml:mo>
<mml:mi>.</mml:mi>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ6.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>For the first and second terms, the authors only considered one experimental spectrum. We multiply the terms by
<italic>s</italic>
because there are
<italic>s</italic>
experimental spectra. We also use
<italic>k</italic>
<sup>2</sup>
<italic>n</italic>
<sup>2</sup>
/
<italic>L</italic>
to replace
<italic>p</italic>
in the original paper. For the third term, the authors only considered one PSM. We multiply the term by
<italic>s</italic>
<italic>k</italic>
<sup>2</sup>
<italic>n</italic>
<sup>2</sup>
/
<italic>L</italic>
because there are
<italic>k</italic>
<sup>2</sup>
<italic>n</italic>
<sup>2</sup>
/
<italic>L</italic>
peptide-peptide combinations for each experimental spectrum and there are
<italic>s</italic>
experimental spectra. The time complexity of Kojak is
<disp-formula id="Equ7">
<label>7</label>
<alternatives>
<tex-math id="M25">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ O(kn \log(s) + kns (m + h + 1) + s t^{2}). $$ \end{document}</tex-math>
<mml:math id="M26">
<mml:mi>O</mml:mi>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">kn</mml:mtext>
<mml:mo>log</mml:mo>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">kns</mml:mtext>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>)</mml:mo>
<mml:mi>.</mml:mi>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ7.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>Please refer to the Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
for details.</p>
<p>For ECL, the computational complexity is dominated by step 7 in the workflow. The complexity of step 7.1 is
<italic>O</italic>
(log(
<italic>s</italic>
)). Steps 7.2 and 7.5.1 have the same time complexity,
<italic>O</italic>
(
<italic>m</italic>
). ECL stores theoretical and experimental spectra in sparse matrixes. We developed an algorithm to match peaks between a theoretical spectrum and an experimental spectrum with
<italic>O</italic>
(
<italic>m</italic>
+
<italic>h</italic>
) complexity (Algorithm 1). Thus, both steps 7.3 and 7.5.2 have the time complexity,
<italic>O</italic>
(
<italic>m</italic>
+
<italic>h</italic>
). Moreover, for an experimental spectrum and a pair of chains, steps 7.2 and 7.3 only need to be executed once because ECL checks each chain whose mass is smaller than or equal to half of the largest precursor mass in ascending order. Steps 7.3 and 7.5.2 also only need to be executed once for the same reason. The time complexity of step 7.4 is
<italic>O</italic>
(log(
<italic>k</italic>
<italic>n</italic>
)). The time complexity of steps 7.5.3 and 7.5.4 is
<italic>O</italic>
(
<italic>k</italic>
<italic>n</italic>
<italic>s</italic>
/
<italic>L</italic>
). Thus, the time complexity of step 7 is
<disp-formula id="Equ8">
<label>8</label>
<alternatives>
<tex-math id="M27">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ O(kn(\log(s) + m + s(m + h) + \log(kn) + kns / L)). $$ \end{document}</tex-math>
<mml:math id="M28">
<mml:mi>O</mml:mi>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">kn</mml:mtext>
<mml:mo>(</mml:mo>
<mml:mo>log</mml:mo>
<mml:mo>(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>log</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">kn</mml:mtext>
<mml:mo>)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">kns</mml:mtext>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>)</mml:mo>
<mml:mi>.</mml:mi>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ8.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>
<graphic xlink:href="12859_2016_1073_Figa_HTML.gif" id="MO2"></graphic>
</p>
<p>There are seven variables in the time complexity equations. Five of them can be fixed based on biological prior knowledge:
<list list-type="bullet">
<list-item>
<p>
<italic>n</italic>
≈100.</p>
</list-item>
<list-item>
<p>
<italic>m</italic>
≈20.</p>
</list-item>
<list-item>
<p>
<italic>h</italic>
≈10
<sup>2</sup>
.</p>
</list-item>
<list-item>
<p>
<italic>s</italic>
≈10
<sup>4</sup>
.</p>
</list-item>
<list-item>
<p>
<italic>L</italic>
≈10
<sup>5</sup>
.</p>
</list-item>
</list>
</p>
<p>We plotted curves of Eqs. (
<xref rid="Equ6" ref-type="">6</xref>
), (
<xref rid="Equ7" ref-type="">7</xref>
), and (
<xref rid="Equ8" ref-type="">8</xref>
) against different numbers of proteins (Fig.
<xref rid="Fig2" ref-type="fig">2</xref>
). Since Kojak selects
<italic>t</italic>
peptides for each spectrum, we plotted three curves corresponding to three different
<italic>t</italic>
values. We can see that Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] has the highest time complexity. When the number of proteins is small, ECL has smaller time complexity compared to Kojak (leftmost of Fig.
<xref rid="Fig2" ref-type="fig">2</xref>
). This is because ECL doesn’t need to select peptides beforehand. When the number of protein is large, ECL has higher complexity than Kojak (rightmost of Fig.
<xref rid="Fig2" ref-type="fig">2</xref>
). This is because the number of peptide-peptide combinations searched by ECL grows quadratically as the increase of protein number (Eq. (
<xref rid="Equ8" ref-type="">8</xref>
)). This is an unavoidable cost of exhaustive searching. On the other hand, the number of peptide-peptide combinations searched by Kojak is almost constant, and the total time complexity increases linearly (Eq. (
<xref rid="Equ7" ref-type="">7</xref>
)).
<fig id="Fig2">
<label>Fig. 2</label>
<caption>
<p>Computational complexity against different numbers of proteins. Three
<italic>t</italic>
values were used to plot Kojak’s computational complexity curves. Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] has the highest time complexity. When the number of proteins is small, ECL has smaller time complexity compared to Kojak. When the number of proteins is large, ECL has higher complexity than Kojak</p>
</caption>
<graphic xlink:href="12859_2016_1073_Fig2_HTML" id="MO3"></graphic>
</fig>
</p>
<p>Even though ECL’s time complexity is large, it can still handle a large database. Given a data set containing thousands of tandem mass spectra, ECL only needs 7 h to search a database containing 5200 proteins.</p>
</sec>
<sec id="Sec6">
<title>Space complexity</title>
<p>
<list list-type="bullet">
<list-item>
<p>The space complexity of Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] is
<disp-formula id="Equ9">
<label>9</label>
<alternatives>
<tex-math id="M29">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ O(kn + k^{2} n^{2}/L + knm + h). $$ \end{document}</tex-math>
<mml:math id="M30">
<mml:mi>O</mml:mi>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">kn</mml:mtext>
<mml:mo>+</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">knm</mml:mtext>
<mml:mo>+</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
<mml:mi>.</mml:mi>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ9.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>For the second term, we use
<italic>k</italic>
<sup>2</sup>
<italic>n</italic>
<sup>2</sup>
/
<italic>L</italic>
to replace
<italic>p</italic>
in the original paper. For the third term, the authors only considered one peptide-peptide combination for each experimental spectrum. We multiply the term by
<italic>kn</italic>
considering that there are
<italic>kn</italic>
peptides for each experimental spectrum.</p>
</list-item>
<list-item>
<p>There are two steps in Kojak. The space complexity of the first step is
<italic>O</italic>
(
<italic>m</italic>
+
<italic>s</italic>
<italic>h</italic>
), and the space complexity of the second step is
<italic>O</italic>
(
<italic>t</italic>
<italic>m</italic>
+
<italic>h</italic>
). Thus, the total space complexity is
<disp-formula id="Equ10">
<label>10</label>
<alternatives>
<tex-math id="M31">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ O(m + sh + tm + h). $$ \end{document}</tex-math>
<mml:math id="M32">
<mml:mi>O</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">sh</mml:mtext>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">tm</mml:mtext>
<mml:mo>+</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
<mml:mi>.</mml:mi>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ10.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
</list-item>
<list-item>
<p>The space complexity of ECL is
<disp-formula id="Equ11">
<label>11</label>
<alternatives>
<tex-math id="M33">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $$ O(knm + sh). $$ \end{document}</tex-math>
<mml:math id="M34">
<mml:mi>O</mml:mi>
<mml:mo>(</mml:mo>
<mml:mtext mathvariant="italic">knm</mml:mtext>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">sh</mml:mtext>
<mml:mo>)</mml:mo>
<mml:mi>.</mml:mi>
</mml:math>
<graphic xlink:href="12859_2016_1073_Article_Equ11.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
</list-item>
</list>
</p>
<p>Clearly, Chen et al. [
<xref ref-type="bibr" rid="CR32">32</xref>
] has the highest space complexity, and Kojak has the lowest space complexity. Although ECL’s space complexity is higher than that of Kojak, from our experience, a personal computer with 32G memory is sufficient in most cases.</p>
</sec>
</sec>
<sec id="Sec7">
<title>Experiments</title>
<p>In this paper, we will present two sets of experiments. The first one used a data set from the cross-linking of two synthetic peptides. The second one used four data sets from the 26S proteasome sample [
<xref ref-type="bibr" rid="CR33">33</xref>
] provided by xQuest [
<xref ref-type="bibr" rid="CR25">25</xref>
,
<xref ref-type="bibr" rid="CR26">26</xref>
]. Since our study did not involve any humans, animals or clinical data, we do not have ethics or consent issues.</p>
<sec id="Sec8">
<title>An experiment with synthetic peptides</title>
<p>This experiment used two synthetic peptides produced by GL Biochem (Shanghai) Ltd. The sequences were “EVRKELDDLR” and “EAKELIEGLPR”. N-terminals were protected by Fmoc. We used 1
<italic>μ</italic>
L peptides and 0.5
<italic>μ</italic>
L DSS. Their concentrations were 1 and 0.5 mM, respectively. We dissolved the peptides and DSS in DMSO (dimethyl sulfoxide) to a final concentration of 50 mM. The reaction was carried out at room temperature, and the reaction time was 2 h. After quenching, we added 12.5
<italic>μ</italic>
L piperidine to the above solution to remove the Fmoc protection. The reaction lasted for another 2 h. Finally, we freeze-dried the sample to obtain the cross-linked peptides.</p>
<p>LC-MS (liquid chromatography-mass spectrometry) analysis was carried out on a Thermo LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific Inc.) with a NanoLC system. The sample was loaded onto a trapping column (PepMap C18; 2 cm × 100
<italic>μ</italic>
m × 5
<italic>μ</italic>
m, 100 Å) using a flow rate of 4
<italic>μ</italic>
L/min of solvent A. The loading lasted for 10 min. Cross-linked peptides were separated at a flow rate of 200 L/min on a 75
<italic>μ</italic>
m × 50 cm C18 column (Acclaim PepMap RSLC C18, 75
<italic>μ</italic>
m × 50 cm × 3
<italic>μ</italic>
m, 100 Å). The following gradient was used: 0–8 min 2 % B, 8–12 min 2–10 % B, 12–180 min 10–50 % B, 180–200 min 50–98 % B, 200–215 min 98 % B, and 215–240 min 98 – 2 % B, where B was the ratio of acetonitrile to formic acid. B equaled 100:0.1 in this experiment. The mass spectrometer selected up to five precursors to perform CID. The intensity threshold of triggering fragmentation was 150 counts. Only those whose precursor charges were larger than or equal to 2 were considered. CID was performed for 30 ms using 35 % normalized collision energy and a 0.25 activation value. Dynamic exclusion was used with the following parameters: 1 repeat count, 60 s exclusion duration, 500 list size, and 10 ppm mass window. The ion target value was 1,000,000 (or 500 ms fill time) for full scans, and 1,000,000 (or 200 ms fill time) for a tandem mass scan. Fragmented ions were detected in a linear ion trap.</p>
<p>During the search, the precursor mass tolerance was 10ppm, and the tandem mass tolerance was 0.5Th. Up to 2 missed cleavages were allowed. The database contained 100 randomly selected proteins and two synthetic peptides. The decoy database was generated by reversing peptides, with lysine and arginine fixed. Because there was only one linkable site in each synthetic peptide, all cross-linked peptides formed by synthetic peptides were treated as inter-protein cross-linked peptides. The
<italic>q</italic>
-value cut-off threshold was 0.05.</p>
<p>The search was carried out on a personal computer with an Intel Core i5-4570 CPU (central processing unit) and 32 GB memory. ECL needed about 100 s to finish the task. Since we knew the ground truth, we could calculate the false discovery proportion. 4 out of 149 PSMs were incorrect. The corresponding false discovery proportion was 0.03. This experiment indicated that ECL could provide trustable results. Details can be found in the Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
.</p>
</sec>
<sec id="Sec9">
<title>Experiments with 26S proteasome data</title>
<p>Four data sets from the 26S proteasome sample [
<xref ref-type="bibr" rid="CR25">25</xref>
,
<xref ref-type="bibr" rid="CR26">26</xref>
,
<xref ref-type="bibr" rid="CR33">33</xref>
] were used. We first searched four data sets against a database released along with the data sets. It contained 34 proteins. The latest versions of xQuest, pLink, ProteinProspector, Kojak, and ECL were used: xQuest 2.1.1, pLink 1.23, ProteinProspector 5.14.4, Kojak 1.4.2, and ECL 20160117. The precursor mass tolerance was 10 ppm, and the tandem mass tolerance was 0.2Da. Other parameters were the same as those in the previous experiment. All the parameter files used by these tools were included in the Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
. We used xProphet [
<xref ref-type="bibr" rid="CR26">26</xref>
] to estimate the
<italic>q</italic>
-value for xQuest’s results by setting “qtransform” to 1 in the “xproph.def” file. Because ProteinProspector did not provide the
<italic>q</italic>
-value in its results, we estimated it as what Trnka et al. [
<xref ref-type="bibr" rid="CR29">29</xref>
] did. We used Percolator to estimate the
<italic>q</italic>
-value for Kojak’s results as what Kojak required. Intra-protein cross-linked peptides and inter-protein cross-linked peptides were analyzed separately. For a fair comparison, these tools’
<italic>q</italic>
-value thresholds were 0.05.</p>
<p>Table
<xref rid="Tab1" ref-type="table">1</xref>
shows the numbers of non-redundant cross-linked peptides identified by xQuest, pLink, ProteinProspector, Kojak, and ECL, respectively. Corresponding Venn diagrams can be found in the Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
. ECL identified more cross-linked peptides than xQuest, pLink, and ProteinProspector. We used protein crystal structures from the protein data bank (PDB) to measure the distances between linking-sites in intra-protein cross-linked peptides. Only 3 proteins had structural information. Their UniProt accessions were O94444, P06732, and P50524, respectively. The corresponding PDB ID were 2X5N, 1I0E, and 4B0Z, respectively. There were 65 PSMs to these proteins. 60 of them had a distance smaller than 30 Å, which meant that they were within the distance tolerance. Details can be found in the Additional file
<xref rid="MOESM4" ref-type="media">4</xref>
. We also used ECLAnnotator to generate annotated tandem mass spectra for ECL’s results. They can be found at
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.ust.hk/ecl.html">http://bioinformatics.ust.hk/ecl.html</ext-link>
. Then, we analyzed matched and unmatched peaks. Please refer to the Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
for details.
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>Numbers of non-redundant cross-linked peptides identified by xQuest, pLink, ProteinProspector, Kojak, and ECL, respectively. The database contains 34 proteins</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Data set</th>
<th align="left">xQuest</th>
<th align="left">pLink</th>
<th align="left">ProteinProspector</th>
<th align="left">Kojak</th>
<th align="left">ECL</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1</td>
<td align="left">70 (56)</td>
<td align="left">5 (4)</td>
<td align="left">104 (69)</td>
<td align="left">102 (71)</td>
<td align="left">97</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">73 (41)</td>
<td align="left">28 (17)</td>
<td align="left">99 (45)</td>
<td align="left">120 (56)</td>
<td align="left">58</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">90 (62)</td>
<td align="left">28 (10)</td>
<td align="left">96 (64)</td>
<td align="left">139 (90)</td>
<td align="left">127</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">61 (47)</td>
<td align="left">20 (14)</td>
<td align="left">94 (68)</td>
<td align="left">110 (83)</td>
<td align="left">135</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Values in the brackets are the numbers of overlapping cross-linked peptides identified by both ECL and the corresponding method</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>In order to find out if the additionally identified cross-linked peptides were due to exhaustive search, we let Kojak output top 9999 pre-selected peptides for each cross-linked peptide’s highest score spectrum. (The default number of pre-selected peptides is 250. To our knowledge, other tools can not output their pre-selected peptides). Then, we compared the cross-linked peptides identified by ECL with those pre-selected peptides in the corresponding spectra. We consider one additionally identified cross-linked peptides pair is due to exhaustive search if all of the following criteria are satisfied (We thank the anonymous reviewer for suggesting these criteria):
<list list-type="order">
<list-item>
<p>The precursor masses in Kojak and ECL are within the same tolerance range.</p>
</list-item>
<list-item>
<p>If both of two peptide chains are in the pre-selection list and at least one is over 250, Kojak and ECL identify the same pair of peptide chains.</p>
</list-item>
<list-item>
<p>At least one peptide chain isn’t in the pre-selection list.</p>
</list-item>
</list>
</p>
<p>Table
<xref rid="Tab2" ref-type="table">2</xref>
shows the summarized results. About 30
<italic>%</italic>
of these peptides aren’t within top 250 of Kojak’s pre-selected peptides, which means that the pre-selection procedure is one of the causes of missing findings. Each spectrum’s pre-selected peptides and detailed comparison results can be found in the Additional file
<xref rid="MOESM5" ref-type="media">5</xref>
.
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>A table showing if Kojak searched those missing identified peptides</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Data set</th>
<th align="left">Number of peptides from the cross-linked peptides identified by ECL, but not by Kojak</th>
<th align="left">Number of peptides that don’t belong to Kojak’s pre-selected peptides</th>
<th align="left">Ratio</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1</td>
<td align="justify">25</td>
<td align="justify">2</td>
<td align="left">0.08</td>
</tr>
<tr>
<td align="left">2</td>
<td align="justify">2</td>
<td align="justify">1</td>
<td align="left">0.50</td>
</tr>
<tr>
<td align="left">3</td>
<td align="justify">37</td>
<td align="justify">12</td>
<td align="left">0.32</td>
</tr>
<tr>
<td align="left">4</td>
<td align="justify">52</td>
<td align="justify">21</td>
<td align="left">0.40</td>
</tr>
<tr>
<td align="left">Total</td>
<td align="justify">116</td>
<td align="justify">36</td>
<td align="left">0.31</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The second column contains the total numbers of peptides from the cross-linked peptides identified by ECL, but not by Kojak. The third column contains the numbers of peptides that don’t belong to Kojak’s pre-selected peptides. The forth column contains the ratios between the number in the third column and the number in the second column</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>Table
<xref rid="Tab3" ref-type="table">3</xref>
shows the corresponding running time of xQuest, pLink, Kojak, and ECL, respectively. ProteinProspector spent 1254 seconds on average analyzing one data set. It was run on the authors’ web server so we didn’t compare it with the other four tools. Since Kojak supports multi-thread computing, we ran it with 4 threads. xQuest, pLink, and ECL don’t support multi-thread computing.
<table-wrap id="Tab3">
<label>Table 3</label>
<caption>
<p>Running time of xQuest, pLink, Kojak, and ECL, respectively. The unit is second</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Data set</th>
<th align="left">xQuest</th>
<th align="left">pLink</th>
<th align="left">Kojak (4 threads)</th>
<th align="left">ECL</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1</td>
<td align="left">6349</td>
<td align="left">851</td>
<td align="left">46</td>
<td align="left">51</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">6741</td>
<td align="left">878</td>
<td align="left">48</td>
<td align="left">57</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">20419</td>
<td align="left">876</td>
<td align="left">49</td>
<td align="left">60</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">21757</td>
<td align="left">700</td>
<td align="left">47</td>
<td align="left">60</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>Finally, we tested if ECL could search a large database within a reasonable period of time. We searched the same data sets against the whole proteome of Schizosaccharomyces pombe species. There were 5200 proteins. We set the allowed maximum missed cleavage to 1. The rest of the parameters were the same as those in the last experiment. xQuest ran for a few days, but it still couldn’t finish the searching. pLink could not handle such a large database. ProteinProspector spent 1.7 h on average analyzing one data set on the authors’ web server. Kojak spent 0.25 h on average analyzing one data set. ECL spent 7 h on average analyzing one data set.</p>
<p>There were 4×10
<sup>10</sup>
peptide-peptide combinations including decoy peptides. The precursor mass tolerance was 10 ppm. Thus, there were about 4×10
<sup>5</sup>
peptide-peptide combinations for each spectrum. Kojak selected top 250 peptides to generate peptide-peptide combinations for each spectrum, which covered about 8
<italic>%</italic>
of the whole search space. ProteinProspector used a similar pre-selection procedure to select top 1000 peptides. Thus, the number of peptide-peptide combinations searched by ProteinProspector and Kojak was almost constant with the increase of the database size. However, the number of peptide-peptide combinations searched by ECL increased quadratically. That’s why ECL was slower than ProteinProspector and Kojak.</p>
<p>ProteinProspector, Kojak, and ECL identified fewer cross-linked peptides compared with the previous experiment (Table
<xref rid="Tab4" ref-type="table">4</xref>
). It is a known issue [
<xref ref-type="bibr" rid="CR34">34</xref>
,
<xref ref-type="bibr" rid="CR35">35</xref>
] that larger databases lead to fewer results. The discussion of this issue is beyond the scope of this paper. ECL identified more non-redundant peptides than ProteinProspector and Kojak. Please note that there is no intra-protein cross-linked peptides identified by Kojak because Percolator output errors in estimating
<italic>q</italic>
-value for Kojak. The errors said: “the input data has too good separation between target and decoy PSMs”. It is a common error when there are only a few target or decoy PSMs. Please refer to Percolator’s document for more detail.
<table-wrap id="Tab4">
<label>Table 4</label>
<caption>
<p>Numbers of non-redundant cross-linked peptides identified by ProteinProspector, Kojak, and ECL, respectively. The database contains 5200 proteins</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Data set</th>
<th align="left">ProteinProspector</th>
<th align="left">Kojak</th>
<th align="left">ECL</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1</td>
<td align="left">20 (15)</td>
<td align="left">5 (0)</td>
<td align="left">36</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">32 (16)</td>
<td align="left">6 (0)</td>
<td align="left">39</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">24 (12)</td>
<td align="left">4 (0)</td>
<td align="left">39</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">23 (17)</td>
<td align="left">2 (0)</td>
<td align="left">57</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Values in the brackets are the numbers of overlapping cross-linked peptides identified by both ECL and the corresponding method. There is no result for intra-protein cross-linked peptides reported by Kojak because Percolator outputs errors in estimating
<italic>q</italic>
-value</p>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
</sec>
</sec>
<sec id="Sec10" sec-type="conclusion">
<title>Conclusions</title>
<p>High computational complexity is a major obstacle in exhaustively carrying out large-scale cross-linked peptides identification. To the best of our knowledge, ECL is the first tool that successfully addresses the computational complexity issue without any heuristic pre-selection procedure. Given thousands of tandem mass spectra and a database containing thousands of proteins, it can finish the task in a few hours. The experiments showed that ECL could identify more peptides than xQuest, pLink, and ProteinProspector. A further analysis on public data sets showed that exhaustive search helped identify more cross-linked peptides than existing methods.</p>
</sec>
<sec id="Sec11">
<title>Availability and requirements</title>
<p>
<bold>Project name:</bold>
ECL
<bold>Project home pase:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.ust.hk/ecl.html">http://bioinformatics.ust.hk/ecl.html</ext-link>
<bold>Operating systems:</bold>
Windows, Linux, OS X
<bold>Programming language:</bold>
Java, Python
<bold>Other requirements:</bold>
Java 1.7 or higher, Python 2.7
<bold>License:</bold>
Apache License 2</p>
</sec>
</body>
<back>
<app-group>
<app id="App1">
<sec id="Sec12">
<title>Additional files</title>
<p>
<media position="anchor" xlink:href="12859_2016_1073_MOESM1_ESM.pdf" id="MOESM1">
<label>Additional file1</label>
<caption>
<p>A supplementary document contains ECL user instruction, computational complexity analysis of Kojak, spectra analysis of the 26S Proteasome results, and venn diagrams of 26S Proteasome results. (PDF 434 kb)</p>
</caption>
</media>
</p>
<p>
<media position="anchor" xlink:href="12859_2016_1073_MOESM2_ESM.zip" id="MOESM2">
<label>Additional file 2</label>
<caption>
<p>Detailed results of synthetic peptides and 26S Proteasome samples. (ZIP 4432 kb)</p>
</caption>
</media>
</p>
<p>
<media position="anchor" xlink:href="12859_2016_1073_MOESM3_ESM.zip" id="MOESM3">
<label>Additional file 3</label>
<caption>
<p>Parameter files used by xQuest, ProteinProspector, Kojak, and ECL, respectively. (ZIP 17 kb)</p>
</caption>
</media>
</p>
<p>
<media position="anchor" xlink:href="12859_2016_1073_MOESM4_ESM.xlsx" id="MOESM4">
<label>Additional file 4</label>
<caption>
<p>Distances of intra protein identified by ECL. (XLSX 15 kb)</p>
</caption>
</media>
</p>
<p>
<media position="anchor" xlink:href="12859_2016_1073_MOESM5_ESM.zip" id="MOESM5">
<label>Additional file 5</label>
<caption>
<p>Kojak’s pre-selection list of PSMs only identified by ECL. (ZIP 5147 kb)</p>
</caption>
</media>
</p>
</sec>
</app>
</app-group>
<glossary>
<title>Abbreviations</title>
<def-list>
<def-item>
<term>BS3</term>
<def>
<p>Bis(sulfosuccinimidyl) suberate. CID: collision-induced dissociation</p>
</def>
</def-item>
<def-item>
<term>CPU</term>
<def>
<p>central processing unit</p>
</def>
</def-item>
<def-item>
<term>CX-MS</term>
<def>
<p>chemical cross-linking combined with mass spectrometry</p>
</def>
</def-item>
<def-item>
<term>DMSO</term>
<def>
<p>dimethyl sulfoxide</p>
</def>
</def-item>
<def-item>
<term>DSS</term>
<def>
<p>disuccinimidyl suberate</p>
</def>
</def-item>
<def-item>
<term>ECL</term>
<def>
<p>exhaustive cross-linked peptides identification tool</p>
</def>
</def-item>
<def-item>
<term>ETD</term>
<def>
<p>electron-transfer dissociation</p>
</def>
</def-item>
<def-item>
<term>FDR</term>
<def>
<p>false discovery rate</p>
</def>
</def-item>
<def-item>
<term>PDB</term>
<def>
<p>protein data bank</p>
</def>
</def-item>
<def-item>
<term>PSM</term>
<def>
<p>peptide spectrum match</p>
</def>
</def-item>
</def-list>
</glossary>
<ack>
<title>Acknowledgements</title>
<p>We would like thank for the anonymous reviewers for all the critical challenges, and excellent suggestions.</p>
<sec id="d30e2555">
<title>Funding</title>
<p>This work is partially supported by a theme-based project T12-402/13N from the research grant council (RGC) of the Hong Kong S.A.R. government, internal grant VPRGO15EG01 from HKUST, two grants, 16101114 and 661613, from the general research fund (GRF) of the Hong Kong S.A.R. government, and grant 31370315 from the National Natural Science Foundation of China (NSFC).</p>
</sec>
<sec id="d30e2560">
<title>Availability of data and materials</title>
<p>The mzXML file of the synthetic peptide sample can be downloaded at
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.ust.hk/ecl.html">http://bioinformatics.ust.hk/ecl.html</ext-link>
. Four mzXML files of the 26S proteasome sample can be found in the xQuest virtual machine from
<ext-link ext-link-type="uri" xlink:href="http://proteomics.ethz.ch/cgi-bin/xquest2_cgi/installation.cgi">http://proteomics.ethz.ch/cgi-bin/xquest2_cgi/installation.cgi</ext-link>
.</p>
</sec>
<sec id="d30e2575">
<title>Authors’ contributions</title>
<p>FY designed the algorithm, wrote the program, analyzed the computational complexity, did the benchmarks, and wrote the manuscript. NL conceived the study and provided the synthetic peptides sample. WY conceived the study and revised the manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec id="d30e2580">
<title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec id="d30e2585">
<title>Consent for publication</title>
<p>Not applicable.</p>
</sec>
<sec id="d30e2590">
<title>Ethics approval and consent to participate</title>
<p>Not applicable.</p>
</sec>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Young</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Hempel</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Oshiro</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Kuntz</surname>
<given-names>ID</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>BW</given-names>
</name>
<name>
<surname>Dollinger</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2000</year>
<volume>97</volume>
<fpage>5802</fpage>
<lpage>806</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.090099097</pub-id>
<pub-id pub-id-type="pmid">10811876</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schilling</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Row</surname>
<given-names>RH</given-names>
</name>
<name>
<surname>Gibsonb</surname>
<given-names>BW</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>MM</given-names>
</name>
</person-group>
<article-title>MS2Assign: Automated assignment and nomenclature of tandem mass spectra of chemically crosslinked peptides</article-title>
<source>J Am Soc Mass Spectrom</source>
<year>2003</year>
<volume>14</volume>
<fpage>834</fpage>
<lpage>50</lpage>
<pub-id pub-id-type="doi">10.1016/S1044-0305(03)00327-1</pub-id>
<pub-id pub-id-type="pmid">12892908</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chu</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Shan</surname>
<given-names>S-o</given-names>
</name>
<name>
<surname>Moustakas</surname>
<given-names>DT</given-names>
</name>
<name>
<surname>Alber</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Egea</surname>
<given-names>PF</given-names>
</name>
<name>
<surname>Stroud</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Walter</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Burlingame</surname>
<given-names>AL</given-names>
</name>
</person-group>
<article-title>Unraveling the interface of signal recognition particle and its receptor by using chemical cross-linking and tandem mass spectrometry</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2004</year>
<volume>101</volume>
<issue>47</issue>
<fpage>16454</fpage>
<lpage>16459</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0407456101</pub-id>
<pub-id pub-id-type="pmid">15546976</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Lichti</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Hall</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Raney</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Jennings</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>CLPM: a cross-linked peptide mapping algorithm for mass spectrometric analysis</article-title>
<source>BMC Bioinforma</source>
<year>2005</year>
<volume>6</volume>
<fpage>9</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-6-S2-S9</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ihling</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Kalkhof</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schulz</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Stingl</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Mechtler</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Haack</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Beck-Sickinger</surname>
<given-names>AG</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Sinz</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Isotope-labeled cross-linkers and fourier transform ion cyclotron resonance mass spectrometry for structural analysis of a protein/peptide complex</article-title>
<source>J Am Soc Mass Spectrom</source>
<year>2006</year>
<volume>17</volume>
<issue>8</issue>
<fpage>1100</fpage>
<lpage>1113</lpage>
<pub-id pub-id-type="doi">10.1016/j.jasms.2006.04.020</pub-id>
<pub-id pub-id-type="pmid">16750914</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koning</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Kasper</surname>
<given-names>PT</given-names>
</name>
<name>
<surname>Back</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Nessen</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Vanrobaeys</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Beeumen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Gherardi</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Koster</surname>
<given-names>CG</given-names>
</name>
<name>
<surname>Jong</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Computer-assisted mass spectrometric analysis of naturally occurring and artificially introduced cross-links in proteins and protein complexes</article-title>
<source>FEBS J</source>
<year>2006</year>
<volume>273</volume>
<issue>2</issue>
<fpage>281</fpage>
<lpage>91</lpage>
<pub-id pub-id-type="doi">10.1111/j.1742-4658.2005.05053.x</pub-id>
<pub-id pub-id-type="pmid">16403016</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maiolica</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cittaro</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Borsotti</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Sennels</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ciferri</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Tarricone</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Musacchio</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rappsilber</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Structural analysis of multi-protein complexes by cross-linking, mass spectrometry and database searching</article-title>
<source>Mol Cell Proteomics</source>
<year>2007</year>
<volume>6</volume>
<fpage>2200</fpage>
<lpage>211</lpage>
<pub-id pub-id-type="doi">10.1074/mcp.M700274-MCP200</pub-id>
<pub-id pub-id-type="pmid">17921176</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>YJ</given-names>
</name>
<name>
<surname>Lackner</surname>
<given-names>LL</given-names>
</name>
<name>
<surname>Nunnari</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Phinney</surname>
<given-names>BS</given-names>
</name>
</person-group>
<article-title>Shotgun cross-linking analysis for studying quaternary and tertiary protein structures</article-title>
<source>J Proteome Res</source>
<year>2007</year>
<volume>6</volume>
<issue>10</issue>
<fpage>3908</fpage>
<lpage>917</lpage>
<pub-id pub-id-type="doi">10.1021/pr070234i</pub-id>
<pub-id pub-id-type="pmid">17854217</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Singh</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Shaffer</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Scherl</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Holman</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Pfuetzner</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Freeman</surname>
<given-names>TJL</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>SI</given-names>
</name>
<name>
<surname>Hernandez</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Appel</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Goodlett</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Characterization of protein cross-links via mass spectrometry and an open-modification search strategy</article-title>
<source>Anal Chem</source>
<year>2008</year>
<volume>80</volume>
<issue>22</issue>
<fpage>8799</fpage>
<lpage>806</lpage>
<pub-id pub-id-type="doi">10.1021/ac801646f</pub-id>
<pub-id pub-id-type="pmid">18947195</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>ET</given-names>
</name>
<name>
<surname>Hawkins</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Kuntz</surname>
<given-names>ID</given-names>
</name>
<name>
<surname>Rahn</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Rothfuss</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sale</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Pancerella</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Fabris</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>The collaboratory for MS3D: a new cyberinfrastructure for the structural elucidation of biological macromolecules and their assemblies using mass spectrometry-based approaches</article-title>
<source>J Proteome Res</source>
<year>2008</year>
<volume>7</volume>
<issue>11</issue>
<fpage>4848</fpage>
<lpage>857</lpage>
<pub-id pub-id-type="doi">10.1021/pr800443f</pub-id>
<pub-id pub-id-type="pmid">18817429</pub-id>
</element-citation>
</ref>
<ref id="CR11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nadeau</surname>
<given-names>OW</given-names>
</name>
<name>
<surname>Wyckoff</surname>
<given-names>GJ</given-names>
</name>
<name>
<surname>Paschall</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Artigues</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sage</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Villar</surname>
<given-names>MT</given-names>
</name>
<name>
<surname>Carlson</surname>
<given-names>GM</given-names>
</name>
</person-group>
<article-title>CrossSearch, a user-friendly search engine for detecting chemically cross-linked peptides in conjugated proteins</article-title>
<source>Mol Cell Proteomics</source>
<year>2008</year>
<volume>7</volume>
<issue>4</issue>
<fpage>739</fpage>
<lpage>49</lpage>
<pub-id pub-id-type="doi">10.1074/mcp.M800020-MCP200</pub-id>
<pub-id pub-id-type="pmid">18281724</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Panchaud</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Shaffer</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Goodlett</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>xComb: a cross-linked peptide database approach to protein-protein interaction analysis</article-title>
<source>J Proteome Res</source>
<year>2010</year>
<volume>9</volume>
<issue>5</issue>
<fpage>2508</fpage>
<lpage>515</lpage>
<pub-id pub-id-type="doi">10.1021/pr9011816</pub-id>
<pub-id pub-id-type="pmid">20302351</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McIlwain</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Draghicescu</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Goodlett</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Noble</surname>
<given-names>WS</given-names>
</name>
</person-group>
<article-title>Detecting cross-linked peptides by searching against a database of cross-linked peptide pairs</article-title>
<source>J Proteome Res</source>
<year>2010</year>
<volume>9</volume>
<issue>5</issue>
<fpage>2488</fpage>
<lpage>495</lpage>
<pub-id pub-id-type="doi">10.1021/pr901163d</pub-id>
<pub-id pub-id-type="pmid">20349954</pub-id>
</element-citation>
</ref>
<ref id="CR14">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Du</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Chowdhury</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Manes</surname>
<given-names>NP</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>MU</given-names>
</name>
<name>
<surname>Adkins</surname>
<given-names>JN</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>GA</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>RD</given-names>
</name>
</person-group>
<article-title>Xlink-Identifier: an automated data analysis platform for confident identifications of chemically cross-linked peptides using tandem mass spectrometry</article-title>
<source>J Proteome Res</source>
<year>2011</year>
<volume>10</volume>
<issue>3</issue>
<fpage>923</fpage>
<lpage>31</lpage>
<pub-id pub-id-type="doi">10.1021/pr100848a</pub-id>
<pub-id pub-id-type="pmid">21175198</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holding</surname>
<given-names>AN</given-names>
</name>
<name>
<surname>Lamers</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Stephens</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Skehel</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Hekate: software suite for the mass spectrometric analysis and three-dimensional visualization of cross-linked protein samples</article-title>
<source>J Proteome Res</source>
<year>2013</year>
<volume>12</volume>
<issue>12</issue>
<fpage>5923</fpage>
<lpage>933</lpage>
<pub-id pub-id-type="doi">10.1021/pr4003867</pub-id>
<pub-id pub-id-type="pmid">24010795</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mueller-Planitz</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Crossfinder-assisted mapping of protein crosslinks formed by site-specifically incorporated crosslinkers</article-title>
<source>Bioinformatics</source>
<year>2015</year>
<volume>31</volume>
<issue>12</issue>
<fpage>2043</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btv083</pub-id>
<pub-id pub-id-type="pmid">25788624</pub-id>
</element-citation>
</ref>
<ref id="CR17">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Petrotchenko</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Borchers</surname>
<given-names>CH</given-names>
</name>
</person-group>
<article-title>ICC-CLASS: isotopically-coded cleavable crosslinking analysis software suite</article-title>
<source>BMC Bioinforma</source>
<year>2010</year>
<volume>11</volume>
<issue>1</issue>
<fpage>64</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-64</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18</label>
<mixed-citation publication-type="other">Kao A, Chiu CL, Vellucci D, Yang Y, Patel VR, Guan S, Randall A, Baldi P, Rychnovsky SD, Huang L. Development of a novel cross-linking strategy for fast and accurate identification of cross-linked peptides of protein complexes. Mol Cell Proteomics.2010;mcp-M110.</mixed-citation>
</ref>
<ref id="CR19">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Petrotchenko</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Serpa</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>Borchers</surname>
<given-names>CH</given-names>
</name>
</person-group>
<article-title>An isotopically coded cid-cleavable biotinylated cross-linker for structural proteomics</article-title>
<source>Mol Cell Proteomics</source>
<year>2011</year>
<volume>10</volume>
<issue>2</issue>
<fpage>110</fpage>
<lpage>001420</lpage>
<pub-id pub-id-type="doi">10.1074/mcp.M110.001420</pub-id>
</element-citation>
</ref>
<ref id="CR20">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kaake</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Burke</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kandur</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Novtisky</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Second</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Duan</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kao</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A new in vivo cross-linking mass spectrometry platform to define protein–protein interactions in living cells</article-title>
<source>Mol Cell Proteomics</source>
<year>2014</year>
<volume>13</volume>
<issue>12</issue>
<fpage>3533</fpage>
<lpage>543</lpage>
<pub-id pub-id-type="doi">10.1074/mcp.M114.042630</pub-id>
<pub-id pub-id-type="pmid">25253489</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Herzog</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Kahraman</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Boehringer</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Mak</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Bracher</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Walzthoeni</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Leitner</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Beck</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hartl</surname>
<given-names>FU</given-names>
</name>
<name>
<surname>Ban</surname>
<given-names>N</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Structural probing of a protein phosphatase 2A network by chemical cross-linking and mass spectrometry</article-title>
<source>Science</source>
<year>2012</year>
<volume>337</volume>
<issue>6100</issue>
<fpage>1348</fpage>
<lpage>1352</lpage>
<pub-id pub-id-type="doi">10.1126/science.1221483</pub-id>
<pub-id pub-id-type="pmid">22984071</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nguyen</surname>
<given-names>VQ</given-names>
</name>
<name>
<surname>Ranjan</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stengel</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Aebersold</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Leschziner</surname>
<given-names>AE</given-names>
</name>
</person-group>
<article-title>Molecular architecture of the ATP-dependent chromatin-remodeling complex SWR1</article-title>
<source>Cell</source>
<year>2013</year>
<volume>154</volume>
<issue>6</issue>
<fpage>1220</fpage>
<lpage>1231</lpage>
<pub-id pub-id-type="doi">10.1016/j.cell.2013.08.018</pub-id>
<pub-id pub-id-type="pmid">24034246</pub-id>
</element-citation>
</ref>
<ref id="CR23">
<label>23</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Politis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stengel</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Hall</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Hernández</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Leitner</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Walzthoeni</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Robinson</surname>
<given-names>CV</given-names>
</name>
<name>
<surname>Aebersold</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>A mass spectrometry-based hybrid method for structural modeling of protein complexes</article-title>
<source>Nat Methods</source>
<year>2014</year>
<volume>11</volume>
<issue>4</issue>
<fpage>403</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.2841</pub-id>
<pub-id pub-id-type="pmid">24509631</pub-id>
</element-citation>
</ref>
<ref id="CR24">
<label>24</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Greber</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Boehringer</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Leitner</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Bieri</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Voigts-Hoffmann</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Erzberger</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Leibundgut</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Aebersold</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Ban</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Architecture of the large subunit of the mammalian mitochondrial ribosome</article-title>
<source>Nature</source>
<year>2014</year>
<volume>505</volume>
<issue>7484</issue>
<fpage>515</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1038/nature12890</pub-id>
<pub-id pub-id-type="pmid">24362565</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rinner</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Seebacher</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Walzthoeni</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Mueller</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Beck</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mueller</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Aebersold</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Identification of cross-linked peptides from large sequence databases</article-title>
<source>Nat Methods</source>
<year>2008</year>
<volume>5</volume>
<issue>4</issue>
<fpage>315</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">18327264</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Walzthoeni</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Claassen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Leitner</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Herzog</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Bohn</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Förster</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Beck</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Aebersold</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>False discovery rate estimation for cross-linked peptides identified by mass spectrometry</article-title>
<source>Nat Methods</source>
<year>2012</year>
<volume>9</volume>
<issue>9</issue>
<fpage>901</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.2103</pub-id>
<pub-id pub-id-type="pmid">22772729</pub-id>
</element-citation>
</ref>
<ref id="CR27">
<label>27</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>YJ</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fan</surname>
<given-names>SB</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>YX</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>HF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Identification of cross-linked peptides from complex samples</article-title>
<source>Nat Methods</source>
<year>2012</year>
<volume>9</volume>
<issue>9</issue>
<fpage>904</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.2099</pub-id>
<pub-id pub-id-type="pmid">22772728</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chu</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>PR</given-names>
</name>
<name>
<surname>Burlingame</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Chalkley</surname>
<given-names>RJ</given-names>
</name>
</person-group>
<article-title>Finding chimeras: a bioinformatics strategy for identification of cross-linked peptides</article-title>
<source>Mol Cell Proteomics</source>
<year>2010</year>
<volume>9</volume>
<fpage>25</fpage>
<lpage>31</lpage>
<pub-id pub-id-type="doi">10.1074/mcp.M800555-MCP200</pub-id>
<pub-id pub-id-type="pmid">19809093</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trnka</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>PR</given-names>
</name>
<name>
<surname>Robinson</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Burlingame</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Chalkley</surname>
<given-names>RJ</given-names>
</name>
</person-group>
<article-title>Matching cross-linked peptide spectra: only as good as the worse identification</article-title>
<source>Mol Cell Proteomics</source>
<year>2014</year>
<volume>13</volume>
<issue>2</issue>
<fpage>420</fpage>
<lpage>34</lpage>
<pub-id pub-id-type="doi">10.1074/mcp.M113.034009</pub-id>
<pub-id pub-id-type="pmid">24335475</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoopmann</surname>
<given-names>MR</given-names>
</name>
<name>
<surname>Zelter</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>RS</given-names>
</name>
<name>
<surname>Riffle</surname>
<given-names>M</given-names>
</name>
<name>
<surname>MacCoss</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>TN</given-names>
</name>
<name>
<surname>Moritz</surname>
<given-names>RL</given-names>
</name>
</person-group>
<article-title>Kojak: efficient analysis of chemically cross-linked protein complexes</article-title>
<source>J Proteome Res</source>
<year>2015</year>
<volume>14</volume>
<issue>5</issue>
<fpage>2190</fpage>
<lpage>198</lpage>
<pub-id pub-id-type="doi">10.1021/pr501321h</pub-id>
<pub-id pub-id-type="pmid">25812159</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Storey</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Tibshirani</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Statistical significance for genomewide studies</article-title>
<source>Proc Natl Acad Sci</source>
<year>2003</year>
<volume>100</volume>
<issue>16</issue>
<fpage>9440</fpage>
<lpage>445</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1530509100</pub-id>
<pub-id pub-id-type="pmid">12883005</pub-id>
</element-citation>
</ref>
<ref id="CR32">
<label>32</label>
<mixed-citation publication-type="other">Chen T, Jaffe JD, Church GM. Algorithms for identifying protein cross-links via tandem mass spectrometry. In: Proceedings of the fifth annual international conference on Computational biology. ACM: 2001. p. 95–102.</mixed-citation>
</ref>
<ref id="CR33">
<label>33</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bohn</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Beck</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Sakata</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Walzthoeni</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Beck</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Aebersold</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Frster</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Baumeister</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Nickell</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Structure of the 26S proteasome from schizosaccharomyces pombe at subnanometer resolution</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2010</year>
<volume>107</volume>
<issue>49</issue>
<fpage>20992</fpage>
<lpage>0997</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1015530107</pub-id>
<pub-id pub-id-type="pmid">21098295</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nesvizhskii</surname>
<given-names>AI</given-names>
</name>
</person-group>
<article-title>A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics</article-title>
<source>J Proteome</source>
<year>2010</year>
<volume>73</volume>
<issue>11</issue>
<fpage>2092</fpage>
<lpage>123</lpage>
<pub-id pub-id-type="doi">10.1016/j.jprot.2010.08.009</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kertesz-Farkas</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Keich</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Noble</surname>
<given-names>WS</given-names>
</name>
</person-group>
<article-title>Tandem mass spectrum identification via cascaded search</article-title>
<source>J Proteome Res</source>
<year>2015</year>
<volume>14</volume>
<issue>8</issue>
<fpage>3027</fpage>
<lpage>38</lpage>
<pub-id pub-id-type="doi">10.1021/pr501173s</pub-id>
<pub-id pub-id-type="pmid">26084232</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000160  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000160  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024