Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000542 ( Pmc/Corpus ); précédent : 0005419; suivant : 0005430 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction</title>
<author>
<name sortKey="Larsen, Mette V" sort="Larsen, Mette V" uniqKey="Larsen M" first="Mette V" last="Larsen">Mette V. Larsen</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lundegaard, Claus" sort="Lundegaard, Claus" uniqKey="Lundegaard C" first="Claus" last="Lundegaard">Claus Lundegaard</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lamberth, Kasper" sort="Lamberth, Kasper" uniqKey="Lamberth K" first="Kasper" last="Lamberth">Kasper Lamberth</name>
<affiliation>
<nlm:aff id="I2">Institute for Medical Microbiology and Immunology, Panum Institute 18.3.12, Blegdamsvej 3, DK-2200 Copenhagen N, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Buus, Soren" sort="Buus, Soren" uniqKey="Buus S" first="Soren" last="Buus">Soren Buus</name>
<affiliation>
<nlm:aff id="I2">Institute for Medical Microbiology and Immunology, Panum Institute 18.3.12, Blegdamsvej 3, DK-2200 Copenhagen N, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lund, Ole" sort="Lund, Ole" uniqKey="Lund O" first="Ole" last="Lund">Ole Lund</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nielsen, Morten" sort="Nielsen, Morten" uniqKey="Nielsen M" first="Morten" last="Nielsen">Morten Nielsen</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">17973982</idno>
<idno type="pmc">2194739</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2194739</idno>
<idno type="RBID">PMC:2194739</idno>
<idno type="doi">10.1186/1471-2105-8-424</idno>
<date when="2007">2007</date>
<idno type="wicri:Area/Pmc/Corpus">000542</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000542</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction</title>
<author>
<name sortKey="Larsen, Mette V" sort="Larsen, Mette V" uniqKey="Larsen M" first="Mette V" last="Larsen">Mette V. Larsen</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lundegaard, Claus" sort="Lundegaard, Claus" uniqKey="Lundegaard C" first="Claus" last="Lundegaard">Claus Lundegaard</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lamberth, Kasper" sort="Lamberth, Kasper" uniqKey="Lamberth K" first="Kasper" last="Lamberth">Kasper Lamberth</name>
<affiliation>
<nlm:aff id="I2">Institute for Medical Microbiology and Immunology, Panum Institute 18.3.12, Blegdamsvej 3, DK-2200 Copenhagen N, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Buus, Soren" sort="Buus, Soren" uniqKey="Buus S" first="Soren" last="Buus">Soren Buus</name>
<affiliation>
<nlm:aff id="I2">Institute for Medical Microbiology and Immunology, Panum Institute 18.3.12, Blegdamsvej 3, DK-2200 Copenhagen N, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lund, Ole" sort="Lund, Ole" uniqKey="Lund O" first="Ole" last="Lund">Ole Lund</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nielsen, Morten" sort="Nielsen, Morten" uniqKey="Nielsen M" first="Morten" last="Nielsen">Morten Nielsen</name>
<affiliation>
<nlm:aff id="I1">Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2007">2007</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods.</p>
</sec>
<sec>
<title>Results</title>
<p>We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/NetCTL"></ext-link>
.</p>
<p>All used datasets are available at
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php"></ext-link>
.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-title>BMC Bioinformatics</journal-title>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">17973982</article-id>
<article-id pub-id-type="pmc">2194739</article-id>
<article-id pub-id-type="publisher-id">1471-2105-8-424</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-8-424</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction</article-title>
</title-group>
<contrib-group>
<contrib id="A1" corresp="yes" contrib-type="author">
<name>
<surname>Larsen</surname>
<given-names>Mette V</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>metteb@cbs.dtu.dk</email>
</contrib>
<contrib id="A2" contrib-type="author">
<name>
<surname>Lundegaard</surname>
<given-names>Claus</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>lunde@cbs.dtu.dk</email>
</contrib>
<contrib id="A3" contrib-type="author">
<name>
<surname>Lamberth</surname>
<given-names>Kasper</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>k.lamberth@immi.ku.dk</email>
</contrib>
<contrib id="A4" contrib-type="author">
<name>
<surname>Buus</surname>
<given-names>Soren</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>s.buus@immi.ku.dk</email>
</contrib>
<contrib id="A5" contrib-type="author">
<name>
<surname>Lund</surname>
<given-names>Ole</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>lund@cbs.dtu.dk</email>
</contrib>
<contrib id="A6" contrib-type="author">
<name>
<surname>Nielsen</surname>
<given-names>Morten</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>mniel@cbs.dtu.dk</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark</aff>
<aff id="I2">
<label>2</label>
Institute for Medical Microbiology and Immunology, Panum Institute 18.3.12, Blegdamsvej 3, DK-2200 Copenhagen N, Denmark</aff>
<pub-date pub-type="collection">
<year>2007</year>
</pub-date>
<pub-date pub-type="epub">
<day>31</day>
<month>10</month>
<year>2007</year>
</pub-date>
<volume>8</volume>
<fpage>424</fpage>
<lpage>424</lpage>
<ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/1471-2105/8/424"></ext-link>
<history>
<date date-type="received">
<day>21</day>
<month>2</month>
<year>2007</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>10</month>
<year>2007</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2007 Larsen et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2007</copyright-year>
<copyright-holder>Larsen et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0"></ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p>
<pmc-comment> Larsen V Mette metteb@cbs.dtu.dk Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction 2007BMC Bioinformatics 8(1): 424-. (2007)1471-2105(2007)8:1<424>urn:ISSN:1471-2105</pmc-comment>
</license>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods.</p>
</sec>
<sec>
<title>Results</title>
<p>We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/NetCTL"></ext-link>
.</p>
<p>All used datasets are available at
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php"></ext-link>
.</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>The CTLs of the immune system must be able to discriminate between healthy and infected cells, since only the infected cells are to be eliminated. To facilitate the discrimination, all nucleated cells present a selection of the peptides contained in their proteins on the cell surface in complex with Major Histocompatibility Complex class I (MHC class I) molecules. The course of events leading to MHC class I presentation includes the ongoing degradation of the cell's proteins by the proteasome [
<xref ref-type="bibr" rid="B1">1</xref>
-
<xref ref-type="bibr" rid="B5">5</xref>
]. A subset of the generated peptides are then transported into the Endoplasmatic Reticulum (ER) by Transporter associated with Antigen Presentation (TAP) molecules [
<xref ref-type="bibr" rid="B6">6</xref>
-
<xref ref-type="bibr" rid="B8">8</xref>
]. Once inside the ER, the peptides may bind to MHC class I molecules, which are subsequently transported to the cell surface, where the complexes may be recognized by passing CTLs. The most restrictive step involved in antigen presentation is binding of the peptide to MHC class I. It is estimated that only 1 out of 200 peptides will bind a given MHC class I allele with sufficient strength to elicit a CTL response [
<xref ref-type="bibr" rid="B9">9</xref>
]. However, also proteasomal cleavage and TAP transport efficiency show some degree of specificity [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B9">9</xref>
].</p>
<p>Reliable predictions of immunogenic peptides can minimize the experimental effort needed to identify new epitopes to be used in, for example, vaccine design or for diagnostic purposes. We have previously described a method, NetCTL (hereafter renamed NetCTL-1.0), which integrates the predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity to an overall prediction of CTL epitopes [
<xref ref-type="bibr" rid="B10">10</xref>
]. In the following, we describe an improved version of NetCTL, version 1.2. Several other groups have likewise attempted to generate methods that enable CTL epitope identification. On an independent evaluation dataset of known HIV CTL epitopes, NetCTL-1.0 has previously been shown to have a higher predictive performance than the publicly available SYFPEITHI Epitope Prediction method [
<xref ref-type="bibr" rid="B11">11</xref>
,
<xref ref-type="bibr" rid="B12">12</xref>
] and the BIMAS HLA Peptide Binding Prediction method [
<xref ref-type="bibr" rid="B13">13</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
]. Here, we compare the performance of NetCTL-1.2 to four other publicly available methods, which have been described within the last few years: MAPPP [
<xref ref-type="bibr" rid="B15">15</xref>
], which combines proteasomal cleavage predictions with MHC class I affinity predictions, and EpiJen [
<xref ref-type="bibr" rid="B16">16</xref>
], MHC-pathway [
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B18">18</xref>
], and WAPP [
<xref ref-type="bibr" rid="B19">19</xref>
], which operate with predictions of both proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. Even for skilled scientist within the field it is not straightforward to compare the performance of the various methods, since they do not necessarily have the same output format and do not cover the same output range. In addition, many different performance measures can be applied, but not all are equally well suited for every method. It is also important to keep in mind that some performance measures are not meaningful on their own. An example of the latter is the performance measure
<italic>sensitivity</italic>
. In the case of finding CTL epitopes among a large number of peptides, sensitivity is defined as the number of peptides correctly predicted to be CTL epitopes (also called the number of True Positives, TP) divided by the total number of CTL epitopes in the dataset (also called Actual Positives, AP). A method, which finds all CTL epitopes, has a sensitivity of 1. This performance can, however, easily be achieved if the method predicts every peptide to be a CTL epitope. Obviously, such a method is totally useless. In this study, we have defined a number of performance measures, which together give an objective assessment of the methods. On all measures, we find that NetCTL-1.2 has a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP, although when comparing NetCTL-1.2 with EpiJen and MHC-pathway, the higher predictive performance of NetCTL-1.2 is not statistically significant on all measures.</p>
</sec>
<sec>
<title>Results</title>
<sec>
<title>NetCTL-1.2</title>
<p>NetCTL predicts CTL epitopes by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I binding [
<xref ref-type="bibr" rid="B10">10</xref>
]. Version 1.2 is an improvement on several accounts. Firstly, it predicts epitopes restricted to the A26 and B39 supertypes thus completing the list of 12 recognized supertypes [
<xref ref-type="bibr" rid="B20">20</xref>
]. Secondly, it has an improved performance as compared to the older version 1.0. This is partly due to the use of newer methods for predicting MHC class I affinity and proteasomal cleavage. Furthermore, a larger dataset has been used to deduce the optimal weights on proteasomal cleavage, TAP transport efficiency, and MHC class I affinity, respectively. When testing the performance of NetCTL-1.0 versus NetCTL-1.2 on the independent HIV evaluation dataset consisting of 216 known CTL epitopes, NetCTL-1.0 has an average AUC (Area Under the ROC Curve) per epitope-protein pair of 0.931, while NetCTL-1.2 has an average AUC per epitope-protein pair of 0.941. This difference in predictive performance between NetCTL-1.0 and NetCTL-1.2 is significant at P = 0.02 (paired t-test). For comparison, NetMHC-3.0
<sup>NO_HIV</sup>
, which is the MHC class I affinity predictor used in NetCTL-1.2, has an average AUC per epitope-protein pair of 0.922. The difference in predictive performance between NetCTL-1.2 and NetMHC-3.0
<sup>NO_HIV </sup>
is significant at P = 0.004 (paired t-test).</p>
</sec>
<sec>
<title>Comparing different methods for CTL epitope prediction by using the AUC value</title>
<p>We wanted to compare the performance of NetCTL-1.2 to that of four other publicly available CTL epitope prediction methods: EpiJen [
<xref ref-type="bibr" rid="B16">16</xref>
], MAPPP [
<xref ref-type="bibr" rid="B15">15</xref>
], MHC-pathway [
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B18">18</xref>
], and WAPP [
<xref ref-type="bibr" rid="B19">19</xref>
]. For the comparisons, we use two evaluation sets containing experimentally verified HIV CTL epitopes and their source proteins: The HIV dataset, which we compiled ourselves, contains 216 epitope-protein pairs restricted to all 12 recognized supertypes. When comparing the performance of NetCTL-1.2 to that of any of the other four methods, only the subset of supertypes also covered by the test method is included. The other dataset is called HIV
<sup>EpiJen</sup>
. It was taken almost in complete from [
<xref ref-type="bibr" rid="B16">16</xref>
] and contains 87 epitopes restricted to the A1, A2, or A3 supertypes. All five methods can perform predictions for these three supertypes.</p>
<p>In the above section, we used the AUC value to compare NetCTL-1.2 to NetCTL-1.0 and NetMHC-3.0
<sup>NO_HIV</sup>
. This measure is, however, not appropriate for the EpiJen and WAPP methods. These methods do not produce a single, combined score for each peptide in the dataset. Instead, the proteasomal cleavage and TAP transport predictors act as filters that reduce the number of possible epitopes. In addition, the EpiJen server maximally outputs the 5% peptides, which have the highest predicted MHC class I affinity and at the same time pass the proteasomal cleavage and TAP transport filters. The problem is exemplified in the ROC (Receiver Operating Characteristic) curve shown in Figure
<xref ref-type="fig" rid="F1">1</xref>
. For NetCTL-1.2, MAPPP, and MHC-pathway, the combined score is used as the predicted value. For EpiJen and WAPP, we used the predicted MHC class I affinity as the predicted value. The ROC curves for the two last-mentioned methods come to an abrupt stop, since there are no predicted values for peptides that do not pass the proteasomal cleavage and TAP transport filters. The ROC curves also highlight the need for extracting sensitivity at comparable specificity levels and vice versa in order to achieve objective benchmark comparisons between different methods: Any of the methods can be assigned the highest sensitivity, if the specificity is not set at a comparable level.</p>
<fig position="float" id="F1">
<label>Figure 1</label>
<caption>
<p>
<bold>ROC curves</bold>
. The analysis has been performed on 41 A3 restricted epitope-protein pairs from the HIV dataset.</p>
</caption>
<graphic xlink:href="1471-2105-8-424-1"></graphic>
</fig>
</sec>
<sec>
<title>The RANK measure</title>
<p>Since the AUC measure is not applicable to all methods, we designed a new measure, which we call the RANK measure. Looking at each epitope-protein pair separately for either the HIV or HIV
<sup>EpiJen </sup>
dataset, we rank all possible 9 mers according to the prediction score of a given method. Next, we compare two methods at a time: NetCTL-1.2 and one of the four test methods (EpiJen, MAPPP, MHC-pathway, or WAPP). Again, we use the combined score as the predicted value for NetCTL-1.2, MAPPP, and MHC-pathway, and the predicted MHC class I affinity for EpiJen and WAPP. We then count how often NetCTL-1.2 ranks the epitope higher than the test method, and vice versa. To facilitate a fair comparison to the EpiJen and WAPP methods, where predictions are limited to a subset of the peptides, only the top N of the NetCTL-1.2 predictions are included, where N is the number of peptides assigned a prediction score by the test method (EpiJen or WAPP). All peptides without a predicted value are assigned the rank 9999 to put them at the bottom of the rank-list. In this way, all methods are compared on an equal number of peptide data. Figure
<xref ref-type="fig" rid="F2">2</xref>
shows the results. In Figure
<xref ref-type="fig" rid="F2">2A</xref>
, it is seen that for all comparisons, NetCTL-1.2 more frequently ranks the epitope higher than any of the four test methods on the HIV dataset. The difference is significant at P < 0.01 (Binomial test). In Figure
<xref ref-type="fig" rid="F2">2B</xref>
, the results are shown for the HIV
<sup>EpiJen </sup>
dataset. Also here, NetCTL-1.2 more frequently ranks the epitope higher than the test method. For WAPP the difference is significant at P < 0.01, while for EpiJen, MAPPP, and MHC-pathway the difference is significant at P < 0.05 (Binomial test).</p>
<fig position="float" id="F2">
<label>Figure 2</label>
<caption>
<p>
<bold>Performance on the RANK measure</bold>
. For each epitope-protein pair, the rank that is assigned to the epitope when using NetCTL-1.2 is compared to the rank assigned when using the test method (EpiJen, MAPPP, MHC-pathway, or WAPP). The height of the bars indicates how often, respectively, NetCTL or the test method ranks the epitope highest.
<bold>A: </bold>
The HIV dataset has been used for the analysis. When comparing NetCTL-1.2 to either of the test methods, only predictions for supertypes that the test method covers are included.
<bold>B: </bold>
The HIV
<sup>EpiJen </sup>
dataset has been used for the analysis. ** The difference is significant at P < 0.01. * The difference is significant at P < 0.05.</p>
</caption>
<graphic xlink:href="1471-2105-8-424-2"></graphic>
</fig>
</sec>
<sec>
<title>Specificity at a predefined sensitivity</title>
<p>When using the default settings at the NetCTL-1.2, MAPPP, and WAPP servers, thresholds are defined that separate the predicted epitopes from the predicted non-epitopes. At the EpiJen server, one can choose between defining the top-scoring 5%, 4%, 3%, or 2% peptides as epitopes. MHC-pathway does as yet not offer any thresholds for separating predicted epitopes from non-epitopes. These differences pose a challenge when comparing the performance of the methods as regards to sensitivity and specificity, since it is a prerequisite for the calculation of these measures that the predicted epitopes can be separated from the non-epitopes. Furthermore, as mentioned earlier, it is generally problematic to distinguish which method has the highest predictive performance, if one method has the highest sensitivity, while the other method has the highest specificity. To overcome these problems, we chose to compare the specificity of the methods at a series of predefined sensitivity values. We chose three predefined sensitivities: 0.3, 0.5, and 0.8. For the HIV dataset, we again compared two methods at a time: NetCTL-1.2 and one of the four test methods, in order to include epitopes restricted to as many supertypes as possible. For the HIV
<sup>EpiJen </sup>
dataset, all methods can be compared simultaneously, since all methods can predict epitopes restricted to the A1, A2, and A3 supertypes. We first identified the prediction threshold values that result in the desired sensitivity when averaging over all epitope-protein pairs. We then used the same thresholds to find the average specificity. Figure
<xref ref-type="fig" rid="F3">3</xref>
shows the results for the HIV dataset. It can be seen that NetCTL-1.2 has a significantly higher specificity than EpiJen, MAPPP, and WAPP at all sensitivities (P < 0.01, unpaired student's t-test). When comparing NetCTL-1.2 to MHC-pathway, it can be seen that at an average sensitivity of 0.3 and 0.5 NetCTL has a higher specificity than MHC-pathway although this difference is not statistically significant. At an average sensitivity of 0.8, NetCTL-1.2 has significantly higher specificity than MHC-pathway (P < 0.05, unpaired student's t-test).</p>
<fig position="float" id="F3">
<label>Figure 3</label>
<caption>
<p>
<bold>Comparing specificities</bold>
. The HIV dataset has been used for the analysis. In order to include epitopes restricted to as many supertypes as possible, NetCTL-1.2 is compared to each of the other methods separately. For each comparison, only predictions for supertypes that the test method covers are included. The average specificity is found at a predefined average sensitivity using either NetCTL-1.2 or one of the four test methods (EpiJen, MAPPP, MHC-pathway, WAPP).
<bold>A: </bold>
Average sensitivity = 0.3,
<bold>B: </bold>
Average sensitivity = 0.5,
<bold>C: </bold>
Average sensitivity = 0.8. Only NetCTL-1.2, MAPPP and MHC-pathway provide enough predicted scores to obtain a sensitivity of 0.8. The error bars are the standard error. ** The difference is significant at P < 0.01. * The difference is significant at P < 0.05.</p>
</caption>
<graphic xlink:href="1471-2105-8-424-3"></graphic>
</fig>
<p>When using the HIV
<sup>EpiJen </sup>
dataset for the analysis, NetCTL-1.2 has a higher specificity than all the test methods at all sensitivities, although for EpiJen and MHC-pathway the difference is not statistically significant at all sensitivities (the results are available as supplementary material at [
<xref ref-type="bibr" rid="B21">21</xref>
]).</p>
</sec>
<sec>
<title>Sensitivity among the 5% top-scoring peptides</title>
<p>For an experimentalist who wants to find epitopes in a specific protein, it is interesting to know how many of the actual epitopes one can expect to find if testing a certain top-fraction of the peptides. For this, we calculate the sensitivity among the 5% top-scoring peptides. For the HIV dataset, we made the calculations for NetCTL-1.2 and one of the four test methods at a time. For the HIV
<sup>EpiJen </sup>
dataset, all methods could be compared using the same dataset, since all methods can predict epitopes restricted to the A1, A2, and A3 supertypes. Table
<xref ref-type="table" rid="T1">1</xref>
and
<xref ref-type="table" rid="T2">2</xref>
show the results. Table
<xref ref-type="table" rid="T1">1</xref>
shows that when NetCTL-1.2 is compared separately to either of the test methods using the HIV dataset, NetCTL-1.2 has the highest sensitivity among the 5% top-scoring peptides with sensitivity values in the range of 0.70–0.78 depending on the evaluation dataset. When evaluating on the HIV
<sup>EpiJen </sup>
dataset (Table
<xref ref-type="table" rid="T2">2</xref>
) NetCTL-1.2 also achieves the highest sensitivity of 0.75. On this dataset, MAPPP achieves the second highest sensitivity (0.64), closely followed by MHC-pathway (0.63). EpiJen achieves a sensitivity of 0.60, while WAPP only achieves a sensitivity of 0.44 among the 5% top-scoring peptides.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption>
<p>Determining the sensitivity among the 5% top-scoring peptides on the HIV dataset</p>
</caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td></td>
<td align="left">NetCTL-1.2</td>
<td align="left">EpiJen</td>
<td align="left">NetCTL-1.2</td>
<td align="left">MAPPP</td>
<td align="left">NetCTL-1.2</td>
<td align="left">MHC-pathway</td>
<td align="left">NetCTL-1.2</td>
<td align="left">WAPP</td>
</tr>
<tr>
<td colspan="3">
<hr></hr>
</td>
<td colspan="2">
<hr></hr>
</td>
<td colspan="2">
<hr></hr>
</td>
<td colspan="2">
<hr></hr>
</td>
</tr>
<tr>
<td align="left">HIV</td>
<td align="left">0.72</td>
<td align="left">0.63</td>
<td align="left">0.70</td>
<td align="left">0.57</td>
<td align="left">0.70</td>
<td align="left">0.64</td>
<td align="left">0.78</td>
<td align="left">0.44</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The HIV dataset has been used for the analysis. To be able to include epitopes restricted to as many supertypes as possible, NetCTL-1.2 is compared to each of the other methods separately. For each comparison, only predictions for supertypes covered by the test method are included.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption>
<p>Determining the sensitivity among the 5% top-scoring peptides on the HIV
<sup>EpiJen </sup>
dataset</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td></td>
<td align="left">NetCTL-1.2</td>
<td align="left">EpiJen</td>
<td align="left">MAPPP</td>
<td align="left">MHC-pathway</td>
<td align="left">WAPP</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left">HIV
<sup>EpiJen</sup>
</td>
<td align="left">0.75</td>
<td align="left">0.60</td>
<td align="left">0.64</td>
<td align="left">0.63</td>
<td align="left">0.44</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The HIV
<sup>EpiJen </sup>
dataset has been used for the analysis. All methods can be compared simultaneously since this dataset only contains epitopes restricted to the A1, A2, or A3 supertypes, which all methods cover.</p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec>
<title>Discussion</title>
<p>Reliable CTL epitope predictions can minimize the experimental effort needed to identify new CTL epitopes to be used in for example vaccine design or for diagnostic purposes. Tong et al. [
<xref ref-type="bibr" rid="B22">22</xref>
] comments on the reports of algorithms that integrate MHC class I predictions with TAP and proteasomal cleavage specificities: "These techniques are still in their infancy and need to be further developed and thoroughly tested". Here, we make a first attempt to test the performance of five of these methods on two evaluation sets of experimentally verified HIV CTL epitopes. It turned out to be a highly non-trivial task to design an objective benchmark. Mainly because the prediction methods each generate epitope predictions in a specific format and potentially with different mechanisms that filter the number of prediction scores made available to the user. Our final performance measures consist firstly of a RANK measure that allows for an objective comparison of accuracy between the different prediction methods. For comparing prediction specificity, we define three levels of prediction sensitivity, so that comparisons can be performed at equal levels. Finally, we compare the sensitivity among the 5% top-scoring peptides as obtained by each method.</p>
<p>Using the defined performance measures, we performed a large-scale benchmark calculation comparing the predictive performance of a series of publicly available methods for CTL epitope prediction. The benchmark included the EpiJen, MAPPP, WAPP, and MHC-pathway methods, and an updated version of the NetCTL method. The updated version of NetCTL, version 1.2, can make predictions for the A26 and B39 HLA supertypes thus completing the list of 12 recognized supertypes, and was shown to have a higher predictive performance than the old version 1.0. We find that NetCTL-1.2 has a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all measures. When comparing NetCTL-1.2 with MAPPP and WAPP, the higher performance of NetCTL-1.2 is statistically significant on all measures. When comparing NetCTL-1.2 with EpiJen, the higher performance of NetCTL-1.2 is statistically significant for all measures except when comparing the specificities at the sensitivity values of 0.3 and 0.5 on the HIV
<sup>EpiJen </sup>
dataset. When comparing NetCTL-1.2 with MHC-pathway, the higher performance of NetCTL-1.2 is statistically significant for all measures, except when comparing the specificities at the sensitivity values of 0.3 and 0.5 on either evaluation dataset. It is not surprising that MHC-pathway reaches almost as high predictive performance as NetCTL-1.2 on some of the performance measures. These two methods have several features in common: Firstly, the MHC binding prediction methods included in the MHC-pathway and NetCTL prediction methods, have recently in a large scale benchmark been shown to have comparable performance [
<xref ref-type="bibr" rid="B18">18</xref>
]. Secondly, they use identical methods for predicting TAP transport efficiency; namely the matrix method developed by Peters et al. [
<xref ref-type="bibr" rid="B23">23</xref>
]. Thirdly, they integrate the predicted values obtained from the separate proteasomal cleavage, TAP transport efficiency, and MHC class I affinity predictors into one combined score. Regarding differences it can be mentioned that the proteasomal cleavage predictor used for MHC-pathway is trained on
<italic>in vitro </italic>
data, while NetCTL-1.2's proteasomal cleavage predictor, NetChop-3.0, is trained on natural MHC class I ligands.</p>
<p>NetCTL-1.2, MAPPP, and MHC-pathway integrates the predicted values into one, overall score, while EpiJen and WAPP use a number of successive filters that step by step reduce the number of possible epitopes. Doytchinova et al. [
<xref ref-type="bibr" rid="B16">16</xref>
] has stated that the "combined score as used by SMM (MHC-pathway) and NetCTL, obscures the final result, because a low (or even negative) TAP and/or proteasomal score could be compensated for by a high MHC score." We would here like to offer our interpretation of how the combined score can be understood in a biological meaningful manner: First of all, we see the predictive values as probabilities. Secondly, one has to keep in mind that there is not just one copy of a given protein in the cell. This means that if for example a certain peptide has a low predicted cleavage score and will only be generated in 1 out of a 100 cleavage events, the peptide can still survive all the way to the cell surface and become a CTL epitope, if the TAP transport efficiency and MHC class I affinity are sufficiently high.</p>
<p>We have throughout the analysis on the HIV dataset compared NetCTL-1.2 to each of the other test methods separately. This was done in order to include epitopes restricted to as many supertypes as possible. Had we chosen only to include epitopes restricted to supertypes that all methods had in common, we could only have included the A1, A2, and A3 supertypes. The shortcoming of this approach is that comparisons can not be made directly in between the test methods. For comparisons in between the test methods, we refer to calculations done on the HIV
<sup>EpiJen </sup>
dataset, which only contains epitopes restricted to the A1, A2, and A3 supertypes.</p>
<p>Lastly, we would like to note that the NetCTL method predicts CTL epitopes that are presented via a pathway that utilizes TAP for peptide entry into ER. Additional pathways also exist as reviewed in [
<xref ref-type="bibr" rid="B24">24</xref>
]. Their contribution to the total presentation of MHC class I ligands is, however, thought to be minor [
<xref ref-type="bibr" rid="B25">25</xref>
-
<xref ref-type="bibr" rid="B27">27</xref>
].</p>
</sec>
<sec>
<title>Conclusion</title>
<p>Using objective benchmarks and standardized performance measures, we have demonstrated that NetCTL-1.2 has a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP, although when comparing NetCTL-1.2 with EpiJen and MHC-pathway, the higher predictive performance of NetCTL-1.2 is not statistically significant on some of the measures.</p>
<p>The benchmark datasets are all available and downloadable from the Internet. Together with the detailed description on how to perform the calculations and extract the different performance measures presented here, it is our hope that other researches readily can repeat the benchmark analysis, and in an objective manner compare novel methods for CTL epitope discovery to the five methods included here.</p>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>Datasets</title>
<sec>
<title>Training set</title>
<p>In August 2006, 1730 9 meric peptides present in the SYFPEITHI database [
<xref ref-type="bibr" rid="B12">12</xref>
,
<xref ref-type="bibr" rid="B28">28</xref>
] and listed as either "Example for Ligand" or "T-cell epitope" were extracted. The peptides were grouped according to MHC class I allele and one of the 12 supertypes as defined in [
<xref ref-type="bibr" rid="B20">20</xref>
]. Peptides that had been used to develop one or more of the methods for predicting proteasomal cleavage, TAP transport efficiency, or MHC class I affinity for NetCTL-1.2 were removed. Then, for every peptide, the source protein was found in the SwissProt database. If more than one source protein was possible, the longest human protein was chosen, and if there were no possible human proteins, the longest other protein was chosen. The final SYFPETHI dataset contained a total of 863 epitope-protein pairs. All 9 meric peptides contained in the source protein sequences, except those annotated as epitopes in either the complete SYFPEITHI or Los Alamos HIV databases [
<xref ref-type="bibr" rid="B29">29</xref>
], were taken as negative peptides (non-epitopes). When using this definition of epitope/non-epitope one has to take into account that some epitopes will falsely be classified as non-epitopes because the SYFPEITHI and Los Alamos HIV databases are incomplete. Since the MHC class I molecules are very specific, binding only a highly limited repertoire of peptides, this misclassified proportion will, however, be very small. A given MHC class I molecule has a specificity of ~1% [
<xref ref-type="bibr" rid="B9">9</xref>
]. In a protein of 100 amino acids, one expects to have one binding and 99 non-binding peptides. The potential number of false classifications is hence orders of magnitudes smaller than the actual number of negatives. Furthermore, the measured performance of all the methods should be equally affected by the false negatives in the dataset, thus while the reported absolute performance of the methods might be underestimated, we do not expect the false negatives to affect the relative ranking of the different methods.</p>
<p>This dataset will hereafter be referred to as the SYFPEITHI dataset.</p>
</sec>
<sec>
<title>Evaluation sets</title>
<p>In December 2005, 342 9 meric peptides present in the HIV Immunology CTL database of the Los Alamos HIV Database [
<xref ref-type="bibr" rid="B29">29</xref>
] were extracted. The peptides were subsequently sorted as for the SYFPEITHI dataset. In all, the epitopes in the final dataset are restricted to 29 MHC class I alleles that can be further divided into one of the 12 recognized supertypes [
<xref ref-type="bibr" rid="B20">20</xref>
]. Subsequently, the source proteins were found. If more than one protein was the possible origin of a given peptide, the longest protein was chosen. The final Los Alamos HIV dataset contained 216 epitope-protein pairs. All 9 meric peptides contained in the source protein sequences, except those annotated as epitopes in either the complete SYFPEITHI or Los Alamos HIV databases, were taken as negative peptides (non-epitopes). The dataset will hereafter be referred to as the HIV dataset. Another evaluation set was taken from [
<xref ref-type="bibr" rid="B16">16</xref>
]. This dataset was compiled from the Los Alamos HIV database in June 2005, but contained only epitopes restricted to the A1, A2, or A3 supertypes. Originally it contained 99 epitopes, but we removed 12 of them, since they had been used previously to train NetCTL-1.2. For the 87 remaining peptides, the source proteins were subsequently found. If more than one protein was the possible origin of a given peptide, the longest protein was chosen. The final dataset, which is called HIV
<sup>EpiJen</sup>
, thus contains 87 epitope-protein pairs. It may be noted, that this approach differs from the one used in [
<xref ref-type="bibr" rid="B16">16</xref>
], where all epitopes are mapped to the HXB2 consensus protein sequence. All 9 meric peptides contained in the source protein sequences, except those annotated as epitopes in either the complete SYFPEITHI or Los Alamos HIV databases, were taken as negative peptides (non-epitopes). In summary, the HIV and HIV
<sup>EpiJen </sup>
datasets are both compiled from the Los Alamos HIV database, but whereas the HIV dataset contains 216 epitopes restricted to all 12 recognized supertypes, the HIV
<sup>EpiJen </sup>
dataset contains 87 epitopes restricted to only the A1, A2, or A3 supertype. The HIV dataset was compiled by ourselves, while the HIV
<sup>EpiJen </sup>
dataset was taken from [
<xref ref-type="bibr" rid="B16">16</xref>
]. Of the 87 epitopes in the HIV
<sup>EpiJen </sup>
dataset, 59 are also present in the HIV dataset [
<xref ref-type="bibr" rid="B21">21</xref>
].</p>
<p>All above mentioned datasets are available as supplementary material at [
<xref ref-type="bibr" rid="B21">21</xref>
].</p>
</sec>
</sec>
<sec>
<title>Prediction methods</title>
<sec>
<title>NetCTL-1.2</title>
<p>Prediction of proteasomal cleavage patterns was done by the NetChop 3.0 method [
<xref ref-type="bibr" rid="B30">30</xref>
,
<xref ref-type="bibr" rid="B31">31</xref>
], which is an artificial neural network (ANN) trained on natural MHC class I ligand data. Prediction of TAP transport efficiency was performed using the matrix method described by Peters et al. [
<xref ref-type="bibr" rid="B23">23</xref>
]. For MHC class I affinity predictions, we use an updated version of the method described by Nielsen et al. [
<xref ref-type="bibr" rid="B32">32</xref>
] (NetMHC-3.0
<sup>NO_HIV</sup>
) and include all 12 supertypes: A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58, and B62 [
<xref ref-type="bibr" rid="B20">20</xref>
]. For training of NetMHC-3.0
<sup>NO_HIV </sup>
HIV data were excluded, but otherwise the method is identical to the method available at [
<xref ref-type="bibr" rid="B33">33</xref>
]. This was done in order to retain the maximal size of the evaluation sets, which only consist of HIV data. Note that the NetCTL-1.2 method available at [
<xref ref-type="bibr" rid="B34">34</xref>
] integrates the complete NetMHC-3.0 method. Table
<xref ref-type="table" rid="T3">3</xref>
shows which alleles are used to represent each of the 12 supertypes. The weights on proteasomal cleavage and TAP predictions are determined in a five-fold cross validated manner optimizing the predictive performance on the SYFPEITHI dataset. For each of the epitope/protein pairs in the training set, the AUC is calculated and the set of weights on proteasomal cleavage and TAP prediction that achieves optimal mean AUC value are identified. Optimal weights on cleavage and TAP transport were found to be 0.15+/-0.01, and 0.05 +/- 0.01, respectively, with an average AUC performance of 0.975 over the 863 epitope/protein pairs. The output from NetCTL-1.2 is a single, combined score for every possible peptide in a given protein.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption>
<p>Representative alleles</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="left">Supertype</td>
<td align="left">NetCTL</td>
<td align="left">EpiJen</td>
<td align="left">MAPPP</td>
<td align="left">MHC-pathway</td>
<td align="left">WAPP</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left">A1</td>
<td align="left">HLA-A*0101</td>
<td align="left">HLA-A*0101</td>
<td align="left">HLA-A1</td>
<td align="left">HLA-A*0101</td>
<td align="left">HLA-A*01</td>
</tr>
<tr>
<td align="left">A2</td>
<td align="left">HLA-A*0201</td>
<td align="left">HLA-A*0201</td>
<td align="left">HLA-A*0201</td>
<td align="left">HLA-A*0201</td>
<td align="left">HLA-A*0201</td>
</tr>
<tr>
<td align="left">A3</td>
<td align="left">HLA-A*0301</td>
<td align="left">HLA-A*0301</td>
<td align="left">HLA-A3</td>
<td align="left">HLA-A*0301</td>
<td align="left">HLA-A*03</td>
</tr>
<tr>
<td align="left">A24</td>
<td align="left">HLA-A*2402</td>
<td align="left">HLA-A*24</td>
<td align="left">HLA-A24</td>
<td align="left">HLA-A*2402</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">A26</td>
<td align="left">HLA-A*2601</td>
<td align="left">N/A</td>
<td align="left">N/A*</td>
<td align="left">HLA-A*2601</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">B7</td>
<td align="left">HLA-B*0702</td>
<td align="left">HLA-B*07</td>
<td align="left">HLA-B7</td>
<td align="left">HLA-B*0702</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">B8</td>
<td align="left">HLA-B*0801</td>
<td align="left">N/A</td>
<td align="left">HLA-B8</td>
<td align="left">HLA-B*0801</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">B27</td>
<td align="left">HLA-B*2705</td>
<td align="left">HLA-B*27</td>
<td align="left">HLA-B*2705</td>
<td align="left">HLA-B*2705</td>
<td align="left">HLA-B*2705</td>
</tr>
<tr>
<td align="left">B39</td>
<td align="left">HLA-B*3901</td>
<td align="left">N/A</td>
<td align="left">HLA-B*3901</td>
<td align="left">N/A</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">B44</td>
<td align="left">HLA-B*4001</td>
<td align="left">HLA-B*40</td>
<td align="left">HLA-B40</td>
<td align="left">HLA-B*4002</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">B58</td>
<td align="left">HLA-B*5801</td>
<td align="left">N/A</td>
<td align="left">HLA-B*5801</td>
<td align="left">HLA-B*5801</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left">B62</td>
<td align="left">HLA-B*1501</td>
<td align="left">N/A</td>
<td align="left">HLA-B62</td>
<td align="left">HLA-B*1501</td>
<td align="left">N/A</td>
</tr>
<tr>
<td align="left"># epitope-protein pairs</td>
<td align="left">216</td>
<td align="left">188</td>
<td align="left">214</td>
<td align="left">215</td>
<td align="left">131</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The table shows which alleles are used for representing the supertypes in the HIV and HIV
<sup>EpiJen </sup>
datasets. The first column gives the HLA supertype, the next five columns give the alleles used a supertype representatives for each of the five prediction method NetCTL-1.2, EpiJen, MAPPP, MHC-pathway, and WAPP, respectively. The lower row (N) gives the total number of epitope-protein pairs in the HIV dataset covered by each of the five prediction methods. *A MHC type termed HLA-A26 was listed, but did not produce any results.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>EpiJen [
<xref ref-type="bibr" rid="B16">16</xref>
]</title>
<p>Like NetCTL-1.2, MHC-pathway, and WAPP, this algorithm operates with three steps in order to predict CTL epitopes: Proteasomal cleavage, TAP transport, and MHC class I binding. Each step is based on a quantitative matrix and acts as a filter that reduces the number of potential epitopes. The method is available at [
<xref ref-type="bibr" rid="B35">35</xref>
] and includes CTL epitope predictions for 18 different alleles. Table
<xref ref-type="table" rid="T3">3</xref>
shows which alleles we use to represent the supertypes in the HIV and HIV
<sup>EpiJen </sup>
dataset. No alleles can represent the A26, B39, B58, or B62 supertypes. When calculating the performance measures for EpiJen on the HIV dataset, we therefore only have a total of 188 epitope-protein pairs as compared to 216 epitope-protein pairs for all 12 supertypes. Different cut offs can be chosen for the proteasomal cleavage and TAP transport filters. In each case, we used the recommended cut offs. The final scores are the predicted MHC class I affinities in form of -logIC
<sub>50 </sub>
and IC
<sub>50 </sub>
values. It is not possible to retrieve scores for all possible peptides in a given protein – at most, the EpiJen server outputs the 5% peptides that have the highest predicted MHC class I affinity and at the same time pass the proteasomal cleavage and TAP transport filters.</p>
</sec>
<sec>
<title>MAPPP [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B36">36</xref>
]</title>
<p>Unlike the other four methods, MAPPP only operates with proteasomal cleavage and MHC class I binding. Proteasomal cleavage can be done by either the FRAGPREDICT [
<xref ref-type="bibr" rid="B37">37</xref>
,
<xref ref-type="bibr" rid="B38">38</xref>
] or PAProC [
<xref ref-type="bibr" rid="B39">39</xref>
,
<xref ref-type="bibr" rid="B40">40</xref>
] method. For this study we chose FRAGPREDICT, since it is the default choice. MHC class I binding can be done by either the SYFPEITHI Epitope Prediction method [
<xref ref-type="bibr" rid="B11">11</xref>
] or the BIMAS HLA Peptide Binding Prediction method [
<xref ref-type="bibr" rid="B13">13</xref>
]. We used the BIMAS HLA Peptide Binding Prediction method, since we have previously found this method to be superior [
<xref ref-type="bibr" rid="B10">10</xref>
]. Binding to the A26 supertype was listed to be done only by the SYFPEITHI Epitope Prediction method, but in spite or several attempts, we never received any results for this supertype. Consequently, we left out this supertype when doing calculation for the MAPPP method on the HIV dataset. Table
<xref ref-type="table" rid="T3">3</xref>
shows which alleles we use to represent the remaining supertypes in the HIV and HIV
<sup>EpiJen </sup>
dataset. Excluding the A26 supertype, we have a total of 214 epitope-protein pairs for 11 supertypes in the HIV dataset. The output is a combined score from the proteasomal cleavage and MHC class I binding predictions. It is possible to retrieve scores for all peptides in a given protein.</p>
</sec>
<sec>
<title>MHC-pathway [
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B18">18</xref>
]</title>
<p>As NetCTL-1.2, MHC-pathway integrates the scores obtained from three methods predicting, respectively, proteasomal cleavage, TAP transport, and MHC class I affinity into one final score. The method for predicting proteasomal cleavage is a matrix-based algorithm called the Stabilized Matrix Method (SMM) trained on
<italic>in vitro </italic>
cleavage data. The method for predicting TAP transport efficiency is identical to the one used for NetCTL-1.2 and is described in [
<xref ref-type="bibr" rid="B23">23</xref>
]. The method for predicting MHC class I affinity is also based on a SMM. The original MHC-pathway method [
<xref ref-type="bibr" rid="B17">17</xref>
] is available from [
<xref ref-type="bibr" rid="B41">41</xref>
], while an updated version of the method is located at [
<xref ref-type="bibr" rid="B42">42</xref>
]. In this study we have used the updated version. It is possible to predict CTL epitopes restricted to 34 different human alleles. Table
<xref ref-type="table" rid="T3">3</xref>
shows which alleles we use to represent the supertypes in the HIV and HIV
<sup>EpiJen </sup>
dataset. No alleles can represent the B39 supertype. When calculating the performance measures for MHC-pathway on the HIV dataset, we therefore only have a total of 215 epitope-protein pairs as compared to 216 epitope-protein pairs for all 12 supertypes. We used default settings for proteasomal cleavage (immuno proteasome) and TAP transport predictions. In the final output, MHC-pathway provides a single, combined score for all possible peptides in a given protein.</p>
</sec>
<sec>
<title>WAPP [
<xref ref-type="bibr" rid="B19">19</xref>
]</title>
<p>Like NetCTL-1.2, EpiJen, and MHC-pathway, this algorithm operates with predictions for proteasomal cleavage, TAP transport, and MHC class I affinity. The proteasomal cleavage predictor employs a matrix-based method trained on experimentally verified proteasomal cleavage sites. Support vector regression is used for predicting peptides transported by TAP. MHC class I affinity is predicted using a support vector machine. Each step acts as a filter that reduces the number of potential epitopes. The method is available at [
<xref ref-type="bibr" rid="B43">43</xref>
] and includes predictions for HLA-A*01, HLA-A*0201, HLA-A*03, and HLA-B*2705. Table
<xref ref-type="table" rid="T3">3</xref>
shows which alleles we use to represent the supertypes in the HIV and HIV
<sup>EpiJen </sup>
dataset. No alleles can represent the A24, A26, B7, B8, B39, B44, B58, or B62 supertypes. When calculating the performance measures for WAPP on the HIV dataset, we therefore only have a total of 131 epitope-protein pairs as compared to 216 epitope-protein pairs for all 12 supertypes. It is possible to retrieve predicted values for proteasomal cleavage, TAP transport, and MHC class I affinity for all possible peptides in a protein. The proteasomal cleavage and TAP transport filters can be set at different levels between 1 and 5. We used the default levels, which for both filters are 3. These levels correspond to a predicted proteasomal cleavage value above -1.2 and a predicted TAP transport value below -37.5 (as kindly informed by Pierre Dönnes). Prediction scores for all methods and for all nonamers are available as supplementary material [
<xref ref-type="bibr" rid="B21">21</xref>
].</p>
</sec>
</sec>
<sec>
<title>Performance measures</title>
<sec>
<title>Sensitivity and specificity</title>
<p>The formulas for calculating sensitivity and specificity are listed below:</p>
<p>Sensitivity = TP/AP</p>
<p>Specificity = TN/AN</p>
<p>Where</p>
<p>TP = true positives, which are the correctly predicted epitopes in the dataset, AP = actual positives, which are the actual number of epitopes in the dataset, TN = true negatives, which are the correctly predicted non-epitopes in the dataset, AN = actual negatives, which are the actual number of non-epitopes in the dataset.</p>
</sec>
<sec>
<title>AUC</title>
<p>The AUC value (the Area Under the ROC Curve) is calculated per epitope-protein pair. All overlapping 9 meric peptides in the protein are sorted according to the predicted score. For NetCTL-1.0, NetCTL-1.2, MAPPP, and MHC-pathway, the predicted score is combined from the predicted proteasomal cleavage, TAP transport, and MHC class I affinity values. For WAPP it is the predicted MHC class I affinity for peptides that pass the proteasomal cleavage and TAP transport filters. For EpiJen, the predicted score is also the predicted MHC class I affinity, but is only available for the 5% peptides that have the highest predicted MHC class I affinity, and which at the same time pass the proteasomal cleavage and TAP transport filters. The epitopes in the epitope-protein pairs define the positive set, whereas the negative set is made up from all other 9 mers in the source proteins excluding 9 mers found in the complete SYFPHITHI or Los Alamos HIV databases. The ROC curve is plotted from the sensitivity and 1-specificity values calculated by varying the cut-off value (separating the predicted positive from the predicted negative) from high to low. The area under this curve gives the AUC value. The AUC value is 0.5 for a random prediction method and 1.0 for a perfect method. When comparing the predictive performance (measured by AUC) of two prediction methods, a paired t-test is applied to test whether the observed difference in average AUC values differs significantly from zero.</p>
</sec>
<sec>
<title>RANK</title>
<p>Two methods at a time are compared by this performance measure. The two methods are NetCTL-1.2 and one of the four test methods (EpiJen, MAPPP, MHC-pathway, or WAPP). Calculations are done on the HIV and HIV
<sup>EpiJen </sup>
datasets separately. For comparison on the HIV dataset, we only include epitope-protein pairs, where the epitope is restricted to a supertype covered by the test method. To facilitate comparison to the EpiJen and WAPP methods, where only a subset of the peptides are assigned a predicted value, only the top N of the NetCTL-1.2 predictions where included, where N is the number of peptides predicted by the test method (EpiJen or WAPP). All peptides without a predicted value are assigned the rank 9999 to put them at the bottom of the rank-list. In this way, all methods are compared on an equal number of peptide data. For MAPPP and MHC-pathway all peptides are included. We next count how often NetCTL-1.2 ranks the epitope higher than the test method, and vice versa. For all comparisons, all epitopes in either the complete SYFPEITHI or Los Alamos HIV databases are disregarded, except for the particular epitope belonging to the epitope-protein pair in question. When comparing the predictive performance (as measured by RANK) of NetCTL-1.2 and the test method, we examine whether the observed higher proportion of proteins for which NetCTL-1.2 ranks the epitope highest deviates significantly from what is expected under a binomial distribution, where both methods have a probability of 0.5 for ranking the epitope highest. Proteins for which the methods rank the epitope equally high are omitted from the analysis.</p>
</sec>
<sec>
<title>Specificity at a predefined sensitivity</title>
<p>When using the HIV dataset for the analysis, two methods at a time are compared by this measure: NetCTL-1.2 and one of the four test methods (EpiJen, MAPPP, MHC-pathway, or WAPP). We only include epitope-protein pairs, where the epitope is restricted to supertypes covered by the test method. All calculations are made per epitope-protein pair, which means that for a given epitope-protein pair the sensitivity will either be 1 (the epitope is identified at the given threshold) or 0 (the epitope is not identified at the given threshold). First, for every method three threshold values in the form of combined scores (NetCTL-1.2, MAPPP, and MHC-pathway) or predicted MHC class I affinities (EpiJen and WAPP) are identified, which achieve a sensitivity of 0.3, 0.5, or 0.8, when averaging over all epitope-proteins pairs. Notice that EpiJen and WAPP do not provide enough predicted scores to achieve a sensitivity of 0.8. Due to the different size of the HIV dataset depending on the test method in question, three different thresholds values are found for NetCTL-1.2 when compared to either of the test methods. Next, the specificity is calculated per epitope-protein pair using the same threshold values. For the HIV
<sup>EpiJen </sup>
datasets all methods cover all epitopes. Again, three threshold values are found for each method and the specificity is calculated per epitope-protein pair using the same threshold values. An unpaired student's t-test [
<xref ref-type="bibr" rid="B44">44</xref>
] is applied to test whether the average specificity of NetCTL-1.2 at a given sensitivity is significantly different from the average specificity at the same sensitivity for a test method.</p>
</sec>
<sec>
<title>Sensitivity among the 5% top-scoring peptides</title>
<p>When using the HIV dataset, two methods at a time are compared by this measure: NetCTL-1.2 and one of the four test methods. We only include epitope-protein pairs, where the epitope is restricted to supertypes covered by the test method. For the HIV
<sup>EpiJen </sup>
datasets all methods cover all epitopes. For calculating the sensitivity among the top 5% peptides, we rank all possible 9 mers for the proteins in the dataset in question according to the combined score (NetCTL-1.2, MAPPP, and MHC-pathway) or according to the predicted MHC class I affinity (EpiJen and WAPP). We only operate with one epitope per protein and accordingly remove all other known epitopes from the SYFPEITHI or Los Alamos HIV databases from the protein in question (all known epitopes from the SYFPEITHI or Los Alamos HIV databases are listed per supertype as supplementary material [
<xref ref-type="bibr" rid="B21">21</xref>
]). Finally, we calculate the sensitivity among the 5% top-scoring peptides.</p>
</sec>
</sec>
</sec>
<sec>
<title>Authors' contributions</title>
<p>MVL contributed to the design of the study, compiled the datasets, obtained the data for the MAPPP, MHC-pathway, and WAPP methods, analysed the data for the EpiJen, MAPPP, MHC-pathway, and WAPP methods, participated in the design of the NetCTL method, and drafted the manuscript. CL contributed to the design of the study, participated in the design of the NetCTL method, and obtained the data for the EpiJen method. KL generated data used for the NetCTL method. SB contributed to the design of the study and generated data used for the NetCTL method. OL contributed to the design of the study and participated in the design of NetCTL. MN contributed to the design of the study, participated in the design of the NetCTL method, implemented the NetCTL method, analysed the data for the EpiJen, MAPPP, MHC-pathway, and WAPP methods, and helped drafting the manuscript. All authors read and approved the manuscript.</p>
</sec>
</body>
<back>
<ack>
<sec>
<title>Acknowledgements</title>
<p>This project was in part funded by Genomes2Vaccines (STREP), FP6, contract no.: LSHB-CT-2003-503231, NIH Contract #HHSN266200400083C, and NIH Contract #HHSN266200400025C.</p>
</sec>
</ack>
<ref-list>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stoltze</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Dick</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Deeg</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Pommerl</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Rammensee</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Schild</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Generation of the vesicular stomatitis virus nucleoprotein cytotoxic T lymphocyte epitope requires proteasome-dependent and -independent proteolytic activities</article-title>
<source>Eur J Immunol</source>
<year>1998</year>
<volume>28</volume>
<fpage>4029</fpage>
<lpage>4036</lpage>
<pub-id pub-id-type="pmid">9862339</pub-id>
<pub-id pub-id-type="doi">10.1002/(SICI)1521-4141(199812)28:12<4029::AID-IMMU4029>3.0.CO;2-N</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mo</surname>
<given-names>XY</given-names>
</name>
<name>
<surname>Cascio</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lemerise</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Goldberg</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Rock</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Distinct proteolytic processes generate the C and N termini of MHC class I-binding peptides</article-title>
<source>J Immunol</source>
<year>1999</year>
<volume>163</volume>
<fpage>5851</fpage>
<lpage>5859</lpage>
<pub-id pub-id-type="pmid">10570269</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altuvia</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Margalit</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Sequence signals for generation of antigenic peptides by the proteasome: implications for proteasomal cleavage mechanism</article-title>
<source>J Mol Biol</source>
<year>2000</year>
<volume>295</volume>
<fpage>879</fpage>
<lpage>890</lpage>
<pub-id pub-id-type="pmid">10656797</pub-id>
<pub-id pub-id-type="doi">10.1006/jmbi.1999.3392</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Craiu</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Akopian</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Goldberg</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rock</surname>
<given-names>KL</given-names>
</name>
</person-group>
<article-title>Two distinct proteolytic processes in the generation of a major histocompatibility complex class I-presented peptide</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>1997</year>
<volume>94</volume>
<fpage>10850</fpage>
<lpage>10855</lpage>
<pub-id pub-id-type="pmid">9380723</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.94.20.10850</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Brouwenstijn</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Perry</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Shastri</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Discrete proteolytic intermediates in the MHC class I antigen processing pathway and MHC I-dependent peptide trimming in the ER</article-title>
<source>Immunity</source>
<year>1999</year>
<volume>11</volume>
<fpage>241</fpage>
<lpage>251</lpage>
<pub-id pub-id-type="pmid">10485659</pub-id>
<pub-id pub-id-type="doi">10.1016/S1074-7613(00)80099-0</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ritz</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Seliger</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>The transporter associated with antigen processing (TAP): structural integrity, expression, function, and its clinical relevance</article-title>
<source>Mol Med</source>
<year>2001</year>
<volume>7</volume>
<fpage>149</fpage>
<lpage>158</lpage>
<pub-id pub-id-type="pmid">11471551</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koch</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Guntrum</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Heintke</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kyritsis</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Tampe</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Functional dissection of the transmembrane domains of the transporter associated with antigen processing (TAP)</article-title>
<source>J Biol Chem</source>
<year>2004</year>
<volume>279</volume>
<fpage>10142</fpage>
<lpage>10147</lpage>
<pub-id pub-id-type="pmid">14679198</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.M312816200</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Endert</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Tampe</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Tisch</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Bach</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>McDevitt</surname>
<given-names>HO</given-names>
</name>
</person-group>
<article-title>A sequential model for peptide binding and transport by the transporters associated with antigen processing</article-title>
<source>Immunity</source>
<year>1994</year>
<volume>1</volume>
<fpage>491</fpage>
<lpage>500</lpage>
<pub-id pub-id-type="pmid">7895159</pub-id>
<pub-id pub-id-type="doi">10.1016/1074-7613(94)90091-4</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yewdell</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Bennink</surname>
<given-names>JR</given-names>
</name>
</person-group>
<article-title>Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses</article-title>
<source>Annu Rev Immunol</source>
<year>1999</year>
<volume>17</volume>
<fpage>51</fpage>
<lpage>88</lpage>
<pub-id pub-id-type="pmid">10358753</pub-id>
<pub-id pub-id-type="doi">10.1146/annurev.immunol.17.1.51</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Larsen</surname>
<given-names>MV</given-names>
</name>
<name>
<surname>Lundegaard</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lamberth</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Buus</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Brunak</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lund</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions</article-title>
<source>Eur J Immunol</source>
<year>2005</year>
<volume>35</volume>
<fpage>2295</fpage>
<lpage>2303</lpage>
<pub-id pub-id-type="pmid">15997466</pub-id>
<pub-id pub-id-type="doi">10.1002/eji.200425811</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="other">
<article-title>SYFPEITHI Epitope Prediction</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.syfpeithi.de/Scripts/MHCServer.dll/EpitopePrediction.htm"></ext-link>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rammensee</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Bachmann</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Emmerich</surname>
<given-names>NP</given-names>
</name>
<name>
<surname>Bachor</surname>
<given-names>OA</given-names>
</name>
<name>
<surname>Stevanovic</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>SYFPEITHI: database for MHC ligands and peptide motifs</article-title>
<source>Immunogenetics</source>
<year>1999</year>
<volume>50</volume>
<fpage>213</fpage>
<lpage>219</lpage>
<pub-id pub-id-type="pmid">10602881</pub-id>
<pub-id pub-id-type="doi">10.1007/s002510050595</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="other">
<article-title>BIMAS HLA Peptide Binding Prediction</article-title>
<ext-link ext-link-type="uri" xlink:href="http://bimas.dcrt.nih.gov/molbio/hla_bind/"></ext-link>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Parker</surname>
<given-names>KC</given-names>
</name>
<name>
<surname>Bednarek</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Coligan</surname>
<given-names>JE</given-names>
</name>
</person-group>
<article-title>Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains</article-title>
<source>J Immunol</source>
<year>1994</year>
<volume>152</volume>
<fpage>163</fpage>
<lpage>175</lpage>
<pub-id pub-id-type="pmid">8254189</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hakenberg</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Nussbaum</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Schild</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Rammensee</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Kuttler</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Holzhutter</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Kloetzel</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Kaufmann</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Mollenkopf</surname>
<given-names>HJ</given-names>
</name>
</person-group>
<article-title>MAPPP: MHC class I antigenic peptide processing prediction</article-title>
<source>Appl Bioinformatics</source>
<year>2003</year>
<volume>2</volume>
<fpage>155</fpage>
<lpage>158</lpage>
<pub-id pub-id-type="pmid">15130801</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Doytchinova</surname>
<given-names>IA</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Flower</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>EpiJen: a server for multistep T cell epitope prediction</article-title>
<source>BMC Bioinformatics</source>
<year>2006</year>
<volume>7</volume>
<fpage>131</fpage>
<pub-id pub-id-type="pmid">16533401</pub-id>
<pub-id pub-id-type="doi">10.1186/1471-2105-7-131</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tenzer</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Peters</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Bulik</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schoor</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Lemmel</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Schatz</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Kloetzel</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Rammensee</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Schild</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Holzhutter</surname>
<given-names>HG</given-names>
</name>
</person-group>
<article-title>Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding</article-title>
<source>Cell Mol Life Sci</source>
<year>2005</year>
<volume>62</volume>
<fpage>1025</fpage>
<lpage>1037</lpage>
<pub-id pub-id-type="pmid">15868101</pub-id>
<pub-id pub-id-type="doi">10.1007/s00018-005-4528-2</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peters</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Bui</surname>
<given-names>HH</given-names>
</name>
<name>
<surname>Frankild</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nielson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lundegaard</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kostem</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Basch</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Lamberth</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Harndahl</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fleri</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Sidney</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lund</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Buus</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sette</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>A community resource benchmarking predictions of peptide binding to MHC-I molecules</article-title>
<source>PLoS Comput Biol</source>
<year>2006</year>
<volume>2</volume>
<fpage>e65</fpage>
<pub-id pub-id-type="pmid">16789818</pub-id>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.0020065</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Donnes</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Kohlbacher</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Integrated modeling of the major events in the MHC class I antigen processing pathway</article-title>
<source>Protein Sci</source>
<year>2005</year>
<volume>14</volume>
<fpage>2132</fpage>
<lpage>2140</lpage>
<pub-id pub-id-type="pmid">15987883</pub-id>
<pub-id pub-id-type="doi">10.1110/ps.051352405</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lund</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kesmir</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Petersen</surname>
<given-names>AG</given-names>
</name>
<name>
<surname>Lundegaard</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Worning</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Sylvester-Hvid</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lamberth</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Roder</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Justesen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Buus</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Brunak</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Definition of supertypes for HLA molecules using clustering of specificity matrices</article-title>
<source>Immunogenetics</source>
<year>2004</year>
<volume>55</volume>
<fpage>797</fpage>
<lpage>810</lpage>
<pub-id pub-id-type="pmid">14963618</pub-id>
<pub-id pub-id-type="doi">10.1007/s00251-004-0647-4</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="other">
<article-title>Supplementary material for NetCTL-1.2</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php"></ext-link>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tong</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>TW</given-names>
</name>
<name>
<surname>Ranganathan</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Methods and protocols for prediction of immunogenic epitopes</article-title>
<source>Brief Bioinform</source>
<year>2006</year>
<pub-id pub-id-type="pmid">17077136</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peters</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Bulik</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tampe</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Van Endert</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Holzhutter</surname>
<given-names>HG</given-names>
</name>
</person-group>
<article-title>Identifying MHC class I epitopes by predicting the TAP transport efficiency of epitope precursors</article-title>
<source>J Immunol</source>
<year>2003</year>
<volume>171</volume>
<fpage>1741</fpage>
<lpage>1749</lpage>
<pub-id pub-id-type="pmid">12902473</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Larsen</surname>
<given-names>MV</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Weinzierl</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lund</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>TAP-Independent MHC Class I Presentation</article-title>
<source>Current Immunological Reviews</source>
<year>2006</year>
<volume>2</volume>
<fpage>233</fpage>
<lpage>245</lpage>
<pub-id pub-id-type="doi">10.2174/157339506778018550</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Henderson</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Michel</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Sakaguchi</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Shabanowitz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Appella</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Hunt</surname>
<given-names>DF</given-names>
</name>
<name>
<surname>Engelhard</surname>
<given-names>VH</given-names>
</name>
</person-group>
<article-title>HLA-A2.1-associated peptides from a mutant cell line: a second pathway of antigen presentation</article-title>
<source>Science</source>
<year>1992</year>
<volume>255</volume>
<fpage>1264</fpage>
<lpage>1266</lpage>
<pub-id pub-id-type="pmid">1546329</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1546329</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>KD</given-names>
</name>
<name>
<surname>Lutz</surname>
<given-names>CT</given-names>
</name>
</person-group>
<article-title>Peptide-dependent expression of HLA-B7 on antigen processing-deficient T2 cells</article-title>
<source>J Immunol</source>
<year>1996</year>
<volume>156</volume>
<fpage>3755</fpage>
<lpage>3764</lpage>
<pub-id pub-id-type="pmid">8621911</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Cresswell</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>HLA-A2 molecules in an antigen-processing mutant cell contain signal sequence-derived peptides</article-title>
<source>Nature</source>
<year>1992</year>
<volume>356</volume>
<fpage>443</fpage>
<lpage>446</lpage>
<pub-id pub-id-type="pmid">1557127</pub-id>
<pub-id pub-id-type="doi">10.1038/356443a0</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="other">
<article-title>SYFPEITHI database</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.syfpeithi.de"></ext-link>
</citation>
</ref>
<ref id="B29">
<citation citation-type="other">
<article-title>HIV Immunology CTL Database</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.hiv.lanl.gov"></ext-link>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kesmir</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Nussbaum</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Schild</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Detours</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Brunak</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Prediction of proteasome cleavage motifs by neural networks</article-title>
<source>Protein Eng</source>
<year>2002</year>
<volume>15</volume>
<fpage>287</fpage>
<lpage>296</lpage>
<pub-id pub-id-type="pmid">11983929</pub-id>
<pub-id pub-id-type="doi">10.1093/protein/15.4.287</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nielsen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lundegaard</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lund</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Kesmir</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage</article-title>
<source>Immunogenetics</source>
<year>2005</year>
<volume>57</volume>
<fpage>33</fpage>
<lpage>41</lpage>
<pub-id pub-id-type="pmid">15744535</pub-id>
<pub-id pub-id-type="doi">10.1007/s00251-005-0781-7</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nielsen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lundegaard</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Worning</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lauemoller</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Lamberth</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Buus</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Brunak</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lund</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Reliable prediction of T-cell epitopes using neural networks with novel sequence representations</article-title>
<source>Protein Sci</source>
<year>2003</year>
<volume>12</volume>
<fpage>1007</fpage>
<lpage>1017</lpage>
<pub-id pub-id-type="pmid">12717023</pub-id>
<pub-id pub-id-type="doi">10.1110/ps.0239403</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="other">
<article-title>NetMHC-3.0</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/NetMHC-3.0"></ext-link>
</citation>
</ref>
<ref id="B34">
<citation citation-type="other">
<article-title>NetCTL</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/NetCTL"></ext-link>
</citation>
</ref>
<ref id="B35">
<citation citation-type="other">
<article-title>EpiJen</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.jenner.ac.uk/EpiJen"></ext-link>
</citation>
</ref>
<ref id="B36">
<citation citation-type="other">
<article-title>MHC-I Antigenic Peptide Processing Prediction - MAPPP</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.mpiib-berlin.mpg.de/MAPPP/"></ext-link>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holzhutter</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Frommel</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kloetzel</surname>
<given-names>PM</given-names>
</name>
</person-group>
<article-title>A theoretical approach towards the identification of cleavage-determining amino acid motifs of the 20 S proteasome</article-title>
<source>J Mol Biol</source>
<year>1999</year>
<volume>286</volume>
<fpage>1251</fpage>
<lpage>1265</lpage>
<pub-id pub-id-type="pmid">10047495</pub-id>
<pub-id pub-id-type="doi">10.1006/jmbi.1998.2530</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holzhutter</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Kloetzel</surname>
<given-names>PM</given-names>
</name>
</person-group>
<article-title>A kinetic model of vertebrate 20S proteasome accounting for the generation of major proteolytic fragments from oligomeric peptide substrates</article-title>
<source>Biophys J</source>
<year>2000</year>
<volume>79</volume>
<fpage>1196</fpage>
<lpage>1205</lpage>
<pub-id pub-id-type="pmid">10968984</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kuttler</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Nussbaum</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Dick</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Rammensee</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Schild</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hadeler</surname>
<given-names>KP</given-names>
</name>
</person-group>
<article-title>An algorithm for the prediction of proteasomal cleavages</article-title>
<source>J Mol Biol</source>
<year>2000</year>
<volume>298</volume>
<fpage>417</fpage>
<lpage>429</lpage>
<pub-id pub-id-type="pmid">10772860</pub-id>
<pub-id pub-id-type="doi">10.1006/jmbi.2000.3683</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nussbaum</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Kuttler</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Hadeler</surname>
<given-names>KP</given-names>
</name>
<name>
<surname>Rammensee</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Schild</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>PAProC: a prediction algorithm for proteasomal cleavages available on the WWW</article-title>
<source>Immunogenetics</source>
<year>2001</year>
<volume>53</volume>
<fpage>87</fpage>
<lpage>94</lpage>
<pub-id pub-id-type="pmid">11345595</pub-id>
<pub-id pub-id-type="doi">10.1007/s002510100300</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="other">
<article-title>MHC pathway</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.mhc-pathway.net"></ext-link>
</citation>
</ref>
<ref id="B42">
<citation citation-type="other">
<article-title>IEDB</article-title>
<ext-link ext-link-type="uri" xlink:href="http://tools-int-01.liai.org/analyze/html/mhc_processing.html"></ext-link>
</citation>
</ref>
<ref id="B43">
<citation citation-type="other">
<article-title>WAPP</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www-bs.informatik.uni-tuebingen.de/WAPP"></ext-link>
</citation>
</ref>
<ref id="B44">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Armitage</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Berry</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Matthews</surname>
<given-names>JNS</given-names>
</name>
</person-group>
<source>Statistical Methods in Medical Research</source>
<year>2002</year>
<edition>4th</edition>
<publisher-name> Blackwell Science Ltd</publisher-name>
</citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000542  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000542  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021