Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX
Identifieur interne : 001291 ( Main/Curation ); précédent : 001290; suivant : 001292Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX
Auteurs : Lorena Etcheverry [Uruguay] ; Shahan Khatchadourian [Canada, États-Unis] ; Mariano Consens [Canada, États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.
Abstract
Abstract: The functional genomics and informatics community has made extensive microarray experimental data available online, facilitating independent evaluation of experiment conclusions and enabling researchers to access and reuse a growing body of gene expression knowledge. While there are several data-exchange standards, numerous microarray experiment datasets are published using the MAGE-ML XML schema. Assessing the quality of published experiments is a challenging task, and there is no consensus among microarray users on a framework to measure dataset quality. In this paper, we develop techniques based on DescribeX (a summary-based visualization tool for XML) that quantitatively and qualitatively analyze MAGE-ML public collections, gaining insights about schema usage. We address specific questions such as detection of common instance patterns and coverage, precision of the experiment descriptions, and usage of controlled vocabularies. Our case study shows that DescribeX is a useful tool for the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of MAGE-ML datasets.
Url:
DOI: 10.1007/978-3-642-15120-0_15
Links toward previous steps (curation, corpus...)
- to stream Main, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001516
Links to Exploration step
ISTEX:A54568780D1B1D763EFA6EF758DC50E82472A1C5Curation
No country items
Lorena Etcheverry<affiliation><mods:affiliation>Instituto de Computación, Facultad de Ingeniería, Universidad de la República</mods:affiliation>
<wicri:noCountry code="subField">Universidad de la República</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: lorenae@fing.edu.uy</mods:affiliation>
<country wicri:rule="url">Uruguay</country>
</affiliation>
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX</title>
<author><name sortKey="Etcheverry, Lorena" sort="Etcheverry, Lorena" uniqKey="Etcheverry L" first="Lorena" last="Etcheverry">Lorena Etcheverry</name>
<affiliation><mods:affiliation>Instituto de Computación, Facultad de Ingeniería, Universidad de la República</mods:affiliation>
<wicri:noCountry code="subField">Universidad de la República</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: lorenae@fing.edu.uy</mods:affiliation>
<country wicri:rule="url">Uruguay</country>
</affiliation>
</author>
<author><name sortKey="Khatchadourian, Shahan" sort="Khatchadourian, Shahan" uniqKey="Khatchadourian S" first="Shahan" last="Khatchadourian">Shahan Khatchadourian</name>
<affiliation wicri:level="4"><mods:affiliation>University of Toronto</mods:affiliation>
<country>Canada</country>
<placeName><settlement type="city">Toronto</settlement>
<region type="state">Ontario</region>
</placeName>
<orgName type="university">Université de Toronto</orgName>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: shahan@cs.toronto.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Consens, Mariano" sort="Consens, Mariano" uniqKey="Consens M" first="Mariano" last="Consens">Mariano Consens</name>
<affiliation wicri:level="4"><mods:affiliation>University of Toronto</mods:affiliation>
<country>Canada</country>
<placeName><settlement type="city">Toronto</settlement>
<region type="state">Ontario</region>
</placeName>
<orgName type="university">Université de Toronto</orgName>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: consens@cs.toronto.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:A54568780D1B1D763EFA6EF758DC50E82472A1C5</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15120-0_15</idno>
<idno type="url">https://api.istex.fr/document/A54568780D1B1D763EFA6EF758DC50E82472A1C5/fulltext/pdf</idno>
<idno type="wicri:Area/Main/Corpus">001516</idno>
<idno type="wicri:Area/Main/Curation">001291</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX</title>
<author><name sortKey="Etcheverry, Lorena" sort="Etcheverry, Lorena" uniqKey="Etcheverry L" first="Lorena" last="Etcheverry">Lorena Etcheverry</name>
<affiliation><mods:affiliation>Instituto de Computación, Facultad de Ingeniería, Universidad de la República</mods:affiliation>
<wicri:noCountry code="subField">Universidad de la República</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: lorenae@fing.edu.uy</mods:affiliation>
<country wicri:rule="url">Uruguay</country>
</affiliation>
</author>
<author><name sortKey="Khatchadourian, Shahan" sort="Khatchadourian, Shahan" uniqKey="Khatchadourian S" first="Shahan" last="Khatchadourian">Shahan Khatchadourian</name>
<affiliation wicri:level="4"><mods:affiliation>University of Toronto</mods:affiliation>
<country>Canada</country>
<placeName><settlement type="city">Toronto</settlement>
<region type="state">Ontario</region>
</placeName>
<orgName type="university">Université de Toronto</orgName>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: shahan@cs.toronto.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Consens, Mariano" sort="Consens, Mariano" uniqKey="Consens M" first="Mariano" last="Consens">Mariano Consens</name>
<affiliation wicri:level="4"><mods:affiliation>University of Toronto</mods:affiliation>
<country>Canada</country>
<placeName><settlement type="city">Toronto</settlement>
<region type="state">Ontario</region>
</placeName>
<orgName type="university">Université de Toronto</orgName>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: consens@cs.toronto.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">A54568780D1B1D763EFA6EF758DC50E82472A1C5</idno>
<idno type="DOI">10.1007/978-3-642-15120-0_15</idno>
<idno type="ChapterID">Chap15</idno>
<idno type="ChapterID">15</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The functional genomics and informatics community has made extensive microarray experimental data available online, facilitating independent evaluation of experiment conclusions and enabling researchers to access and reuse a growing body of gene expression knowledge. While there are several data-exchange standards, numerous microarray experiment datasets are published using the MAGE-ML XML schema. Assessing the quality of published experiments is a challenging task, and there is no consensus among microarray users on a framework to measure dataset quality. In this paper, we develop techniques based on DescribeX (a summary-based visualization tool for XML) that quantitatively and qualitatively analyze MAGE-ML public collections, gaining insights about schema usage. We address specific questions such as detection of common instance patterns and coverage, precision of the experiment descriptions, and usage of controlled vocabularies. Our case study shows that DescribeX is a useful tool for the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of MAGE-ML datasets.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/ParkinsonV1/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001291 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 001291 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sante |area= ParkinsonV1 |flux= Main |étape= Curation |type= RBID |clé= ISTEX:A54568780D1B1D763EFA6EF758DC50E82472A1C5 |texte= Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX }}
This area was generated with Dilib version V0.6.23. |