Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Super Paramagnetic Clustering of DNA Sequences

Identifieur interne : 000688 ( Istex/Corpus ); précédent : 000687; suivant : 000689

Super Paramagnetic Clustering of DNA Sequences

Auteurs : Sugiarto Radjiman ; Han Lianyi ; Wang Jian-Sheng ; Chen Yu Zong

Source :

RBID : ISTEX:26E3AE23957EF24AC44D870CB1866BB4F49C3740

English descriptors

Abstract

Abstract: An unsupervised clustering of 4541 DNA sequences containing active promoter regions from vertebrate and arthropod classes (including their viral genes) was performed. All necessary information was solely gathered a priori from the DNA sequences by measuring frequencies of tri-nucleotides and tetra-nucleotides. We employed Super Paramagnetic Clustering, a novel clustering algorithm based on physical properties of an inhomogeneous granular ferromagnet. This method utilizes Swendsen-Wang cluster Monte Carlo simulations to distinguish clusters by measuring pairs of correlation functions from different resolutions. We identified two strongly separated clusters of human viral genes corresponding to the Epstein-Barr virus and the Herpes Simplex virus type 1. In addition, vertebrate and arthropod sequences were successfully separated into two different classes with merely 9.25% of arthropod sequences being misclassified. From a functional perspective, these sequences have high gene function correlations with sequences from the vertebrate cluster. By tuning a clustering parameter, Super Paramagnetic Clustering was able to classify vertebrate class further into two major clusters, from where a large number of housekeeping genes and tissue-specific genes were found respectively. The indications came from observation of gene expression function and consensus transcription factors which were found grouped together in specific positions of the DNA sequences.

Url:
DOI: 10.1007/s10867-006-2120-0

Links to Exploration step

ISTEX:26E3AE23957EF24AC44D870CB1866BB4F49C3740

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Super Paramagnetic Clustering of DNA Sequences</title>
<author>
<name sortKey="Radjiman, Sugiarto" sort="Radjiman, Sugiarto" uniqKey="Radjiman S" first="Sugiarto" last="Radjiman">Sugiarto Radjiman</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: sugiarto@cz3.nus.edu.sg</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Lianyi, Han" sort="Lianyi, Han" uniqKey="Lianyi H" first="Han" last="Lianyi">Han Lianyi</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Jian Sheng, Wang" sort="Jian Sheng, Wang" uniqKey="Jian Sheng W" first="Wang" last="Jian-Sheng">Wang Jian-Sheng</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Zong, Chen Yu" sort="Zong, Chen Yu" uniqKey="Zong C" first="Chen Yu" last="Zong">Chen Yu Zong</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:26E3AE23957EF24AC44D870CB1866BB4F49C3740</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/s10867-006-2120-0</idno>
<idno type="url">https://api.istex.fr/ark:/67375/VQC-BLW8RPP5-1/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000688</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000688</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Super Paramagnetic Clustering of DNA Sequences</title>
<author>
<name sortKey="Radjiman, Sugiarto" sort="Radjiman, Sugiarto" uniqKey="Radjiman S" first="Sugiarto" last="Radjiman">Sugiarto Radjiman</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: sugiarto@cz3.nus.edu.sg</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Lianyi, Han" sort="Lianyi, Han" uniqKey="Lianyi H" first="Han" last="Lianyi">Han Lianyi</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Jian Sheng, Wang" sort="Jian Sheng, Wang" uniqKey="Jian Sheng W" first="Wang" last="Jian-Sheng">Wang Jian-Sheng</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Zong, Chen Yu" sort="Zong, Chen Yu" uniqKey="Zong C" first="Chen Yu" last="Zong">Chen Yu Zong</name>
<affiliation>
<mods:affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Journal of Biological Physics</title>
<title level="j" type="abbrev">J Biol Phys</title>
<idno type="ISSN">0092-0606</idno>
<idno type="eISSN">1573-0689</idno>
<imprint>
<publisher>Kluwer Academic Publishers</publisher>
<pubPlace>Dordrecht</pubPlace>
<date type="published" when="2006-01-01">2006-01-01</date>
<biblScope unit="volume">32</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="11">11</biblScope>
<biblScope unit="page" to="25">25</biblScope>
</imprint>
<idno type="ISSN">0092-0606</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0092-0606</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>DNA sequence</term>
<term>cluster algorithm</term>
<term>data clustering</term>
<term>promoters</term>
<term>statistical physics</term>
<term>transcription factor binding sites</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: An unsupervised clustering of 4541 DNA sequences containing active promoter regions from vertebrate and arthropod classes (including their viral genes) was performed. All necessary information was solely gathered a priori from the DNA sequences by measuring frequencies of tri-nucleotides and tetra-nucleotides. We employed Super Paramagnetic Clustering, a novel clustering algorithm based on physical properties of an inhomogeneous granular ferromagnet. This method utilizes Swendsen-Wang cluster Monte Carlo simulations to distinguish clusters by measuring pairs of correlation functions from different resolutions. We identified two strongly separated clusters of human viral genes corresponding to the Epstein-Barr virus and the Herpes Simplex virus type 1. In addition, vertebrate and arthropod sequences were successfully separated into two different classes with merely 9.25% of arthropod sequences being misclassified. From a functional perspective, these sequences have high gene function correlations with sequences from the vertebrate cluster. By tuning a clustering parameter, Super Paramagnetic Clustering was able to classify vertebrate class further into two major clusters, from where a large number of housekeeping genes and tissue-specific genes were found respectively. The indications came from observation of gene expression function and consensus transcription factors which were found grouped together in specific positions of the DNA sequences.</div>
</front>
</TEI>
<istex>
<corpusName>springer-journals</corpusName>
<author>
<json:item>
<name>Sugiarto Radjiman</name>
<affiliations>
<json:string>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</json:string>
<json:string>E-mail: sugiarto@cz3.nus.edu.sg</json:string>
</affiliations>
</json:item>
<json:item>
<name>Han Lianyi</name>
<affiliations>
<json:string>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</json:string>
</affiliations>
</json:item>
<json:item>
<name>Wang Jian-Sheng</name>
<affiliations>
<json:string>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</json:string>
</affiliations>
</json:item>
<json:item>
<name>Chen Yu Zong</name>
<affiliations>
<json:string>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>DNA sequence</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>promoters</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>transcription factor binding sites</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>cluster algorithm</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>data clustering</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>statistical physics</value>
</json:item>
</subject>
<articleId>
<json:string>2120</json:string>
<json:string>DO00022120</json:string>
</articleId>
<arkIstex>ark:/67375/VQC-BLW8RPP5-1</arkIstex>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>OriginalPaper</json:string>
</originalGenre>
<abstract>Abstract: An unsupervised clustering of 4541 DNA sequences containing active promoter regions from vertebrate and arthropod classes (including their viral genes) was performed. All necessary information was solely gathered a priori from the DNA sequences by measuring frequencies of tri-nucleotides and tetra-nucleotides. We employed Super Paramagnetic Clustering, a novel clustering algorithm based on physical properties of an inhomogeneous granular ferromagnet. This method utilizes Swendsen-Wang cluster Monte Carlo simulations to distinguish clusters by measuring pairs of correlation functions from different resolutions. We identified two strongly separated clusters of human viral genes corresponding to the Epstein-Barr virus and the Herpes Simplex virus type 1. In addition, vertebrate and arthropod sequences were successfully separated into two different classes with merely 9.25% of arthropod sequences being misclassified. From a functional perspective, these sequences have high gene function correlations with sequences from the vertebrate cluster. By tuning a clustering parameter, Super Paramagnetic Clustering was able to classify vertebrate class further into two major clusters, from where a large number of housekeeping genes and tissue-specific genes were found respectively. The indications came from observation of gene expression function and consensus transcription factors which were found grouped together in specific positions of the DNA sequences.</abstract>
<qualityIndicators>
<refBibsNative>false</refBibsNative>
<abstractWordCount>198</abstractWordCount>
<abstractCharCount>1478</abstractCharCount>
<keywordCount>6</keywordCount>
<score>9.113</score>
<pdfWordCount>4737</pdfWordCount>
<pdfCharCount>29062</pdfCharCount>
<pdfVersion>1.3</pdfVersion>
<pdfPageCount>15</pdfPageCount>
<pdfPageSize>595 x 842 pts (A4)</pdfPageSize>
</qualityIndicators>
<title>Super Paramagnetic Clustering of DNA Sequences</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<title>Journal of Biological Physics</title>
<language>
<json:string>unknown</json:string>
</language>
<publicationDate>2006</publicationDate>
<copyrightDate>2006</copyrightDate>
<issn>
<json:string>0092-0606</json:string>
</issn>
<eissn>
<json:string>1573-0689</json:string>
</eissn>
<journalId>
<json:string>10867</json:string>
</journalId>
<volume>32</volume>
<issue>1</issue>
<pages>
<first>11</first>
<last>25</last>
</pages>
<genre>
<json:string>journal</json:string>
</genre>
<subject>
<json:item>
<value>Neurosciences</value>
</json:item>
<json:item>
<value>Polymer Sciences</value>
</json:item>
<json:item>
<value>Bioinformatics</value>
</json:item>
<json:item>
<value>Statistical Physics</value>
</json:item>
<json:item>
<value>Condensed Matter</value>
</json:item>
<json:item>
<value>Biophysics/Biomedical Physics</value>
</json:item>
</subject>
</host>
<ark>
<json:string>ark:/67375/VQC-BLW8RPP5-1</json:string>
</ark>
<publicationDate>2006</publicationDate>
<copyrightDate>2006</copyrightDate>
<doi>
<json:string>10.1007/s10867-006-2120-0</json:string>
</doi>
<id>26E3AE23957EF24AC44D870CB1866BB4F49C3740</id>
<score>1</score>
<fulltext>
<json:item>
<extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-BLW8RPP5-1/fulltext.pdf</uri>
</json:item>
<json:item>
<extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-BLW8RPP5-1/bundle.zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/ark:/67375/VQC-BLW8RPP5-1/fulltext.tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Super Paramagnetic Clustering of DNA Sequences</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher scheme="https://scientific-publisher.data.istex.fr">Kluwer Academic Publishers</publisher>
<pubPlace>Dordrecht</pubPlace>
<availability>
<licence>
<p>Springer Science + Business Media, Inc., 2006</p>
</licence>
<p scheme="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-3XSW68JL-F">springer</p>
</availability>
<date>2006</date>
</publicationStmt>
<notesStmt>
<note type="research-article" scheme="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</note>
<note type="journal" scheme="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</note>
</notesStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">Super Paramagnetic Clustering of DNA Sequences</title>
<author xml:id="author-0000" corresp="yes">
<persName>
<forename type="first">Sugiarto</forename>
<surname>Radjiman</surname>
</persName>
<email>sugiarto@cz3.nus.edu.sg</email>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
</author>
<author xml:id="author-0001">
<persName>
<forename type="first">Han</forename>
<surname>Lianyi</surname>
</persName>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
</author>
<author xml:id="author-0002">
<persName>
<forename type="first">Wang</forename>
<surname>Jian-Sheng</surname>
</persName>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
</author>
<author xml:id="author-0003">
<persName>
<forename type="first">Chen</forename>
<surname>Zong</surname>
</persName>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
</author>
<idno type="istex">26E3AE23957EF24AC44D870CB1866BB4F49C3740</idno>
<idno type="ark">ark:/67375/VQC-BLW8RPP5-1</idno>
<idno type="DOI">10.1007/s10867-006-2120-0</idno>
<idno type="article-id">2120</idno>
<idno type="article-id">DO00022120</idno>
</analytic>
<monogr>
<title level="j">Journal of Biological Physics</title>
<title level="j" type="abbrev">J Biol Phys</title>
<idno type="pISSN">0092-0606</idno>
<idno type="eISSN">1573-0689</idno>
<idno type="journal-ID">true</idno>
<idno type="journal-SPIN">JOBP</idno>
<idno type="issue-article-count">5</idno>
<idno type="volume-issue-count">4</idno>
<imprint>
<publisher>Kluwer Academic Publishers</publisher>
<pubPlace>Dordrecht</pubPlace>
<date type="published" when="2006-01-01"></date>
<biblScope unit="volume">32</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="11">11</biblScope>
<biblScope unit="page" to="25">25</biblScope>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2006</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Abstract: An unsupervised clustering of 4541 DNA sequences containing active promoter regions from vertebrate and arthropod classes (including their viral genes) was performed. All necessary information was solely gathered a priori from the DNA sequences by measuring frequencies of tri-nucleotides and tetra-nucleotides. We employed Super Paramagnetic Clustering, a novel clustering algorithm based on physical properties of an inhomogeneous granular ferromagnet. This method utilizes Swendsen-Wang cluster Monte Carlo simulations to distinguish clusters by measuring pairs of correlation functions from different resolutions. We identified two strongly separated clusters of human viral genes corresponding to the Epstein-Barr virus and the Herpes Simplex virus type 1. In addition, vertebrate and arthropod sequences were successfully separated into two different classes with merely 9.25% of arthropod sequences being misclassified. From a functional perspective, these sequences have high gene function correlations with sequences from the vertebrate cluster. By tuning a clustering parameter, Super Paramagnetic Clustering was able to classify vertebrate class further into two major clusters, from where a large number of housekeeping genes and tissue-specific genes were found respectively. The indications came from observation of gene expression function and consensus transcription factors which were found grouped together in specific positions of the DNA sequences.</p>
</abstract>
<textClass xml:lang="en">
<keywords scheme="keyword">
<list>
<head>Key words</head>
<item>
<term>DNA sequence</term>
</item>
<item>
<term>promoters</term>
</item>
<item>
<term>transcription factor binding sites</term>
</item>
<item>
<term>cluster algorithm</term>
</item>
<item>
<term>data clustering</term>
</item>
<item>
<term>statistical physics</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Journal Subject">
<list>
<head>Physics</head>
<item>
<term>Neurosciences</term>
</item>
<item>
<term>Polymer Sciences</term>
</item>
<item>
<term>Bioinformatics</term>
</item>
<item>
<term>Statistical Physics</term>
</item>
<item>
<term>Condensed Matter</term>
</item>
<item>
<term>Biophysics/Biomedical Physics</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2006-01-01">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-BLW8RPP5-1/fulltext.txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus springer-journals not found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document>
<Publisher>
<PublisherInfo>
<PublisherName>Kluwer Academic Publishers</PublisherName>
<PublisherLocation>Dordrecht</PublisherLocation>
</PublisherInfo>
<Journal>
<JournalInfo JournalProductType="ArchiveJournal" NumberingStyle="Unnumbered">
<JournalID>10867</JournalID>
<JournalPrintISSN>0092-0606</JournalPrintISSN>
<JournalElectronicISSN>1573-0689</JournalElectronicISSN>
<JournalSPIN>JOBP</JournalSPIN>
<JournalTitle>Journal of Biological Physics</JournalTitle>
<JournalAbbreviatedTitle>J Biol Phys</JournalAbbreviatedTitle>
<JournalSubjectGroup>
<JournalSubject Type="Primary">Physics</JournalSubject>
<JournalSubject Type="Secondary">Neurosciences</JournalSubject>
<JournalSubject Type="Secondary">Polymer Sciences</JournalSubject>
<JournalSubject Type="Secondary">Bioinformatics</JournalSubject>
<JournalSubject Type="Secondary">Statistical Physics</JournalSubject>
<JournalSubject Type="Secondary">Condensed Matter</JournalSubject>
<JournalSubject Type="Secondary">Biophysics/Biomedical Physics</JournalSubject>
</JournalSubjectGroup>
</JournalInfo>
<Volume>
<VolumeInfo TocLevels="0" VolumeType="Regular">
<VolumeIDStart>32</VolumeIDStart>
<VolumeIDEnd>32</VolumeIDEnd>
<VolumeIssueCount>4</VolumeIssueCount>
</VolumeInfo>
<Issue IssueType="Regular">
<IssueInfo TocLevels="0">
<IssueIDStart>1</IssueIDStart>
<IssueIDEnd>1</IssueIDEnd>
<IssueArticleCount>5</IssueArticleCount>
<IssueHistory>
<CoverDate>
<Year>2006</Year>
<Month>1</Month>
</CoverDate>
</IssueHistory>
<IssueCopyright>
<CopyrightHolderName>Springer Science + Business Media, Inc.</CopyrightHolderName>
<CopyrightYear>2006</CopyrightYear>
</IssueCopyright>
</IssueInfo>
<Article ID="DO00022120">
<ArticleInfo ArticleType="OriginalPaper" ContainsESM="No" Language="En" NumberingStyle="Unnumbered" TocLevels="0">
<ArticleID>2120</ArticleID>
<ArticleDOI>10.1007/s10867-006-2120-0</ArticleDOI>
<ArticleSequenceNumber>3</ArticleSequenceNumber>
<ArticleTitle Language="En">Super Paramagnetic Clustering of DNA Sequences</ArticleTitle>
<ArticleFirstPage>11</ArticleFirstPage>
<ArticleLastPage>25</ArticleLastPage>
<ArticleHistory>
<RegistrationDate>
<Year>2006</Year>
<Month>1</Month>
<Day>1</Day>
</RegistrationDate>
</ArticleHistory>
<ArticleCopyright>
<CopyrightHolderName>Springer Science + Business Media, Inc.</CopyrightHolderName>
<CopyrightYear>2006</CopyrightYear>
</ArticleCopyright>
<ArticleGrants Type="Regular">
<MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ArticleGrants>
</ArticleInfo>
<ArticleHeader>
<AuthorGroup>
<Author AffiliationIDS="Aff1" CorrespondingAffiliationID="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>Sugiarto</GivenName>
<FamilyName>Radjiman</FamilyName>
</AuthorName>
<Contact>
<Email>sugiarto@cz3.nus.edu.sg</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>Han</GivenName>
<FamilyName>Lianyi</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>Wang</GivenName>
<FamilyName>Jian-Sheng</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>Chen</GivenName>
<GivenName>Yu</GivenName>
<FamilyName>Zong</FamilyName>
</AuthorName>
</Author>
<Affiliation ID="Aff1">
<OrgDivision>Department of Computational Science</OrgDivision>
<OrgName>National University of Singapore</OrgName>
<OrgAddress>
<Postcode>117543</Postcode>
<City>Singapore</City>
<Country Code="SG">Republic of Singapore</Country>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En">
<Heading>Abstract</Heading>
<Para>An unsupervised clustering of 4541 DNA sequences containing active promoter regions from vertebrate and arthropod classes (including their viral genes) was performed. All necessary information was solely gathered a priori from the DNA sequences by measuring frequencies of tri-nucleotides and tetra-nucleotides. We employed Super Paramagnetic Clustering, a novel clustering algorithm based on physical properties of an inhomogeneous granular ferromagnet. This method utilizes Swendsen-Wang cluster Monte Carlo simulations to distinguish clusters by measuring pairs of correlation functions from different resolutions. We identified two strongly separated clusters of human viral genes corresponding to the Epstein-Barr virus and the Herpes Simplex virus type 1. In addition, vertebrate and arthropod sequences were successfully separated into two different classes with merely 9.25% of arthropod sequences being misclassified. From a functional perspective, these sequences have high gene function correlations with sequences from the vertebrate cluster. By tuning a clustering parameter, Super Paramagnetic Clustering was able to classify vertebrate class further into two major clusters, from where a large number of housekeeping genes and tissue-specific genes were found respectively. The indications came from observation of gene expression function and consensus transcription factors which were found grouped together in specific positions of the DNA sequences.</Para>
</Abstract>
<KeywordGroup Language="En">
<Heading>Key words</Heading>
<Keyword>DNA sequence</Keyword>
<Keyword>promoters</Keyword>
<Keyword>transcription factor binding sites</Keyword>
<Keyword>cluster algorithm</Keyword>
<Keyword>data clustering</Keyword>
<Keyword>statistical physics</Keyword>
</KeywordGroup>
</ArticleHeader>
<NoBody></NoBody>
</Article>
</Issue>
</Volume>
</Journal>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Super Paramagnetic Clustering of DNA Sequences</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>Super Paramagnetic Clustering of DNA Sequences</title>
</titleInfo>
<name type="personal" displayLabel="corresp">
<namePart type="given">Sugiarto</namePart>
<namePart type="family">Radjiman</namePart>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
<affiliation>E-mail: sugiarto@cz3.nus.edu.sg</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Han</namePart>
<namePart type="family">Lianyi</namePart>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Wang</namePart>
<namePart type="family">Jian-Sheng</namePart>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Chen</namePart>
<namePart type="given">Yu</namePart>
<namePart type="family">Zong</namePart>
<affiliation>Department of Computational Science, National University of Singapore, 117543, Singapore, Republic of Singapore</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="OriginalPaper" authority="ISTEX" authorityURI="https://content-type.data.istex.fr" valueURI="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</genre>
<originInfo>
<publisher>Kluwer Academic Publishers</publisher>
<place>
<placeTerm type="text">Dordrecht</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2006-01-01</dateIssued>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<abstract lang="en">Abstract: An unsupervised clustering of 4541 DNA sequences containing active promoter regions from vertebrate and arthropod classes (including their viral genes) was performed. All necessary information was solely gathered a priori from the DNA sequences by measuring frequencies of tri-nucleotides and tetra-nucleotides. We employed Super Paramagnetic Clustering, a novel clustering algorithm based on physical properties of an inhomogeneous granular ferromagnet. This method utilizes Swendsen-Wang cluster Monte Carlo simulations to distinguish clusters by measuring pairs of correlation functions from different resolutions. We identified two strongly separated clusters of human viral genes corresponding to the Epstein-Barr virus and the Herpes Simplex virus type 1. In addition, vertebrate and arthropod sequences were successfully separated into two different classes with merely 9.25% of arthropod sequences being misclassified. From a functional perspective, these sequences have high gene function correlations with sequences from the vertebrate cluster. By tuning a clustering parameter, Super Paramagnetic Clustering was able to classify vertebrate class further into two major clusters, from where a large number of housekeeping genes and tissue-specific genes were found respectively. The indications came from observation of gene expression function and consensus transcription factors which were found grouped together in specific positions of the DNA sequences.</abstract>
<subject lang="en">
<genre>Key words</genre>
<topic>DNA sequence</topic>
<topic>promoters</topic>
<topic>transcription factor binding sites</topic>
<topic>cluster algorithm</topic>
<topic>data clustering</topic>
<topic>statistical physics</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Journal of Biological Physics</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>J Biol Phys</title>
</titleInfo>
<genre type="journal" authority="ISTEX" authorityURI="https://publication-type.data.istex.fr" valueURI="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</genre>
<originInfo>
<publisher>Springer</publisher>
<dateIssued encoding="w3cdtf">2006-01-01</dateIssued>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
</originInfo>
<subject>
<genre>Physics</genre>
<topic>Neurosciences</topic>
<topic>Polymer Sciences</topic>
<topic>Bioinformatics</topic>
<topic>Statistical Physics</topic>
<topic>Condensed Matter</topic>
<topic>Biophysics/Biomedical Physics</topic>
</subject>
<identifier type="ISSN">0092-0606</identifier>
<identifier type="eISSN">1573-0689</identifier>
<identifier type="JournalID">10867</identifier>
<identifier type="JournalSPIN">JOBP</identifier>
<identifier type="IssueArticleCount">5</identifier>
<identifier type="VolumeIssueCount">4</identifier>
<part>
<date>2006</date>
<detail type="volume">
<number>32</number>
<caption>vol.</caption>
</detail>
<detail type="issue">
<number>1</number>
<caption>no.</caption>
</detail>
<extent unit="pages">
<start>11</start>
<end>25</end>
</extent>
</part>
<recordInfo>
<recordOrigin>Springer Science + Business Media, Inc., 2006</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">26E3AE23957EF24AC44D870CB1866BB4F49C3740</identifier>
<identifier type="ark">ark:/67375/VQC-BLW8RPP5-1</identifier>
<identifier type="DOI">10.1007/s10867-006-2120-0</identifier>
<identifier type="ArticleID">2120</identifier>
<identifier type="ArticleID">DO00022120</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer Science + Business Media, Inc., 2006</accessCondition>
<recordInfo>
<recordContentSource authority="ISTEX" authorityURI="https://loaded-corpus.data.istex.fr" valueURI="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-3XSW68JL-F">springer</recordContentSource>
<recordOrigin>Springer Science + Business Media, Inc., 2006</recordOrigin>
</recordInfo>
</mods>
<json:item>
<extension>json</extension>
<original>false</original>
<mimetype>application/json</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-BLW8RPP5-1/record.json</uri>
</json:item>
</metadata>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000688 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000688 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:26E3AE23957EF24AC44D870CB1866BB4F49C3740
   |texte=   Super Paramagnetic Clustering of DNA Sequences
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021