Regularization in finite mixture of regression models with diverging number of parameters.
Identifieur interne : 000906 ( PubMed/Curation ); précédent : 000905; suivant : 000907Regularization in finite mixture of regression models with diverging number of parameters.
Auteurs : Abbas Khalili [Canada] ; Shili LinSource :
- Biometrics [ 1541-0420 ] ; 2013.
English descriptors
- KwdEn :
- Algorithms, Biometry (methods), Breast Neoplasms (mortality), Breast Neoplasms (pathology), Cell Nucleus (pathology), Computer Simulation, Disease Progression, Female, Humans, Likelihood Functions, Models, Statistical, Parkinson Disease (diagnosis), Parkinson Disease (physiopathology), Phonetics, Regression Analysis, Telemedicine (statistics & numerical data).
- MESH :
- diagnosis : Parkinson Disease.
- methods : Biometry.
- mortality : Breast Neoplasms.
- pathology : Breast Neoplasms, Cell Nucleus.
- physiopathology : Parkinson Disease.
- statistics & numerical data : Telemedicine.
- Algorithms, Computer Simulation, Disease Progression, Female, Humans, Likelihood Functions, Models, Statistical, Phonetics, Regression Analysis.
Abstract
Feature (variable) selection has become a fundamentally important problem in recent statistical literature. Sometimes, in applications, many variables are introduced to reduce possible modeling biases, but the number of variables a model can accommodate is often limited by the amount of data available. In other words, the number of variables considered depends on the sample size, which reflects the estimability of the parametric model. In this article, we consider the problem of feature selection in finite mixture of regression models when the number of parameters in the model can increase with the sample size. We propose a penalized likelihood approach for feature selection in these models. Under certain regularity conditions, our approach leads to consistent variable selection. We carry out extensive simulation studies to evaluate the performance of the proposed approach under controlled settings. We also applied the proposed method to two real data. The first is on telemonitoring of Parkinson's disease (PD), where the problem concerns whether dysphonic features extracted from the patients' speech signals recorded at home can be used as surrogates to study PD severity and progression. The second is on breast cancer prognosis, in which one is interested in assessing whether cell nuclear features may offer prognostic values on long-term survival of breast cancer patients. Our analysis in each of the application revealed a mixture structure in the study population and uncovered a unique relationship between the features and the response variable in each of the mixture component.
DOI: 10.1111/biom.12020
PubMed: 23556535
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000906
Links to Exploration step
pubmed:23556535Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Regularization in finite mixture of regression models with diverging number of parameters.</title>
<author><name sortKey="Khalili, Abbas" sort="Khalili, Abbas" uniqKey="Khalili A" first="Abbas" last="Khalili">Abbas Khalili</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Mathematics and Statistics, McGill University, Montreal, Quebec, Canada H3A 2K6. khalili@math.mcgill.ca</nlm:affiliation>
<country>Canada</country>
<wicri:regionArea>Department of Mathematics and Statistics, McGill University, Montreal, Quebec</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Lin, Shili" sort="Lin, Shili" uniqKey="Lin S" first="Shili" last="Lin">Shili Lin</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2013">2013</date>
<idno type="RBID">pubmed:23556535</idno>
<idno type="pmid">23556535</idno>
<idno type="doi">10.1111/biom.12020</idno>
<idno type="wicri:Area/PubMed/Corpus">000906</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000906</idno>
<idno type="wicri:Area/PubMed/Curation">000906</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000906</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Regularization in finite mixture of regression models with diverging number of parameters.</title>
<author><name sortKey="Khalili, Abbas" sort="Khalili, Abbas" uniqKey="Khalili A" first="Abbas" last="Khalili">Abbas Khalili</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Mathematics and Statistics, McGill University, Montreal, Quebec, Canada H3A 2K6. khalili@math.mcgill.ca</nlm:affiliation>
<country>Canada</country>
<wicri:regionArea>Department of Mathematics and Statistics, McGill University, Montreal, Quebec</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Lin, Shili" sort="Lin, Shili" uniqKey="Lin S" first="Shili" last="Lin">Shili Lin</name>
</author>
</analytic>
<series><title level="j">Biometrics</title>
<idno type="eISSN">1541-0420</idno>
<imprint><date when="2013" type="published">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Biometry (methods)</term>
<term>Breast Neoplasms (mortality)</term>
<term>Breast Neoplasms (pathology)</term>
<term>Cell Nucleus (pathology)</term>
<term>Computer Simulation</term>
<term>Disease Progression</term>
<term>Female</term>
<term>Humans</term>
<term>Likelihood Functions</term>
<term>Models, Statistical</term>
<term>Parkinson Disease (diagnosis)</term>
<term>Parkinson Disease (physiopathology)</term>
<term>Phonetics</term>
<term>Regression Analysis</term>
<term>Telemedicine (statistics & numerical data)</term>
</keywords>
<keywords scheme="MESH" qualifier="diagnosis" xml:lang="en"><term>Parkinson Disease</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Biometry</term>
</keywords>
<keywords scheme="MESH" qualifier="mortality" xml:lang="en"><term>Breast Neoplasms</term>
</keywords>
<keywords scheme="MESH" qualifier="pathology" xml:lang="en"><term>Breast Neoplasms</term>
<term>Cell Nucleus</term>
</keywords>
<keywords scheme="MESH" qualifier="physiopathology" xml:lang="en"><term>Parkinson Disease</term>
</keywords>
<keywords scheme="MESH" qualifier="statistics & numerical data" xml:lang="en"><term>Telemedicine</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Computer Simulation</term>
<term>Disease Progression</term>
<term>Female</term>
<term>Humans</term>
<term>Likelihood Functions</term>
<term>Models, Statistical</term>
<term>Phonetics</term>
<term>Regression Analysis</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Feature (variable) selection has become a fundamentally important problem in recent statistical literature. Sometimes, in applications, many variables are introduced to reduce possible modeling biases, but the number of variables a model can accommodate is often limited by the amount of data available. In other words, the number of variables considered depends on the sample size, which reflects the estimability of the parametric model. In this article, we consider the problem of feature selection in finite mixture of regression models when the number of parameters in the model can increase with the sample size. We propose a penalized likelihood approach for feature selection in these models. Under certain regularity conditions, our approach leads to consistent variable selection. We carry out extensive simulation studies to evaluate the performance of the proposed approach under controlled settings. We also applied the proposed method to two real data. The first is on telemonitoring of Parkinson's disease (PD), where the problem concerns whether dysphonic features extracted from the patients' speech signals recorded at home can be used as surrogates to study PD severity and progression. The second is on breast cancer prognosis, in which one is interested in assessing whether cell nuclear features may offer prognostic values on long-term survival of breast cancer patients. Our analysis in each of the application revealed a mixture structure in the study population and uncovered a unique relationship between the features and the response variable in each of the mixture component.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" Owner="NLM"><PMID Version="1">23556535</PMID>
<DateCreated><Year>2013</Year>
<Month>06</Month>
<Day>26</Day>
</DateCreated>
<DateCompleted><Year>2014</Year>
<Month>01</Month>
<Day>27</Day>
</DateCompleted>
<DateRevised><Year>2013</Year>
<Month>06</Month>
<Day>26</Day>
</DateRevised>
<Article PubModel="Print-Electronic"><Journal><ISSN IssnType="Electronic">1541-0420</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>69</Volume>
<Issue>2</Issue>
<PubDate><Year>2013</Year>
<Month>Jun</Month>
</PubDate>
</JournalIssue>
<Title>Biometrics</Title>
<ISOAbbreviation>Biometrics</ISOAbbreviation>
</Journal>
<ArticleTitle>Regularization in finite mixture of regression models with diverging number of parameters.</ArticleTitle>
<Pagination><MedlinePgn>436-46</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1111/biom.12020</ELocationID>
<Abstract><AbstractText>Feature (variable) selection has become a fundamentally important problem in recent statistical literature. Sometimes, in applications, many variables are introduced to reduce possible modeling biases, but the number of variables a model can accommodate is often limited by the amount of data available. In other words, the number of variables considered depends on the sample size, which reflects the estimability of the parametric model. In this article, we consider the problem of feature selection in finite mixture of regression models when the number of parameters in the model can increase with the sample size. We propose a penalized likelihood approach for feature selection in these models. Under certain regularity conditions, our approach leads to consistent variable selection. We carry out extensive simulation studies to evaluate the performance of the proposed approach under controlled settings. We also applied the proposed method to two real data. The first is on telemonitoring of Parkinson's disease (PD), where the problem concerns whether dysphonic features extracted from the patients' speech signals recorded at home can be used as surrogates to study PD severity and progression. The second is on breast cancer prognosis, in which one is interested in assessing whether cell nuclear features may offer prognostic values on long-term survival of breast cancer patients. Our analysis in each of the application revealed a mixture structure in the study population and uncovered a unique relationship between the features and the response variable in each of the mixture component.</AbstractText>
<CopyrightInformation>© 2013, The International Biometric Society.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Khalili</LastName>
<ForeName>Abbas</ForeName>
<Initials>A</Initials>
<AffiliationInfo><Affiliation>Department of Mathematics and Statistics, McGill University, Montreal, Quebec, Canada H3A 2K6. khalili@math.mcgill.ca</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Lin</LastName>
<ForeName>Shili</ForeName>
<Initials>S</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
<PublicationType UI="D013486">Research Support, U.S. Gov't, Non-P.H.S.</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic"><Year>2013</Year>
<Month>04</Month>
<Day>04</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo><Country>United States</Country>
<MedlineTA>Biometrics</MedlineTA>
<NlmUniqueID>0370625</NlmUniqueID>
<ISSNLinking>0006-341X</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D001699" MajorTopicYN="N">Biometry</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D001943" MajorTopicYN="N">Breast Neoplasms</DescriptorName>
<QualifierName UI="Q000401" MajorTopicYN="N">mortality</QualifierName>
<QualifierName UI="Q000473" MajorTopicYN="N">pathology</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D002467" MajorTopicYN="N">Cell Nucleus</DescriptorName>
<QualifierName UI="Q000473" MajorTopicYN="N">pathology</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D003198" MajorTopicYN="N">Computer Simulation</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D018450" MajorTopicYN="N">Disease Progression</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D005260" MajorTopicYN="N">Female</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016013" MajorTopicYN="N">Likelihood Functions</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D015233" MajorTopicYN="Y">Models, Statistical</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D010300" MajorTopicYN="N">Parkinson Disease</DescriptorName>
<QualifierName UI="Q000175" MajorTopicYN="N">diagnosis</QualifierName>
<QualifierName UI="Q000503" MajorTopicYN="N">physiopathology</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D010700" MajorTopicYN="N">Phonetics</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D012044" MajorTopicYN="Y">Regression Analysis</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D017216" MajorTopicYN="N">Telemedicine</DescriptorName>
<QualifierName UI="Q000706" MajorTopicYN="N">statistics & numerical data</QualifierName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2011</Year>
<Month>05</Month>
<Day>01</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised"><Year>2012</Year>
<Month>12</Month>
<Day>01</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2013</Year>
<Month>01</Month>
<Day>01</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2013</Year>
<Month>4</Month>
<Day>6</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2013</Year>
<Month>4</Month>
<Day>6</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2014</Year>
<Month>1</Month>
<Day>28</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">23556535</ArticleId>
<ArticleId IdType="doi">10.1111/biom.12020</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Canada/explor/ParkinsonCanadaV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000906 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 000906 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Canada |area= ParkinsonCanadaV1 |flux= PubMed |étape= Curation |type= RBID |clé= pubmed:23556535 |texte= Regularization in finite mixture of regression models with diverging number of parameters. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i -Sk "pubmed:23556535" \ | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd \ | NlmPubMed2Wicri -a ParkinsonCanadaV1
This area was generated with Dilib version V0.6.29. |