Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Using Benford’s law to investigate Natural Hazard dataset homogeneity

Identifieur interne : 000805 ( Pmc/Corpus ); précédent : 000804; suivant : 000806

Using Benford’s law to investigate Natural Hazard dataset homogeneity

Auteurs : Renaud Joannes-Boyau ; Thomas Bodin ; Anja Scheffers ; Malcolm Sambridge ; Simon Matthias May

Source :

RBID : PMC:4496784

Abstract

Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford’s Law (also called the “First-Digit Law”) to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford’s Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford’s Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.


Url:
DOI: 10.1038/srep12046
PubMed: 26156060
PubMed Central: 4496784

Links to Exploration step

PMC:4496784

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Using Benford’s law to investigate Natural Hazard dataset homogeneity</title>
<author>
<name sortKey="Joannes Boyau, Renaud" sort="Joannes Boyau, Renaud" uniqKey="Joannes Boyau R" first="Renaud" last="Joannes-Boyau">Renaud Joannes-Boyau</name>
<affiliation>
<nlm:aff id="a1">
<institution>Southern Cross GeoScience, Southern Cross University</institution>
, Lismore, NSW, 2480,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bodin, Thomas" sort="Bodin, Thomas" uniqKey="Bodin T" first="Thomas" last="Bodin">Thomas Bodin</name>
<affiliation>
<nlm:aff id="a2">
<institution>Earth and Planetary Science, University of California Berkeley</institution>
, CA,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a5">
<institution>Laboratoire de Géologie de Lyon, Ecole Normale Supérieure de Lyon, Université de Lyon-1, CNRS</institution>
, 69364 Lyon Cedex 07,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Scheffers, Anja" sort="Scheffers, Anja" uniqKey="Scheffers A" first="Anja" last="Scheffers">Anja Scheffers</name>
<affiliation>
<nlm:aff id="a1">
<institution>Southern Cross GeoScience, Southern Cross University</institution>
, Lismore, NSW, 2480,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sambridge, Malcolm" sort="Sambridge, Malcolm" uniqKey="Sambridge M" first="Malcolm" last="Sambridge">Malcolm Sambridge</name>
<affiliation>
<nlm:aff id="a3">
<institution>Research School of Earth Sciences, Australian National University</institution>
, Canberra, ACT, 0200,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="May, Simon Matthias" sort="May, Simon Matthias" uniqKey="May S" first="Simon Matthias" last="May">Simon Matthias May</name>
<affiliation>
<nlm:aff id="a4">
<institution>Institute of Geography, University of Cologne</institution>
, 50923, Cologne,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26156060</idno>
<idno type="pmc">4496784</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4496784</idno>
<idno type="RBID">PMC:4496784</idno>
<idno type="doi">10.1038/srep12046</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000805</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000805</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Using Benford’s law to investigate Natural Hazard dataset homogeneity</title>
<author>
<name sortKey="Joannes Boyau, Renaud" sort="Joannes Boyau, Renaud" uniqKey="Joannes Boyau R" first="Renaud" last="Joannes-Boyau">Renaud Joannes-Boyau</name>
<affiliation>
<nlm:aff id="a1">
<institution>Southern Cross GeoScience, Southern Cross University</institution>
, Lismore, NSW, 2480,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bodin, Thomas" sort="Bodin, Thomas" uniqKey="Bodin T" first="Thomas" last="Bodin">Thomas Bodin</name>
<affiliation>
<nlm:aff id="a2">
<institution>Earth and Planetary Science, University of California Berkeley</institution>
, CA,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a5">
<institution>Laboratoire de Géologie de Lyon, Ecole Normale Supérieure de Lyon, Université de Lyon-1, CNRS</institution>
, 69364 Lyon Cedex 07,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Scheffers, Anja" sort="Scheffers, Anja" uniqKey="Scheffers A" first="Anja" last="Scheffers">Anja Scheffers</name>
<affiliation>
<nlm:aff id="a1">
<institution>Southern Cross GeoScience, Southern Cross University</institution>
, Lismore, NSW, 2480,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sambridge, Malcolm" sort="Sambridge, Malcolm" uniqKey="Sambridge M" first="Malcolm" last="Sambridge">Malcolm Sambridge</name>
<affiliation>
<nlm:aff id="a3">
<institution>Research School of Earth Sciences, Australian National University</institution>
, Canberra, ACT, 0200,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="May, Simon Matthias" sort="May, Simon Matthias" uniqKey="May S" first="Simon Matthias" last="May">Simon Matthias May</name>
<affiliation>
<nlm:aff id="a4">
<institution>Institute of Geography, University of Cologne</institution>
, 50923, Cologne,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Scientific Reports</title>
<idno type="eISSN">2045-2322</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford’s Law (also called the “First-Digit Law”) to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford’s Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The
<italic>least-square misfit measure</italic>
is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford’s Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Benford, F" uniqKey="Benford F">F. Benford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Newcomb, S" uniqKey="Newcomb S">S. Newcomb</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hill, T" uniqKey="Hill T">T. Hill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fewster, R" uniqKey="Fewster R">R. Fewster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scott, P" uniqKey="Scott P">P. Scott</name>
</author>
<author>
<name sortKey="Fasli, M" uniqKey="Fasli M">M. Fasli</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nigrini, M" uniqKey="Nigrini M">M. Nigrini</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sambridge, M" uniqKey="Sambridge M">M. Sambridge</name>
</author>
<author>
<name sortKey="Tkalcic, H" uniqKey="Tkalcic H">H. Tkalcic</name>
</author>
<author>
<name sortKey="Jackson, A" uniqKey="Jackson A">A. Jackson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sambridge, M" uniqKey="Sambridge M">M. Sambridge</name>
</author>
<author>
<name sortKey="Tkal I, H" uniqKey="Tkal I H">H. Tkalčić</name>
</author>
<author>
<name sortKey="Arroucau, P" uniqKey="Arroucau P">P. Arroucau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="West, G B" uniqKey="West G">G. B. West</name>
</author>
<author>
<name sortKey="Brown, J H" uniqKey="Brown J">J. H. Brown</name>
</author>
<author>
<name sortKey="Enquist, B J A" uniqKey="Enquist B">B. J. A. Enquist</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raimi, R" uniqKey="Raimi R">R. Raimi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Emanuel, K" uniqKey="Emanuel K">K. Emanuel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Emanuel, K" uniqKey="Emanuel K">K. Emanuel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoyos, C D" uniqKey="Hoyos C">C. D. Hoyos</name>
</author>
<author>
<name sortKey="Agudelo, P A" uniqKey="Agudelo P">P. A. Agudelo</name>
</author>
<author>
<name sortKey="Webster, P J" uniqKey="Webster P">P. J. Webster</name>
</author>
<author>
<name sortKey="Curry, J A" uniqKey="Curry J">J. A. Curry</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pielke, R A" uniqKey="Pielke R">R. A. Pielke</name>
</author>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
<author>
<name sortKey="Downton, M" uniqKey="Downton M">M. Downton</name>
</author>
<author>
<name sortKey="Musulin, R" uniqKey="Musulin R">R. Musulin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barabasi, A L" uniqKey="Barabasi A">A. L. Barabási</name>
</author>
<author>
<name sortKey="Albert, R" uniqKey="Albert R">R. Albert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bettencourt, L" uniqKey="Bettencourt L">L. Bettencourt</name>
</author>
<author>
<name sortKey="West, G" uniqKey="West G">G. West</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Diekmann, A" uniqKey="Diekmann A">A. Diekmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Judge, G" uniqKey="Judge G">G. Judge</name>
</author>
<author>
<name sortKey="Schechter, L" uniqKey="Schechter L">L. Schechter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
<author>
<name sortKey="Murnane, R J" uniqKey="Murnane R">R. J. Murnane</name>
</author>
<author>
<name sortKey="Liu, K B" uniqKey="Liu K">K.-B. Liu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, C" uniqKey="Wang C">C. Wang</name>
</author>
<author>
<name sortKey="Deser, C" uniqKey="Deser C">C. Deser</name>
</author>
<author>
<name sortKey="Yu, J Y" uniqKey="Yu J">J. Y. Yu</name>
</author>
<author>
<name sortKey="Dinezio, P" uniqKey="Dinezio P">P. DiNezio</name>
</author>
<author>
<name sortKey="Clement, A" uniqKey="Clement A">A. Clement</name>
</author>
<author>
<name sortKey="Glynn, P" uniqKey="Glynn P">P. Glynn</name>
</author>
<author>
<name sortKey="Manzella, D" uniqKey="Manzella D">D. Manzella</name>
</author>
<author>
<name sortKey="Enochs, I" uniqKey="Enochs I">I. Enochs</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jarvinen, B R" uniqKey="Jarvinen B">B. R. Jarvinen</name>
</author>
<author>
<name sortKey="Neumann, C J" uniqKey="Neumann C">C. J. Neumann</name>
</author>
<author>
<name sortKey="Davis, M A" uniqKey="Davis M">M. A. Davis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
<author>
<name sortKey="Nicholls, N" uniqKey="Nicholls N">N. Nicholls</name>
</author>
<author>
<name sortKey="Gray, W M" uniqKey="Gray W">W. M. Gray</name>
</author>
<author>
<name sortKey="Avila, L A" uniqKey="Avila L">L. A. Avila</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
<author>
<name sortKey="Harper, B A" uniqKey="Harper B">B. A. Harper</name>
</author>
<author>
<name sortKey="Hoarau, K" uniqKey="Hoarau K">K. Hoarau</name>
</author>
<author>
<name sortKey="Knaff, J A" uniqKey="Knaff J">J. A. Knaff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burpee, R W" uniqKey="Burpee R">R. W. Burpee</name>
</author>
<author>
<name sortKey="Franklin, J L" uniqKey="Franklin J">J. L. Franklin</name>
</author>
<author>
<name sortKey="Tuleya, S J" uniqKey="Tuleya S">S. J. Tuleya</name>
</author>
<author>
<name sortKey="Aberson, S D" uniqKey="Aberson S">S. D. Aberson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holland, G J" uniqKey="Holland G">G. J. Holland</name>
</author>
<author>
<name sortKey="Mcgeer, T" uniqKey="Mcgeer T">T. McGeer</name>
</author>
<author>
<name sortKey="Youngren, H H" uniqKey="Youngren H">H. H. Youngren</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z. Zhang</name>
</author>
<author>
<name sortKey="Krishnamurti, T N" uniqKey="Krishnamurti T">T. N. Krishnamurti</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Landsea, C W" uniqKey="Landsea C">C. W. Landsea</name>
</author>
<author>
<name sortKey="Vecchi, G A" uniqKey="Vecchi G">G. A. Vecchi</name>
</author>
<author>
<name sortKey="Bengtsson, L" uniqKey="Bengtsson L">L. Bengtsson</name>
</author>
<author>
<name sortKey="Knutson, T R" uniqKey="Knutson T">T. R. Knutson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Villarini, G" uniqKey="Villarini G">G. Villarini</name>
</author>
<author>
<name sortKey="Vecchi, G A" uniqKey="Vecchi G">G. A. Vecchi</name>
</author>
<author>
<name sortKey="Knutson, T R" uniqKey="Knutson T">T. R. Knutson</name>
</author>
<author>
<name sortKey="Smith, J A" uniqKey="Smith J">J. A. Smith</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sci Rep</journal-id>
<journal-id journal-id-type="iso-abbrev">Sci Rep</journal-id>
<journal-title-group>
<journal-title>Scientific Reports</journal-title>
</journal-title-group>
<issn pub-type="epub">2045-2322</issn>
<publisher>
<publisher-name>Nature Publishing Group</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26156060</article-id>
<article-id pub-id-type="pmc">4496784</article-id>
<article-id pub-id-type="pii">srep12046</article-id>
<article-id pub-id-type="doi">10.1038/srep12046</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Using Benford’s law to investigate Natural Hazard dataset homogeneity</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Joannes-Boyau</surname>
<given-names>Renaud</given-names>
</name>
<xref ref-type="corresp" rid="c1">a</xref>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bodin</surname>
<given-names>Thomas</given-names>
</name>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="aff" rid="a5">5</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Scheffers</surname>
<given-names>Anja</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sambridge</surname>
<given-names>Malcolm</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>May</surname>
<given-names>Simon Matthias</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
</contrib>
<aff id="a1">
<label>1</label>
<institution>Southern Cross GeoScience, Southern Cross University</institution>
, Lismore, NSW, 2480,
<country>Australia</country>
</aff>
<aff id="a2">
<label>2</label>
<institution>Earth and Planetary Science, University of California Berkeley</institution>
, CA,
<country>USA</country>
</aff>
<aff id="a3">
<label>3</label>
<institution>Research School of Earth Sciences, Australian National University</institution>
, Canberra, ACT, 0200,
<country>Australia</country>
</aff>
<aff id="a4">
<label>4</label>
<institution>Institute of Geography, University of Cologne</institution>
, 50923, Cologne,
<country>Germany</country>
</aff>
<aff id="a5">
<label>5</label>
<institution>Laboratoire de Géologie de Lyon, Ecole Normale Supérieure de Lyon, Université de Lyon-1, CNRS</institution>
, 69364 Lyon Cedex 07,
<country>France</country>
</aff>
</contrib-group>
<author-notes>
<corresp id="c1">
<label>a</label>
<email>renaud.joannes-boyau@scu.edu.au</email>
</corresp>
<fn id="n1">
<label>*</label>
<p>These authors contributed equally to this work.</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>09</day>
<month>07</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<volume>5</volume>
<elocation-id>12046</elocation-id>
<history>
<date date-type="received">
<day>29</day>
<month>09</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>06</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2015, Macmillan Publishers Limited</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Macmillan Publishers Limited</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<pmc-comment>author-paid</pmc-comment>
<license-p>This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
</license-p>
</license>
</permissions>
<abstract>
<p>Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford’s Law (also called the “First-Digit Law”) to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford’s Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The
<italic>least-square misfit measure</italic>
is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford’s Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.</p>
</abstract>
</article-meta>
</front>
<body>
<p>Benford’s Law (BL) is an empirically discovered property related to the frequency of first digits (
<italic>sensu stricto</italic>
numerals from 1 to 9 forming numbers and values) occurring in “real-world” datasets
<xref ref-type="bibr" rid="b1">1</xref>
. It states that in certain datasets the leading digit is distributed in a predictable but non-uniform manner. That is, observations with a lower first digit (1, 2, …) occur more often than those with a higher first digit (… 8, 9). This property arises in many situations but is known to occur when the underlying measurements have a log-uniform distribution:
<disp-formula id="eq1">
<inline-graphic id="d33e178" xlink:href="srep12046-m1.jpg"></inline-graphic>
</disp-formula>
</p>
<p>Such datasets are often associated with a power-law distribution with a “heavy tail,” making extreme events far more likely than they would be, for example, in a Gaussian distribution.</p>
<p>Since the initial discovery more than 100 years ago by Newcomb, many studies have emerged that either theorize the mathematical aspect of the law or seek new applications for it (e.g., size of 335 rivers, molecular weights of several thousand chemical compounds, or the first digits of the street addresses for the first 342 persons listed in
<italic>American Men of Science)</italic>
<xref ref-type="bibr" rid="b2">2</xref>
<xref ref-type="bibr" rid="b3">3</xref>
. The distributions obtained from these datasets were remarkably similar to the predicted frequencies in (1), and those frequencies came to be known as Benford’s Law.</p>
<p>There is ongoing debate about the fundamental origins of BL, but it is clear that it can only be applied to data that fall somewhere between being entirely random (e.g., lottery results) and overly constrained (e.g., the size of new born babies). For many years, little was known about Benford’s Law and its unusual and empirical applications were seen more as a numerical curiosity rather than useful information. In fact, Benford himself called his research paper “The Law of Anomalous Numbers.” More recently it has been established that for real valued continuous data-sets Benford’s law arises naturally if the data are distributed according to a log-uniform modulo 10 distribution
<xref ref-type="bibr" rid="b4">4</xref>
.</p>
<p>Based on simulation evidence and measured datasets, studies showed that large classes of naturally occurring quantities (preferentially log-uniform distributed data) are expected to conform to BL
<xref ref-type="bibr" rid="b5">5</xref>
. It has been used in forensic accounting for fraud detection or for change detection in physical and natural science phenomena
<xref ref-type="bibr" rid="b6">6</xref>
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b8">8</xref>
<xref ref-type="bibr" rid="b9">9</xref>
. Several articles have summarized most of the known datasets that follow BL prediction, including river length, population distribution, atomic weight, x-ray volts, American League baseball statistics, black-body radiation, the mass of exoplanets, postal codes, and death rates
<xref ref-type="bibr" rid="b5">5</xref>
<xref ref-type="bibr" rid="b6">6</xref>
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b8">8</xref>
<xref ref-type="bibr" rid="b9">9</xref>
. Large datasets of variables that span many orders of magnitude are often seen to follow the distribution
<xref ref-type="bibr" rid="b3">3</xref>
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b10">10</xref>
.</p>
<p>In this study, we test the validity of BL (1) on a natural climatic process. For this purpose we have chosen the traveled distance of tropical cyclones (TC) (
<xref ref-type="fig" rid="f1">Fig. 1</xref>
), a large dataset available freely online via the International Best Track Archive for Climate Stewardship (IBTrACS).</p>
<p>Describing and understanding a climatic and natural hazard such as tropical cyclone occurrence is a complex task that often requires elaborate mathematical models, especially because of the multifactorial input and intrinsic heterogeneity of the data
<xref ref-type="bibr" rid="b11">11</xref>
<xref ref-type="bibr" rid="b12">12</xref>
<xref ref-type="bibr" rid="b13">13</xref>
<xref ref-type="bibr" rid="b14">14</xref>
<xref ref-type="bibr" rid="b15">15</xref>
. Thus, the quality and homogeneity of the dataset continues to spark heated debates and is frequently used as evidence against newly proposed models. Given the abundance of information that can be extracted from the TC dataset, it is necessary to enhance our ability to understand and separate natural trends from effects due to incomplete and heterogeneous records
<xref ref-type="bibr" rid="b11">11</xref>
<xref ref-type="bibr" rid="b14">14</xref>
. Therefore, in this study we investigate distribution anomalies from Benford’s Law in order to detect temporal changes in the record and monitor the heterogeneity of the TC global dataset.</p>
<sec disp-level="1" sec-type="results">
<title>Results</title>
<p>The distance traveled by each TC was plotted against the year of occurrence (
<xref ref-type="fig" rid="f2">Fig. 2</xref>
). The number of events increases with time, most likely due to improvements in scientific communications and observational capabilities. For example, only one cyclone track appears in the dataset for 1842 compared to 92 in 1900 and 297 in 1970. We also note that no TC tracks were reported along the Western Pacific coast in the early records (
<xref ref-type="fig" rid="f1">Figs 1</xref>
and
<xref ref-type="fig" rid="f2">2</xref>
). The minimum and maximum distances traveled in the dataset are 1.2 km and 18,947 km, respectively, spanning over four orders of magnitude. The average distance traveled by cyclones over the complete dataset is 2,560 km but changes from 1,796 km to 2,866 km prior to and after 1931, respectively.</p>
<p>In
<xref ref-type="fig" rid="f3">Fig. 3</xref>
we have plotted the evolution of TC tracks over time in three categories of distances traveled: (i) short (<1,000 km), (ii) medium (1,000 km < × < 5,000 km), and (iii) long (>5,000 km). Over time, there has been a change in traveled distances, with a continuous increase in large distances traveled by TCs between 1930 and 2010. Most importantly, a severe and sudden shift occurred in the 1970s between short and medium distances. The overall shift after 1970 is also visible in
<xref ref-type="fig" rid="f2">Fig. 2</xref>
as a clear increase in the average distance traveled.</p>
<p>Following the calculation of the first-digit occurrence in the TC records, we have compared the distribution with the theoretical values of BL (
<xref ref-type="fig" rid="f4">Fig. 4</xref>
). One has to keep in mind that BL is scale invariant and that the comparison would not differ in feet, kilometers, or miles. The values in
<xref ref-type="fig" rid="f4">Fig. 4</xref>
reveal very little deviation from the theoretical values. In fact, they are in exceptional agreement (for other comparisons see
<xref ref-type="bibr" rid="b9">9</xref>
<xref ref-type="bibr" rid="b16">16</xref>
<xref ref-type="bibr" rid="b17">17</xref>
), with minimum and maximum absolute differences from empirical values of 0 and 1.2%, respectively, and with an average of 0.51% (see
<xref ref-type="table" rid="t1">Table 1</xref>
).</p>
<p>The typical decay of first-digit occurrence can be observed for the complete TC best track data, establishing BL for the TC dataset. This allows the investigation of temporal variation such as potential change-points in the system at specific periods. Similar to the work of Diekmann
<xref ref-type="bibr" rid="b18">18</xref>
as well as Judge and Schechter
<xref ref-type="bibr" rid="b19">19</xref>
, we used BL to describe the homogeneity and integrity of the dataset. Once established for a particular dataset, temporary deviations from the theoretical values can potentially indicate additional control processes in the system. The more sudden and stronger the modifications to the system, the more intense and abrupt the BL misfit distribution will vary. For example, the magnitude of, and timing between, earthquakes is in agreement with BL estimates, whereas human activities such as nuclear tests of constant magnitude lead to deviations from the “natural” pattern
<xref ref-type="bibr" rid="b7">7</xref>
.</p>
<p>Temporal variations in BL estimates between 1842 and 2010 were determined for high-resolution observations by plotting the least square misfit between observed and theoretical first digit distributions over a 5-year and 10-year running window (
<xref ref-type="fig" rid="f5">Fig. 5</xref>
). This allows the observation of potential episodes within the record that differ from the predicted BL distribution. Those fluctuations are linked to system’s dynamic variations (e.g. incomplete data, change in recording, protocols, unusual activities…) that could go undetected if the dataset is observed in full (see
<xref ref-type="fig" rid="f2">Figs 2</xref>
and
<xref ref-type="fig" rid="f3">3</xref>
).</p>
</sec>
<sec disp-level="1" sec-type="discussion">
<title>Discussion</title>
<p>Between 1842 and 2010 two distinctive periods, P
<sub>1</sub>
(1842-1960) and P
<sub>2</sub>
(1960-2010) can be identified (
<xref ref-type="fig" rid="f5">Fig. 5</xref>
). The data were smoothed over a 5- to 10-year running mean, which makes the period boundaries relatively vague. It is not surprising that the largest deviation from expected first digit distribution is observed in the early record. Not only because the record is incomplete (most TC that did not achieve landfall usually went undetected prior to satellite observation), but also because the TC tracks were built from ship records, sediment archives, and/or observed landfall damages, obviously inducing a large error in the reported distances and trajectories. Landsea and colleagues estimated an undercount bias of 0–5 TCs/yr during 1851–1910 and 0–2 TCs/yr during 1910–1960 by taking into account information on the coastline, TCs, and ship density in the Atlantic basin
<xref ref-type="bibr" rid="b20">20</xref>
.</p>
<p>Within the two phases, it seems that short- and long-term trends have occurred at different periods, especially in the first 30 to 40 years of the twentieth century. Four specific episodes can be observed from A to D at 1915, 1925, 1955, and 2000, (+/−5 years) respectively. Episode A in 1915, which spans over a decade, most likely relates to the constant improvement in data coordination and the record of the events, for example, with the increased use of telegraph lines in the early 1900s
<xref ref-type="bibr" rid="b20">20</xref>
. Nevertheless, a sudden increase around the mid-1920s (episode B), which represents the most striking feature, cannot be directly explained by data recording or technological improvement. This indicates that the BL misfit could potentially reflect climatic variation within the record. For example, the misfit could be related to a sudden and strong inversion in the El Niño Southern Oscillation (ENSO) record around the mid-1920s
<xref ref-type="bibr" rid="b21">21</xref>
. Unfortunately, the comparison of the BL misfit with the climatic record is not sufficient in proving a causal relationship, especially when considering the scatter of the data. Furthermore the agreement between the BL misfit and ENSO record could be a mere coincidence and any attempted interpretation is at best speculation.</p>
<p>Following the sudden increase of BL misfits during episode B in the late 1920s, BL misfits decrease until the 1960s (episode C). This period coincides with the introduction of the aircraft (1944)
<xref ref-type="bibr" rid="b22">22</xref>
, especially in the Atlantic basin, and the use of radiosondes in the late 1930s to early 1940s. One has to keep in mind that this was a critical time in history with WWII followed by the Cold War, which led to; (i) technological developments being used for TC detection, such as the radar in mid-1950s; (ii) increased military movement and presence, and therefore a better observation of TC events; and (iii) an improved centralization of the data (i.e. development of computer and military networks, precursors of the internet). With all the technological development it is not surprising to see the BL misfit values drastically and consistently decreasing over time, as the overall quality (e.g. homogeneity, precision, completeness) of the dataset improves.</p>
<p>The two periods pre- and post-1960 can be clearly separated, as misfit values after the mid-1960s are much smaller than those seen in the early record. This change is clearly present in the 1930s and was reinforced in the 1940s. But a divide undoubtedly occurs in the mid-60s, with the smallest BL misfits in the records (apart from 1842, which had only one TC track) being observed. This very abrupt change seen in the mid-1960s is related to profound deviations in the recording system and is indicative of serious effects on the homogeneity, quality, and precision of the TC record. Thus, the period matches the introduction of satellite technology described by Landsea and colleagues
<xref ref-type="bibr" rid="b23">23</xref>
<xref ref-type="bibr" rid="b24">24</xref>
. Measurements of greater precision had a clear impact on the recording of a TC’s distance traveled; it is likely that the early detection of the phenomena increased the overall distance traveled (
<xref ref-type="fig" rid="f2">Fig. 2</xref>
). It was held that new technologies would contribute to keeping the BL misfit to its lowest values; however
<xref ref-type="fig" rid="f5">Fig. 5</xref>
shows the opposite. One potential reason could be that satellite not only offered more detection but also a more precise addition of a clustering of events into extra tropical and subtropical cyclone categories
<xref ref-type="bibr" rid="b25">25</xref>
. This would have a direct impact on the number of TCs in the record.</p>
<p>In the late 1990s (episode D,
<xref ref-type="fig" rid="f5">Fig. 5</xref>
), the BL misfit again peaks to values similar to those observed prior to the introduction of satellites. The advent of new technology can be correlated to the sudden changes, including the deployment of mobile platforms, flight-level instrumentation, Doppler radar systems, aerosondes, and microwave imagery, which have reduced the error of TC tracks and parameter measurements up to 30% in the last two decades
<xref ref-type="bibr" rid="b26">26</xref>
<xref ref-type="bibr" rid="b27">27</xref>
<xref ref-type="bibr" rid="b28">28</xref>
. However, it appears that not all of the variations observed within the BL misfit relate to technological improvements, but also to the definition of TCs and their data handling. For example, there has been a doubling in the number of TCs in the Atlantic basin over the last century. This increase in the storm count from the original Atlantic basin data has been shown to be mainly due to an increase in short duration (<2-day: “shorties”) tropical storms
<xref ref-type="bibr" rid="b29">29</xref>
, which has been attributed to changes in observation capabilities
<xref ref-type="bibr" rid="b30">30</xref>
. The appearance of “shorties” in the TC data record have become even more pronounced in the last decades due to the aforementioned technological improvements. The impact of this artificial increase in short-lived Atlantic basin TCs reduces the mean track length, which is seen in the decrease from 1995 onward (
<xref ref-type="fig" rid="f2">Figs 2</xref>
and
<xref ref-type="fig" rid="f3">3</xref>
) and leads to strong variation in the BL misfit (
<xref ref-type="fig" rid="f5">Fig. 5</xref>
, BL variations are increasing steadily until peaking around 2000 +/−5 yr). Interestingly, this strong variation also coincides with the “super El Nino” episodes of 1998 that started in the early 1990s. Again, the BL misfit peaks or trend could be influenced by both technological improvements and climatic variations, however, it is difficult to tease them apart.</p>
<p>It is most likely that technological improvement and climatic variation will have different impacts that yield significant variation within the pattern of BL misfits. Human-induced climatic variations potentially provoke a deviation significantly different from what we could name “natural” variation. For instance, the fact that improvements in measurement methods and technology happen very suddenly could influence BL misfits differently than a gradual anthropogenic influence on the climate. A non-uniform variation (e.g., the uninterrupted increase of medium and long distance TCs while short distance TCs have for the most part remained constant
<xref ref-type="fig" rid="f3">Fig. 3</xref>
) could be due to changes in climatic processes, such as an increases in sea surface temperature (SST) or prolonged atmospheric pressure anomalies. Although, one could easily argue the opposite, that recording precision could obviously be responsible for such a change. In this study, the TC record appears too complex and heterogeneous to be corrected by directly using the BL misfit. Nevertheless, BL offers an innovative approach to assess the TC dataset, and clearly helps to identify the influence of technological improvement in measurement capabilities. Furthermore, using BL misfits we can quantify which technology had the most influence on dataset integrity. According to
<xref ref-type="fig" rid="f5">Fig. 5</xref>
, satellite introduction was clearly the most remarkable change in the TC record.</p>
<p>Given that climatic indicators are likely to be reflected in the TC record, it appears that the key limitation lies in our ability to understand and extract climatically influenced data from the record. A mathematical system such as BL offers a new approach to assess errors and discrepancies. Most of all, it enables us to map variations in the record not observable by classical statistical approaches, allowing us to define the most suited part of the dataset for detailed studies. In some circumstances, BL could offer the ability to identify specific periods of biased records that need to be excluded or corrected in order to accurately model the dataset. For example, BL shows that the early part of the record (before 1915) should be use with extreme caution in climatic models and TC statistics. The early records are known to be heterogeneous, and are obviously incomplete due to limited measurement capabilities, while 1915 onwards offers a much more stable part of the TC record. The ability to identify parts of biased data within the records (and perhaps to correct for it) using BL could enhance climatic modelling capabilities to extract crucial information about TC occurrences. Examining datasets for instrumentation artifacts, especially in the case of TCs, could potentially allow us to isolate climatic influences such as ENSO variation, anthropogenic activities, and other climatic variation more accurately.</p>
</sec>
<sec disp-level="1" sec-type="conclusions">
<title>Conclusion</title>
<p>Benford’s Law presents a new way to investigate and assess the homogeneity and quality of natural hazard datasets. While we have shown that BL prediction over the complete dataset is verifiable, a large deviation within the temporal record can clearly be attributed to technological improvements. The introduction of new instrumentation, such as satellite observations, has had a large impact on the dataset quality and is clearly reflected in the form of strong variations in the BL misfit. Furthermore, the quality of discrete timespans within the dataset can be evaluated using BL, as demonstrated by the fact that the clustering of TC events resulting from measurement precision was also clearly observable in the BL misfit. To conclude, we can say that the use of mathematical laws, such as Benford’s Law, has the potential to identify changes in natural systems and could possibly offer the ability to correct for a heterogeneous dataset. Thus BL can be used to select the appropriate part of a large dataset to run climatic models or to account for the impact of a known transition in the system. While strong natural climatic variation and anthropogenic impact on TC occurrences was not clearly observable at this point in the BL misfit, long deviation trends, and/or sudden peaks could potentially be linked to climatic processes in the future.</p>
<p>This type of analysis enables one to observe temporal or spatial variations in large data sets with sufficient dynamic range. However, it does not have any predictive power, and does not tell us anything about future climate change. It is a tool for detection, rather than prediction. Benford’s law has already been exploited to detect signals hidden in background noise in other time series data. For example, Sambridge and colleagues
<xref ref-type="bibr" rid="b7">7</xref>
showed how seismic energy from an earthquake can be detected from just the first digit distribution of displacement counts on a seismometer. We therefore expect this approach to be a powerful tool used to detect unknown anomalies or abrupt changes in climate data, and we anticipate new applications to appear in this ever-growing field of research.</p>
</sec>
<sec disp-level="1" sec-type="methods">
<title>Methods</title>
<sec disp-level="2">
<title>TC Records</title>
<p>Global TC tracking information from 1842 to 2010 was obtained from the International Best Track Archive for Climate Stewardship. The geometric path of each event can be downloaded from the IBTrACS website, resulting in a dataset consisting of more than 350,000 data points that describe the geometry of each path (
<xref ref-type="fig" rid="f1">Fig. 1</xref>
). Using the averaged radius for the Earth of 6,371 km, we have calculated the distance along the great circle between each consecutive point and have computed the total distance traveled by cyclones. TCs defined by only one point were excluded from the calculation in order to avoid introducing artificial values that would offset the first-digit count. The total number of independent occurrences available to us was n = 11,863 at the time of the study.</p>
</sec>
<sec disp-level="2">
<title>BL Calculation</title>
<p>Following BL, the distribution of first-digit values is defined by the probability function (1), where
<italic>b</italic>
is the base (here 10) and
<italic>d</italic>
the leading digit. The theoretical distribution gives a frequency of occurrence of 30.1% for digit 1, 17.6% for 2, 12.5% for 3, and so on, until reaching 4.6% for the ninth digit. This law is scale and base invariant; thus it is independent of the units used (e.g., miles or kilometers).</p>
<p>A least-squares misfit measure (2) is used to quantify the discrepancy between the observed and predicted first-digit proportions:
<disp-formula id="eq2">
<inline-graphic id="d33e369" xlink:href="srep12046-m2.jpg"></inline-graphic>
</disp-formula>
where
<italic>P</italic>
<sub>
<italic>d</italic>
</sub>
is the expected proportion of data with first digit
<italic>d</italic>
as given by BL theoretical values,
<italic>n</italic>
<sub>
<italic>d</italic>
</sub>
is the number of observed data with first digit
<italic>d</italic>
, and
<italic>n</italic>
is the total number of data. We acknowledge that this measure of goodness-of-fit is arbitrary. Here it is solely intended to quantify the relative distance to Benford’s law between different subsets of data. Little is known about data uncertainties, and hence this quantity cannot be used to measure an absolute goodness-of-fit, through for example a statistical significance test (e.g. chi-square test).</p>
</sec>
</sec>
<sec disp-level="1">
<title>Additional Information</title>
<p>
<bold>How to cite this article</bold>
: Joannes-Boyau, R.
<italic>et al</italic>
. Using Benford's law to investigate Natural Hazard dataset homogeneity.
<italic>Sci. Rep</italic>
.
<bold>5</bold>
, 12046; doi: 10.1038/srep12046 (2015).</p>
</sec>
</body>
<back>
<ack>
<p>Part of this research was financially supported by the Australian Research Council (FT0990910, DP140100919 and DP110102098) and by the Southern Cross University Postdoctoral Research Fellowship. We thank Prof. K. Emanuel, Prof. A. Rose, A/Professor S. Johnston, Dr Tyler Cyronak and Dr Peter Kraal for there valuable advises at an early stage of this paper. We also would like to show our appreciation for the valuable comments provided by Prof Rachel Fewster on the paper.</p>
</ack>
<ref-list>
<ref id="b1">
<mixed-citation publication-type="journal">
<name>
<surname>Benford</surname>
<given-names>F.</given-names>
</name>
<article-title>The Law of anomalous numbers</article-title>
.
<source>Proc. Am. Philos. Soc.</source>
<volume>78</volume>
,
<fpage>551</fpage>
<lpage>572</lpage>
(
<year>1938</year>
).</mixed-citation>
</ref>
<ref id="b2">
<mixed-citation publication-type="journal">
<name>
<surname>Newcomb</surname>
<given-names>S.</given-names>
</name>
<article-title>Note on the frequency of use of different digits in natural numbers</article-title>
.
<source>Am. J. Math.</source>
<volume>4</volume>
,
<fpage>39</fpage>
<lpage>40</lpage>
(
<year>1881</year>
).</mixed-citation>
</ref>
<ref id="b3">
<mixed-citation publication-type="journal">
<name>
<surname>Hill</surname>
<given-names>T.</given-names>
</name>
<article-title>Base-invariance implies benford’s law</article-title>
.
<source>American Mathematical Society</source>
<volume>123</volume>
,
<fpage>887</fpage>
<lpage>895</lpage>
(
<year>1995</year>
).</mixed-citation>
</ref>
<ref id="b4">
<mixed-citation publication-type="journal">
<name>
<surname>Fewster</surname>
<given-names>R.</given-names>
</name>
<article-title>A simple explanation of Benford’s Law</article-title>
.
<source>The Ameri. Stat.</source>
<volume>63</volume>
,
<fpage>26</fpage>
<lpage>32</lpage>
(
<year>2009</year>
).</mixed-citation>
</ref>
<ref id="b5">
<mixed-citation publication-type="journal">
<name>
<surname>Scott</surname>
<given-names>P.</given-names>
</name>
&
<name>
<surname>Fasli</surname>
<given-names>M.</given-names>
</name>
<article-title>Benford’s Law: An Empirical Investigation and a Novel Explanation</article-title>
.
<source>CSM Technical Report</source>
<volume>349</volume>
, (University Essex, UK,
<year>2001</year>
).</mixed-citation>
</ref>
<ref id="b6">
<mixed-citation publication-type="journal">
<name>
<surname>Nigrini</surname>
<given-names>M.</given-names>
</name>
<article-title>A taxpayer compliance application of Benford’s Law</article-title>
.
<source>J. Amer. Tax Assoc.</source>
<volume>18</volume>
,
<fpage>72</fpage>
<lpage>91</lpage>
(
<year>1996</year>
).</mixed-citation>
</ref>
<ref id="b7">
<mixed-citation publication-type="journal">
<name>
<surname>Sambridge</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Tkalcic</surname>
<given-names>H.</given-names>
</name>
&
<name>
<surname>Jackson</surname>
<given-names>A.</given-names>
</name>
<article-title>Benford’s law in the natural sciences</article-title>
.
<source>Geophys. Res. Lett.</source>
<volume>37</volume>
,
<fpage>L22301</fpage>
(
<year>2010</year>
).</mixed-citation>
</ref>
<ref id="b8">
<mixed-citation publication-type="journal">
<name>
<surname>Sambridge</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Tkalčić</surname>
<given-names>H.</given-names>
</name>
&
<name>
<surname>Arroucau</surname>
<given-names>P.</given-names>
</name>
<article-title>Benford’s Law of First Digits: From Mathematical Curiosity to Change Detector</article-title>
.
<source>Asia Pacific Mathematics Newsletter</source>
<volume>1</volume>
,
<fpage>1</fpage>
<lpage>5</lpage>
(
<year>2011</year>
).</mixed-citation>
</ref>
<ref id="b9">
<mixed-citation publication-type="journal">
<name>
<surname>West</surname>
<given-names>G. B.</given-names>
</name>
,
<name>
<surname>Brown</surname>
<given-names>J. H.</given-names>
</name>
&
<name>
<surname>Enquist</surname>
<given-names>B. J. A.</given-names>
</name>
<article-title>General Model for the Origin of Allometric Scaling Laws in Biology</article-title>
.
<source>Science</source>
<volume>276</volume>
,
<fpage>122</fpage>
(
<year>1997</year>
).
<pub-id pub-id-type="pmid">9082983</pub-id>
</mixed-citation>
</ref>
<ref id="b10">
<mixed-citation publication-type="journal">
<name>
<surname>Raimi</surname>
<given-names>R.</given-names>
</name>
<article-title>The peculiar distribution of first significant digits</article-title>
.
<source>Sci. Amer.</source>
<volume>221</volume>
,
<fpage>109</fpage>
<lpage>120</lpage>
(
<year>1969</year>
).</mixed-citation>
</ref>
<ref id="b11">
<mixed-citation publication-type="journal">
<name>
<surname>Emanuel</surname>
<given-names>K.</given-names>
</name>
<article-title>Increasing destructiveness of tropical cyclones over the past 30 years</article-title>
.
<source>Nature</source>
<volume>436</volume>
,
<fpage>686</fpage>
<lpage>688</lpage>
(
<year>2005</year>
).
<pub-id pub-id-type="pmid">16056221</pub-id>
</mixed-citation>
</ref>
<ref id="b12">
<mixed-citation publication-type="journal">
<name>
<surname>Emanuel</surname>
<given-names>K.</given-names>
</name>
<article-title>Environmental factors affecting tropical cyclone power dissipation</article-title>
.
<source>J. Climate</source>
<volume>20</volume>
,
<fpage>5497</fpage>
<lpage>5509</lpage>
(
<year>2007</year>
).</mixed-citation>
</ref>
<ref id="b13">
<mixed-citation publication-type="journal">
<name>
<surname>Hoyos</surname>
<given-names>C. D.</given-names>
</name>
,
<name>
<surname>Agudelo</surname>
<given-names>P. A.</given-names>
</name>
,
<name>
<surname>Webster</surname>
<given-names>P. J.</given-names>
</name>
&
<name>
<surname>Curry</surname>
<given-names>J. A.</given-names>
</name>
<article-title>Deconvolution of the Factors Contributing to the Increase in Global Hurricane Intensity</article-title>
.
<source>Science</source>
<volume>312</volume>
,
<fpage>94</fpage>
<lpage>97</lpage>
(
<year>2006</year>
).
<pub-id pub-id-type="pmid">16543416</pub-id>
</mixed-citation>
</ref>
<ref id="b14">
<mixed-citation publication-type="journal">
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
<article-title>Hurricanes and global warming</article-title>
.
<source>Nature</source>
<volume>438</volume>
,
<fpage>11</fpage>
<lpage>13</lpage>
(
<year>2005</year>
).
<pub-id pub-id-type="pmid">16267520</pub-id>
</mixed-citation>
</ref>
<ref id="b15">
<mixed-citation publication-type="journal">
<name>
<surname>Pielke</surname>
<given-names>R. A.</given-names>
</name>
,
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
,
<name>
<surname>Downton</surname>
<given-names>M.</given-names>
</name>
&
<name>
<surname>Musulin</surname>
<given-names>R.</given-names>
</name>
<article-title>Evaluation of catastrophe models using a normalized historical record: Why it is needed and how to do it</article-title>
.
<source>J. Insur. Reg.</source>
<volume>18</volume>
,
<fpage>177</fpage>
<lpage>194</lpage>
(
<year>1999</year>
).</mixed-citation>
</ref>
<ref id="b16">
<mixed-citation publication-type="journal">
<name>
<surname>Barabási</surname>
<given-names>A. L.</given-names>
</name>
&
<name>
<surname>Albert</surname>
<given-names>R.</given-names>
</name>
<article-title>Mergence of Scaling in Random Networks</article-title>
.
<source>Science</source>
<volume>286</volume>
,
<fpage>509</fpage>
(
<year>1999</year>
).
<pub-id pub-id-type="pmid">10521342</pub-id>
</mixed-citation>
</ref>
<ref id="b17">
<mixed-citation publication-type="journal">
<name>
<surname>Bettencourt</surname>
<given-names>L.</given-names>
</name>
&
<name>
<surname>West</surname>
<given-names>G.</given-names>
</name>
<article-title>A unified theory of urban living</article-title>
.
<source>Nature</source>
<volume>467</volume>
,
<fpage>912</fpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20962823</pub-id>
</mixed-citation>
</ref>
<ref id="b18">
<mixed-citation publication-type="journal">
<name>
<surname>Diekmann</surname>
<given-names>A.</given-names>
</name>
<article-title>Not the First Digit! Using Benford’s Law to Detect Fraudulent Scientific Data</article-title>
.
<source>J. App. Stat.</source>
<volume>34</volume>
,
<fpage>321</fpage>
<lpage>329</lpage>
(
<year>2007</year>
).</mixed-citation>
</ref>
<ref id="b19">
<mixed-citation publication-type="journal">
<name>
<surname>Judge</surname>
<given-names>G.</given-names>
</name>
&
<name>
<surname>Schechter</surname>
<given-names>L.</given-names>
</name>
<article-title>Detecting Problems in Survey Data using Benford’s Law</article-title>
.
<source>J. Hum. Res.</source>
<volume>44</volume>
:
<fpage>1</fpage>
<lpage>24</lpage>
(
<year>2007</year>
).</mixed-citation>
</ref>
<ref id="b20">
<mixed-citation publication-type="journal">
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
<etal></etal>
.
<source>In Hurricanes and Typhoons : Past, Present, and Future</source>
1st edn, (Eds
<name>
<surname>Murnane</surname>
<given-names>R. J.</given-names>
</name>
&
<name>
<surname>Liu</surname>
<given-names>K.-B.</given-names>
</name>
),
<volume>Ch.7</volume>
,
<fpage>177</fpage>
<lpage>221</lpage>
(Columbia University Press, New York,
<year>2003</year>
).</mixed-citation>
</ref>
<ref id="b21">
<mixed-citation publication-type="journal">
<name>
<surname>Wang</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Deser</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Yu</surname>
<given-names>J. Y.</given-names>
</name>
,
<name>
<surname>DiNezio</surname>
<given-names>P.</given-names>
</name>
&
<name>
<surname>Clement</surname>
<given-names>A.</given-names>
</name>
In
<source>Coral Reefs of the Eastern Pacific</source>
, (Eds
<name>
<surname>Glynn</surname>
<given-names>P.</given-names>
</name>
,
<name>
<surname>Manzella</surname>
<given-names>D.</given-names>
</name>
&
<name>
<surname>Enochs</surname>
<given-names>I.</given-names>
</name>
),
<volume>Ch.1</volume>
,
<fpage>2</fpage>
<lpage>19</lpage>
, (Springer Science Publisher,
<year>2012</year>
).</mixed-citation>
</ref>
<ref id="b22">
<mixed-citation publication-type="journal">
<name>
<surname>Jarvinen</surname>
<given-names>B. R.</given-names>
</name>
,
<name>
<surname>Neumann</surname>
<given-names>C. J.</given-names>
</name>
&
<name>
<surname>Davis</surname>
<given-names>M. A.</given-names>
</name>
<article-title>A Tropical Cyclone Data Tape for the North Atlantic Basin 1886-1983: Contents, Limitations, and Uses</article-title>
.
<source>NOAA Tech. Memo. NWS-NHC 22.</source>
<fpage>21</fpage>
pp. (Coral Gables, Fla.
<year>1984</year>
).</mixed-citation>
</ref>
<ref id="b23">
<mixed-citation publication-type="journal">
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
,
<name>
<surname>Nicholls</surname>
<given-names>N.</given-names>
</name>
,
<name>
<surname>Gray</surname>
<given-names>W. M.</given-names>
</name>
&
<name>
<surname>Avila</surname>
<given-names>L. A.</given-names>
</name>
<article-title>Downward trends in the frequency of intense Atlantic hurricanes during the past five decades</article-title>
.
<source>Geophys. Res. Let.</source>
<volume>23</volume>
,
<fpage>1697</fpage>
<lpage>1700</lpage>
(
<year>1996</year>
).</mixed-citation>
</ref>
<ref id="b24">
<mixed-citation publication-type="journal">
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
,
<name>
<surname>Harper</surname>
<given-names>B. A.</given-names>
</name>
,
<name>
<surname>Hoarau</surname>
<given-names>K.</given-names>
</name>
&
<name>
<surname>Knaff</surname>
<given-names>J. A.</given-names>
</name>
<article-title>Can we detect trends in extreme tropical cyclones?</article-title>
<source>Science</source>
<volume>313</volume>
,
<fpage>452</fpage>
<lpage>454</lpage>
(
<year>2006</year>
).
<pub-id pub-id-type="pmid">16873634</pub-id>
</mixed-citation>
</ref>
<ref id="b25">
<mixed-citation publication-type="journal">
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
<article-title>Counting Atlantic Tropical Cyclones back to 1900</article-title>
.
<source>Eos</source>
<volume>88</volume>
,
<fpage>197</fpage>
(
<year>2007</year>
).</mixed-citation>
</ref>
<ref id="b26">
<mixed-citation publication-type="journal">
<name>
<surname>Burpee</surname>
<given-names>R. W.</given-names>
</name>
,
<name>
<surname>Franklin</surname>
<given-names>J. L.</given-names>
</name>
,
<name>
<surname>Tuleya</surname>
<given-names>S. J.</given-names>
</name>
&
<name>
<surname>Aberson</surname>
<given-names>S. D.</given-names>
</name>
<article-title>The impact of omega dropwindsondes on operational hurricane track forecast models</article-title>
.
<source>Bull. Amer. Meteor Soc.</source>
<volume>77</volume>
,
<fpage>925</fpage>
<lpage>933</lpage>
(
<year>1996</year>
).</mixed-citation>
</ref>
<ref id="b27">
<mixed-citation publication-type="journal">
<name>
<surname>Holland</surname>
<given-names>G. J.</given-names>
</name>
,
<name>
<surname>McGeer</surname>
<given-names>T.</given-names>
</name>
&
<name>
<surname>Youngren</surname>
<given-names>H. H.</given-names>
</name>
<article-title>Autonomous aerosondes for economical atmospheric soundings anywhere on the globe</article-title>
.
<source>Bull. Amer. Meteor Soc.</source>
<volume>73</volume>
,
<fpage>1987</fpage>
<lpage>1999</lpage>
(
<year>1992</year>
).</mixed-citation>
</ref>
<ref id="b28">
<mixed-citation publication-type="journal">
<name>
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
&
<name>
<surname>Krishnamurti</surname>
<given-names>T. N.</given-names>
</name>
<article-title>Ensemble forecasting of hurricane tracks</article-title>
.
<source>Bull. Amer. Metor. Soc.</source>
<volume>78</volume>
,
<fpage>2785</fpage>
<lpage>95</lpage>
(
<year>1997</year>
).</mixed-citation>
</ref>
<ref id="b29">
<mixed-citation publication-type="journal">
<name>
<surname>Landsea</surname>
<given-names>C. W.</given-names>
</name>
,
<name>
<surname>Vecchi</surname>
<given-names>G. A.</given-names>
</name>
,
<name>
<surname>Bengtsson</surname>
<given-names>L.</given-names>
</name>
&
<name>
<surname>Knutson</surname>
<given-names>T. R.</given-names>
</name>
<article-title>Impact of Duration Thresholds on Atlantic Tropical Cyclone Counts</article-title>
.
<source>J. of Clim.</source>
<volume>23</volume>
,
<fpage>2508</fpage>
<lpage>2519</lpage>
(
<year>2010</year>
).</mixed-citation>
</ref>
<ref id="b30">
<mixed-citation publication-type="journal">
<name>
<surname>Villarini</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Vecchi</surname>
<given-names>G. A.</given-names>
</name>
,
<name>
<surname>Knutson</surname>
<given-names>T. R.</given-names>
</name>
&
<name>
<surname>Smith</surname>
<given-names>J. A.</given-names>
</name>
<article-title>Is the Recorded Increase in Short Duration North Atlantic Tropical Storms Spurious?</article-title>
<source>J. Geophys.</source>
<volume>116</volume>
,
<fpage>D10114</fpage>
(
<year>2011</year>
).</mixed-citation>
</ref>
</ref-list>
<fn-group>
<fn>
<p>
<bold>Author Contributions</bold>
R.J.-B. and T.B. design the study. R.J.-B., T.B., A.S., M.S. and S.M.M. interpreted the data, and wrote the manuscript.</p>
</fn>
</fn-group>
</back>
<floats-group>
<fig id="f1">
<label>Figure 1</label>
<caption>
<title>Map of the world TC tracks from International Best Track Archive for Climate Stewardship (IBTrACS).</title>
<p>(top) from 1931 to present days; (bottom) from 1841 to 1930; TC records prior to 1931 were based on only a single position estimate per day, while at the same time many parts of the globe were poorly sampled (Jarvinen
<italic>et al.</italic>
, 1984). There are no data available prior to 1930 for the western Pacific Ocean along the North and Central American coast, while this region is prone to TC activities, especially due to the direct influence of El Niño/La Niña; (Figure made with ArcGIS
<sup>®</sup>
software and Corel Draw X5).</p>
</caption>
<graphic xlink:href="srep12046-f1"></graphic>
</fig>
<fig id="f2">
<label>Figure 2</label>
<caption>
<title>Distribution of TC travelled distances and frequency since 1842 to present.</title>
<p>The 1930’s represent a significant improvement in the recording and measurements of tropical cyclone occurrences. Tropical cyclone travelled distances (km) from 1841 to 2010; red curve is the 5 years running mean distance (km).</p>
</caption>
<graphic xlink:href="srep12046-f2"></graphic>
</fig>
<fig id="f3">
<label>Figure 3</label>
<caption>
<title>Temporal variations of categories distance travelled by TC.</title>
<p>Frequency of TC occurrences relative to the category of distance traveled: (i) short (<1000 kms), (ii) medium (1000 kms < × < 5000 kms) and (iii) long (>5000 kms) (plain curve correspond to the 5 year running mean (5YRM)).</p>
</caption>
<graphic xlink:href="srep12046-f3"></graphic>
</fig>
<fig id="f4">
<label>Figure 4</label>
<caption>
<title>Relationship between BL first digit theoretical values and the first digit TC track distribution.</title>
<p>Comparison of the distribution of first digit occurrence in the global TC record with the theoretical Benford’s Law estimates.</p>
</caption>
<graphic xlink:href="srep12046-f4"></graphic>
</fig>
<fig id="f5">
<label>Figure 5</label>
<caption>
<title>Calculation of misfit from BL estimate over the entire TC dataset.</title>
<p>The scatter of the data is on itself an indication of the dataset quality and homogeneity. The graphic represent the evaluation of the BL Misfit per individual years (black with empty circles), 5YRM (red) and 10YRM (blue).
<bold>P</bold>
<sub>
<bold>1</bold>
</sub>
and
<bold>P</bold>
<sub>
<bold>2</bold>
</sub>
divide the dataset into two large periods relating to technological improvements. A, B, C and D correspond notable shifts or fluctuations within the BL misfit.</p>
</caption>
<graphic xlink:href="srep12046-f5"></graphic>
</fig>
<table-wrap position="float" id="t1">
<label>Table 1</label>
<caption>
<title>First digit values obtained over the entire TC dataset (Empirical values) against the theoretical values describe by Benford’s law.</title>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="left"></col>
<col align="char" char="."></col>
<col align="char" char="."></col>
</colgroup>
<thead valign="bottom">
<tr>
<th align="left" valign="top" charoff="50">First digit</th>
<th align="center" valign="top" char="." charoff="50">Theoretical values %</th>
<th align="center" valign="top" char="." charoff="50">Empirical values %</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left" valign="top" charoff="50">1</td>
<td align="char" valign="top" char="." charoff="50">30.1</td>
<td align="char" valign="top" char="." charoff="50">29.8</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">2</td>
<td align="char" valign="top" char="." charoff="50">17.6</td>
<td align="char" valign="top" char="." charoff="50">18.8</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">3</td>
<td align="char" valign="top" char="." charoff="50">12.5</td>
<td align="char" valign="top" char="." charoff="50">13.5</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">4</td>
<td align="char" valign="top" char="." charoff="50">9.7</td>
<td align="char" valign="top" char="." charoff="50">9.8</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">5</td>
<td align="char" valign="top" char="." charoff="50">7.9</td>
<td align="char" valign="top" char="." charoff="50">7.9</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">6</td>
<td align="char" valign="top" char="." charoff="50">6.7</td>
<td align="char" valign="top" char="." charoff="50">6.3</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">7</td>
<td align="char" valign="top" char="." charoff="50">5.8</td>
<td align="char" valign="top" char="." charoff="50">5.2</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">8</td>
<td align="char" valign="top" char="." charoff="50">5.1</td>
<td align="char" valign="top" char="." charoff="50">4.4</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">9</td>
<td align="char" valign="top" char="." charoff="50">4.6</td>
<td align="char" valign="top" char="." charoff="50">4.3</td>
</tr>
</tbody>
</table>
</table-wrap>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000805 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000805 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4496784
   |texte=   Using Benford’s law to investigate Natural Hazard dataset homogeneity
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:26156060" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a AustralieFrV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024