Serveur d'exploration sur la télématique

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Supersampling and Network Reconstruction of Urban Mobility

Identifieur interne : 000033 ( Pmc/Corpus ); précédent : 000032; suivant : 000034

Supersampling and Network Reconstruction of Urban Mobility

Auteurs : Oleguer Sagarra ; Michael Szell ; Paolo Santi ; Albert Díaz-Guilera ; Carlo Ratti

Source :

RBID : PMC:4537279

Abstract

Understanding human mobility is of vital importance for urban planning, epidemiology, and many other fields that draw policies from the activities of humans in space. Despite the recent availability of large-scale data sets of GPS traces or mobile phone records capturing human mobility, typically only a subsample of the population of interest is represented, giving a possibly incomplete picture of the entire system under study. Methods to reliably extract mobility information from such reduced data and to assess their sampling biases are lacking. To that end, we analyzed a data set of millions of taxi movements in New York City. We first show that, once they are appropriately transformed, mobility patterns are highly stable over long time scales. Based on this observation, we develop a supersampling methodology to reliably extrapolate mobility records from a reduced sample based on an entropy maximization procedure, and we propose a number of network-based metrics to assess the accuracy of the predicted vehicle flows. Our approach provides a well founded way to exploit temporal patterns to save effort in recording mobility data, and opens the possibility to scale up data from limited records when information on the full system is required.


Url:
DOI: 10.1371/journal.pone.0134508
PubMed: 26275237
PubMed Central: 4537279

Links to Exploration step

PMC:4537279

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Supersampling and Network Reconstruction of Urban Mobility</title>
<author>
<name sortKey="Sagarra, Oleguer" sort="Sagarra, Oleguer" uniqKey="Sagarra O" first="Oleguer" last="Sagarra">Oleguer Sagarra</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>Departament de Física Fonamental, Universitat de Barcelona, Barcelona, Spain</addr-line>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Szell, Michael" sort="Szell, Michael" uniqKey="Szell M" first="Michael" last="Szell">Michael Szell</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff003">
<addr-line>Center for Complex Network Research, Northeastern University, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Santi, Paolo" sort="Santi, Paolo" uniqKey="Santi P" first="Paolo" last="Santi">Paolo Santi</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff004">
<addr-line>Istituto di Informatica e Telematica del CNR, Pisa, Italy</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Diaz Guilera, Albert" sort="Diaz Guilera, Albert" uniqKey="Diaz Guilera A" first="Albert" last="Díaz-Guilera">Albert Díaz-Guilera</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>Departament de Física Fonamental, Universitat de Barcelona, Barcelona, Spain</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ratti, Carlo" sort="Ratti, Carlo" uniqKey="Ratti C" first="Carlo" last="Ratti">Carlo Ratti</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26275237</idno>
<idno type="pmc">4537279</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4537279</idno>
<idno type="RBID">PMC:4537279</idno>
<idno type="doi">10.1371/journal.pone.0134508</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000033</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000033</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Supersampling and Network Reconstruction of Urban Mobility</title>
<author>
<name sortKey="Sagarra, Oleguer" sort="Sagarra, Oleguer" uniqKey="Sagarra O" first="Oleguer" last="Sagarra">Oleguer Sagarra</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>Departament de Física Fonamental, Universitat de Barcelona, Barcelona, Spain</addr-line>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Szell, Michael" sort="Szell, Michael" uniqKey="Szell M" first="Michael" last="Szell">Michael Szell</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff003">
<addr-line>Center for Complex Network Research, Northeastern University, Boston, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Santi, Paolo" sort="Santi, Paolo" uniqKey="Santi P" first="Paolo" last="Santi">Paolo Santi</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff004">
<addr-line>Istituto di Informatica e Telematica del CNR, Pisa, Italy</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Diaz Guilera, Albert" sort="Diaz Guilera, Albert" uniqKey="Diaz Guilera A" first="Albert" last="Díaz-Guilera">Albert Díaz-Guilera</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>Departament de Física Fonamental, Universitat de Barcelona, Barcelona, Spain</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ratti, Carlo" sort="Ratti, Carlo" uniqKey="Ratti C" first="Carlo" last="Ratti">Carlo Ratti</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Understanding human mobility is of vital importance for urban planning, epidemiology, and many other fields that draw policies from the activities of humans in space. Despite the recent availability of large-scale data sets of GPS traces or mobile phone records capturing human mobility, typically only a subsample of the population of interest is represented, giving a possibly incomplete picture of the entire system under study. Methods to reliably extract mobility information from such reduced data and to assess their sampling biases are lacking. To that end, we analyzed a data set of millions of taxi movements in New York City. We first show that, once they are appropriately transformed, mobility patterns are highly stable over long time scales. Based on this observation, we develop a
<italic>supersampling</italic>
methodology to reliably extrapolate mobility records from a reduced sample based on an entropy maximization procedure, and we propose a number of network-based metrics to assess the accuracy of the predicted vehicle flows. Our approach provides a well founded way to exploit temporal patterns to save effort in recording mobility data, and opens the possibility to scale up data from limited records when information on the full system is required.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Gonzalez, Mc" uniqKey="Gonzalez M">MC González</name>
</author>
<author>
<name sortKey="Hidalgo, Ca" uniqKey="Hidalgo C">CA Hidalgo</name>
</author>
<author>
<name sortKey="Barabasi, Al" uniqKey="Barabasi A">AL Barabási</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bazzani, A" uniqKey="Bazzani A">A Bazzani</name>
</author>
<author>
<name sortKey="Giorgini, B" uniqKey="Giorgini B">B Giorgini</name>
</author>
<author>
<name sortKey="Rambaldi, S" uniqKey="Rambaldi S">S Rambaldi</name>
</author>
<author>
<name sortKey="Gallotti, R" uniqKey="Gallotti R">R Gallotti</name>
</author>
<author>
<name sortKey="Giovannini, L" uniqKey="Giovannini L">L Giovannini</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brockmann, D" uniqKey="Brockmann D">D Brockmann</name>
</author>
<author>
<name sortKey="Hufnagel, L" uniqKey="Hufnagel L">L Hufnagel</name>
</author>
<author>
<name sortKey="Geisel, T" uniqKey="Geisel T">T Geisel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thiemann, C" uniqKey="Thiemann C">C Thiemann</name>
</author>
<author>
<name sortKey="Theis, F" uniqKey="Theis F">F Theis</name>
</author>
<author>
<name sortKey="Grady, D" uniqKey="Grady D">D Grady</name>
</author>
<author>
<name sortKey="Brune, R" uniqKey="Brune R">R Brune</name>
</author>
<author>
<name sortKey="Dirk Brockmann, D" uniqKey="Dirk Brockmann D">D Dirk Brockmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scellato, S" uniqKey="Scellato S">S Scellato</name>
</author>
<author>
<name sortKey="Noulas, A" uniqKey="Noulas A">A Noulas</name>
</author>
<author>
<name sortKey="Lambiotte, R" uniqKey="Lambiotte R">R Lambiotte</name>
</author>
<author>
<name sortKey="Mascolo, C" uniqKey="Mascolo C">C Mascolo</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barthelemy, M" uniqKey="Barthelemy M">M Barthélemy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cattuto, C" uniqKey="Cattuto C">C Cattuto</name>
</author>
<author>
<name sortKey="Van Den Broeck, W" uniqKey="Van Den Broeck W">W Van den Broeck</name>
</author>
<author>
<name sortKey="Barrat, A" uniqKey="Barrat A">A Barrat</name>
</author>
<author>
<name sortKey="Colizza, V" uniqKey="Colizza V">V Colizza</name>
</author>
<author>
<name sortKey="Pinton, Jf" uniqKey="Pinton J">JF Pinton</name>
</author>
<author>
<name sortKey="Vespignani, A" uniqKey="Vespignani A">A Vespignani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roth, C" uniqKey="Roth C">C Roth</name>
</author>
<author>
<name sortKey="Kang, Sm" uniqKey="Kang S">SM Kang</name>
</author>
<author>
<name sortKey="Batty, M" uniqKey="Batty M">M Batty</name>
</author>
<author>
<name sortKey="Barthelemy, M" uniqKey="Barthelemy M">M Barthélemy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Szell, M" uniqKey="Szell M">M Szell</name>
</author>
<author>
<name sortKey="Sinatra, R" uniqKey="Sinatra R">R Sinatra</name>
</author>
<author>
<name sortKey="Petri, G" uniqKey="Petri G">G Petri</name>
</author>
<author>
<name sortKey="Thurner, S" uniqKey="Thurner S">S Thurner</name>
</author>
<author>
<name sortKey="Latora, V" uniqKey="Latora V">V Latora</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Song, C" uniqKey="Song C">C Song</name>
</author>
<author>
<name sortKey="Koren, T" uniqKey="Koren T">T Koren</name>
</author>
<author>
<name sortKey="Wang, P" uniqKey="Wang P">P Wang</name>
</author>
<author>
<name sortKey="Barabasi, Al" uniqKey="Barabasi A">AL Barabási</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Song, C" uniqKey="Song C">C Song</name>
</author>
<author>
<name sortKey="Qu, Z" uniqKey="Qu Z">Z Qu</name>
</author>
<author>
<name sortKey="Blumm, N" uniqKey="Blumm N">N Blumm</name>
</author>
<author>
<name sortKey="Barabasi, Al" uniqKey="Barabasi A">AL Barabási</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hufnagel, L" uniqKey="Hufnagel L">L Hufnagel</name>
</author>
<author>
<name sortKey="Brockmann, D" uniqKey="Brockmann D">D Brockmann</name>
</author>
<author>
<name sortKey="Geisel, T" uniqKey="Geisel T">T Geisel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Belik, V" uniqKey="Belik V">V Belik</name>
</author>
<author>
<name sortKey="Geisel, T" uniqKey="Geisel T">T Geisel</name>
</author>
<author>
<name sortKey="Brockmann, D" uniqKey="Brockmann D">D Brockmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Colizza, V" uniqKey="Colizza V">V Colizza</name>
</author>
<author>
<name sortKey="Barrat, A" uniqKey="Barrat A">A Barrat</name>
</author>
<author>
<name sortKey="Barthelemy, M" uniqKey="Barthelemy M">M Barthélemy</name>
</author>
<author>
<name sortKey="Vespignani, A" uniqKey="Vespignani A">A Vespignani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Balcan, D" uniqKey="Balcan D">D Balcan</name>
</author>
<author>
<name sortKey="Colizza, V" uniqKey="Colizza V">V Colizza</name>
</author>
<author>
<name sortKey="Goncalves, B" uniqKey="Goncalves B">B Goncalves</name>
</author>
<author>
<name sortKey="Hu, H" uniqKey="Hu H">H Hu</name>
</author>
<author>
<name sortKey="Ramasco, Jj" uniqKey="Ramasco J">JJ Ramasco</name>
</author>
<author>
<name sortKey="Vespignani, A" uniqKey="Vespignani A">A Vespignani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Batty, M" uniqKey="Batty M">M Batty</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zipf, Gk" uniqKey="Zipf G">GK Zipf</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simini, F" uniqKey="Simini F">F Simini</name>
</author>
<author>
<name sortKey="Gonzalez, Mc" uniqKey="Gonzalez M">MC González</name>
</author>
<author>
<name sortKey="Maritan, A" uniqKey="Maritan A">A Maritan</name>
</author>
<author>
<name sortKey="Barabasi, Al" uniqKey="Barabasi A">AL Barabási</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stouffer, Sa" uniqKey="Stouffer S">SA Stouffer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Clauset, A" uniqKey="Clauset A">A Clauset</name>
</author>
<author>
<name sortKey="Shalizi, Cr" uniqKey="Shalizi C">CR Shalizi</name>
</author>
<author>
<name sortKey="Newman, Mej" uniqKey="Newman M">MEJ Newman</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Santi, P" uniqKey="Santi P">P Santi</name>
</author>
<author>
<name sortKey="Resta, G" uniqKey="Resta G">G Resta</name>
</author>
<author>
<name sortKey="Szell, M" uniqKey="Szell M">M Szell</name>
</author>
<author>
<name sortKey="Sobolevsky, S" uniqKey="Sobolevsky S">S Sobolevsky</name>
</author>
<author>
<name sortKey="Strogatz, Sh" uniqKey="Strogatz S">SH Strogatz</name>
</author>
<author>
<name sortKey="Ratti, C" uniqKey="Ratti C">C Ratti</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilson, A" uniqKey="Wilson A">A Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Erlander, S" uniqKey="Erlander S">S Erlander</name>
</author>
<author>
<name sortKey="Stewart, Nf" uniqKey="Stewart N">NF Stewart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sagarra, O" uniqKey="Sagarra O">O Sagarra</name>
</author>
<author>
<name sortKey="Perez Vicente, C" uniqKey="Perez Vicente C">C Pérez-Vicente</name>
</author>
<author>
<name sortKey="Diaz Guilera, A" uniqKey="Diaz Guilera A">A Díaz Guilera</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sagarra, O" uniqKey="Sagarra O">O Sagarra</name>
</author>
<author>
<name sortKey="Font Clos, F" uniqKey="Font Clos F">F Font-Clos</name>
</author>
<author>
<name sortKey="Perez Vicente, Cj" uniqKey="Perez Vicente C">CJ Pérez-Vicente</name>
</author>
<author>
<name sortKey="Diaz Guilera, A" uniqKey="Diaz Guilera A">A Díaz-Guilera</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peng, C" uniqKey="Peng C">C Peng</name>
</author>
<author>
<name sortKey="Jin, X" uniqKey="Jin X">X Jin</name>
</author>
<author>
<name sortKey="Wong, K" uniqKey="Wong K">K Wong</name>
</author>
<author>
<name sortKey="Shi, M" uniqKey="Shi M">M Shi</name>
</author>
<author>
<name sortKey="Lio, P" uniqKey="Lio P">P Lio</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, Y" uniqKey="Yang Y">Y Yang</name>
</author>
<author>
<name sortKey="Herrera, C" uniqKey="Herrera C">C Herrera</name>
</author>
<author>
<name sortKey="Eagle, N" uniqKey="Eagle N">N Eagle</name>
</author>
<author>
<name sortKey="Gonzalez, Mc" uniqKey="Gonzalez M">MC González</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bianconi, G" uniqKey="Bianconi G">G Bianconi</name>
</author>
<author>
<name sortKey="Pin, P" uniqKey="Pin P">P Pin</name>
</author>
<author>
<name sortKey="Marsili, M" uniqKey="Marsili M">M Marsili</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lenormand, M" uniqKey="Lenormand M">M Lenormand</name>
</author>
<author>
<name sortKey="Picornell, M" uniqKey="Picornell M">M Picornell</name>
</author>
<author>
<name sortKey="Cantu Ros, Og" uniqKey="Cantu Ros O">OG Cantú-Ros</name>
</author>
<author>
<name sortKey="Tugores, A" uniqKey="Tugores A">A Tugores</name>
</author>
<author>
<name sortKey="Louail, T" uniqKey="Louail T">T Louail</name>
</author>
<author>
<name sortKey="Herranz, R" uniqKey="Herranz R">R Herranz</name>
</author>
<author>
<name sortKey="Barthelemy, M" uniqKey="Barthelemy M">M Barthelemy</name>
</author>
<author>
<name sortKey="Frias Martinez, E" uniqKey="Frias Martinez E">E Frías-Martinez</name>
</author>
<author>
<name sortKey="Ramasco, J J" uniqKey="Ramasco J">J.J. Ramasco</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lenormand, M" uniqKey="Lenormand M">M Lenormand</name>
</author>
<author>
<name sortKey="Huet, S" uniqKey="Huet S">S Huet</name>
</author>
<author>
<name sortKey="Gargiulo, F" uniqKey="Gargiulo F">F Gargiulo</name>
</author>
<author>
<name sortKey="Deffuant, G" uniqKey="Deffuant G">G Deffuant</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Robillard, P" uniqKey="Robillard P">P Robillard</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26275237</article-id>
<article-id pub-id-type="pmc">4537279</article-id>
<article-id pub-id-type="publisher-id">PONE-D-15-17090</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0134508</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Supersampling and Network Reconstruction of Urban Mobility</article-title>
<alt-title alt-title-type="running-head">Supersampling and Network Reconstruction of Urban Mobility</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Sagarra</surname>
<given-names>Oleguer</given-names>
</name>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="cor001">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Szell</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff003">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Santi</surname>
<given-names>Paolo</given-names>
</name>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff004">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Díaz-Guilera</surname>
<given-names>Albert</given-names>
</name>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ratti</surname>
<given-names>Carlo</given-names>
</name>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff001">
<label>1</label>
<addr-line>Departament de Física Fonamental, Universitat de Barcelona, Barcelona, Spain</addr-line>
</aff>
<aff id="aff002">
<label>2</label>
<addr-line>Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America</addr-line>
</aff>
<aff id="aff003">
<label>3</label>
<addr-line>Center for Complex Network Research, Northeastern University, Boston, Massachusetts, United States of America</addr-line>
</aff>
<aff id="aff004">
<label>4</label>
<addr-line>Istituto di Informatica e Telematica del CNR, Pisa, Italy</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Perc</surname>
<given-names>Matjaz</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">
<addr-line>University of Maribor, SLOVENIA</addr-line>
</aff>
<author-notes>
<fn fn-type="conflict" id="coi001">
<p>
<bold>Competing Interests: </bold>
Audi Volkswagen partly funded this study. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.</p>
</fn>
<fn fn-type="con" id="contrib001">
<p>Conceived and designed the experiments: CR AD PS OS. Performed the experiments: OS. Analyzed the data: MS OS. Contributed reagents/materials/analysis tools: OS. Wrote the paper: AD PS MS OS.</p>
</fn>
<corresp id="cor001">* E-mail:
<email>osagarra@ub.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<pub-date pub-type="epub">
<day>14</day>
<month>8</month>
<year>2015</year>
</pub-date>
<volume>10</volume>
<issue>8</issue>
<elocation-id>e0134508</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>4</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>9</day>
<month>7</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-year>2015</copyright-year>
<copyright-holder>Sagarra et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open access article distributed under the terms of the
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>
, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:type="simple" xlink:href="pone.0134508.pdf"></self-uri>
<abstract>
<p>Understanding human mobility is of vital importance for urban planning, epidemiology, and many other fields that draw policies from the activities of humans in space. Despite the recent availability of large-scale data sets of GPS traces or mobile phone records capturing human mobility, typically only a subsample of the population of interest is represented, giving a possibly incomplete picture of the entire system under study. Methods to reliably extract mobility information from such reduced data and to assess their sampling biases are lacking. To that end, we analyzed a data set of millions of taxi movements in New York City. We first show that, once they are appropriately transformed, mobility patterns are highly stable over long time scales. Based on this observation, we develop a
<italic>supersampling</italic>
methodology to reliably extrapolate mobility records from a reduced sample based on an entropy maximization procedure, and we propose a number of network-based metrics to assess the accuracy of the predicted vehicle flows. Our approach provides a well founded way to exploit temporal patterns to save effort in recording mobility data, and opens the possibility to scale up data from limited records when information on the full system is required.</p>
</abstract>
<funding-group>
<funding-statement>The work of O.S. has been partially supported by the EU-LASAGNE Project, Contract No. 318132 (STREP), the Spanish MINECO Grant FIS2012-38266-C02-02 and by the Generalitat de Catalunya through the FI Program and 2014-SGR-00608. P.S. and C.R. thank the Enel Foundation, Audi Volkswagen, and all the members of the MIT Senseable City Lab Consortium for supporting this research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts>
<fig-count count="4"></fig-count>
<table-count count="4"></table-count>
<page-count count="15"></page-count>
</counts>
<custom-meta-group>
<custom-meta id="data-availability">
<meta-name>Data Availability</meta-name>
<meta-value>Data are freely available from the New York Taxi and Limousine Commission via Freedom of Information Law request. The dataset is composed by all the Taxi trips recorded in NY for the year 2011 by the NYTLM. Data requests may be sent to Jason Gonzalez, Records Access Officer:
<email>foil@tlc.nyc.gov</email>
.</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
<notes>
<title>Data Availability</title>
<p>Data are freely available from the New York Taxi and Limousine Commission via Freedom of Information Law request. The dataset is composed by all the Taxi trips recorded in NY for the year 2011 by the NYTLM. Data requests may be sent to Jason Gonzalez, Records Access Officer:
<email>foil@tlc.nyc.gov</email>
.</p>
</notes>
</front>
<body>
<sec sec-type="intro" id="sec001">
<title>Introduction</title>
<p>The increased pervasiveness of information and communication technologies is enabling the tracking of human mobility at an unprecedented scale. Massive call detail records from mobile phone activities [
<xref rid="pone.0134508.ref001" ref-type="bibr">1</xref>
,
<xref rid="pone.0134508.ref002" ref-type="bibr">2</xref>
] and the use of global positioning systems (GPS) in large vehicle fleets [
<xref rid="pone.0134508.ref003" ref-type="bibr">3</xref>
] for instance, are generating extraordinary quantities of positional and movement data available for researchers who aim to understand human activity in space. Other data sources, such as observations of banknote circulation [
<xref rid="pone.0134508.ref004" ref-type="bibr">4</xref>
,
<xref rid="pone.0134508.ref005" ref-type="bibr">5</xref>
], online location-based social networks [
<xref rid="pone.0134508.ref006" ref-type="bibr">6</xref>
,
<xref rid="pone.0134508.ref007" ref-type="bibr">7</xref>
], radio frequency identification traces [
<xref rid="pone.0134508.ref008" ref-type="bibr">8</xref>
<xref rid="pone.0134508.ref010" ref-type="bibr">10</xref>
], or even virtual movements of avatars in online games [
<xref rid="pone.0134508.ref011" ref-type="bibr">11</xref>
] have also been used as proxies for human movements. These studies have provided valuable insights into several aspects of human mobility, uncovering distinct features of human travel behavior such as scaling laws [
<xref rid="pone.0134508.ref004" ref-type="bibr">4</xref>
,
<xref rid="pone.0134508.ref012" ref-type="bibr">12</xref>
] or predictability of trajectories [
<xref rid="pone.0134508.ref013" ref-type="bibr">13</xref>
] among others. Besides empirical studies, the surge of available data on human mobility has also evoked interest in developing new theoretical models of mobility at several scales. Such models have deep implications for various subjects ranging from epidemiology to urbanism [
<xref rid="pone.0134508.ref014" ref-type="bibr">14</xref>
<xref rid="pone.0134508.ref017" ref-type="bibr">17</xref>
], with special importance in city planning and policy action [
<xref rid="pone.0134508.ref018" ref-type="bibr">18</xref>
].</p>
<p>Despite these first success stories, the theoretical development of tools and techniques for handling massive data sets of human mobility and for assessing their possible biases is still a road full of obstacles. Existing models based on gravity [
<xref rid="pone.0134508.ref019" ref-type="bibr">19</xref>
], radiation [
<xref rid="pone.0134508.ref020" ref-type="bibr">20</xref>
], intervening opportunities [
<xref rid="pone.0134508.ref021" ref-type="bibr">21</xref>
], etc. present a first step towards an accurate proxy for mobility at medium and large range scales, but they have been proven to be not always satisfactory to describe short scale movement such as intra-city displacements. The size of the data analyzed, the multiple scales involved, the highly skewed statistical nature of human activities [
<xref rid="pone.0134508.ref022" ref-type="bibr">22</xref>
] and the lack of strict control on the reliability of the data are just some of the multiple challenges this exciting new era poses.</p>
<p>Although they are often extensive, one of the main limitations of data sets used in empirically driven urban-scale mobility research is the limited coverage of the entire population under study. For instance, cell phone data records are typically obtained from a single operator. Similarly, data from taxis, or from other vehicle fleets are typically obtained from a single company, which usually represents only a small fraction of the actual number of vehicles circulating in a city [
<xref rid="pone.0134508.ref003" ref-type="bibr">3</xref>
,
<xref rid="pone.0134508.ref023" ref-type="bibr">23</xref>
]. In some scenarios, fully grasping a certain mobility-related phenomenon may require modelling the entire population of interest. For instance, it was shown that the fraction of taxi trips that can be shared in the city of New York is an increasing (albeit not simple) function of the number of daily taxi trips [
<xref rid="pone.0134508.ref024" ref-type="bibr">24</xref>
]. Hence, if a certain data set covers only a fraction of the daily taxi trips performed in a city, the taxi sharing potential cannot be fully unveiled.</p>
<p>The above discussion motivates the need of extrapolating urban mobility data starting from a subset of the population of interest. Although a number of urban mobility studies have applied such methods [
<xref rid="pone.0134508.ref025" ref-type="bibr">25</xref>
,
<xref rid="pone.0134508.ref026" ref-type="bibr">26</xref>
], a definition and assessment of a statistically rigorous extrapolation methodology is so far lacking. Even the sub-problem of assessing the quality of urban movement models is to date open, since the skewness of the underlying statistical distributions [
<xref rid="pone.0134508.ref010" ref-type="bibr">10</xref>
] makes a set of consistent, quantitative indicators hard to develop. In this paper, we fill these gaps by introducing a rigorous methodology to tackle the problem of obtaining an accurate picture of a mobility process when only a limited observation of such a process is available, both in time and volume. We first propose a simple rescaling rule which allows to quantify the strong temporal regularity of urban mobility patterns, even at very fine scales such as trips between particular intersections. Exploiting this regularity, we use a maximum entropy approach combining empirical data to model the occurrence of the core of frequent trips with an exponential gravity model [
<xref rid="pone.0134508.ref027" ref-type="bibr">27</xref>
<xref rid="pone.0134508.ref029" ref-type="bibr">29</xref>
] accounting for the variation observed in the least-frequent trips. We apply our method to accurately reconstruct the data set of all taxi trips performed in the city of New York in the year 2011 using small fractions sub-sampled from only a month of recorded data. By analysing the temporal patterns and the topological properties of the yearly mobility of taxis represented as a multi-edge network, we can finally assess the statistical accuracy of the proposed
<italic>supersampling</italic>
methodology using a number of both information-theoretical and network-based performance metrics.</p>
<p>The remainder of the paper is structured as follows: We first present the study of both the temporal and topological patterns observed in the data, which then allows us to construct a maximum entropy method that exploits these features to solve the
<italic>supersampling</italic>
problem. Finally, we systematically test our reconstruction model on a very large data set and conclude by discussing some insights about the structure of urban mobility that the present study draws.</p>
</sec>
<sec sec-type="results" id="sec002">
<title>Results</title>
<p>Typically, mobility data is formalized by so-called Origin-Destination (OD) matrices, which are particular examples of weighted, or multi-edge networks [
<xref rid="pone.0134508.ref030" ref-type="bibr">30</xref>
]. OD matrices represent the number of observed trips
<inline-formula id="pone.0134508.e001">
<alternatives>
<graphic xlink:href="pone.0134508.e001.jpg" id="pone.0134508.e001g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M1">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
between the
<italic>L</italic>
=
<italic>N</italic>
<sup>2</sup>
pairs of
<italic>N</italic>
locations or nodes
<italic>i</italic>
,
<italic>j</italic>
over a given observation period
<italic>τ</italic>
. A location can be defined based on a spatial partitioning of the urban area, on points of interest [
<xref rid="pone.0134508.ref031" ref-type="bibr">31</xref>
], or on road intersections [
<xref rid="pone.0134508.ref024" ref-type="bibr">24</xref>
]—as it is the case in the NY taxi data set at hand (see
<xref ref-type="sec" rid="sec009">methods</xref>
). Given this network representation, one can compute the total incoming
<inline-formula id="pone.0134508.e002">
<alternatives>
<graphic xlink:href="pone.0134508.e002.jpg" id="pone.0134508.e002g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M2">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and outgoing strength
<inline-formula id="pone.0134508.e003">
<alternatives>
<graphic xlink:href="pone.0134508.e003.jpg" id="pone.0134508.e003g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M3">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
of a node
<italic>i</italic>
. Throughout this paper, we define
<italic>active nodes</italic>
as the subset of nodes which are either origin or destination of at least one trip in the set of all recorded trips
<inline-formula id="pone.0134508.e004">
<alternatives>
<graphic xlink:href="pone.0134508.e004.jpg" id="pone.0134508.e004g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M4">
<mml:mrow>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and similarly
<italic>active edges</italic>
as the pairs of locations with at least one trip (
<inline-formula id="pone.0134508.e005">
<alternatives>
<graphic xlink:href="pone.0134508.e005.jpg" id="pone.0134508.e005g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M5">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
) recorded between them. The notation
<inline-formula id="pone.0134508.e006">
<alternatives>
<graphic xlink:href="pone.0134508.e006.jpg" id="pone.0134508.e006g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M6">
<mml:mrow>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
shall refer to the observed value of the random variable
<italic>x</italic>
as derived from the data set, ⟨
<italic>x</italic>
⟩ to its expected value over independent realizations of a given model, while
<inline-formula id="pone.0134508.e007">
<alternatives>
<graphic xlink:href="pone.0134508.e007.jpg" id="pone.0134508.e007g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M7">
<mml:mrow>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
denotes the matrix or network average of the variable
<inline-formula id="pone.0134508.e008">
<alternatives>
<graphic xlink:href="pone.0134508.e008.jpg" id="pone.0134508.e008g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M8">
<mml:mrow>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
across the full empirical OD matrix (for example, average graph-degree
<inline-formula id="pone.0134508.e009">
<alternatives>
<graphic xlink:href="pone.0134508.e009.jpg" id="pone.0134508.e009g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M9">
<mml:mrow>
<mml:mover>
<mml:mi>k</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
). Finally, the symbol ⟨
<italic>x</italic>
<sub>
<italic>τ</italic>
</sub>
is used to express averages over time of variable
<italic>x</italic>
using bins of temporal length
<italic>τ</italic>
.</p>
<sec id="sec003">
<title>Stability of temporal urban-mobility patterns</title>
<p>While the built structure of cities evolves slowly in time, many dynamic, behavioral processes that take place within a city unfold relatively fast, and in principle could be strongly variable across time. However, human activity in cities exhibits highly regular patterns when observed over well defined periods of time, such as circadian or weekly rhythms. Intra-urban mobility is a good example for such activities: With longer time spans or larger samples of gathering movement data in cities, the picture of the underlying mobility network will clear up continually, but stable patterns should already emerge with relatively few data points as we can see in
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1A</xref>
. To systematically test this hypothesis, we make use of a fleet of taxis acting as probes, sampling from the total traffic of all vehicles in a city. The total number of recorded trips, or sampling size
<inline-formula id="pone.0134508.e010">
<alternatives>
<graphic xlink:href="pone.0134508.e010.jpg" id="pone.0134508.e010g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M10">
<mml:mrow>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, depends on the total observation time
<italic>τ</italic>
and the number of probes, i.e., the size of the sub-population that is being monitored.</p>
<fig id="pone.0134508.g001" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.g001</object-id>
<label>Fig 1</label>
<caption>
<p>
<bold>(a)</bold>
Circadian clocks in the city: Time-series of the number of observed trips
<inline-formula id="pone.0134508.e011">
<alternatives>
<graphic id="pone.0134508.e011g" xlink:href="pone.0134508.e011"></graphic>
<mml:math id="M11">
<mml:mrow>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, active edges
<inline-formula id="pone.0134508.e012">
<alternatives>
<graphic id="pone.0134508.e012g" xlink:href="pone.0134508.e012"></graphic>
<mml:math id="M12">
<mml:mrow>
<mml:mover>
<mml:mi>E</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and active nodes in the city of NY per day over the year 2011. The influence of both seasonal fluctuations, major events and stable weekly patterns are clearly observed.
<bold>(b)</bold>
Aggregated fraction of active nodes and total trips as days of data are accumulated normalized by the total number of recorded nodes and trips at the end of the observation period, the evolution of the accumulated graph density in time
<italic>E</italic>
/
<italic>L</italic>
=
<italic>E</italic>
/
<italic>N</italic>
<sup>2</sup>
(observed binary edges over total possible pairs of intersections) is also reported. Nodes are almost fully sampled within the first days analyzed, while edges are sampled sub-linearly in time.
<bold>(c)</bold>
Quantile—Quantile plot comparing the number of trips per day observed in the data set without outliers within two standard derivations of the mean (> 95% of the data) to the theoretical quantiles of a normal distribution and linear fit (dashed line) showing their proportionality (similar results not shown obtained for number of edges and nodes respectively).</p>
</caption>
<graphic xlink:href="pone.0134508.g001"></graphic>
</fig>
<p>The evolution of the sampling size of trips as a function of the observation period,
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1B</xref>
, can be extremely well approximated by a linear relation (
<italic>R</italic>
<sup>2</sup>
> 0.999), indicating that the total number of trips generated daily in the city can be described as a random variable strongly concentrated around its mean value ⟨
<italic>T</italic>
<sub>
<italic>τ</italic>
= 1 day</sub>
≃ 403,000 ± 61,000 (confidence bounds reported as standard deviation).</p>
<p>On a yearly scale, the distribution of trips per day
<italic>T</italic>
<sub>
<italic>τ</italic>
= 1 day</sub>
is not statistically compatible to a Gaussian distribution mainly due to seasonal effects, see
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1A</xref>
. The effects of summer and winter holidays are apparent, and of Hurricane Irene that hit New York city towards the end of August, but if we disregard such outliers, corresponding to around 5% of the data that lies further than two standard deviations away from the mean, the quantile-quantile plot shows acceptable agreement with a normal distribution, see
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1C</xref>
.</p>
<p>To observe whether this strong regularity is also present in the finer structure of mobility, we must focus on each of the
<italic>N</italic>
nodes and
<italic>L</italic>
intersection pairs. Yet, we must find a suitable scaling to the data: The accumulated observed strength (both incoming and outgoing)
<inline-formula id="pone.0134508.e013">
<alternatives>
<graphic xlink:href="pone.0134508.e013.jpg" id="pone.0134508.e013g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M13">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
of each node and the weight
<inline-formula id="pone.0134508.e014">
<alternatives>
<graphic xlink:href="pone.0134508.e014.jpg" id="pone.0134508.e014g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M14">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
of each intersection pair will increase as more and more data is gathered, but if we normalize their (in or out) strength and weight by the total number of observed trips in the period
<italic>τ</italic>
, a strong regularity is recovered as we show in the following. The quantities
<inline-formula id="pone.0134508.e015">
<alternatives>
<graphic xlink:href="pone.0134508.e015.jpg" id="pone.0134508.e015g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M15">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and
<inline-formula id="pone.0134508.e016">
<alternatives>
<graphic xlink:href="pone.0134508.e016.jpg" id="pone.0134508.e016g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M16">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, which quantify the
<italic>relative importance</italic>
of a given node and intersection pair compared to the overall network,
<disp-formula id="pone.0134508.e017">
<alternatives>
<graphic xlink:href="pone.0134508.e017.jpg" id="pone.0134508.e017g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M17">
<mml:mrow>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mspace width="2.em"></mml:mspace>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</alternatives>
<label>(1)</label>
</disp-formula>
are extremely stable as shown in
<xref ref-type="table" rid="pone.0134508.t001">Table 1</xref>
. We have split our data set into
<italic>n</italic>
<sub>
<italic>τ</italic>
</sub>
equal time intervals (on daily, weekly and four-week bases) and computed the relative dispersion of the values accumulated over the entire data set
<inline-formula id="pone.0134508.e018">
<alternatives>
<graphic xlink:href="pone.0134508.e018.jpg" id="pone.0134508.e018g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M18">
<mml:mrow>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>τ</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mspace width="0.167em"></mml:mspace>
<mml:mtext mathvariant="normal">year</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
around the measured values
<inline-formula id="pone.0134508.e019">
<alternatives>
<graphic xlink:href="pone.0134508.e019.jpg" id="pone.0134508.e019g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M19">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mi>τ</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
,
<disp-formula id="pone.0134508.e020">
<alternatives>
<graphic xlink:href="pone.0134508.e020.jpg" id="pone.0134508.e020g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M20">
<mml:mrow>
<mml:mi>ε</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mi>τ</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mi>τ</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mi>τ</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(2)</label>
</disp-formula>
where
<italic>τ</italic>
<sub>
<italic>max</italic>
</sub>
is the time at the end of the full observation period and the averages are performed over all the time slices of length
<italic>τ</italic>
. The graph-average of
<italic>ɛ</italic>
is very close to zero and highly concentrated around this value for all the time windows considered (with a standard deviation of 13% in the worst case, decaying as sampling time is increased).</p>
<table-wrap id="pone.0134508.t001" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.t001</object-id>
<label>Table 1</label>
<caption>
<title>Variability of node and node-pair statistics (incoming
<inline-formula id="pone.0134508.e021">
<alternatives>
<graphic id="pone.0134508.e021g" xlink:href="pone.0134508.e021"></graphic>
<mml:math id="M21">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and outgoing
<inline-formula id="pone.0134508.e022">
<alternatives>
<graphic id="pone.0134508.e022g" xlink:href="pone.0134508.e022"></graphic>
<mml:math id="M22">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
relative strength, and relative number of trips between intersections
<inline-formula id="pone.0134508.e023">
<alternatives>
<graphic id="pone.0134508.e023g" xlink:href="pone.0134508.e023"></graphic>
<mml:math id="M23">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
) using different temporal granularity of the data set averaged over the full network (nodes and pairs of nodes respectively) compared to final yearly values.</title>
<p>Time units with a total number of trips at least two standard deviations apart from the adjusted yearly mean have not been considered in the average to account for seasonal variations (< 5% of the data in the worst case). For the pairs of intersections
<italic>ij</italic>
, only pairs with at least one non-zero appearance on the time slicing have been considered for the average. The fraction of data with absolute relative error larger than two standard deviations is also reported as
<italic>Outliers</italic>
.</p>
</caption>
<alternatives>
<graphic id="pone.0134508.t001g" xlink:href="pone.0134508.t001"></graphic>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="1" colspan="1">Time windows
<italic>n</italic>
<sub>
<italic>τ</italic>
</sub>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e024">
<alternatives>
<graphic id="pone.0134508.e024g" xlink:href="pone.0134508.e024"></graphic>
<mml:math id="M24">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
:
<inline-formula id="pone.0134508.e025">
<alternatives>
<graphic id="pone.0134508.e025g" xlink:href="pone.0134508.e025"></graphic>
<mml:math id="M25">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>ɛ</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(± std)</th>
<th align="center" rowspan="1" colspan="1">Outliers</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e026">
<alternatives>
<graphic id="pone.0134508.e026g" xlink:href="pone.0134508.e026"></graphic>
<mml:math id="M26">
<mml:mrow>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
:
<inline-formula id="pone.0134508.e027">
<alternatives>
<graphic id="pone.0134508.e027g" xlink:href="pone.0134508.e027"></graphic>
<mml:math id="M27">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>ɛ</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(± std)</th>
<th align="center" rowspan="1" colspan="1">Outliers</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e028">
<alternatives>
<graphic id="pone.0134508.e028g" xlink:href="pone.0134508.e028"></graphic>
<mml:math id="M28">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
:
<inline-formula id="pone.0134508.e029">
<alternatives>
<graphic id="pone.0134508.e029g" xlink:href="pone.0134508.e029"></graphic>
<mml:math id="M29">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>ɛ</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(± std)</th>
<th align="center" rowspan="1" colspan="1">Outliers</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">347 (1 day period)</td>
<td align="center" rowspan="1" colspan="1">−0.009 ± 0.056</td>
<td align="char" char="." rowspan="1" colspan="1">0.039</td>
<td align="center" rowspan="1" colspan="1">−0.037 ± 0.095</td>
<td align="char" char="." rowspan="1" colspan="1">0.083</td>
<td align="center" rowspan="1" colspan="1">0.018 ± 0.127</td>
<td align="char" char="." rowspan="1" colspan="1">0.025</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">51 (1 week period)</td>
<td align="center" rowspan="1" colspan="1">−0.011 ± 0.054</td>
<td align="char" char="." rowspan="1" colspan="1">0.041</td>
<td align="center" rowspan="1" colspan="1">−0.040 ± 0.095</td>
<td align="char" char="." rowspan="1" colspan="1">0.088</td>
<td align="center" rowspan="1" colspan="1">0.014 ± 0.100</td>
<td align="char" char="." rowspan="1" colspan="1">0.026</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">13 (4 weeks period)</td>
<td align="center" rowspan="1" colspan="1">−0.011 ± 0.054</td>
<td align="char" char="." rowspan="1" colspan="1">0.041</td>
<td align="center" rowspan="1" colspan="1">−0.040 ± 0.095</td>
<td align="char" char="." rowspan="1" colspan="1">0.087</td>
<td align="center" rowspan="1" colspan="1">0.012 ± 0.061</td>
<td align="char" char="." rowspan="1" colspan="1">0.025</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<p>
<xref ref-type="fig" rid="pone.0134508.g002">Fig 2</xref>
shows the correlation between the relative error and the relative importance of nodes and links. The fact that
<inline-formula id="pone.0134508.e030">
<alternatives>
<graphic xlink:href="pone.0134508.e030.jpg" id="pone.0134508.e030g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M30">
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, coupled with second order seasonality effects induces an uneven distribution of errors: An overestimation of some values in the collection {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩} will forcefully induce an underestimation in some other values of the collection. Despite this issue, we can clearly see that the vast majority of the mass of relative errors is concentrated around zero (see points in background for
<xref ref-type="fig" rid="pone.0134508.g002">Fig 2</xref>
).</p>
<fig id="pone.0134508.g002" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.g002</object-id>
<label>Fig 2</label>
<caption>
<title>The effect of sampling on node and intersection pair temporal stability.</title>
<p>Correlation between measured values of intersection pair
<inline-formula id="pone.0134508.e031">
<alternatives>
<graphic id="pone.0134508.e031g" xlink:href="pone.0134508.e031"></graphic>
<mml:math id="M31">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(a) and node statistics
<inline-formula id="pone.0134508.e032">
<alternatives>
<graphic id="pone.0134508.e032g" xlink:href="pone.0134508.e032"></graphic>
<mml:math id="M32">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(b),
<inline-formula id="pone.0134508.e033">
<alternatives>
<graphic id="pone.0134508.e033g" xlink:href="pone.0134508.e033"></graphic>
<mml:math id="M33">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msubsup>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(c) for different time slices and relative dispersion around the mean (
<xref ref-type="disp-formula" rid="pone.0134508.e020">Eq (2)</xref>
) for the yearly aggregated data. Error bars represent standard deviations on the log-binned data. Raw data for the daily case is shown in the background. For visual clarity, panel a) only shows a random subsample of 1/1000 of the original points.</p>
</caption>
<graphic xlink:href="pone.0134508.g002"></graphic>
</fig>
<p>To some extent, we would expect the node strength to be stable over time, since the number of trips received and generated at each location depends on parameters such as population density, number of points of interest present in a given location, etc. [
<xref rid="pone.0134508.ref032" ref-type="bibr">32</xref>
], whose evolution is given by much slower dynamics than the mobility process studied herein. But additionally, time stability is also observed at the trip level between intersections, yet in this case the analysis displays a higher variability around the mean—see
<xref ref-type="fig" rid="pone.0134508.g002">Fig 2A</xref>
and
<xref ref-type="table" rid="pone.0134508.t001">Table 1</xref>
. This higher variability can be explained by the different sampling processes: While the percentage of active nodes becomes extremely stable already when just a very small number of days is considered (
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1B</xref>
), this is not the case for the total number of active edges, because sampling of edge-specific attributes requires to grow as
<italic>L</italic>
<italic>N</italic>
<sup>2</sup>
to achieve a comparable level of accuracy.</p>
</sec>
<sec id="sec004">
<title>The anatomy of urban flows</title>
<p>The results on temporal stability indicate that any model aiming to reproduce human mobility at urban scales should consistently exhibit regularities as reported above. Having seen that the mobility patterns are stable across time, an understanding of the main topological aspects of the aggregated static picture is further needed in order to be able to select the main features our methodology should aim to reproduce. The most relevant topological aspect of the mobility network is the highly skewed concentration of taxi pick-ups and drop-offs across the city, which gives rise to heavy-tailed node-strength distributions,
<xref ref-type="fig" rid="pone.0134508.g003">Fig 3A</xref>
(other general metrics are reported in the
<xref ref-type="sec" rid="sec009">methods</xref>
section). Therefore, to test whether this relevant property alone already captures the essential features of the mobility network, we must consider a null model which randomizes the considered network keeping constant the strength of each node –the multi-edge configuration model (see
<xref ref-type="sec" rid="sec009">methods</xref>
section or [
<xref rid="pone.0134508.ref030" ref-type="bibr">30</xref>
] for more details). When comparing the null model with the data, we observe significant deviations showing the importance of certain places or nodes in the network. In other words, even the strong heterogeneity of the distribution of strengths cannot account for the skewness of the weight distribution: Additional factors add many more trips between some connections than there should be under random conditions. The empirical link weight distribution, blue line in
<xref ref-type="fig" rid="pone.0134508.g003">Fig 3B</xref>
, is more skewed than under a random allocation of trips (Configuration model), dashed line, and the connections at the node level show a clear assortative correlation instead of a flat profile, see
<xref ref-type="fig" rid="pone.0134508.g003">Fig 3D</xref>
. This occurs despite the fact that the average number of trips between the most busy locations can be characterized by the relation
<inline-formula id="pone.0134508.e034">
<alternatives>
<graphic xlink:href="pone.0134508.e034.jpg" id="pone.0134508.e034g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M34">
<mml:mrow>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
with a reasonable accuracy, see
<xref ref-type="fig" rid="pone.0134508.g003">Fig 3C</xref>
, coinciding with the configuration model. These insights indicate that the distribution of node strengths across the city has a strong influence on the topology of the network, and needs to be taken into account when modelled, but needs extra ingredients to fully account for the observed pattern of connections.</p>
<fig id="pone.0134508.g003" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.g003</object-id>
<label>Fig 3</label>
<caption>
<title>Empirical network features of the taxi multi-edge mobility network.</title>
<p>
<bold>(a)</bold>
Distribution of incoming and outgoing strengths
<italic>s</italic>
.
<bold>(b)</bold>
Existing edge weight complementary cumulative distribution function compared with a configuration model [
<xref rid="pone.0134508.ref030" ref-type="bibr">30</xref>
] and the supersampled model with
<italic>f</italic>
= 0.1.
<bold>(c)</bold>
Weighted average neighbor out-strength
<inline-formula id="pone.0134508.e035">
<alternatives>
<graphic id="pone.0134508.e035g" xlink:href="pone.0134508.e035"></graphic>
<mml:math id="M35">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mi>w</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
as function of node out-strength (similar results for incoming strengths not shown).
<bold>(d)</bold>
Graph-average existing weight as a function of product of outgoing and incoming strengths of origin and destination for a single instance of the network. Points represent mean values and error bars represent standard deviations computed using log-binning and distributions also shown using log-binning and ensemble results averaged over 100 repetitions of the two models.</p>
</caption>
<graphic xlink:href="pone.0134508.g003"></graphic>
</fig>
</sec>
<sec id="sec005">
<title>A flexible model to reproduce human mobility</title>
<p>A general maximum entropy based theory for model generation (see [
<xref rid="pone.0134508.ref027" ref-type="bibr">27</xref>
,
<xref rid="pone.0134508.ref029" ref-type="bibr">29</xref>
,
<xref rid="pone.0134508.ref033" ref-type="bibr">33</xref>
] for extended discussion and references) allows us to efficiently exploit both the observed temporal stability features and the heterogeneous topological properties of the network to solve the
<italic>supersampling</italic>
problem at hand. It starts with the assumption that each intersection pair in the network is allocated a constant fraction
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
of the total sampling (
<xref ref-type="disp-formula" rid="pone.0134508.e017">Eq (1)</xref>
). Under this condition, and assuming that the mobility process is driven by some general constraints, such as population density or budget, it can be proved that for any desired level of sampling
<italic>T</italic>
<sub>
<italic>d</italic>
</sub>
the statistics of trips for each pair of nodes can be well described by a set of
<italic>L</italic>
independent Poisson processes with mean ⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
⟩ =
<italic>T</italic>
<sub>
<italic>d</italic>
</sub>
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩, with
<disp-formula id="pone.0134508.e036">
<alternatives>
<graphic xlink:href="pone.0134508.e036.jpg" id="pone.0134508.e036g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M36">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
<mml:munder>
<mml:mo movablelimits="true" form="prefix">lim</mml:mo>
<mml:mrow>
<mml:mi>τ</mml:mi>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo></mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo movablelimits="true" form="prefix">lim</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo></mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>T</mml:mi>
</mml:mfrac>
<mml:mo></mml:mo>
<mml:mspace width="2.em"></mml:mspace>
<mml:mi>T</mml:mi>
<mml:mo></mml:mo>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(3)</label>
</disp-formula>
</p>
<p>Following the theory, it would seem clear that from knowing the
<italic>real</italic>
values of the collection {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩},
<italic>supersampling</italic>
a mobility data set would be a trivial operation of generating
<italic>L</italic>
independent Poisson processes using the provided proportionality rule. Therefore, the problem now reduces to inferring the collection of values {
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
} from an available data set. We shall assume that only one snapshot of the aggregated mobility network is available to this end (thus assuming no temporal information is available on the trip data) as is usually the case in mobility studies. The maximum likelihood estimation of such values corresponds to
<disp-formula id="pone.0134508.e037">
<alternatives>
<graphic xlink:href="pone.0134508.e037.jpg" id="pone.0134508.e037g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M37">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(4)</label>
</disp-formula>
There is, however, a practical issue in this formula related with the normalization condition for the random variables {
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
} and the presence of empty intersection pairs in the available observed data. For such intersections, using the formulas above, we have that
<inline-formula id="pone.0134508.e038">
<alternatives>
<graphic xlink:href="pone.0134508.e038.jpg" id="pone.0134508.e038g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M38">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, whereas their
<italic>real</italic>
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩ value is unknown but fulfils
<inline-formula id="pone.0134508.e039">
<alternatives>
<graphic xlink:href="pone.0134508.e039.jpg" id="pone.0134508.e039g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M39">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msup>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
. Since by definition both collections {
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
} and
<inline-formula id="pone.0134508.e040">
<alternatives>
<graphic xlink:href="pone.0134508.e040.jpg" id="pone.0134508.e040g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M40">
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
need to be normalized, and denoting the set of active edges as
<inline-formula id="pone.0134508.e041">
<alternatives>
<graphic xlink:href="pone.0134508.e041.jpg" id="pone.0134508.e041g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M41">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="true">{</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>></mml:mo>
<mml:mn>0</mml:mn>
<mml:mo stretchy="true">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, we have,
<disp-formula id="pone.0134508.e042">
<alternatives>
<graphic xlink:href="pone.0134508.e042.jpg" id="pone.0134508.e042g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M42">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:munder>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>+</mml:mo>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:munder>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mspace width="2.em"></mml:mspace>
<mml:mspace width="2.em"></mml:mspace>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(5)</label>
</disp-formula>
from which we see that
<inline-formula id="pone.0134508.e043">
<alternatives>
<graphic xlink:href="pone.0134508.e043.jpg" id="pone.0134508.e043g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M43">
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
.</p>
<p>Hence, in general we cannot consider the empirically observed probabilities
<inline-formula id="pone.0134508.e044">
<alternatives>
<graphic xlink:href="pone.0134508.e044.jpg" id="pone.0134508.e044g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M44">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
as a good proxy for the
<italic>real</italic>
values of
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
unless the number of empty intersection pairs is very reduced. Given that the percentage of active edges (pairs of nodes for which
<inline-formula id="pone.0134508.e045">
<alternatives>
<graphic xlink:href="pone.0134508.e045.jpg" id="pone.0134508.e045g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M45">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>></mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
) is a very slowly increasing function of the sampling, see
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1B</xref>
, inferring directly the set of probabilities {
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
} empirically would take an enormous data set—note that even with over a year of data only roughly 40% of edges are covered.</p>
<p>For the reasons given above, a simple proportionality rule using
<xref ref-type="disp-formula" rid="pone.0134508.e037">Eq (4)</xref>
is not a good
<italic>supersampling</italic>
strategy, specially for skewed and sparse data sets.</p>
</sec>
<sec id="sec006">
<title>Supersampling urban trips</title>
<p>Based on the previous discussion, we now present the methodology for
<italic>supersampling</italic>
an urban mobility data set that consists in inferring the collection of {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩} values from a set of aggregated empirical trips
<inline-formula id="pone.0134508.e046">
<alternatives>
<graphic xlink:href="pone.0134508.e046.jpg" id="pone.0134508.e046g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M46">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true">{</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
. We should do so bearing in mind that despite the maximum likelihood formula in
<xref ref-type="disp-formula" rid="pone.0134508.e037">Eq (4)</xref>
cannot be directly used for the empty intersection pairs in the data, it does perform well for non-empty intersections (see
<xref ref-type="fig" rid="pone.0134508.g002">Fig 2</xref>
).</p>
<p>The maximum entropy based framework naturally allows such a procedure, since it can combine any constraint driven model with the rich information encoded in the trip sample. We propose a method to predict trips based on the theory mentioned earlier: Taking the
<italic>L</italic>
intersection pairs (being them active edges in the data set or not), we split them into two parts, the subgroup of
<italic>trusted</italic>
trips defined as
<inline-formula id="pone.0134508.e047">
<alternatives>
<graphic xlink:href="pone.0134508.e047.jpg" id="pone.0134508.e047g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M47">
<mml:mrow>
<mml:mo>𝓠</mml:mo>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="true">{</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mtext>min</mml:mtext>
</mml:msub>
<mml:mo stretchy="true">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and its complementary part 𝓠
<sup>
<italic>C</italic>
</sup>
. The value
<italic>t</italic>
<sub>min</sub>
is a threshold modelling a minimal statistical accuracy that depends on the amount of data available, and which may be set to 1 in practical applications. We keep the proportionality rule
<inline-formula id="pone.0134508.e048">
<alternatives>
<graphic xlink:href="pone.0134508.e048.jpg" id="pone.0134508.e048g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M48">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
for the
<italic>trusted</italic>
trips, while for the remaining trips we apply a doubly constrained exponential gravity model –other maximum entropy models [
<xref rid="pone.0134508.ref029" ref-type="bibr">29</xref>
] could also perform well as long as they preserve the outgoing and incoming strength. In other words, we generate a collection of {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩} values,
<disp-formula id="pone.0134508.e049">
<alternatives>
<graphic xlink:href="pone.0134508.e049.jpg" id="pone.0134508.e049g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M49">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>{</mml:mo>
<mml:mtable>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mfrac>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:mfrac>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo>𝓠</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>γ</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mo>𝓠</mml:mo>
<mml:mi>C</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo></mml:mo>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(6)</label>
</disp-formula>
The values
<italic>γ</italic>
and {
<italic>x</italic>
<sub>
<italic>i</italic>
</sub>
,
<italic>y</italic>
<sub>
<italic>j</italic>
</sub>
} are the 2
<italic>N</italic>
+ 1 Lagrange multipliers satisfying the following equations
<disp-formula id="pone.0134508.e050">
<alternatives>
<graphic xlink:href="pone.0134508.e050.jpg" id="pone.0134508.e050g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M50">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>-</mml:mo>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo>𝓠</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
</mml:mrow>
<mml:msup>
<mml:mo>𝓠</mml:mo>
<mml:mi>C</mml:mi>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>γ</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>-</mml:mo>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo>𝓠</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
</mml:mrow>
<mml:msup>
<mml:mo>𝓠</mml:mo>
<mml:mi>C</mml:mi>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>γ</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>C</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>-</mml:mo>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo>𝓠</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>T</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
</mml:mrow>
<mml:msup>
<mml:mo>𝓠</mml:mo>
<mml:mi>C</mml:mi>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>γ</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(7)</label>
</disp-formula>
where
<inline-formula id="pone.0134508.e051">
<alternatives>
<graphic xlink:href="pone.0134508.e051.jpg" id="pone.0134508.e051g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M51">
<mml:mrow>
<mml:mover>
<mml:mi>C</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
is the total euclidean distance of the observed trips (
<italic>c</italic>
<sub>
<italic>ij</italic>
</sub>
stands for the distance between intersections
<italic>i</italic>
and
<italic>j</italic>
). Note that, by construction, the values are properly normalized, i.e.,
<inline-formula id="pone.0134508.e052">
<alternatives>
<graphic xlink:href="pone.0134508.e052.jpg" id="pone.0134508.e052g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M52">
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo>𝓠</mml:mo>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mo>𝓠</mml:mo>
<mml:mi>C</mml:mi>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>γ</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
.</p>
<p>The model presented earlier needs to deal with the issue of inactive nodes that do not appear in the original data due to poor sampling, i.e., nodes for which
<inline-formula id="pone.0134508.e053">
<alternatives>
<graphic xlink:href="pone.0134508.e053.jpg" id="pone.0134508.e053g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M53">
<mml:mrow>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
either incoming or outgoing for some observation period
<italic>τ</italic>
. This issue has a minor impact in our case due to the previously observed rapid coverage of the number of active nodes (the number of inactive nodes is negligible after accumulating very few days of data, see
<xref ref-type="fig" rid="pone.0134508.g001">Fig 1B</xref>
). In any case, it can be solved easily: given that the geographic positions of the nodes are available, we could always artificially assign a certain relative strength to the nodes not present in the data using complementary call detail records [
<xref rid="pone.0134508.ref034" ref-type="bibr">34</xref>
], census data or points of interests (POI) data, or assign them some values according to a chosen distribution depending on the data at hand. For simplicity, in our case we have chosen to keep only the nodes present in the original data.</p>
</sec>
<sec id="sec007">
<title>Assessing the quality of the supersampling methodology</title>
<p>To test the
<italic>supersampling</italic>
methodology, we have proceeded to select a timespan of our data set corresponding to an observation period of
<italic>τ</italic>
= 1 month (February 2011) from which we further randomly sub-sample different fractions
<italic>f</italic>
used as training sets to compute {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩} applying Eqs (
<xref ref-type="disp-formula" rid="pone.0134508.e049">6</xref>
) and (
<xref ref-type="disp-formula" rid="pone.0134508.e050">7</xref>
). We then reconstruct the OD using the proportionality rule {⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
⟩ =
<italic>T</italic>
<sub>
<italic>d</italic>
</sub>
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩} for both the complete and reduced data set,
<italic>T</italic>
<sub>
<italic>d</italic>
</sub>
=
<italic>T</italic>
(
<italic>τ</italic>
′ = 1 year) and
<italic>T</italic>
<sub>
<italic>d</italic>
</sub>
=
<italic>T</italic>
(
<italic>τ</italic>
= 1 month). Finally, we compare the model predictions with the set of empirically observed trips in these periods.</p>
<p>In order to do so, we need to introduce metrics to quantify the resemblance between model predictions and actual recorded values. Commonly used indicators are the Sorensen-Dice common part of commuters (CPC) value [
<xref rid="pone.0134508.ref035" ref-type="bibr">35</xref>
] or the linear fit of ⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
<sup>
<italic>model</italic>
</sup>
vs
<inline-formula id="pone.0134508.e054">
<alternatives>
<graphic xlink:href="pone.0134508.e054.jpg" id="pone.0134508.e054g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M54">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
[
<xref rid="pone.0134508.ref020" ref-type="bibr">20</xref>
]. However, the skewness of the observed trip distribution represents a challenge to these indicators: The variability of low-valued trips induces notable instabilities on both and being single numbers, they are only able to provide a limited picture on the precision of a given model to reproduce empirical results.</p>
<p>To overcome these issues, we propose a slight modification to both the Sorensen CPC index and the coefficient of determination
<italic>R</italic>
<sup>2</sup>
from the fit ⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
<sup>
<italic>model</italic>
</sup>
vs
<inline-formula id="pone.0134508.e055">
<alternatives>
<graphic xlink:href="pone.0134508.e055.jpg" id="pone.0134508.e055g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M55">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(see
<xref ref-type="sec" rid="sec009">methods</xref>
) and we additionally propose the introduction of a number of network metrics to precisely assess the quality of the models used in human mobility at the topological, finer, scale: The unweighted degrees of the nodes
<inline-formula id="pone.0134508.e056">
<alternatives>
<graphic xlink:href="pone.0134508.e056.jpg" id="pone.0134508.e056g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M56">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, the weighted neighbor strength correlation
<inline-formula id="pone.0134508.e057">
<alternatives>
<graphic xlink:href="pone.0134508.e057.jpg" id="pone.0134508.e057g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M57">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mi>w</mml:mi>
</mml:msubsup>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, existing trip distribution
<italic>P</italic>
(
<italic>t</italic>
) and number of existing trips as a function of the origin destination strength product
<inline-formula id="pone.0134508.e058">
<alternatives>
<graphic xlink:href="pone.0134508.e058.jpg" id="pone.0134508.e058g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M58">
<mml:mrow>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
.</p>
<p>The results for the supersampling method are summarized in
<xref ref-type="table" rid="pone.0134508.t002">Table 2</xref>
and a specific example for
<italic>f</italic>
= 0.1 (reconstruction using only 10% of the original data of the monthly data set compared to yearly data) is shown in
<xref ref-type="fig" rid="pone.0134508.g004">Fig 4</xref>
for the different indicators proposed. For comparison, results using both a configuration model and the empirical values
<inline-formula id="pone.0134508.e059">
<alternatives>
<graphic xlink:href="pone.0134508.e059.jpg" id="pone.0134508.e059g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M59">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true">{</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
using the information encoded in the complete dataset are also shown.</p>
<table-wrap id="pone.0134508.t002" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.t002</object-id>
<label>Table 2</label>
<caption>
<title>Parameters for the validation of the methodology.</title>
<p>See details on each indicator in the main text and in the methods section. The number of
<italic>trusted</italic>
trips fed to the model relative to the entire number of generated trips
<inline-formula id="pone.0134508.e060">
<alternatives>
<graphic id="pone.0134508.e060g" xlink:href="pone.0134508.e060"></graphic>
<mml:math id="M60">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mo>𝓠</mml:mo>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo>𝓠</mml:mo>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mi>d</mml:mi>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>τ</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
is reported in column 3. The
<italic>Supersampled</italic>
models with different fractions
<italic>f</italic>
are only generated using subsamples of the training set (1 month observation period).
<italic>Empirical</italic>
stands for the model generated using the empirical probabilities
<inline-formula id="pone.0134508.e061">
<alternatives>
<graphic id="pone.0134508.e061g" xlink:href="pone.0134508.e061"></graphic>
<mml:math id="M61">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>p</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
(
<xref ref-type="disp-formula" rid="pone.0134508.e017">Eq (1)</xref>
) of the full data set and
<italic>Configuration</italic>
stands for the multi-edge configuration model applied to the full data set.</p>
</caption>
<alternatives>
<graphic id="pone.0134508.t002g" xlink:href="pone.0134508.t002"></graphic>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="1" colspan="1"></th>
<th align="center" colspan="4" rowspan="1">
<italic>τ</italic>
= 1 month</th>
<th align="center" colspan="3" rowspan="1">
<italic>τ</italic>
= 1 year</th>
</tr>
<tr>
<th align="center" rowspan="1" colspan="1">
<italic>f</italic>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>f</italic>
<sub>𝓠</sub>
</th>
<th align="center" rowspan="1" colspan="1">ℒ/ℒ
<sub>
<italic>Emp</italic>
</sub>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>CPC</italic>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e062">
<alternatives>
<graphic id="pone.0134508.e062g" xlink:href="pone.0134508.e062"></graphic>
<mml:math id="M62">
<mml:mrow>
<mml:msubsup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>f</italic>
<sub>𝓠</sub>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>CPC</italic>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e063">
<alternatives>
<graphic id="pone.0134508.e063g" xlink:href="pone.0134508.e063"></graphic>
<mml:math id="M63">
<mml:mrow>
<mml:msubsup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.8855</td>
<td align="center" rowspan="1" colspan="1">1.46</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.07090</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="center" rowspan="1" colspan="1">0.6417</td>
<td align="center" rowspan="1" colspan="1">1.64</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.05138</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.50</td>
<td align="center" rowspan="1" colspan="1">0.4014</td>
<td align="center" rowspan="1" colspan="1">1.88</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.03214</td>
<td align="center" rowspan="1" colspan="1">0.74</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.25</td>
<td align="center" rowspan="1" colspan="1">0.1711</td>
<td align="center" rowspan="1" colspan="1">2.26</td>
<td align="center" rowspan="1" colspan="1">0.68</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.01370</td>
<td align="center" rowspan="1" colspan="1">0.69</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.10</td>
<td align="center" rowspan="1" colspan="1">0.0492</td>
<td align="center" rowspan="1" colspan="1">2.64</td>
<td align="center" rowspan="1" colspan="1">0.60</td>
<td align="center" rowspan="1" colspan="1">0.65</td>
<td align="center" rowspan="1" colspan="1">0.00394</td>
<td align="center" rowspan="1" colspan="1">0.65</td>
<td align="center" rowspan="1" colspan="1">0.63</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.01</td>
<td align="center" rowspan="1" colspan="1">0.0012</td>
<td align="center" rowspan="1" colspan="1">-</td>
<td align="center" rowspan="1" colspan="1">0.58</td>
<td align="center" rowspan="1" colspan="1">0.21</td>
<td align="center" rowspan="1" colspan="1">0.00010</td>
<td align="center" rowspan="1" colspan="1">0.66</td>
<td align="center" rowspan="1" colspan="1">0.26</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.005</td>
<td align="center" rowspan="1" colspan="1">0.0003</td>
<td align="center" rowspan="1" colspan="1">-</td>
<td align="center" rowspan="1" colspan="1">0.59</td>
<td align="center" rowspan="1" colspan="1">0.12</td>
<td align="center" rowspan="1" colspan="1">0.00002</td>
<td align="center" rowspan="1" colspan="1">0.66</td>
<td align="center" rowspan="1" colspan="1">0.18</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Configuration</td>
<td align="center" rowspan="1" colspan="1">-</td>
<td align="center" rowspan="1" colspan="1">-</td>
<td align="center" rowspan="1" colspan="1">0.57</td>
<td align="center" rowspan="1" colspan="1">-0.87</td>
<td align="center" rowspan="1" colspan="1">-</td>
<td align="center" rowspan="1" colspan="1">0.64</td>
<td align="center" rowspan="1" colspan="1">-0.22</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Empirical</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<fig id="pone.0134508.g004" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.g004</object-id>
<label>Fig 4</label>
<caption>
<title>Main network differences between real data accumulated over a year and Supersampled model from real one month data with
<italic>f</italic>
= 0.1 subsampling.</title>
<p>
<bold>(a)-(d)</bold>
Relative error (
<inline-formula id="pone.0134508.e064">
<alternatives>
<graphic id="pone.0134508.e064g" xlink:href="pone.0134508.e064"></graphic>
<mml:math id="M64">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>/</mml:mo>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
with
<inline-formula id="pone.0134508.e065">
<alternatives>
<graphic id="pone.0134508.e065g" xlink:href="pone.0134508.e065"></graphic>
<mml:math id="M65">
<mml:mrow>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
being a magnitude measured from the aggregated yearly network) between reconstructed network using supersampling and original data for outgoing degrees (a), strengths (b) and average neighbor strength (d) (similar results for incoming direction not displayed). The complementary cumulative distribution function of both edge lengths and trips lengths (c) is also shown.
<bold>(e)</bold>
Comparison between empirical
<inline-formula id="pone.0134508.e066">
<alternatives>
<graphic id="pone.0134508.e066g" xlink:href="pone.0134508.e066"></graphic>
<mml:math id="M66">
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
values and model prediction over a single run. Configuration model expectation from a single run using the full year data set is also shown for comparison. All results averaged over 100 repetitions of the model, error bars represent standard deviations on the log-binned data and raw data is shown in the background. For visual clarity, panel e) only shows a random subsample of 1% of the raw data in background.</p>
</caption>
<graphic xlink:href="pone.0134508.g004"></graphic>
</fig>
<p>We observe an accurate reconstruction of the mobility network for a wide range of values of
<italic>f</italic>
, which shows the validity of our proposed supersampling methodology. At the global scale, even at extreme levels of subsampling, our model is successful at reconstructing the original dataset. Also at the topological scale, despite the heterogeneities in the underlying distributions, the methodology generates very accurate predictions. The predictions for the least frequently visited nodes display higher relative errors due to the presence of inactive nodes in the training dataset (1.6% of total nodes for
<italic>f</italic>
= 0.1).</p>
<p>Upon close inspection, our inferred values {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩} slightly over-estimate low-valued weights and underestimate large-valued weights and strengths as was expected from the analysis in temporal stability (see
<xref ref-type="fig" rid="pone.0134508.g002">Fig 2</xref>
), yet the errors are greatly mitigated as we can see in
<xref ref-type="fig" rid="pone.0134508.g003">Fig 3B</xref>
. See Figs
<xref ref-type="fig" rid="pone.0134508.g003">3B</xref>
(green line) and
<xref ref-type="fig" rid="pone.0134508.g004">4E</xref>
(green dots) where we can observe a gap around
<italic>t</italic>
∼ 100 and
<inline-formula id="pone.0134508.e067">
<alternatives>
<graphic xlink:href="pone.0134508.e067.jpg" id="pone.0134508.e067g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M67">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
</mml:msubsup>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mn>100</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, respectively, which corresponds to the separation between the trusted empirical data (separated points in the background belonging to the group of trusted trips 𝓠) and the reconstructed trips (clustered cloud of points). The minor seasonal fluctuations detected first in our temporal analysis together with these over- and under-estimations explain the minor limitations of the model to reproduce perfectly the entire yearly data set.</p>
<p>The second order effects induced by the seasonality of recorded data can also be seen in the performance of our methodology under extreme levels of subsampling (using around 1% of the sample monthly data to feed the model or less). In these circumstances, the model is still able to produce a good prediction of the empirical data, yet it reproduces better the accumulated yearly mobility rather than the monthly one since the inherent seasonal variations of traffic between certain intersection pairs are smoothed by the aggregation procedure.</p>
<p>Furthermore, in the event that enough historical data were available, we could achieve even better results by computing the collection {⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
<sub>
<italic>τ</italic>
</sub>
} with an appropriate
<italic>τ</italic>
period (depending on the granularity of the data) and approximating ⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
⟩ ≃ ⟨
<italic>p</italic>
<sub>
<italic>ij</italic>
</sub>
<sub>
<italic>τ</italic>
</sub>
for the group of
<italic>trusted</italic>
trips (
<xref ref-type="disp-formula" rid="pone.0134508.e050">Eq (7)</xref>
). Such a procedure, which may be extended to overcome the minor limitations imposed by the seasonality of the data and other improvements related with the presence of non-active nodes could be derived to perfect the method.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="sec008">
<title>Discussion</title>
<p>The stationarity of the temporal statistics of trips between different locations, together with a suitable scaling for the data observed over different timespans, has allowed us to develop a general model that is highly effective at reconstructing a general mobility scenario from very limited aggregated data, without the need of having fine grain temporal information, and provides insights about the structure of real taxi trips. The success of our reconstruction method, even using very small amounts of data, points out the composite structure of the network of urban mobility: Taxi displacements are characterized by a small core of very frequent trips coupled with trips generated at random but conditioned by the structural constraints of the city such as population distribution and mobility costs.</p>
<p>Ultimately, the results presented in this paper could be used to answer questions that are of fundamental importance in the field of human mobility modelling, such as:
<italic>i</italic>
) Can an accurate picture of urban mobility patterns be obtained from an incomplete sample of the population?, and
<italic>ii</italic>
) are existing metrics sufficient to assess the quality of model predictions?</p>
<p>We pointed out the importance of data sampling and of the correct assessment of mobility models, and introduced network-based tools to evaluate such models. The implications of these findings are two-fold: On the one hand, the stationarity of the temporal patterns could be exploited to save space and effort in recording mobility data. On the other hand, our method opens the possibility of efficiently scale up data from reduced fleet of vehicles in cases where a full knowledge of the system is needed.</p>
<p>Our study provides a first step in showing that incomplete samples can indeed be scaled up adequately with the appropriate models, and that network metrics are required to comprehensively assess mobility model predictions.</p>
</sec>
<sec sec-type="materials|methods" id="sec009">
<title>Materials and Methods</title>
<sec id="sec010">
<title>Data set</title>
<p>OD matrices are typically inferred from either census/survey data or alternative means such as social media data or Call Detail Records (CDR) [
<xref rid="pone.0134508.ref036" ref-type="bibr">36</xref>
]. They must be constructed in terms of
<italic>trips</italic>
, i.e. well defined trajectories between a starting and ending point. For this reason, we use the taxi data for its completeness, since all trips are recorded, and we consider it a good proxy for general urban mobility, with the caveat that commuting patterns and areas with little taxi demand are covered less. For the present paper we have used the full set of taxi trips (with customers) with starting and ending points within Manhattan obtained from the New York Taxi and Limousine Commission for the year 2011 via Freedom of Information Law request [
<xref rid="pone.0134508.ref024" ref-type="bibr">24</xref>
]. We have aggregated the data at the node level taking into account the grid of roads up to secondary level and adding trips to the nearest node (intersection) in a radius of 200
<italic>m</italic>
. We have decided to keep the self-loops present in the data for simplicity (albeit their fraction is completely negligible). In the analysis, all trips including both week-ends and week-days are considered, since the pattern for weekly trips shows a continuous increase in the number of trips peaking on Friday and followed by a sudden drop on Sundays.</p>
<p>For the subsampling of the reduced data set used as basis to reconstruct the networks, we have used random subsampling. The parameters of the obtained mobility networks for two different observation periods
<italic>τ</italic>
are reported in
<xref ref-type="table" rid="pone.0134508.t003">Table 3</xref>
.</p>
<table-wrap id="pone.0134508.t003" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.t003</object-id>
<label>Table 3</label>
<caption>
<title>Empirical network parameters.</title>
<p>Symbol
<inline-formula id="pone.0134508.e068">
<alternatives>
<graphic id="pone.0134508.e068g" xlink:href="pone.0134508.e068"></graphic>
<mml:math id="M68">
<mml:mrow>
<mml:mover>
<mml:mi>x</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
stands for graph averaged magnitudes. See main text and methods for a definition of each magnitude.</p>
</caption>
<alternatives>
<graphic id="pone.0134508.t003g" xlink:href="pone.0134508.t003"></graphic>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="1" colspan="1">
<italic>τ</italic>
</th>
<th align="center" rowspan="1" colspan="1">N</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e069">
<alternatives>
<graphic id="pone.0134508.e069g" xlink:href="pone.0134508.e069"></graphic>
<mml:math id="M69">
<mml:mrow>
<mml:mover>
<mml:mi>E</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e070">
<alternatives>
<graphic id="pone.0134508.e070g" xlink:href="pone.0134508.e070"></graphic>
<mml:math id="M70">
<mml:mrow>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e071">
<alternatives>
<graphic id="pone.0134508.e071g" xlink:href="pone.0134508.e071"></graphic>
<mml:math id="M71">
<mml:mrow>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e072">
<alternatives>
<graphic id="pone.0134508.e072g" xlink:href="pone.0134508.e072"></graphic>
<mml:math id="M72">
<mml:mrow>
<mml:msubsup>
<mml:mi>σ</mml:mi>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e073">
<alternatives>
<graphic id="pone.0134508.e073g" xlink:href="pone.0134508.e073"></graphic>
<mml:math id="M73">
<mml:mrow>
<mml:msubsup>
<mml:mi>σ</mml:mi>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e074">
<alternatives>
<graphic id="pone.0134508.e074g" xlink:href="pone.0134508.e074"></graphic>
<mml:math id="M74">
<mml:mrow>
<mml:mover>
<mml:mi>k</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e075">
<alternatives>
<graphic id="pone.0134508.e075g" xlink:href="pone.0134508.e075"></graphic>
<mml:math id="M75">
<mml:mrow>
<mml:msubsup>
<mml:mi>σ</mml:mi>
<mml:mi>k</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e076">
<alternatives>
<graphic id="pone.0134508.e076g" xlink:href="pone.0134508.e076"></graphic>
<mml:math id="M76">
<mml:mrow>
<mml:msubsup>
<mml:mi>σ</mml:mi>
<mml:mi>k</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e077">
<alternatives>
<graphic id="pone.0134508.e077g" xlink:href="pone.0134508.e077"></graphic>
<mml:math id="M77">
<mml:mrow>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e078">
<alternatives>
<graphic id="pone.0134508.e078g" xlink:href="pone.0134508.e078"></graphic>
<mml:math id="M78">
<mml:mrow>
<mml:mover>
<mml:mi>E</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e079">
<alternatives>
<graphic id="pone.0134508.e079g" xlink:href="pone.0134508.e079"></graphic>
<mml:math id="M79">
<mml:mrow>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>σ</italic>
<sub>
<italic>t</italic>
</sub>
</th>
<th align="center" rowspan="1" colspan="1">
<inline-formula id="pone.0134508.e080">
<alternatives>
<graphic id="pone.0134508.e080g" xlink:href="pone.0134508.e080"></graphic>
<mml:math id="M80">
<mml:mrow>
<mml:mover>
<mml:mi>c</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">February 2011</td>
<td align="center" rowspan="1" colspan="1">4085</td>
<td align="center" rowspan="1" colspan="1">3099271</td>
<td align="center" rowspan="1" colspan="1">11768911</td>
<td align="center" rowspan="1" colspan="1">2881</td>
<td align="center" rowspan="1" colspan="1">4423</td>
<td align="center" rowspan="1" colspan="1">4112</td>
<td align="center" rowspan="1" colspan="1">759</td>
<td align="center" rowspan="1" colspan="1">730</td>
<td align="center" rowspan="1" colspan="1">597</td>
<td align="char" char="." rowspan="1" colspan="1">0.7</td>
<td align="char" char="." rowspan="1" colspan="1">0.19</td>
<td align="char" char="." rowspan="1" colspan="1">3.8</td>
<td align="char" char="." rowspan="1" colspan="1">6.9</td>
<td align="center" rowspan="1" colspan="1">2434 ± 1801m</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Year</td>
<td align="center" rowspan="1" colspan="1">4091</td>
<td align="center" rowspan="1" colspan="1">7251605</td>
<td align="center" rowspan="1" colspan="1">146986835</td>
<td align="center" rowspan="1" colspan="1">35929</td>
<td align="center" rowspan="1" colspan="1">51325</td>
<td align="center" rowspan="1" colspan="1">54877</td>
<td align="center" rowspan="1" colspan="1">1773</td>
<td align="center" rowspan="1" colspan="1">799</td>
<td align="center" rowspan="1" colspan="1">1206</td>
<td align="char" char="." rowspan="1" colspan="1">8.8</td>
<td align="char" char="." rowspan="1" colspan="1">0.43</td>
<td align="char" char="." rowspan="1" colspan="1">60.84</td>
<td align="char" char="." rowspan="1" colspan="1">20.27</td>
<td align="center" rowspan="1" colspan="1">2458 ± 1831m</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<p>The analyzed data shows that most of the taxis share similar performance. This, coupled with the fact [
<xref rid="pone.0134508.ref025" ref-type="bibr">25</xref>
] that individual taxi mobility traces are in large part statistically indistinguishable from the overall population, justifies that their individual traces (corresponding to sets of trips performed by different customers which can be considered as independent events) can be safely aggregated for the analysis.</p>
</sec>
<sec id="sec011">
<title>Null model: The Multi-Edge Configuration model</title>
<p>Cities usually display a high level of variability across its different locations in terms of activity, i.e., city center concentrate busy locations while outskirts usually display less traffic/retails areas and others. At the level of networks, this translates in important topological heterogeneities which need to be accounted for using a suitable null model. The Multi-Edge configuration model [
<xref rid="pone.0134508.ref030" ref-type="bibr">30</xref>
] is a maximum entropy model that is used to generate maximally random network instances of graphs with a prescribed strength sequence {
<italic>s</italic>
<sup>
<italic>in</italic>
</sup>
,
<italic>s</italic>
<sup>
<italic>out</italic>
</sup>
} (number of trips
<italic>entering</italic>
and
<italic>leaving</italic>
each node). The expected number of trips between two nodes
<italic>i</italic>
and
<italic>j</italic>
with respective strengths
<inline-formula id="pone.0134508.e081">
<alternatives>
<graphic xlink:href="pone.0134508.e081.jpg" id="pone.0134508.e081g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M81">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and
<inline-formula id="pone.0134508.e082">
<alternatives>
<graphic xlink:href="pone.0134508.e082.jpg" id="pone.0134508.e082g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M82">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
reads
<inline-formula id="pone.0134508.e083">
<alternatives>
<graphic xlink:href="pone.0134508.e083.jpg" id="pone.0134508.e083g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M83">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mover>
<mml:mi>s</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>/</mml:mo>
<mml:mover>
<mml:mi>T</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
and the assortativity profile is flat (nodes are uncorrelated at the level of strengths). Throughout this paper, we use this null model as benchmark preserving the strength sequence obtained from aggregating the complete yearly observation period.</p>
</sec>
<sec id="sec012">
<title>Indicators for the quality of the reconstruction</title>
<sec id="sec013">
<title>Distance based measures</title>
<list list-type="bullet">
<list-item>
<p>
<bold>Sorensen-Dice common part of commuters index:</bold>
This indicator was proposed in [
<xref rid="pone.0134508.ref035" ref-type="bibr">35</xref>
] and based on the original formulation is defined as
<disp-formula id="pone.0134508.e084">
<alternatives>
<graphic xlink:href="pone.0134508.e084.jpg" id="pone.0134508.e084g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M84">
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msub>
<mml:mo movablelimits="true" form="prefix">min</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(8)</label>
</disp-formula>
We propose an alternative version formulated in terms of averages which reads,
<disp-formula id="pone.0134508.e085">
<alternatives>
<graphic xlink:href="pone.0134508.e085.jpg" id="pone.0134508.e085g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M85">
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>P</mml:mi>
<mml:mi>C</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msub>
<mml:mo movablelimits="true" form="prefix">min</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(9)</label>
</disp-formula>
The different versions of this indicator have values in the range [0, 1], where
<italic>CPC</italic>
= 1 indicates total coincidence between data and model and
<italic>CPC</italic>
= 0 total disagreement. However, for sparse data sets with a skewed distribution of
<inline-formula id="pone.0134508.e086">
<alternatives>
<graphic xlink:href="pone.0134508.e086.jpg" id="pone.0134508.e086g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M86">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true">{</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
values,
<xref ref-type="disp-formula" rid="pone.0134508.e084">Eq (8)</xref>
may return values excessively lower than 1, even for models very close to reality. To exemplify this fact,
<xref ref-type="table" rid="pone.0134508.t004">Table 4</xref>
shows a comparison of the performance of the two indicators for the models presented in
<xref ref-type="table" rid="pone.0134508.t002">Table 2</xref>
. For the
<italic>Empirical</italic>
model, we can see that the second version of the indicator recovers values close to 1 as would be expected. Furthermore, both indicators converge to very similar values as sampling is increased (the yearly data set contains roughly 12 times more trips than the monthly one).</p>
</list-item>
<list-item>
<p>
<bold>Linear correlation ⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
<sup>
<italic>model</italic>
</sup>
vs
<inline-formula id="pone.0134508.e087">
<alternatives>
<graphic xlink:href="pone.0134508.e087.jpg" id="pone.0134508.e087g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M87">
<mml:mrow>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
</bold>
: This method is widely used [
<xref rid="pone.0134508.ref020" ref-type="bibr">20</xref>
]. We report the coefficient of determination
<inline-formula id="pone.0134508.e088">
<alternatives>
<graphic xlink:href="pone.0134508.e088.jpg" id="pone.0134508.e088g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M88">
<mml:mrow>
<mml:msubsup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
in all tables, based on the comparison between real data and conditional values of the model on the existing edges (since we are using a biased statistic based only on the observed trips in the original data, not the entire set of intersection pairs). With Poisson distributed variables with mean ⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
⟩, such conditioned average reads,
<disp-formula id="pone.0134508.e089">
<alternatives>
<graphic xlink:href="pone.0134508.e089.jpg" id="pone.0134508.e089g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M89">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
</mml:msubsup>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>></mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mn>0</mml:mn>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mrow>
<mml:mo></mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(10)</label>
</disp-formula>
which converges to the average value for highly used trips. Hence, the coefficient of determination
<inline-formula id="pone.0134508.e090">
<alternatives>
<graphic xlink:href="pone.0134508.e090.jpg" id="pone.0134508.e090g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M90">
<mml:mrow>
<mml:msubsup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
assuming an identity relation
<inline-formula id="pone.0134508.e091">
<alternatives>
<graphic xlink:href="pone.0134508.e091.jpg" id="pone.0134508.e091g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M91">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
</mml:msubsup>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
is explicitly,
<disp-formula id="pone.0134508.e092">
<alternatives>
<graphic xlink:href="pone.0134508.e092.jpg" id="pone.0134508.e092g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M92">
<mml:mrow>
<mml:msubsup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
<mml:mover accent="true">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mo stretchy="true">¯</mml:mo>
</mml:mover>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(11)</label>
</disp-formula>
</p>
</list-item>
</list>
<table-wrap id="pone.0134508.t004" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0134508.t004</object-id>
<label>Table 4</label>
<caption>
<title>Values for the different versions of the common part of commuters index for the reconstructed models.</title>
<p>
<italic>CPC</italic>
<sub>
<italic>sample</italic>
</sub>
⟩ and averaged trip values computed over 1000 repetition of each model, standard deviations lower than 10
<sup>−3</sup>
for all cases. Note how differences between indicators disappear with increased sampling.</p>
</caption>
<alternatives>
<graphic id="pone.0134508.t004g" xlink:href="pone.0134508.t004"></graphic>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="1" colspan="1"></th>
<th align="center" colspan="2" rowspan="1">
<italic>τ</italic>
= 1 month</th>
<th align="center" colspan="2" rowspan="1">
<italic>τ</italic>
= 1 year</th>
</tr>
<tr>
<th align="center" rowspan="1" colspan="1">
<italic>f</italic>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>CPC</italic>
<sub>
<italic>sample</italic>
</sub>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>CPC</italic>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>CPC</italic>
<sub>
<italic>sample</italic>
</sub>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>CPC</italic>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="char" char="." rowspan="1" colspan="1">0.786</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="char" char="." rowspan="1" colspan="1">0.777</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="char" char="." rowspan="1" colspan="1">0.757</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="char" char="." rowspan="1" colspan="1">0.758</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.50</td>
<td align="char" char="." rowspan="1" colspan="1">0.713</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="char" char="." rowspan="1" colspan="1">0.731</td>
<td align="center" rowspan="1" colspan="1">0.74</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.25</td>
<td align="char" char="." rowspan="1" colspan="1">0.639</td>
<td align="center" rowspan="1" colspan="1">0.68</td>
<td align="char" char="." rowspan="1" colspan="1">0.686</td>
<td align="center" rowspan="1" colspan="1">0.69</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.10</td>
<td align="char" char="." rowspan="1" colspan="1">0.562</td>
<td align="center" rowspan="1" colspan="1">0.60</td>
<td align="char" char="." rowspan="1" colspan="1">0.647</td>
<td align="center" rowspan="1" colspan="1">0.65</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.01</td>
<td align="char" char="." rowspan="1" colspan="1">0.539</td>
<td align="center" rowspan="1" colspan="1">0.58</td>
<td align="char" char="." rowspan="1" colspan="1">0.655</td>
<td align="center" rowspan="1" colspan="1">0.66</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">0.005</td>
<td align="char" char="." rowspan="1" colspan="1">0.543</td>
<td align="center" rowspan="1" colspan="1">0.59</td>
<td align="char" char="." rowspan="1" colspan="1">0.655</td>
<td align="center" rowspan="1" colspan="1">0.66</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Configuration</td>
<td align="char" char="." rowspan="1" colspan="1">0.525</td>
<td align="center" rowspan="1" colspan="1">0.57</td>
<td align="char" char="." rowspan="1" colspan="1">0.641</td>
<td align="center" rowspan="1" colspan="1">0.64</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Empirical</td>
<td align="char" char="." rowspan="1" colspan="1">0.829</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="char" char="." rowspan="1" colspan="1">0.936</td>
<td align="center" rowspan="1" colspan="1">1</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
</sec>
<sec id="sec014">
<title>Network measures</title>
<p>To better grasp the quality of the models, we propose to compare also some of its multi-edge network related quantities:
<list list-type="bullet">
<list-item>
<p>
<bold>Degree:</bold>
The unweighted degree of the nodes is the sum of their incoming/outgoing active edges (edges for which
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
> 0),
<italic>k</italic>
<sup>
<italic>x</italic>
</sup>
= ∑
<sub>
<italic>x</italic>
</sub>
Θ(
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
) being
<italic>E</italic>
= ∑
<sub>
<italic>ij</italic>
</sub>
Θ(
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
) the total number of active edges and
<italic>x</italic>
referring to the outgoing
<italic>i</italic>
or incoming direction
<italic>j</italic>
.</p>
</list-item>
<list-item>
<p>
<bold>Average weighted neighbor strength:</bold>
This metric is widely used in the literature. It indicates the level of correlations at the node level and is defined as
<inline-formula id="pone.0134508.e093">
<alternatives>
<graphic xlink:href="pone.0134508.e093.jpg" id="pone.0134508.e093g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M93">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mi>w</mml:mi>
</mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>x</mml:mi>
</mml:msubsup>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>x</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>x</mml:mi>
<mml:mi>y</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:mfrac>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, where
<italic>y</italic>
is the complementary of
<italic>x</italic>
(if
<italic>x</italic>
=
<italic>i</italic>
then
<italic>y</italic>
=
<italic>j</italic>
and vice versa).</p>
</list-item>
<list-item>
<p>
<bold>Distribution of weights on existing edges:</bold>
This commonly used measure is computed as
<italic>P</italic>
(
<italic>t</italic>
) = ∑
<sub>
<italic>ij</italic>
</sub>
<italic>δ</italic>
<sub>
<italic>t</italic>
,
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
</sub>
/
<italic>E</italic>
and indicates the collection of weight values present in the network, where
<italic>δ</italic>
<sub>
<italic>x</italic>
,
<italic>y</italic>
</sub>
corresponds to the Kroenecker delta.</p>
</list-item>
<list-item>
<p>
<bold>Graph average existing weight of trips as a function of product of incoming and outgoing degree</bold>
: To quantify the deviation from a completely randomized configuration model, we compute also this metric, which is the average weight of existing trips as a function of the product of out(in) strengths of their origin (destination) nodes:
<inline-formula id="pone.0134508.e094">
<alternatives>
<graphic xlink:href="pone.0134508.e094.jpg" id="pone.0134508.e094g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M94">
<mml:mrow>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true"></mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
, where
<italic>n</italic>
<sub>
<italic>ss</italic>
</sub>
is the cardinality of the sum. For the configuration case, this magnitude is equivalent to
<inline-formula id="pone.0134508.e095">
<alternatives>
<graphic xlink:href="pone.0134508.e095.jpg" id="pone.0134508.e095g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M95">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msup>
<mml:mi>t</mml:mi>
<mml:mo>+</mml:mo>
</mml:msup>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>s</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo>/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
.</p>
</list-item>
</list>
</p>
</sec>
<sec id="sec015">
<title>Information values</title>
<p>We also assess the quality of our models using their Log-Likelihood values assuming a set of independent Poisson random variables with known means ⟨
<italic>t</italic>
<sub>
<italic>ij</italic>
</sub>
⟩ for each intersection pair,
<disp-formula id="pone.0134508.e096">
<alternatives>
<graphic xlink:href="pone.0134508.e096.jpg" id="pone.0134508.e096g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M96">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mo form="prefix">ln</mml:mo>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo>{</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>}</mml:mo>
<mml:mo>|</mml:mo>
<mml:mo>{</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mo>}</mml:mo>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo form="prefix">ln</mml:mo>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>t</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>!</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</alternatives>
<label>(12)</label>
</disp-formula>
Incompatible Loglikelihood values are not reported in tables (such as cases where
<inline-formula id="pone.0134508.e097">
<alternatives>
<graphic xlink:href="pone.0134508.e097.jpg" id="pone.0134508.e097g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M97">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
or
<inline-formula id="pone.0134508.e098">
<alternatives>
<graphic xlink:href="pone.0134508.e098.jpg" id="pone.0134508.e098g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M98">
<mml:mrow>
<mml:mo stretchy="false"></mml:mo>
<mml:msub>
<mml:mover>
<mml:mi>t</mml:mi>
<mml:mo accent="true">^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo stretchy="true"></mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true"></mml:mo>
</mml:mrow>
<mml:mo stretchy="false"></mml:mo>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</alternatives>
</inline-formula>
).</p>
</sec>
</sec>
<sec id="sec016">
<title>Simulations</title>
<p>All the simulations and solving of saddle point equations as well as the analysis of the multi-edge networks have been performed using the freely available, open source package Origin-Destination Multi-Edge analysis (ODME) [
<xref rid="pone.0134508.ref037" ref-type="bibr">37</xref>
].</p>
</sec>
</sec>
</body>
<back>
<ack>
<p>We thank P. Colomer-de-Simon for useful comments and suggestions.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="pone.0134508.ref001">
<label>1</label>
<mixed-citation publication-type="journal">
<name>
<surname>González</surname>
<given-names>MC</given-names>
</name>
,
<name>
<surname>Hidalgo</surname>
<given-names>CA</given-names>
</name>
,
<name>
<surname>Barabási</surname>
<given-names>AL</given-names>
</name>
.
<article-title>Understanding individual human mobility patterns</article-title>
.
<source>Nature</source>
.
<year>2008</year>
;
<volume>453</volume>
(
<issue>7196</issue>
):
<fpage>779</fpage>
<lpage>782</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature06958">10.1038/nature06958</ext-link>
</comment>
<pub-id pub-id-type="pmid">18528393</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref002">
<label>2</label>
<mixed-citation publication-type="other">Blondel VD, Decuyper A, Krings G. A survey of results on mobile phone datasets analysis. arXiv preprint arXiv:1502.03406. 2015.</mixed-citation>
</ref>
<ref id="pone.0134508.ref003">
<label>3</label>
<mixed-citation publication-type="journal">
<name>
<surname>Bazzani</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Giorgini</surname>
<given-names>B</given-names>
</name>
,
<name>
<surname>Rambaldi</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Gallotti</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Giovannini</surname>
<given-names>L</given-names>
</name>
.
<article-title>Statistical laws in urban mobility from microscopic GPS data in the area of Florence</article-title>
.
<source>J Stat Mech</source>
.
<year>2010</year>
;
<volume>2010</volume>
:
<fpage>P05001</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1088/1742-5468/2010/05/P05001">10.1088/1742-5468/2010/05/P05001</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref004">
<label>4</label>
<mixed-citation publication-type="journal">
<name>
<surname>Brockmann</surname>
<given-names>D</given-names>
</name>
,
<name>
<surname>Hufnagel</surname>
<given-names>L</given-names>
</name>
,
<name>
<surname>Geisel</surname>
<given-names>T</given-names>
</name>
.
<article-title>The scaling laws of human travel</article-title>
.
<source>Nature</source>
.
<year>2006</year>
;
<volume>439</volume>
(
<issue>7075</issue>
):
<fpage>462</fpage>
<lpage>465</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature04292">10.1038/nature04292</ext-link>
</comment>
<pub-id pub-id-type="pmid">16437114</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref005">
<label>5</label>
<mixed-citation publication-type="journal">
<name>
<surname>Thiemann</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Theis</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Grady</surname>
<given-names>D</given-names>
</name>
,
<name>
<surname>Brune</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Dirk Brockmann</surname>
<given-names>D</given-names>
</name>
.
<article-title>The Structure of Borders in a Small World</article-title>
.
<source>PLoS ONE</source>
.
<year>2010</year>
;
<volume>5</volume>
:
<fpage>e15422</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0015422">10.1371/journal.pone.0015422</ext-link>
</comment>
<pub-id pub-id-type="pmid">21124970</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref006">
<label>6</label>
<mixed-citation publication-type="journal">
<name>
<surname>Scellato</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Noulas</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Lambiotte</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Mascolo</surname>
<given-names>C</given-names>
</name>
.
<article-title>Socio-spatial Properties of Online Location-based Social Networks</article-title>
.
<source>Proc Int AAAI Conf Weblogs Soc Media</source>
.
<year>2011</year>
;
<volume>11</volume>
.</mixed-citation>
</ref>
<ref id="pone.0134508.ref007">
<label>7</label>
<mixed-citation publication-type="other">Scellato S, Musolesi M, Mascolo C, Latora V, Campbell A. NextPlace: A Spatio-Temporal Prediction Framework for Pervasive Systems. Pervasive’11 Proceedings of the 9th international conference on Pervasive computing. 2011;152–169.</mixed-citation>
</ref>
<ref id="pone.0134508.ref008">
<label>8</label>
<mixed-citation publication-type="journal">
<name>
<surname>Barthélemy</surname>
<given-names>M</given-names>
</name>
.
<article-title>Spatial Networks</article-title>
.
<source>Phys Rep</source>
.
<year>2010</year>
;
<volume>499</volume>
:
<fpage>1</fpage>
<lpage>101</lpage>
.</mixed-citation>
</ref>
<ref id="pone.0134508.ref009">
<label>9</label>
<mixed-citation publication-type="journal">
<name>
<surname>Cattuto</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Van den Broeck</surname>
<given-names>W</given-names>
</name>
,
<name>
<surname>Barrat</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Colizza</surname>
<given-names>V</given-names>
</name>
,
<name>
<surname>Pinton</surname>
<given-names>JF</given-names>
</name>
,
<name>
<surname>Vespignani</surname>
<given-names>A</given-names>
</name>
.
<article-title>Dynamics of person-to-person interactions from distributed RFID sensor networks</article-title>
.
<source>PloS ONE</source>
.
<year>2010</year>
;
<volume>5</volume>
(
<issue>7</issue>
):
<fpage>e11596</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0011596">10.1371/journal.pone.0011596</ext-link>
</comment>
<pub-id pub-id-type="pmid">20657651</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref010">
<label>10</label>
<mixed-citation publication-type="journal">
<name>
<surname>Roth</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Kang</surname>
<given-names>SM</given-names>
</name>
,
<name>
<surname>Batty</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Barthélemy</surname>
<given-names>M</given-names>
</name>
.
<article-title>Structure of Urban Movements: Polycentric Activity and Entangled Hierarchical Flows</article-title>
.
<source>PLoS ONE</source>
.
<year>2011</year>
;
<volume>6</volume>
(
<issue>1</issue>
):
<fpage>e15923</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0015923">10.1371/journal.pone.0015923</ext-link>
</comment>
<pub-id pub-id-type="pmid">21249210</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref011">
<label>11</label>
<mixed-citation publication-type="journal">
<name>
<surname>Szell</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Sinatra</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Petri</surname>
<given-names>G</given-names>
</name>
,
<name>
<surname>Thurner</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Latora</surname>
<given-names>V</given-names>
</name>
.
<article-title>Understanding mobility in a social petri dish</article-title>
.
<source>Sci Rep</source>
.
<year>2012</year>
;
<volume>2</volume>
:
<fpage>457</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/srep00457">10.1038/srep00457</ext-link>
</comment>
<pub-id pub-id-type="pmid">22708055</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref012">
<label>12</label>
<mixed-citation publication-type="journal">
<name>
<surname>Song</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Koren</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Wang</surname>
<given-names>P</given-names>
</name>
,
<name>
<surname>Barabási</surname>
<given-names>AL</given-names>
</name>
.
<article-title>Modelling the scaling properties of human mobility</article-title>
.
<source>Nat Phys</source>
.
<year>2010</year>
;
<volume>6</volume>
:
<fpage>818</fpage>
<lpage>823</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nphys1760">10.1038/nphys1760</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref013">
<label>13</label>
<mixed-citation publication-type="journal">
<name>
<surname>Song</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Qu</surname>
<given-names>Z</given-names>
</name>
,
<name>
<surname>Blumm</surname>
<given-names>N</given-names>
</name>
,
<name>
<surname>Barabási</surname>
<given-names>AL</given-names>
</name>
.
<article-title>Limits of predictability in human mobility</article-title>
.
<source>Science</source>
.
<year>2010</year>
;
<volume>327</volume>
(
<issue>5968</issue>
):
<fpage>1018</fpage>
<lpage>1021</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1126/science.1177170">10.1126/science.1177170</ext-link>
</comment>
<pub-id pub-id-type="pmid">20167789</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref014">
<label>14</label>
<mixed-citation publication-type="journal">
<name>
<surname>Hufnagel</surname>
<given-names>L</given-names>
</name>
,
<name>
<surname>Brockmann</surname>
<given-names>D</given-names>
</name>
,
<name>
<surname>Geisel</surname>
<given-names>T</given-names>
</name>
.
<article-title>Forecast and control of epidemics in a globalized worlds</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<year>2004</year>
;
<volume>101</volume>
(
<issue>42</issue>
):
<fpage>15124</fpage>
<lpage>15129</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.0308344101">10.1073/pnas.0308344101</ext-link>
</comment>
<pub-id pub-id-type="pmid">15477600</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref015">
<label>15</label>
<mixed-citation publication-type="journal">
<name>
<surname>Belik</surname>
<given-names>V</given-names>
</name>
,
<name>
<surname>Geisel</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Brockmann</surname>
<given-names>D</given-names>
</name>
.
<article-title>Natural human mobility patterns and spatial spread of infectious diseases</article-title>
.
<source>Phys Rev X</source>
.
<year>2011</year>
;
<volume>1</volume>
:
<fpage>011001</fpage>
.</mixed-citation>
</ref>
<ref id="pone.0134508.ref016">
<label>16</label>
<mixed-citation publication-type="journal">
<name>
<surname>Colizza</surname>
<given-names>V</given-names>
</name>
,
<name>
<surname>Barrat</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Barthélemy</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Vespignani</surname>
<given-names>A</given-names>
</name>
.
<article-title>The role of the airline transportation network in the prediction and predictability of global epidemics</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<year>2006</year>
;
<volume>103</volume>
(
<issue>7</issue>
):
<fpage>2015</fpage>
<lpage>2020</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.0510525103">10.1073/pnas.0510525103</ext-link>
</comment>
<pub-id pub-id-type="pmid">16461461</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref017">
<label>17</label>
<mixed-citation publication-type="journal">
<name>
<surname>Balcan</surname>
<given-names>D</given-names>
</name>
,
<name>
<surname>Colizza</surname>
<given-names>V</given-names>
</name>
,
<name>
<surname>Goncalves</surname>
<given-names>B</given-names>
</name>
,
<name>
<surname>Hu</surname>
<given-names>H</given-names>
</name>
,
<name>
<surname>Ramasco</surname>
<given-names>JJ</given-names>
</name>
,
<name>
<surname>Vespignani</surname>
<given-names>A</given-names>
</name>
.
<article-title>Multiscale mobility networks and the spatial spreading of infectious diseases</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<year>2009</year>
;
<volume>106</volume>
(
<issue>51</issue>
):
<fpage>21484</fpage>
<lpage>21489</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.0906910106">10.1073/pnas.0906910106</ext-link>
</comment>
<pub-id pub-id-type="pmid">20018697</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref018">
<label>18</label>
<mixed-citation publication-type="book">
<name>
<surname>Batty</surname>
<given-names>M</given-names>
</name>
.
<source>The New Science of Cities</source>
.
<publisher-name>MIT Press</publisher-name>
;
<year>2013</year>
.</mixed-citation>
</ref>
<ref id="pone.0134508.ref019">
<label>19</label>
<mixed-citation publication-type="journal">
<name>
<surname>Zipf</surname>
<given-names>GK</given-names>
</name>
.
<article-title>The P 1 P 2/D hypothesis: on the intercity movement of persons</article-title>
.
<source>American sociological review</source>
.
<year>1946</year>
;
<volume>11</volume>
(
<issue>6</issue>
):
<fpage>677</fpage>
<lpage>686</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2307/2087063">10.2307/2087063</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref020">
<label>20</label>
<mixed-citation publication-type="journal">
<name>
<surname>Simini</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>González</surname>
<given-names>MC</given-names>
</name>
,
<name>
<surname>Maritan</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Barabási</surname>
<given-names>AL</given-names>
</name>
.
<article-title>A universal model for mobility and migration patterns</article-title>
.
<source>Nature</source>
.
<year>2012</year>
;
<volume>484</volume>
(
<issue>7392</issue>
):
<fpage>96</fpage>
<lpage>100</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature10856">10.1038/nature10856</ext-link>
</comment>
<pub-id pub-id-type="pmid">22367540</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref021">
<label>21</label>
<mixed-citation publication-type="journal">
<name>
<surname>Stouffer</surname>
<given-names>SA</given-names>
</name>
.
<article-title>Intervening Opportunities: A Theory Relating Mobility and Distance</article-title>
.
<source>American Sociological Review</source>
.
<year>1940</year>
;
<volume>5</volume>
(
<issue>6</issue>
):
<fpage>845</fpage>
<lpage>867</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2307/2084520">10.2307/2084520</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref022">
<label>22</label>
<mixed-citation publication-type="journal">
<name>
<surname>Clauset</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Shalizi</surname>
<given-names>CR</given-names>
</name>
,
<name>
<surname>Newman</surname>
<given-names>MEJ</given-names>
</name>
.
<article-title>Power-Law Distributions in Empirical Data</article-title>
.
<source>SIAM Rev Soc Ind Appl Math</source>
.
<year>2009</year>
;
<volume>51</volume>
(
<issue>4</issue>
):
<fpage>661</fpage>
<lpage>703</lpage>
.</mixed-citation>
</ref>
<ref id="pone.0134508.ref023">
<label>23</label>
<mixed-citation publication-type="other">Sagarra O. Statistical Complex Analysis of Taxi Mobility in San Francisco. M. Sc. Thesis. Universitat Politècnica de Catalunya; 2011.</mixed-citation>
</ref>
<ref id="pone.0134508.ref024">
<label>24</label>
<mixed-citation publication-type="journal">
<name>
<surname>Santi</surname>
<given-names>P</given-names>
</name>
,
<name>
<surname>Resta</surname>
<given-names>G</given-names>
</name>
,
<name>
<surname>Szell</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Sobolevsky</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Strogatz</surname>
<given-names>SH</given-names>
</name>
,
<name>
<surname>Ratti</surname>
<given-names>C</given-names>
</name>
.
<article-title>Quantifying the Benefits of Vehicle Pooling with Shareability Networks</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<year>2014</year>
;
<volume>111</volume>
(
<issue>37</issue>
):
<fpage>13290</fpage>
<lpage>13294</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.1403657111">10.1073/pnas.1403657111</ext-link>
</comment>
<pub-id pub-id-type="pmid">25197046</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref025">
<label>25</label>
<mixed-citation publication-type="other">Piorkowski M, Sarafijanovic-Djukic N, Grossglauser M. A Parsimonious Model of Mobile Partitioned Networks. Int Conf Commun Syst Netw (COMSNET) IEEE. 2009.1–10.</mixed-citation>
</ref>
<ref id="pone.0134508.ref026">
<label>26</label>
<mixed-citation publication-type="other">Spieser K, Treleaven K, Zhang R, Frazzoli E, Morton D, Pavone M. Towards a Systematic Approach to the Design and Evaluation of Automated Mobility-on-Demand Systems: a Case Study in Singapore. Road Vehicle Automation. 2014;229–245.</mixed-citation>
</ref>
<ref id="pone.0134508.ref027">
<label>27</label>
<mixed-citation publication-type="journal">
<name>
<surname>Wilson</surname>
<given-names>A</given-names>
</name>
.
<article-title>A statistical theory of spatial distribution models</article-title>
.
<source>Transportation Research</source>
.
<year>1967</year>
;
<volume>1</volume>
(
<issue>3</issue>
):
<fpage>253</fpage>
<lpage>269</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/0041-1647(67)90035-4">10.1016/0041-1647(67)90035-4</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref028">
<label>28</label>
<mixed-citation publication-type="book">
<name>
<surname>Erlander</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Stewart</surname>
<given-names>NF</given-names>
</name>
.
<source>The gravity model in transportation analysis: theory and extensions</source>
.
<publisher-name>VSP Utrecht</publisher-name>
;
<year>1990</year>
.</mixed-citation>
</ref>
<ref id="pone.0134508.ref029">
<label>29</label>
<mixed-citation publication-type="journal">
<name>
<surname>Sagarra</surname>
<given-names>O</given-names>
</name>
,
<name>
<surname>Pérez-Vicente</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Díaz Guilera</surname>
<given-names>A</given-names>
</name>
.
<article-title>Statistical mechanics of multi-edge networks</article-title>
.
<source>Phys Rev E</source>
.
<year>2013</year>
;
<volume>88</volume>
:
<fpage>062806</fpage>
;
<fpage>1</fpage>
<lpage>14</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1103/PhysRevE.88.062806">10.1103/PhysRevE.88.062806</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref030">
<label>30</label>
<mixed-citation publication-type="journal">
<name>
<surname>Sagarra</surname>
<given-names>O</given-names>
</name>
,
<name>
<surname>Font-Clos</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Pérez-Vicente</surname>
<given-names>CJ</given-names>
</name>
,
<name>
<surname>Díaz-Guilera</surname>
<given-names>A</given-names>
</name>
.
<article-title>The configuration multi-edge model: Assessing the effect of fixing node strengths on weighted network magnitudes</article-title>
.
<source>Europhys Lett</source>
.
<year>2014</year>
;
<volume>107</volume>
(
<issue>3</issue>
):
<fpage>38002</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1209/0295-5075/107/38002">10.1209/0295-5075/107/38002</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref031">
<label>31</label>
<mixed-citation publication-type="journal">
<name>
<surname>Peng</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Jin</surname>
<given-names>X</given-names>
</name>
,
<name>
<surname>Wong</surname>
<given-names>K</given-names>
</name>
,
<name>
<surname>Shi</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Lio</surname>
<given-names>P</given-names>
</name>
.
<article-title>Collective Human Mobility Pattern from Taxi Trips in Urban Area</article-title>
.
<source>PLoS ONE</source>
.
<year>2012</year>
;
<volume>7</volume>
(
<issue>4</issue>
):
<fpage>e34487</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0034487">10.1371/journal.pone.0034487</ext-link>
</comment>
<pub-id pub-id-type="pmid">22529917</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref032">
<label>32</label>
<mixed-citation publication-type="journal">
<name>
<surname>Yang</surname>
<given-names>Y</given-names>
</name>
,
<name>
<surname>Herrera</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Eagle</surname>
<given-names>N</given-names>
</name>
,
<name>
<surname>González</surname>
<given-names>MC</given-names>
</name>
.
<article-title>Limits of Predictability in Commuting Flows in the Absence of Data for Calibration</article-title>
.
<source>Sci Rep</source>
.
<year>2014</year>
;
<volume>4</volume>
:
<fpage>5662</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/srep05662">10.1038/srep05662</ext-link>
</comment>
<pub-id pub-id-type="pmid">25012599</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref033">
<label>33</label>
<mixed-citation publication-type="journal">
<name>
<surname>Bianconi</surname>
<given-names>G</given-names>
</name>
,
<name>
<surname>Pin</surname>
<given-names>P</given-names>
</name>
,
<name>
<surname>Marsili</surname>
<given-names>M</given-names>
</name>
.
<article-title>Assessing the relevance of node features for network structure</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<year>2009</year>
;
<volume>106</volume>
(
<issue>28</issue>
):
<fpage>11433</fpage>
<lpage>11438</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.0811511106">10.1073/pnas.0811511106</ext-link>
</comment>
<pub-id pub-id-type="pmid">19571013</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref034">
<label>34</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lenormand</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Picornell</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Cantú-Ros</surname>
<given-names>OG</given-names>
</name>
,
<name>
<surname>Tugores</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Louail</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Herranz</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Barthelemy</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Frías-Martinez</surname>
<given-names>E</given-names>
</name>
,
<name>
<surname>Ramasco</surname>
<given-names>J.J.</given-names>
</name>
<article-title>Cross-checking different sources of mobility information</article-title>
.
<source>PLoS ONE</source>
.
<year>2014</year>
;
<volume>9</volume>
(
<issue>8</issue>
):
<fpage>e105184</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0105184">10.1371/journal.pone.0105184</ext-link>
</comment>
<pub-id pub-id-type="pmid">25133549</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref035">
<label>35</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lenormand</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Huet</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Gargiulo</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Deffuant</surname>
<given-names>G</given-names>
</name>
.
<article-title>A universal model of commuting networks</article-title>
.
<source>PLoS ONE</source>
.
<year>2012</year>
;
<volume>7</volume>
(
<issue>10</issue>
):
<fpage>e45985</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0045985">10.1371/journal.pone.0045985</ext-link>
</comment>
<pub-id pub-id-type="pmid">23049691</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref036">
<label>36</label>
<mixed-citation publication-type="journal">
<name>
<surname>Robillard</surname>
<given-names>P</given-names>
</name>
.
<article-title>Estimating the OD matrix from observed link volumes</article-title>
.
<source>Transportation Research</source>
.
<year>1975</year>
;
<volume>9</volume>
(
<issue>2</issue>
):
<fpage>123</fpage>
<lpage>128</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/0041-1647(75)90049-0">10.1016/0041-1647(75)90049-0</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0134508.ref037">
<label>37</label>
<mixed-citation publication-type="other">Sagarra O. ODME: Origin Destination Multi-Edge network package; 2014. Available from:
<ext-link ext-link-type="uri" xlink:href="https://github.com/osagarra/ODME_lite">https://github.com/osagarra/ODME_lite</ext-link>
.</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/TelematiV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000033 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000033 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    TelematiV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4537279
   |texte=   Supersampling and Network Reconstruction of Urban Mobility
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:26275237" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a TelematiV1 

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Thu Nov 2 16:09:04 2017. Site generation: Sun Mar 10 16:42:28 2024