InforLorV4, Istex, Corpus, bibRecord, 002E27

Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed

Identifieur interne : 002E27 ( Istex/Corpus ); précédent : 002E26; suivant : 002E28

Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed

Auteurs : Raphaël Bolze ; Franck Cappello ; Eddy Caron ; Michel Daydé ; Frédéric Desprez ; Emmanuel Jeannot ; Yvon Jégou ; Stephane Lanteri ; Julien Leduc ; Noredine Melab ; Guillaume Mornet ; Raymond Namyst ; Pascale Primet ; Benjamin Quetier ; Olivier Richard ; El-Ghazali Talbi ; Iréa Touche

Source :

The International Journal of High Performance Computing Applications [ 1094-3420 ] ; 2006-11.

RBID : ISTEX:C3211077D2750B2369FA4662C4215BD8245E6D38

English descriptors

Teeft :
- Algorithm, Associate professor, Batch scheduler, Bittorrent, Computational, Computational grid, Computer science, Configuration examples, Default environment, Deployment, Ecole normale superieure, Emulator, Experiment diversity, Experimental platform, Fault tolerance, Franck cappello, Front node, Georgiou, Globus, Grid, Grid level, Grid middleware, High performance, Ieee, Ieee computer society, Infrastructure, Inria, Inria sophia antipolis, Institut, International journal, Internet, Kadeploy2, Large scale, Lille, Middleware, Mpls, Networking, Next section, Node, Optimal solution, Optimization, Protocol, Real platforms, Real software, Reboot, Reboot system, Recherche, Reconfiguration, Research interests, Researcher, Scheduler, Scheduling, Server, Simulator, Software, Software environment, Software image, Steering committee, Toolkit, Toulouse, User, User accounts, Validation, Virtual globus grid, Virtualization, Work unit.

Abstract

Large scale distributed systems such as Grids are difficult to study from theoretical models and simulators only. Most Grids deployed at large scale are production platforms that are inappropriate research tools because of their limited reconfiguration, control and monitoring capabilities. In this paper, we present Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid computing. Grid'5000 is designed to provide a scientific tool for computer scientists similar to the large-scale instruments used by physicists, astronomers, and biologists. We describe the motivations, design considerations, architecture, control, and monitoring infrastructure of this experimental platform. We present configuration examples and performance results for the reconfiguration subsystem.

Url:

https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/fulltext.pdf

DOI: 10.1177/1094342006070078

Links to Exploration step

ISTEX:C3211077D2750B2369FA4662C4215BD8245E6D38

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
<author wicri:is="90%"><name sortKey="Bolze, Raphael" sort="Bolze, Raphael" uniqKey="Bolze R" first="Raphaël" last="Bolze">Raphaël Bolze</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Cappello, Franck" sort="Cappello, Franck" uniqKey="Cappello F" first="Franck" last="Cappello">Franck Cappello</name>
<affiliation><mods:affiliation>Inria, Lri, Paris</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: Fci@Lri.Fr</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Caron, Eddy" sort="Caron, Eddy" uniqKey="Caron E" first="Eddy" last="Caron">Eddy Caron</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Dayde, Michel" sort="Dayde, Michel" uniqKey="Dayde M" first="Michel" last="Daydé">Michel Daydé</name>
<affiliation><mods:affiliation>Inpt/Irit, Toulouse</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Desprez, Frederic" sort="Desprez, Frederic" uniqKey="Desprez F" first="Frédéric" last="Desprez">Frédéric Desprez</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Jeannot, Emmanuel" sort="Jeannot, Emmanuel" uniqKey="Jeannot E" first="Emmanuel" last="Jeannot">Emmanuel Jeannot</name>
<affiliation><mods:affiliation>Loria, Inria</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Jegou, Yvon" sort="Jegou, Yvon" uniqKey="Jegou Y" first="Yvon" last="Jégou">Yvon Jégou</name>
<affiliation><mods:affiliation>Irisa, Inria</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Lanteri, Stephane" sort="Lanteri, Stephane" uniqKey="Lanteri S" first="Stephane" last="Lanteri">Stephane Lanteri</name>
<affiliation><mods:affiliation>Inria Sophia Antipolis</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Leduc, Julien" sort="Leduc, Julien" uniqKey="Leduc J" first="Julien" last="Leduc">Julien Leduc</name>
<affiliation><mods:affiliation>Inria, Lri, Paris</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: Fci@Lri.Fr</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Melab, Noredine" sort="Melab, Noredine" uniqKey="Melab N" first="Noredine" last="Melab">Noredine Melab</name>
<affiliation><mods:affiliation>Lifl, Université De Lille</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Mornet, Guillaume" sort="Mornet, Guillaume" uniqKey="Mornet G" first="Guillaume" last="Mornet">Guillaume Mornet</name>
<affiliation><mods:affiliation>Irisa, Inria</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Namyst, Raymond" sort="Namyst, Raymond" uniqKey="Namyst R" first="Raymond" last="Namyst">Raymond Namyst</name>
<affiliation><mods:affiliation>Labri, Université De Bordeaux</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Primet, Pascale" sort="Primet, Pascale" uniqKey="Primet P" first="Pascale" last="Primet">Pascale Primet</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Quetier, Benjamin" sort="Quetier, Benjamin" uniqKey="Quetier B" first="Benjamin" last="Quetier">Benjamin Quetier</name>
<affiliation><mods:affiliation>Inria, Lri, Paris</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: Fci@Lri.Fr</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Richard, Olivier" sort="Richard, Olivier" uniqKey="Richard O" first="Olivier" last="Richard">Olivier Richard</name>
<affiliation><mods:affiliation>Laboratoire Id-Imag</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Talbi, El Ghazali" sort="Talbi, El Ghazali" uniqKey="Talbi E" first="El-Ghazali" last="Talbi">El-Ghazali Talbi</name>
<affiliation><mods:affiliation>Lifl, Université De Lille</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Touche, Irea" sort="Touche, Irea" uniqKey="Touche I" first="Iréa" last="Touche">Iréa Touche</name>
<affiliation><mods:affiliation>Lgc, Toulouse</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:C3211077D2750B2369FA4662C4215BD8245E6D38</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1177/1094342006070078</idno>
<idno type="url">https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002E27</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">002E27</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
<author wicri:is="90%"><name sortKey="Bolze, Raphael" sort="Bolze, Raphael" uniqKey="Bolze R" first="Raphaël" last="Bolze">Raphaël Bolze</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Cappello, Franck" sort="Cappello, Franck" uniqKey="Cappello F" first="Franck" last="Cappello">Franck Cappello</name>
<affiliation><mods:affiliation>Inria, Lri, Paris</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: Fci@Lri.Fr</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Caron, Eddy" sort="Caron, Eddy" uniqKey="Caron E" first="Eddy" last="Caron">Eddy Caron</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Dayde, Michel" sort="Dayde, Michel" uniqKey="Dayde M" first="Michel" last="Daydé">Michel Daydé</name>
<affiliation><mods:affiliation>Inpt/Irit, Toulouse</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Desprez, Frederic" sort="Desprez, Frederic" uniqKey="Desprez F" first="Frédéric" last="Desprez">Frédéric Desprez</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Jeannot, Emmanuel" sort="Jeannot, Emmanuel" uniqKey="Jeannot E" first="Emmanuel" last="Jeannot">Emmanuel Jeannot</name>
<affiliation><mods:affiliation>Loria, Inria</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Jegou, Yvon" sort="Jegou, Yvon" uniqKey="Jegou Y" first="Yvon" last="Jégou">Yvon Jégou</name>
<affiliation><mods:affiliation>Irisa, Inria</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Lanteri, Stephane" sort="Lanteri, Stephane" uniqKey="Lanteri S" first="Stephane" last="Lanteri">Stephane Lanteri</name>
<affiliation><mods:affiliation>Inria Sophia Antipolis</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Leduc, Julien" sort="Leduc, Julien" uniqKey="Leduc J" first="Julien" last="Leduc">Julien Leduc</name>
<affiliation><mods:affiliation>Inria, Lri, Paris</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: Fci@Lri.Fr</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Melab, Noredine" sort="Melab, Noredine" uniqKey="Melab N" first="Noredine" last="Melab">Noredine Melab</name>
<affiliation><mods:affiliation>Lifl, Université De Lille</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Mornet, Guillaume" sort="Mornet, Guillaume" uniqKey="Mornet G" first="Guillaume" last="Mornet">Guillaume Mornet</name>
<affiliation><mods:affiliation>Irisa, Inria</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Namyst, Raymond" sort="Namyst, Raymond" uniqKey="Namyst R" first="Raymond" last="Namyst">Raymond Namyst</name>
<affiliation><mods:affiliation>Labri, Université De Bordeaux</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Primet, Pascale" sort="Primet, Pascale" uniqKey="Primet P" first="Pascale" last="Primet">Pascale Primet</name>
<affiliation><mods:affiliation>Lip, Ens Lyon</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Quetier, Benjamin" sort="Quetier, Benjamin" uniqKey="Quetier B" first="Benjamin" last="Quetier">Benjamin Quetier</name>
<affiliation><mods:affiliation>Inria, Lri, Paris</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: Fci@Lri.Fr</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Richard, Olivier" sort="Richard, Olivier" uniqKey="Richard O" first="Olivier" last="Richard">Olivier Richard</name>
<affiliation><mods:affiliation>Laboratoire Id-Imag</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Talbi, El Ghazali" sort="Talbi, El Ghazali" uniqKey="Talbi E" first="El-Ghazali" last="Talbi">El-Ghazali Talbi</name>
<affiliation><mods:affiliation>Lifl, Université De Lille</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%"><name sortKey="Touche, Irea" sort="Touche, Irea" uniqKey="Touche I" first="Iréa" last="Touche">Iréa Touche</name>
<affiliation><mods:affiliation>Lgc, Toulouse</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">The International Journal of High Performance Computing Applications</title>
<idno type="ISSN">1094-3420</idno>
<idno type="eISSN">1741-2846</idno>
<imprint><publisher>Sage Publications</publisher>
<pubPlace>Sage CA: Thousand Oaks, CA</pubPlace>
<date type="published" when="2006-11">2006-11</date>
<biblScope unit="volume">20</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="481">481</biblScope>
<biblScope unit="page" to="494">494</biblScope>
</imprint>
<idno type="ISSN">1094-3420</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1094-3420</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="Teeft" xml:lang="en"><term>Algorithm</term>
<term>Associate professor</term>
<term>Batch scheduler</term>
<term>Bittorrent</term>
<term>Computational</term>
<term>Computational grid</term>
<term>Computer science</term>
<term>Configuration examples</term>
<term>Default environment</term>
<term>Deployment</term>
<term>Ecole normale superieure</term>
<term>Emulator</term>
<term>Experiment diversity</term>
<term>Experimental platform</term>
<term>Fault tolerance</term>
<term>Franck cappello</term>
<term>Front node</term>
<term>Georgiou</term>
<term>Globus</term>
<term>Grid</term>
<term>Grid level</term>
<term>Grid middleware</term>
<term>High performance</term>
<term>Ieee</term>
<term>Ieee computer society</term>
<term>Infrastructure</term>
<term>Inria</term>
<term>Inria sophia antipolis</term>
<term>Institut</term>
<term>International journal</term>
<term>Internet</term>
<term>Kadeploy2</term>
<term>Large scale</term>
<term>Lille</term>
<term>Middleware</term>
<term>Mpls</term>
<term>Networking</term>
<term>Next section</term>
<term>Node</term>
<term>Optimal solution</term>
<term>Optimization</term>
<term>Protocol</term>
<term>Real platforms</term>
<term>Real software</term>
<term>Reboot</term>
<term>Reboot system</term>
<term>Recherche</term>
<term>Reconfiguration</term>
<term>Research interests</term>
<term>Researcher</term>
<term>Scheduler</term>
<term>Scheduling</term>
<term>Server</term>
<term>Simulator</term>
<term>Software</term>
<term>Software environment</term>
<term>Software image</term>
<term>Steering committee</term>
<term>Toolkit</term>
<term>Toulouse</term>
<term>User</term>
<term>User accounts</term>
<term>Validation</term>
<term>Virtual globus grid</term>
<term>Virtualization</term>
<term>Work unit</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Large scale distributed systems such as Grids are difficult to study from theoretical models and simulators only. Most Grids deployed at large scale are production platforms that are inappropriate research tools because of their limited reconfiguration, control and monitoring capabilities. In this paper, we present Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid computing. Grid'5000 is designed to provide a scientific tool for computer scientists similar to the large-scale instruments used by physicists, astronomers, and biologists. We describe the motivations, design considerations, architecture, control, and monitoring infrastructure of this experimental platform. We present configuration examples and performance results for the reconfiguration subsystem.</div>
</front>
</TEI>
<istex><corpusName>sage</corpusName>
<keywords><teeft><json:string>grid</json:string>
<json:string>node</json:string>
<json:string>software</json:string>
<json:string>inria</json:string>
<json:string>reconfiguration</json:string>
<json:string>reboot</json:string>
<json:string>globus</json:string>
<json:string>deployment</json:string>
<json:string>computer science</json:string>
<json:string>bittorrent</json:string>
<json:string>infrastructure</json:string>
<json:string>middleware</json:string>
<json:string>server</json:string>
<json:string>scheduler</json:string>
<json:string>internet</json:string>
<json:string>emulator</json:string>
<json:string>simulator</json:string>
<json:string>mpls</json:string>
<json:string>networking</json:string>
<json:string>kadeploy2</json:string>
<json:string>ieee</json:string>
<json:string>lille</json:string>
<json:string>large scale</json:string>
<json:string>virtualization</json:string>
<json:string>toolkit</json:string>
<json:string>georgiou</json:string>
<json:string>algorithm</json:string>
<json:string>user</json:string>
<json:string>high performance</json:string>
<json:string>user accounts</json:string>
<json:string>toulouse</json:string>
<json:string>scheduling</json:string>
<json:string>researcher</json:string>
<json:string>software image</json:string>
<json:string>grid middleware</json:string>
<json:string>reboot system</json:string>
<json:string>research interests</json:string>
<json:string>ecole normale superieure</json:string>
<json:string>franck cappello</json:string>
<json:string>real platforms</json:string>
<json:string>experimental platform</json:string>
<json:string>work unit</json:string>
<json:string>computational grid</json:string>
<json:string>computational</json:string>
<json:string>validation</json:string>
<json:string>optimization</json:string>
<json:string>recherche</json:string>
<json:string>software environment</json:string>
<json:string>international journal</json:string>
<json:string>fault tolerance</json:string>
<json:string>configuration examples</json:string>
<json:string>grid level</json:string>
<json:string>default environment</json:string>
<json:string>batch scheduler</json:string>
<json:string>experiment diversity</json:string>
<json:string>virtual globus grid</json:string>
<json:string>front node</json:string>
<json:string>real software</json:string>
<json:string>inria sophia antipolis</json:string>
<json:string>optimal solution</json:string>
<json:string>ieee computer society</json:string>
<json:string>steering committee</json:string>
<json:string>associate professor</json:string>
<json:string>next section</json:string>
<json:string>institut</json:string>
<json:string>protocol</json:string>
<json:string>lyon</json:string>
<json:string>resource management system</json:string>
<json:string>simple broker</json:string>
<json:string>virtual machines</json:string>
<json:string>specific queue</json:string>
<json:string>node reconfiguration</json:string>
<json:string>grid community</json:string>
<json:string>experimental platforms</json:string>
<json:string>reboot order</json:string>
<json:string>light kernel</json:string>
<json:string>user partition</json:string>
<json:string>user environment</json:string>
<json:string>boot time</json:string>
<json:string>time diagram</json:string>
<json:string>scheduling algorithms</json:string>
<json:string>desktop grid</json:string>
<json:string>computational grids</json:string>
<json:string>transport protocol</json:string>
<json:string>system image</json:string>
<json:string>user home directory</json:string>
<json:string>bittorrent master node</json:string>
<json:string>different aspects</json:string>
<json:string>globus installation</json:string>
<json:string>globus toolkit</json:string>
<json:string>software stack</json:string>
<json:string>main goal</json:string>
<json:string>heterogeneous environments</json:string>
<json:string>first step</json:string>
<json:string>total execution time</json:string>
<json:string>search tree</json:string>
<json:string>deployment tool</json:string>
<json:string>dynamic availability</json:string>
<json:string>problem instance</json:string>
<json:string>rhodes island</json:string>
<json:string>total time</json:string>
<json:string>parallel efficiency</json:string>
<json:string>several sites</json:string>
<json:string>high speed networks</json:string>
<json:string>easy access</json:string>
<json:string>production platforms</json:string>
<json:string>mpls technology</json:string>
<json:string>program committees</json:string>
<json:string>grid architecture</json:string>
<json:string>assistant professor</json:string>
<json:string>technical manager</json:string>
<json:string>communication architecture</json:string>
<json:string>parallel libraries</json:string>
<json:string>memory machines</json:string>
<json:string>current research interests</json:string>
<json:string>networking protocols</json:string>
<json:string>main research interests</json:string>
<json:string>laboratoire fondamentale</json:string>
<json:string>opac team</json:string>
<json:string>dolphin project</json:string>
<json:string>inria futurs</json:string>
<json:string>other sites</json:string>
<json:string>combinatorial optimization algorithms</json:string>
<json:string>software frameworks</json:string>
<json:string>pascale primet</json:string>
<json:string>cluster networking</json:string>
<json:string>active networks</json:string>
<json:string>networking journals</json:string>
<json:string>monitoring infrastructure</json:string>
<json:string>kernel</json:string>
</teeft>
</keywords>
<author><json:item><name>Raphaël Bolze</name>
<affiliations><json:string>Lip, Ens Lyon</json:string>
</affiliations>
</json:item>
<json:item><name>Franck Cappello</name>
<affiliations><json:string>Inria, Lri, Paris</json:string>
<json:string>E-mail: Fci@Lri.Fr</json:string>
</affiliations>
</json:item>
<json:item><name>Eddy Caron</name>
<affiliations><json:string>Lip, Ens Lyon</json:string>
</affiliations>
</json:item>
<json:item><name>Michel Daydé</name>
<affiliations><json:string>Inpt/Irit, Toulouse</json:string>
</affiliations>
</json:item>
<json:item><name>Frédéric Desprez</name>
<affiliations><json:string>Lip, Ens Lyon</json:string>
</affiliations>
</json:item>
<json:item><name>Emmanuel Jeannot</name>
<affiliations><json:string>Loria, Inria</json:string>
</affiliations>
</json:item>
<json:item><name>Yvon Jégou</name>
<affiliations><json:string>Irisa, Inria</json:string>
</affiliations>
</json:item>
<json:item><name>Stephane Lanteri</name>
<affiliations><json:string>Inria Sophia Antipolis</json:string>
</affiliations>
</json:item>
<json:item><name>Julien Leduc</name>
<affiliations><json:string>Inria, Lri, Paris</json:string>
<json:string>E-mail: Fci@Lri.Fr</json:string>
</affiliations>
</json:item>
<json:item><name>Noredine Melab</name>
<affiliations><json:string>Lifl, Université De Lille</json:string>
</affiliations>
</json:item>
<json:item><name>Guillaume Mornet</name>
<affiliations><json:string>Irisa, Inria</json:string>
</affiliations>
</json:item>
<json:item><name>Raymond Namyst</name>
<affiliations><json:string>Labri, Université De Bordeaux</json:string>
</affiliations>
</json:item>
<json:item><name>Pascale Primet</name>
<affiliations><json:string>Lip, Ens Lyon</json:string>
</affiliations>
</json:item>
<json:item><name>Benjamin Quetier</name>
<affiliations><json:string>Inria, Lri, Paris</json:string>
<json:string>E-mail: Fci@Lri.Fr</json:string>
</affiliations>
</json:item>
<json:item><name>Olivier Richard</name>
<affiliations><json:string>Laboratoire Id-Imag</json:string>
</affiliations>
</json:item>
<json:item><name>El-Ghazali Talbi</name>
<affiliations><json:string>Lifl, Université De Lille</json:string>
</affiliations>
</json:item>
<json:item><name>Iréa Touche</name>
<affiliations><json:string>Lgc, Toulouse</json:string>
</affiliations>
</json:item>
</author>
<subject><json:item><lang><json:string>eng</json:string>
</lang>
<value>Grid</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>P2P</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>experimental platform</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>highly recon-figurable syste</value>
</json:item>
</subject>
<articleId><json:string>10.1177_1094342006070078</json:string>
</articleId>
<arkIstex>ark:/67375/M70-8RJ7Z37Z-T</arkIstex>
<language><json:string>eng</json:string>
</language>
<originalGenre><json:string>research-article</json:string>
</originalGenre>
<abstract>Large scale distributed systems such as Grids are difficult to study from theoretical models and simulators only. Most Grids deployed at large scale are production platforms that are inappropriate research tools because of their limited reconfiguration, control and monitoring capabilities. In this paper, we present Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid computing. Grid'5000 is designed to provide a scientific tool for computer scientists similar to the large-scale instruments used by physicists, astronomers, and biologists. We describe the motivations, design considerations, architecture, control, and monitoring infrastructure of this experimental platform. We present configuration examples and performance results for the reconfiguration subsystem.</abstract>
<qualityIndicators><score>8.248</score>
<pdfWordCount>8532</pdfWordCount>
<pdfCharCount>53716</pdfCharCount>
<pdfVersion>1.5</pdfVersion>
<pdfPageCount>14</pdfPageCount>
<pdfPageSize>595 x 842 pts (A4)</pdfPageSize>
<refBibsNative>true</refBibsNative>
<abstractWordCount>104</abstractWordCount>
<abstractCharCount>789</abstractCharCount>
<keywordCount>4</keywordCount>
</qualityIndicators>
<title>Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
<genre><json:string>research-article</json:string>
</genre>
<host><title>The International Journal of High Performance Computing Applications</title>
<language><json:string>unknown</json:string>
</language>
<issn><json:string>1094-3420</json:string>
</issn>
<eissn><json:string>1741-2846</json:string>
</eissn>
<publisherId><json:string>HPC</json:string>
</publisherId>
<volume>20</volume>
<issue>4</issue>
<pages><first>481</first>
<last>494</last>
</pages>
<genre><json:string>journal</json:string>
</genre>
</host>
<namedEntities><unitex><date></date>
<geogName></geogName>
<orgName></orgName>
<orgName_funder></orgName_funder>
<orgName_provider></orgName_provider>
<persName></persName>
<placeName></placeName>
<ref_url></ref_url>
<ref_bibl></ref_bibl>
<bibl></bibl>
</unitex>
</namedEntities>
<ark><json:string>ark:/67375/M70-8RJ7Z37Z-T</json:string>
</ark>
<categories><wos><json:string>1 - science</json:string>
<json:string>2 - computer science, theory & methods</json:string>
<json:string>2 - computer science, interdisciplinary applications</json:string>
<json:string>2 - computer science, hardware & architecture</json:string>
</wos>
<scienceMetrix><json:string>1 - applied sciences</json:string>
<json:string>2 - information & communication technologies</json:string>
<json:string>3 - distributed computing</json:string>
</scienceMetrix>
<scopus><json:string>1 - Physical Sciences</json:string>
<json:string>2 - Computer Science</json:string>
<json:string>3 - Hardware and Architecture</json:string>
<json:string>1 - Physical Sciences</json:string>
<json:string>2 - Mathematics</json:string>
<json:string>3 - Theoretical Computer Science</json:string>
<json:string>1 - Physical Sciences</json:string>
<json:string>2 - Computer Science</json:string>
<json:string>3 - Software</json:string>
</scopus>
<inist><json:string>1 - sciences appliquees, technologies et medecines</json:string>
<json:string>2 - sciences exactes et technologie</json:string>
<json:string>3 - terre, ocean, espace</json:string>
<json:string>4 - astronomie</json:string>
</inist>
</categories>
<publicationDate>2006</publicationDate>
<copyrightDate>2006</copyrightDate>
<doi><json:string>10.1177/1094342006070078</json:string>
</doi>
<id>C3211077D2750B2369FA4662C4215BD8245E6D38</id>
<score>1</score>
<fulltext><json:item><extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/fulltext.pdf</uri>
</json:item>
<json:item><extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/bundle.zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/fulltext.tei"><teiHeader><fileDesc><titleStmt><title level="a" type="main" xml:lang="en">Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
</titleStmt>
<publicationStmt><authority>ISTEX</authority>
<publisher scheme="https://scientific-publisher.data.istex.fr">Sage Publications</publisher>
<pubPlace>Sage CA: Thousand Oaks, CA</pubPlace>
<availability><licence><p>sage</p>
</licence>
</availability>
<p scheme="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-0J1N7DQT-B"></p>
<date>2006</date>
</publicationStmt>
<notesStmt><note type="research-article" scheme="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</note>
<note type="journal" scheme="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</note>
</notesStmt>
<sourceDesc><biblStruct type="inbook"><analytic><title level="a" type="main" xml:lang="en">Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
<author xml:id="author-0000"><persName><forename type="first">Raphaël</forename>
<surname>Bolze</surname>
</persName>
<affiliation>Lip, Ens Lyon</affiliation>
</author>
<author xml:id="author-0001"><persName><forename type="first">Franck</forename>
<surname>Cappello</surname>
</persName>
<email>Fci@Lri.Fr</email>
<affiliation>Inria, Lri, Paris</affiliation>
</author>
<author xml:id="author-0002"><persName><forename type="first">Eddy</forename>
<surname>Caron</surname>
</persName>
<affiliation>Lip, Ens Lyon</affiliation>
</author>
<author xml:id="author-0003"><persName><forename type="first">Michel</forename>
<surname>Daydé</surname>
</persName>
<affiliation>Inpt/Irit, Toulouse</affiliation>
</author>
<author xml:id="author-0004"><persName><forename type="first">Frédéric</forename>
<surname>Desprez</surname>
</persName>
<affiliation>Lip, Ens Lyon</affiliation>
</author>
<author xml:id="author-0005"><persName><forename type="first">Emmanuel</forename>
<surname>Jeannot</surname>
</persName>
<affiliation>Loria, Inria</affiliation>
</author>
<author xml:id="author-0006"><persName><forename type="first">Yvon</forename>
<surname>Jégou</surname>
</persName>
<affiliation>Irisa, Inria</affiliation>
</author>
<author xml:id="author-0007"><persName><forename type="first">Stephane</forename>
<surname>Lanteri</surname>
</persName>
<affiliation>Inria Sophia Antipolis</affiliation>
</author>
<author xml:id="author-0008"><persName><forename type="first">Julien</forename>
<surname>Leduc</surname>
</persName>
<email>Fci@Lri.Fr</email>
<affiliation>Inria, Lri, Paris</affiliation>
</author>
<author xml:id="author-0009"><persName><forename type="first">Noredine</forename>
<surname>Melab</surname>
</persName>
<affiliation>Lifl, Université De Lille</affiliation>
</author>
<author xml:id="author-0010"><persName><forename type="first">Guillaume</forename>
<surname>Mornet</surname>
</persName>
<affiliation>Irisa, Inria</affiliation>
</author>
<author xml:id="author-0011"><persName><forename type="first">Raymond</forename>
<surname>Namyst</surname>
</persName>
<affiliation>Labri, Université De Bordeaux</affiliation>
</author>
<author xml:id="author-0012"><persName><forename type="first">Pascale</forename>
<surname>Primet</surname>
</persName>
<affiliation>Lip, Ens Lyon</affiliation>
</author>
<author xml:id="author-0013"><persName><forename type="first">Benjamin</forename>
<surname>Quetier</surname>
</persName>
<email>Fci@Lri.Fr</email>
<affiliation>Inria, Lri, Paris</affiliation>
</author>
<author xml:id="author-0014"><persName><forename type="first">Olivier</forename>
<surname>Richard</surname>
</persName>
<affiliation>Laboratoire Id-Imag</affiliation>
</author>
<author xml:id="author-0015"><persName><forename type="first">El-Ghazali</forename>
<surname>Talbi</surname>
</persName>
<affiliation>Lifl, Université De Lille</affiliation>
</author>
<author xml:id="author-0016"><persName><forename type="first">Iréa</forename>
<surname>Touche</surname>
</persName>
<affiliation>Lgc, Toulouse</affiliation>
</author>
<idno type="istex">C3211077D2750B2369FA4662C4215BD8245E6D38</idno>
<idno type="ark">ark:/67375/M70-8RJ7Z37Z-T</idno>
<idno type="DOI">10.1177/1094342006070078</idno>
<idno type="article-id">10.1177_1094342006070078</idno>
</analytic>
<monogr><title level="j">The International Journal of High Performance Computing Applications</title>
<idno type="pISSN">1094-3420</idno>
<idno type="eISSN">1741-2846</idno>
<idno type="publisher-id">HPC</idno>
<idno type="PublisherID-hwp">sphpc</idno>
<imprint><publisher>Sage Publications</publisher>
<pubPlace>Sage CA: Thousand Oaks, CA</pubPlace>
<date type="published" when="2006-11"></date>
<biblScope unit="volume">20</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="481">481</biblScope>
<biblScope unit="page" to="494">494</biblScope>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><creation><date>2006</date>
</creation>
<langUsage><language ident="en">en</language>
</langUsage>
<abstract xml:lang="en"><p>Large scale distributed systems such as Grids are difficult to study from theoretical models and simulators only. Most Grids deployed at large scale are production platforms that are inappropriate research tools because of their limited reconfiguration, control and monitoring capabilities. In this paper, we present Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid computing. Grid'5000 is designed to provide a scientific tool for computer scientists similar to the large-scale instruments used by physicists, astronomers, and biologists. We describe the motivations, design considerations, architecture, control, and monitoring infrastructure of this experimental platform. We present configuration examples and performance results for the reconfiguration subsystem.</p>
</abstract>
<textClass><keywords scheme="keyword"><list><head>keywords</head>
<item><term>Grid</term>
</item>
<item><term>P2P</term>
</item>
<item><term>experimental platform</term>
</item>
<item><term>highly recon-figurable syste</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc><change when="2006-11">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item><extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/fulltext.txt</uri>
</json:item>
</fulltext>
<metadata><istex:metadataXml wicri:clean="corpus sage not found" wicri:toSee="no header"><istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" URI="journalpublishing.dtd" name="istex:docType"></istex:docType>
<istex:document><article article-type="research-article" dtd-version="2.3" xml:lang="EN"><front><journal-meta><journal-id journal-id-type="hwp">sphpc</journal-id>
<journal-id journal-id-type="publisher-id">HPC</journal-id>
<journal-title>The International Journal of High Performance Computing Applications</journal-title>
<issn pub-type="ppub">1094-3420</issn>
<publisher><publisher-name>Sage Publications</publisher-name>
<publisher-loc>Sage CA: Thousand Oaks, CA</publisher-loc>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="doi">10.1177/1094342006070078</article-id>
<article-id pub-id-type="publisher-id">10.1177_1094342006070078</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Articles</subject>
</subj-group>
</article-categories>
<title-group><article-title>Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Bolze</surname>
<given-names>Raphaël</given-names>
</name>
<aff>Lip, Ens Lyon</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Cappello</surname>
<given-names>Franck</given-names>
</name>
<aff>Inria, Lri, Paris <email xlink:type="simple">Fci@Lri.Fr</email>
</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Caron</surname>
<given-names>Eddy</given-names>
</name>
<aff>Lip, Ens Lyon</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Daydé</surname>
<given-names>Michel</given-names>
</name>
<aff>Inpt/Irit, Toulouse</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Desprez</surname>
<given-names>Frédéric</given-names>
</name>
<aff>Lip, Ens Lyon</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Jeannot</surname>
<given-names>Emmanuel</given-names>
</name>
<aff>Loria, Inria</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Jégou</surname>
<given-names>Yvon</given-names>
</name>
<aff>Irisa, Inria</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Lanteri</surname>
<given-names>Stephane</given-names>
</name>
<aff>Inria Sophia Antipolis</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Leduc</surname>
<given-names>Julien</given-names>
</name>
<aff>Inria, Lri, Paris <email xlink:type="simple">Fci@Lri.Fr</email>
</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Melab</surname>
<given-names>Noredine</given-names>
</name>
<aff>Lifl, Université De Lille</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Mornet</surname>
<given-names>Guillaume</given-names>
</name>
<aff>Irisa, Inria</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Namyst</surname>
<given-names>Raymond</given-names>
</name>
<aff>Labri, Université De Bordeaux</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Primet</surname>
<given-names>Pascale</given-names>
</name>
<aff>Lip, Ens Lyon</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Quetier</surname>
<given-names>Benjamin</given-names>
</name>
<aff>Inria, Lri, Paris <email xlink:type="simple">Fci@Lri.Fr</email>
</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Richard</surname>
<given-names>Olivier</given-names>
</name>
<aff>Laboratoire Id-Imag</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Talbi</surname>
<given-names>El-Ghazali</given-names>
</name>
<aff>Lifl, Université De Lille</aff>
</contrib>
</contrib-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Touche</surname>
<given-names>Iréa</given-names>
</name>
<aff>Lgc, Toulouse</aff>
</contrib>
</contrib-group>
<pub-date pub-type="ppub"><month>11</month>
<year>2006</year>
</pub-date>
<volume>20</volume>
<issue>4</issue>
<fpage>481</fpage>
<lpage>494</lpage>
<abstract><p>Large scale distributed systems such as Grids are difficult to study from theoretical
                models and simulators only. Most Grids deployed at large scale are production
                platforms that are inappropriate research tools because of their limited
                reconfiguration, control and monitoring capabilities. In this paper, we present
                Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid
                computing. Grid'5000 is designed to provide a scientific tool for computer
                scientists similar to the large-scale instruments used by physicists, astronomers,
                and biologists. We describe the motivations, design considerations, architecture,
                control, and monitoring infrastructure of this experimental platform. We present
                configuration examples and performance results for the reconfiguration subsystem.</p>
</abstract>
<kwd-group><kwd>Grid</kwd>
<kwd>P2P</kwd>
<kwd>experimental platform</kwd>
<kwd>highly recon-figurable syste</kwd>
</kwd-group>
<custom-meta-wrap><custom-meta xlink:type="simple"><meta-name>sagemeta-type</meta-name>
<meta-value>Journal Article</meta-value>
</custom-meta>
<custom-meta xlink:type="simple"><meta-name>cover-date</meta-name>
<meta-value>Winter 2006</meta-value>
</custom-meta>
<custom-meta xlink:type="simple"><meta-name>search-text</meta-name>
<meta-value> 481GRID'5000 GRID'5000: A LARGE SCALE AND HIGHLY RECONFIGURABLE EXPERIMENTAL
            GRID TESTBED Raphaël Bolze1 Franck Cappello 2 Eddy Caron1 Michel Daydé 3 Frédéric
            Desprez1 Emmanuel Jeannot4 Yvon Jégou5 Stephane Lanteri6 Julien Leduc 2 Noredine Melab7
            Guillaume Mornet 5 Raymond Namyst 8 Pascale Primet 1 Benjamin Quetier2 Olivier Richard 9
            El-Ghazali Talbi7 Iréa Touche10 Abstract Large scale distributed systems such as Grids
            are difficult to study from theoretical models and simulators only. Most Grids deployed
            at large scale are production plat- forms that are inappropriate research tools because
            of their limited reconfiguration, control and monitoring capa- bilities. In this paper,
            we present Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid
            computing. Grid'5000 is designed to provide a scientific tool for com- puter scientists
            similar to the large-scale instruments used by physicists, astronomers, and biologists.
            We describe the motivations, design considerations, architec- ture, control, and
            monitoring infrastructure of this experi- mental platform. We present configuration
            examples and performance results for the reconfiguration subsystem. Key words: Grid,
            P2P, experimental platform, highly recon- figurable system 1 Introduction Grid is well
            established as a research domain and pro- poses technologies that are mature enough to
            be used for real-life applications. Projects such as e-Science (http:// www.nesc.ac.uk),
            TeraGrid (http://www.teragrid.org), Grid3 (http://www.ivdlg.org/grid2003), DEISA
            (http:// www.deisa.org), and NAREGI (http://www.naregi.org/ index_e.html/), demonstrate
            that large-scale infrastructures can be deployed to provide scientists with fairly easy
            access to geographically distributed resources belonging to dif- ferent administration
            domains. Despite its establishment as a workable computing infrastructure, there are
            still many issues to be solved and mechanisms needed to opti- mize in performance, fault
            tolerance, QoS, security, and fairness. As large-scale distributed systems, Grid
            software and architecture combine several characteristics which make them difficult to
            study by following a theoretical approach. Most of the research conducted in Grids is
            currently per- formed using simulators, emulators or production plat- forms. As
            discussed in the next section, all these tools have limitations making the study of new
            software and optimizations difficult. Given the complexity of Grids, there is a strong
            need for highly configurable real-life experimental platforms that can be controlled and
            moni- tored directly. Such tools already exist in other contexts. The closest example is
            PlanetLab (Chun et al. 2003). It consists of a set of PCs connected to the Internet and
            forming an experimental distributed system. PlanetLab is used for network studies as
            well as for distributed sys- tems research. In this paper we present the Grid'5000
            http://www.grid 5000.org project, still under construction but already in use in France.
            We first explain the motivation for devel- oping a large scale, real-life experimental
            platform by discussing the limitations of existing tools. In Section 3, we present the
            design principles of Grid'5000 which were based on the results of Grid researchers'
            interviews. The International Journal of High Performance Computing Applications, Volume
            20, No. 3, Fall 2006, pp. 481494 DOI: 10.1177/1094342006070078 © 2006 SAGE Publications
            Figures 13, 6 appear in color online: http://hpc.sagepub.com 1 LIP, ENS LYON 2 INRIA,
            LRI, PARIS (FCI@LRI.FR) 3 INPT/IRIT, TOULOUSE 4 LORIA, INRIA 5 IRISA, INRIA 6 INRIA
            SOPHIA ANTIPOLIS 7 LIFL, UNIVERSITÉ DE LILLE 8 LABRI, UNIVERSITÉ DE BORDEAUX 9
            LABORATOIRE ID-IMAG 10 LGC, TOULOUSE 482 COMPUTING APPLICATIONS The implementation of
            Grid'5000 is described in Sec- tion 4. In Section 5 we present evaluation results for
            the deployment and reboot system, a key component of Grid'5000. Section 6 gives some
            configuration exam- ples, demonstrating the high reconfigurability of the plat- form. 2
            Motivations and Related Work As with other scientific domains, research in Grid com-
            puting is based on a variety of methodologies and tools. Figure 1 presents the spectrum
            of methodologies used by researchers to study research issues in distributed sys- tems.
            In large distributed systems, numerous parameters must be considered and complex
            interactions between resources make analytical modeling impractical. Thus sim- ulators,
            emulators, and real platforms are preferred. Simulators focus on a specific behavior or
            mechanism of the distributed system and abstract the rest of the sys- tem. Their
            fundamental advantage is their independence of the execution platform. For example,
            Bricks (Takefusa et al. 1999) was proposed for studies and comparisons of scheduling
            algorithms and frameworks. Researchers can specify network topologies, server
            architectures, commu- nication models and scheduling framework components to study
            multi-client, multi-server Grid scenarios. Some Bricks components are replaceable by
            real software, allow- ing validation of external sofware. SimGrid (Casanova, Legrand,
            and Marchal 2003) is used to study single-client multi-server scheduling in the context
            of complex, dis- tributed, dynamic, and heterogeneous environments. Sim- Grid is based
            on event-driven simulation, providing a set of abstractions and functionalities to build
            a simulator corresponding to the applications and infrastructures. Resources latency and
            service rate may be set as con- stants or evolve according to traces. The topology is
            fully configurable. GangSim (Dumitrescu and Foster 2005) con- siders a context where
            hundreds of institutions and thou- sands of individuals collectively use tens or
            hundreds of thousands of computers and the associated storage systems. It models usage
            policies at the site and Virtual Organiza- tion (VO) levels and can combine simulated
            components with instances of a VO Ganglia Monitoring toolkit run- ning on real
            resources. Surprisingly, very few studies have provided validation for these simulators.
            The validation of Bricks was per- formed by incorporating NWS (Network Weather Serv-
            ice) in Bricks and comparing the NWS results measured on a real Grid with the ones
            obtained on a Grid simulated by Bricks. SimGrid validation consisted in comparing the
            simulator results with the ones obtained analytically on a mathematically tractable
            problem. In some situations, complex behaviors and interactions of the distributed
            system nodes cannot be simulated, because of the difficulty of capturing and extracting
            the factors influ- encing the distributed systems. Emulators can address this limitation
            by executing the actual software part of the dis- tributed system, in its whole
            complexity. Emulators are generally run on rather ideal infrastructures (i.e. controlled
            clusters). MicroGrid (Liu, Xia, and Chien 2004) allows researchers to run Grid
            applications on virtual Grid resources. Resource virtualization is done by intercepting
            all direct use of resources. The emulation coordination essen- tially controls the
            simulation rate, which is determined by the virtualization ratio for all resources. The
            emulation time base is controlled by a virtualization library returning adjusted times
            to the system routines. Accurate processor virtualization relies on specific schedulers
            and the network virtualization (Liu, Xia, and Chien 2004) uses the MaSSF system for a
            scalable online network simulation. Fig. 1 Methodologies used in distributed system
            studies. 483GRID'5000 The authors of MicroGrid have conducted a thorough validation
            (Liu, Xia, and Chien 2004). The internal tim- ing of MicroGrid was validated using the
            AutoPilot sys- tem. The capacity of the emulator to enforce memory limitation and to
            maintain the processing model under CPU and I/O competition was validated using
            microbench- mark. Emulation results were compared with experimen- tal ones on real
            platforms for the NAS benchmark, in order to validate the full emulation engine.
            Validation with real applications compared the execution times of CACTUS problem solving
            environment, Jacobi, Sca- LAPACK, Fish, Game of life, and Fasta on real platforms with
            the ones obtained by MicroGrid. Emulab (White et al. 2002) is another emulator,
            originally designed for network emulation. It provides advanced controlling mechanisms
            for the user, allowing the rebooting of nodes in specific OS configurations and the
            control of the network topology. Because emulators use the real software, they cannot
            scale as well as simulators. Furthermore, there is still a gap between emulators and the
            reality: even traffic and fault injection techniques, generally based on traces or
            synthetic generators cannot capture all the dynamic, variety and complexity of real-life
            conditions. Real-life experi- mental platforms solve this problem by running the real
            software on realistic hardware. DAS2 (http://www.cs. vu.nl/das2/) is basically an
            idealized Grid, all sites being connected on the Internet. Experiments are run on top of
            a Grid middleware managing the classical security and runtime interface issues related
            to Grid platforms. The nodes are voluntarily homogeneous, providing a much sim- pler
            management and helping a better environment for performance comparison (speed up of
            parallel applica- tions) and understanding. PlanetLab (Chun et al. 2003) is another
            real-life experimental platform, connecting real machines through the Internet, at the
            planet scale. Some production Grids (TeraGrid, eScience, DataGrid) have also been used
            as experimental platforms, before being opened to actual users or during dedicated time
            slots. Two major limitations of real-life platforms as experi- mentation tools are 1)
            their low software reconfiguration capability and 2) the lack of deep control and
            monitoring mechanisms for the users. The next section highlights how Grid'5000 addresses
            these limitations. 3 Designing Grid'5000 The design of Grid'5000 derives from the
            combination of 1) the limitations observed in simulators, emulators and real platforms
            and 2) an investigation into the research topics conducted by the Grid community. These
            two ele- ments led to the proposal for a large scale experimental tool, with deep
            reconfiguration capability, a controlled level of heterogeneity and a strong control and
            monitor- ing infrastructure. 3.1 Experiment Diversity During the preparation of the
            project (2003), we asked researchers in Grid computing which experiments they were
            willing to conduct on a large scale real-life experi- mental platform. The members of 10
            teams in France, involved in different aspects of Grid computing and well connected to
            the international Grid community, proposed a set of about 100 experiments. It was
            surprising to dis- cover that almost all teams required different infrastructure
            settings for their experiments. The experiment diversity nearly covered all layers of
            the software stack used in Grid computing: networking protocols (improving point to
            point and multipoints protocols in the Grid context, etc.); operating systems mechanisms
            (virtual machines, single system image, etc.); Grid middleware; application runt- imes
            (object oriented, desktop oriented, etc.); applications (life science, physics,
            engineering, etc.); problem solving environments. Research in these layers concerns
            scalabil- ity (up to thousands of CPUs), performance, fault toler- ance, QoS, and
            security. 3.2 Deep Reconfiguration For researchers involved in network protocols, OS and
            Grid middleware research, the software setting for their experiments often requires
            specific OS. Some researchers need Linux, while others are interested in Solaris10 or
            Windows. For networking research, FreeBSD is preferred because network emulators such as
            Dummynet and Mod- elnet run only on this operating system. Some researchers also need to
            test and improve protocol performance (for example changing the size of the TCP window
            or testing alternative protocols). Some research on virtual machines, process
            checkpointing and migration need the installation of specific OS versions or OS patches
            that may not be compatible with each other. Even for experiments over the OS layers,
            researchers have some preferences: for example some prefer Linux kernel 2.4 or 2.6
            because their schedulers differ. Researchers' needs are quite different in Grid
            middleware: some require Globus (in different ver- sions: 3.2, 4, DataGrid version)
            while others need Unicore, Desktop Grid or P2P middleware. Some other researchers need
            to make experiments without any Grid middleware and test applications and mechanisms in
            a multi-site, multi-cluster environment before evaluating the Middle- ware overhead.
            According to this inquiry on researchers' needs, Grid'5000 should provide a deep
            reconfiguration mechanism allowing researchers to deploy, install, boot and run their
            specific software images, possibly including all the layers of the software stack. In a
            typical experiment sequence, a researcher reserves a partition of Grid'5000, deploys its
            software image, reboots all the machines of the partition, runs the experiment, collects
            results and relieves 484 COMPUTING APPLICATIONS the machines. This reconfiguration
            capability allows all researchers to run their experiments in the software envi- ronment
            exactly corresponding to their needs. 3.3 A Two-Level Security Approach Because
            researchers must be able to boot and run their specific software stack on Grid'5000
            sites and machines, we cannot make any assumption on the correct configura- tion of the
            security mechanisms. As a consequence, we should consider that Grid'5000 machines are
            not protected. Two other constraints increase the security issue com- plexity: 1) all
            the sites hosting the machines are connected through the Internet and 2) basically
            inter-site communi- cation should not suffer any platform security restriction and
            overhead during experiments. From this set of con- straints, we decided to use a
            two-level security design with the following rules: a) Grid'5000 sites are not directly
            connected to the Internet and b) all communication pack- ets fly without limitation
            between Grid'5000 sites. The first rule ensures that Grid'5000 will resist hacker
            attacks and will not be used as basis of attacks (i.e. massive DoS or other more
            restricted attacks). These design rules led to building a large scale con- fined cluster
            of clusters. Users connect to Grid'5000 from the lab where the machines are hosted.
            Rigorous authen- tication and authorization check is done first to enter the lab and
            then to log in Grid'5000 nodes from the lab. In order to participate in multiplatform
            experiments, it is possible for Grid'5000 sites to open restricted routes through the
            Internet to external clusters (called satellite sites). 3.4 Two Thirds as Homogeneous
            Nodes Performance evaluation in Grid is a complex issue. Speedup evaluation is hard to
            evaluate with heterogeneous hard- ware. In addition, the hardware diversity increases
            the complexity of the deployment, reboot and control subsys- tem. Moreover, multiplying
            the hardware configurations directly leads to an increase in the everyday management and
            maintenance cost. Considering these three parame- ters, we decided that 2/3 of the total
            machines should be homogeneous. However, Grid are heterogeneous by nature and this is an
            important dimension in the experi- ment diversity. This is the reason why we chose to
            keep 1/3 as heterogeneous machines. 3.5 Precise Control and Measurement Grid'5000 is
            used for Grid software evaluation and mak- ing fair comparisons of alternative
            algorithms, software, protocols, etc. This implies two elements: first, users should be
            able to steer their experiments in a reproduci- ble way and second, they should be able
            to access probes providing precise measurements during the experiments. The
            reproducibility of experiment steering includes the capability to 1) reserve the same
            set of nodes, 2) deploy and run the same piece of software on the same nodes, 3)
            synchronize the experiment execution on all the involved machines, 4) if needed, repeat
            sequences of operations in a timely and synchronous way, 5) inject the same experi-
            mental conditions (synthetic or trace based: fault injec- tion, packet loss, latency
            increase, bandwidth reduction). As described in the next section, Grid'5000 software set
            provides a reservation tool (OAR, see Georgiou et al. 2005), a deployment tool
            (Kadeploy2, see Georgiou et al. 2006) and several experimental condition injectors. Pre-
            cise and extensive measurement is a fundamental aspect of experimental evaluation on
            real-life platforms. Global observation of the network (from its edges) and local
            observation of processor, memory or disk is diffi- cult at the hardware level and since
            the users may use their own software configuration, there is no way to pro- vide a
            built-in and trustworthy monitoring system for CPU, memory and disc. Hence, it is the
            responsibility of the users to properly install, configure and manage the software
            observation tools they need for their experi- ments. 4 Grid'5000 Architecture The
            Grid'5000 architecture implements the principles described in the previous section.
            Based on the research- ers requirements, the scalability needs and the number of
            researchers, we decided to build a platform of 5000 CPUs distributed over 9 sites in
            France. Figure 2 presents Fig. 2 Overview of Grid'5000. 485GRID'5000 an overview of
            Grid'5000. Every site hosts a cluster and all sites are connected by high speed network
            (a novel network architecture is being deployed, connecting the sites with 10 Gbps
            links). Numbers in Figure 2 give the target number of CPUs for every cluster. Two-thirds
            of the nodes are dual CPU 1U racks equipped with 2 AMD Opteron processors run- ning at 2
            GHz, 2 GB of memory and two 1Gbps Ethernet Adapters. Clusters are also equipped with
            high speed networks (Myrinet, Infiniband, etc.). In the rest of this section we present
            the key architectural elements of Grid'5000. 4.1 A Confined System As discussed earlier,
            the Grid'5000 architecture should provide an isolated domain where communication is
            allowed without restriction between sites and is not possible directly with the outside
            world. Mechanisms based on state-of-the- art technology such as public key
            infrastructures and X509 certificates, produced by the Grid community to secure all
            resources accessed are not suitable for the Grid'5000. The GSI high level security
            approach imposes a heavy over- head and impacts on the performance, biasing the results
            of studies not directly related to security. A private dedicated network (PN) or a
            virtual private network (VPN) are, then, the only solutions to compose a secure grid
            backbone and to build such a confined infra- structure. In Grid'5000, we chose to
            interconnect the sites with a combination of DiffServ and MPLS technology (Multiprotocol
            label switching) provided by RENATER (our service provider). MPLS is an efficient way to
            build secure virtual private networks. As the packet encapsulation is done at very low
            level by very high performance rout- ers, the overhead is negligible and has no impact
            on the end-to-end performance. One difference between classi- cal Internet and MPLS, is
            that MPLS fixes the routing of Grid'5000 datagrams and flows. We consider that the
            static routing constraint is reasonable for such a testbed. Concerning the background
            traffic, and the meaningful- ness of our end-to-end measures, we chose to let the
            researcher load the network with articifially generated traffic he can monitor rather
            than letting him deal with unknown Internet traffic. It allows the calibration of tools,
            the debugging of protocols and the investigation of alter- native traffic control
            strategies and different types of traf- fic models. It appears Grid'5000 is
            complementary to PlanetLab as our instrument enables fine tuning and good understanding
            of basic phenomenon in the absence of extra noise. PlanetLab offers a more realistic
            view of the present Internet behavior, but cannot capture behaviors at the limit of the
            resource capacities and potentially the future Internet behavior. Many VPN
            implementation solutions are available but they do not provide security and QoS
            guarantees simultaneously. For security, network layer VPNs may use tunneling or network
            layer encryption (layer 3 VPN). A link layer, VPNs such as MPLS are directly provided by
            network service providers (layer 2-3 VPN). The advan- tage of the MPLS VPN over IP VPN
            (Ipsec) is perform- ance. As Grid'5000 sites are connected to the same NREN (National
            Research and Education Network), the multi- domain issue of the MPLS technology is
            avoided here. For performance guarantee, a combination of DiffServ and MPLS will be
            configured for Grid'5000 links. The Pre- mium service will be used for delay and
            bandwidth guar- antees required for reproducible experimental conditions and performance
            measurements. This MPLS-based Grid architecture allows the creation of a trust context
            that even enables to experiment with new security solutions for IP VPN-based Grids.
            Figure 3 presents the resulting com- munication architecture. Using MPLS in Grid
            architecture is not an isolated choice. Recently, a Grid VPN research group was born
            within the GGF, attesting a real interest in developing and using MPLS, G-MPLS or lower
            level optical switching technologies for the Grid. 4.2 User View and Data Management As
            previously mentioned, communications are done with minimal authentication between
            Grid'5000 machines. The logical consequence is that a user has a single account across
            the whole platform. However, each Grid'5000 site manages its own user accounts.
            Reliability of the authen- tication system is also critical. A local network outage
            should not break the authentication process on other sites. These two requirements have
            been fulfilled by the installation of an LDAP directory. Every site runs an LDAP server
            containing the same tree: under a common root, a branch is defined for each site. On a
            given site, the local administrator has read-write access to the branch and can manage
            its user accounts. The other branches are periodi- cally synchronized from remote
            servers and are read- only. From the user's point of view, this design is transpar- ent.
            Once the account is created, the user can access any of the Grid'5000 sites or services
            (monitoring tools, Wiki, deployment, etc.). His data, however, are local to every site.
            They are shared on any given cluster through NFS, but distribution to another remote
            site is done by the user through classical file transfer tools (rsync, scp, sftp, etc.).
            Data transfers with the outside of Grid'5000 are restricted to secure tools to prevent
            identity spoofing and public key authentication is used to prevent brute- force attacks.
            486 COMPUTING APPLICATIONS 4.3 Experiment Scheduling Experiment scheduling and resource
            allocation is man- aged by a resource management system called OAR (Georgiou et al.
            2005) at cluster level and by a simple broker at the grid level. OAR architecture is
            built from a relational database engine MySql. All large-scale opera- tions such as
            parallel task launching, node probing or monitoring are performed using a specialized
            parallel launching tool named Taktuk (Augerat, Martin, and Stein 2002). OAR provides
            most of the important features imple- mented by other batch schedulers such as priority
            sched- uling by queues, advance reservations, backfilling and resource match making. At
            grid level, a simple broker allows co-allocating sets of nodes on every selected
            cluster. The co-allocation process works as follows: 1) user submits an experiment which
            needs several sets of nodes on different clusters; 2) in round-robin sequence, the bro-
            ker submits a reservation to each local batch scheduler. If one reservation is refused,
            all previously accepted reser- vations are canceled. When all local reservations are
            accepted, the user receives an identifier from the broker, allowing the user to retrieve
            information about the allo- cated set of nodes. In Grid'5000, the resource management
            system is cou- pled with node reconfiguration operation at different points. First, a
            specific queue is defined where users can submit experiments requesting node
            reconfiguration. Second, there is a dynamic control of deployment rights in the prologue
            script that is executed before starting the experiment. This gives the user the
            capability of deploy- ing system images on the allocated node partition. Rights are
            revoked in the epilogue script after the experiment. Third, after the completion of
            experiments involving node reconfiguration, all nodes are rebooted in a default envi-
            ronment. This default environment provides libraries and middleware for experiments
            without reconfiguration. 4.4 Node Reconfiguration Node reconfiguration operation is
            based on a deployment tool called Kadeploy2 (Georgiou et al. 2006). This tool allows
            users to deploy their own software environment on a disk partition of selected nodes. As
            previously men- tioned, the software environment contains all software layers from OS to
            application level needed by users for their experiments. The architecture of Kadeploy2
            is also designed around a database and a set of specialized operating components. Fig. 3
            Communication architecture. 487GRID'5000 The database is used to manage different
            aspects of the node configuration (disk partition schemes, environment deployed on every
            partition), user rights to deploy on nodes, environment description (kernel, initrd,
            custom kernel parameters, desired filesystem for environment, associated
            postinstallation) and logging of deployment operations. Several deployment procedures
            are available, depend- ing mainly on OS type and filesystem specificity. We only sketch
            the usual deployment procedure. First, when a user initiates a deploy operation, he
            provides an environ- ment name allowing the retrieval of associated information from the
            database. The user provides this information at environment registration. Deployment
            begins by reboot- ing all nodes on a minimal system through a network booting sequence.
            This system prepares the target disk for deployment (disk partitionning, partition
            formatting and mounting). The next step in the deployment is the environment broadcast
            which uses a pipelined transfer between nodes with on-the-fly image decompression. At
            this point, some adjustments must be done on the broad- casted environment in order to
            be compliant with node and site policies (mounting tables, keys for authentica- tion,
            information for specific services that cannot support auto-configuration). The last
            deployment step consists in rebooting the nodes on the deployed system from a net- work
            loaded bootloader. 5 Deployment System Evaluation In this section, we present the
            evaluation of the deploy- ment and reboot system of Grid'5000. Evaluation of other parts
            of Grid'5000 will be presented in future papers. The deployment and reboot system is
            certainly the most impor- tant mechanism of Grid'5000, enabling a rapid turnaround of
            experiments on the platform. Typical deployment and reboot mechanisms for clusters
            cannot be coupled to a batch scheduler. Moreover, they are not designed to concurrently
            install different systems on separate cluster partitions. Our objective is to provide a
            reconfiguration time (boot-to-boot: B2B) lower than 10 minutes for the 5000 CPUs of the
            platform. This means: 1) deploying the software image on all the nodes of every site (a
            site may contain up to 500 nodes); 2) issuing the reboot order on all Grid'5000 nodes;
            and 3) the reboot of all nodes from the deployed software image. As previously
            mentioned, Kadeploy2 uses more steps, booting a light kernel to prepare the user
            partition to boot from for the experiment. The B2B time depends not only on the
            performance of Kadeploy2 but also on the OS to be booted (OS have dif- ferent
            configurations and run different sets of services). Figure 4 presents the B2B time
            according to the number of nodes, in a single site, for a simple kernel without service,
            on a cluster of 200 nodes. The figure presents the completion time of every step
            included in the B2B time (as a cumulated graph): 1) the time to boot the preparation OS
            launching a light kernel (first check); 2) the time to prepare the disk partitions
            before the installation of the user environment (preinstall); 3) the time to transfer
            the user environment archive (trans- fer); and 4) the time to boot the user OS
            (lastcheck). First, the figure shows that the boot time depends on the number of nodes.
            This is because the boot time is differ- ent for all machines and we consider only the
            slowest one. In contrast, the disk preparation and environment transfer times increase
            negligibly with the number of nodes. The time to reboot the 2 OS largely dominates the
            environment transfer time. Altogether, the figure clearly shows a B2B time evolving
            linearly with the number of nodes following an affine function that could be evaluated
            as B2Btime = 200 secs + (0.33 × X), X being the number of nodes. Figure 5 presents the
            time diagram of a deployment and reboot phase involving 2 Grid'5000 sites for a total of
            260 nodes (180 nodes in site 1 and 80 nodes in site 2). The vertical axis corresponds to
            the number of nodes in deployment and reboot states. At t = 0 s, all the nodes are
            running an OS. At t = 30 s, a deployment sequence is issued. At t = 50 s, all nodes are
            rebooting the deploy- ment kernel. At t = 160 s all nodes have rebooted and are
            preparing the user partition. The clusters start the second reboot at t = 200 s for site
            2 and t = 340 s for site 1. Site 2 nodes are rebooted with the user OS at t = 320 s. All
            nodes are rebooted with the user OS (including Site 1) at t = 450 s. At t = 800 s, the
            user experiment is completed and a reboot order is issued making all nodes reboot to
            default environment. This figure demonstrates that the current B2B time at the Grid
            level (450 seconds) is well Fig. 4 Time (in seconds) to deploy and boot a new OS on a
            cluster with Kadeploy2. 488 COMPUTING APPLICATIONS below the 10 minute mark. The
            deployment and reboot system is still in Alpha version. It is not tuned and there are
            many optimization opportunities (Georgiou et al. 2006). 6 Grid'5000 Configuration
            Examples The main objective of the Grid'5000 set of software is to ease the deployment,
            execution and result collection of large scale Grid experiments. In this section, we
            present 5 examples for Grid'5000 reconfiguration for experiments in networking
            protocols, Grid middleware infrastruc- tures, and GridRPC environment. 6.1 Testing
            Recent P2P Protocols in Grid Context BitTorrent is a popular file distribution system
            outper- forming FTP performance when delivering large and highly demanded files. The key
            idea of BitTorrent is the cooper- ation of the downloaders of the same file by uploading
            chunks of the file to each other. As such, BitTorrent is a nice broadcast protocol for
            large files in data and compu- tational Grids. BitTorrent uses TCP as the transport pro-
            tocol. In this section, we describe how we can deploy, run and collect experiment
            results, when performing simple BitTorrent performance evaluation for a variation of TCP
            protocol, on homogeneous nodes of Grid'5000. The modification of the TCP stack involves
            the compilation and deployment of a specific OS kernel. The experiment requires 9 steps:
            Step 1) BitTorrent code is instrumented to log reception and emission events (type of
            communi- cation, sender identifier, receiver identifier, time and chunk identifier).
            BitTorrent has been instrumented to replay the logged sequence of events. Step 2) The
            soft- ware image is prepared (installing specific libraries and software  Python for
            BitTorrent), based on a minimal image certified to work on the experimental nodes. The
            kernel is patched and compiled with alternative TCP ver- sions. The local root file
            system is then archived and reg- istered on the deploying software database on all
            sites. Step 3) Nodes are reserved possibly from the same selec- tion file, using OAR.
            Step 4) The archived file system image is deployed on a user-specified partition of all
            nodes, using Kadeploy2. Step 5) Kadeploy2 reboots all the reserved nodes and checks that
            the machine is respond- ing to ping and ssh. 6) The BitTorrent file to be broad- casted,
            is stored on the user home directory where the BitTorrent master node (the seeder) will
            run. The list of nodes provided by OAR is stored on the BitTorrent mas- ter node. 7)
            Node clocks are synchronized using NTP- date. 8) A distributed launcher program controls
            the start of the experiment script on all the nodes. The BitTorrent tracker is started
            first, then the Torrent file created is reg- istered in the tracker, then the seeder is
            started on the master and finally, the clients (leechers) are started on all the other
            nodes. The BitTorrent events are recorded locally on all the nodes. 9) All log files are
            collected and stored in the user home directory of the user site gateway. Reserved nodes
            are released. 6.2 Deploying a Globus Toolkit Globus is an open source grid middleware
            toolkit used for building grid systems and applications. This part describes how we can
            map a Globus (Toolkit 2) virtual grid on Grid'5000, deploy Globus, and run experiments.
            The topology we chose for our virtual Globus grid was to have one Globus installation on
            each Grid'5000 site. We con- sider each site to be a separate cluster that provides
            serv- ices through the Globus Toolkit. Since we are emulating a grid, each cluster
            manages its own user accounts (i.e. no grid-wide user directory). Job execution on
            clusters is managed by a batch job scheduler (e.g. OAR, PBS). Each cluster manages user
            accounts and job scheduling with their software of choice, as we only need homogeneity
            inside clusters. Each site runs a certification authority (CA) that delivers user
            certificates for their users, as well as host certificates. We pick a front node on each
            site, and install Globus services on this front node. These services accept requests
            from other sites, authenticate and authorize them, then perform an action (e.g. submit a
            job) on behalf of the client. Clients authenticate services with a host certificate
            delivered by the site the services run on. The Gatekeeper maps user certificates to the
            user accounts of each cluster, and executes them with the local job scheduler. Front
            nodes also run the MDS (monitoring and discovery sys- tem) service, and GSIFTP (data
            transfer). Fig. 5 Time diagram for the deployment and reboot of a user environment on 2
            sites. 489GRID'5000 Globus toolkit is deployed by creating a system image that contains
            a Globus installation tailored for the experi- ment (since we deploy the whole system
            image, everything can be customized up to the operating system kernel). We create for
            each site an image for cluster compute nodes with a batch scheduler, and an image for
            the front node with the Globus Toolkit services (Gatekeeper, MDS, GSIFTP, and
            certificates). The virtual Globus grid is deployed on Grid'5000 machines using the
            Kadeploy tools, thereby turning Grid'5000 into a virtual Globus grid as long as the
            Kadeploy reservation lasts. While Globus users are running their experiments, log files
            are saved to the local drives of each node. As soon as the experiment is done, Kadeploy
            reboots the nodes with their default system image, and users can retrieve their log
            files and process them. 6.3 A Corba Based Grid Running DIET and TLSE The DIET (Caron and
            Desprez 2006) middleware infra- structure follows the GridRPC paradigm (Seymour et al.
            2004) for clientserver computing over the Grid. It is designed as a set of hierarchical
            components (client, mas- ter and local agents, and server daemons). It finds an
            appropriate server according to the information provided in the client request (problem
            to be solved, size of the data involved), the performance of the target platform (server
            load, available memory, communication perform- ance), and the availability of data
            stored during previous computations. The scheduler is distributed using several
            hierarchies connected either statically (in a Corba fash- ion) or dynamically (in a
            peer-to-peer fashion). The main goal of the Grid-TLSE project (Dayde et al. 2004) is to
            design an expert site that provides an easy access to a number of sparse matrix solver
            packages allowing their comparative analysis on user-submitted problems, as well as on
            matrices from collections also available on the site. The site provides user assistance
            in choosing the right solver for its problems and appropriate values for the solver
            parameters. A computational Grid managed by DIET is used to deal with all the runs
            related to user requests. Our goal in the Grid'5000 project is two- fold. First we want
            to validate the scalability of our dis- tributed scheduling architecture at a large
            scale (using thousands of servers and clients) and then to test some deployments of the
            TLSE architecture for future produc- tion use. In the current availability of Grid'5000
            platform, the deployment of DIET with TLSE server works in three phases. The first step
            consists in sending one OAR request at each site, to reserve a maximum of available
            nodes. The second phase consists in receiving OAR information to know which nodes are
            given by reservation. The third phase generates an XML file with the dynamic informa-
            tion as well as names of nodes at each site. These files will be used by GoDIET to
            deploy DIET. Our main goal during this first experience is to corroborate a theoretical
            study of the deployment with the hardware capability of Grid'5000 platform (CPU
            performance, bandwidth, etc.) to design a hierarchy that achieves a good scalability and
            a good efficiency for DIET. From this XML file, GoDIET deploys agents (or schedulers),
            servers and services bound to DIET as Corba services (i.e. naming service) along with a
            distributed log tool designed for the visualization tools (VizDIET, see Bolze, Caron,
            and Desprez 2006). Figure 6 shows a large deployment of DIET using 574 computing nodes
            and 9 agents for the scheduling of 45000 requests. The 574 servers are deployed on 8
            clusters and 7 sites. 6.4 Process Design, Optimization, Planning and Scheduling Process
            systems engineering is concerned with the under- standing and development of systematic
            procedures for the design and operation of chemical process systems, ranging from
            continuous to batch processes at industrial scale (Ponish et al. 2005). More precisely,
            the optimal design continuous process consists in selecting simultane- ously the unit
            operations, the topology and the best operating conditions. Several software tools have
            been developed for solving this type of problems. One of them, AG (1,2), used at the LGC
            (Laboratoire de Genie Chimique) is a serial fortran code. To give an illustration, the
            treated problem instances involve problem sizes of between 170 and 210 variables. Half
            of them are integer, which corre- sponds to a combinatorial aspect of about 1.e40 and
            1.e50 (since 3 values are possible for each integer variable). The problem is identified
            as an NP-hard problem. This application is multi-parametric by nature since it uses a
            stochastic algorithm where the execution is repeated 100 times with different
            parameters. Each execution requires 4 hours on the previous example giving a total
            execution time of 400 hours on a single PC. Gridification of such applications is
            straightforward. The first step consists in modifying the code in order to be able to
            schedule the 100 executions over 100 different nodes. The code has then been deployed
            over 5 clusters of Grid'5000: Lyon, Orsay, Sophia, Bordeaux, and Tou- louse. The code
            does not reference any external library which simplifies greatly the installation
            process at every site. Getting 100 nodes over 5 clusters of Grid'5000 is usu- ally not a
            problem and the elapsed time for simulation is reduced from 400 hours down to 4 hours.
            There is an obvious benefit: the possibility of solving larger prob- lems. But a major
            benefit compared with traditional large computer infrastructures is that the time
            between execu- tion launching (usually through a batch system that may 490 COMPUTING
            APPLICATIONS limit the maximal number of processors to a lower number and impose a long
            wait on a specific queue) and result recovering is drastically reduced, allowing
            researchers to carry out more simulations. 6.5 The Flow-Shop Challenge on Grid'5000 The
            Flow-Shop problem consists roughly in finding a schedule of a set of jobs on a set of
            machines that mini- mizes the total execution time called the makespan. The jobs must be
            scheduled in the same order on all machines, and each machine cannot be simultaneously
            assigned to two jobs. The complexity of the problem is very impor- tant for large size
            instances in terms of potential solutions (i.e. schedules). Even with a modern
            workstation the res- olution based on an exhaustive enumeration of all possi- ble
            combinations would take several years. Therefore, the challenge is to reduce the number
            of explored solu- tions using efficient algorithms in order to solve the prob- lem in a
            reasonable time. Nevertheless, even if these algorithms allow significant reduction in
            the size of the search space the complexity remains high, and the prob- lem could not be
            efficiently solved without computa- tional grids. To solve the problem, a new grid exact
            method based on the Branch-and-Bound (B&B) algorithm has been pro- posed by
            Melab (2005). The method is based on a large scale dispatcherworker cycle stealing
            approach. The dis- patcher controls the exploration of the search tree gener- ated by
            the distributed B&B algorithm. It maintains the best solution found so far and a
            pool of work units and ensures their dynamic allocation to the different workers joining
            the computational grid. Each work unit represents a set of nodes (or solutions) to be or
            being explored, and is designated by a small descriptor. Each worker explores its
            assigned tree nodes using the B&B algorithm and sends back a solution to the
            dispatcher if it is better than the best known solution so far. To deal with the load
            bal- Fig. 6 Large DIET deployment on Grid'5000. 491GRID'5000 ancing issue, as soon as
            its local pool of nodes to be explored is empty each worker requests from the dis-
            patcher a work unit. The dispatcher selects a work unit being executed and splits it in
            two parts. The second part, which is probably not yet explored, is sent to the worker
            asking for work. Another issue which is dealt with in the proposed grid exact method is
            the fault tolerance. This issue arises on a computational grid because of failures of
            resources (proc- essors, networks, etc.) or their dynamic availability. Within the
            Grid'5000 context, the dynamic availability is a result of the reservation policy of the
            grid. Indeed, for long-run- ning applications, a series of reservations is required to
            resume their execution. The end of each reservation is put in the same category as a
            failure of the corresponding resources. In the proposed method, a checkpointing-based
            approach is proposed to deal with the fault-tolerance issue. Each worker periodically
            requests the dispatcher to update the descriptor of its associated work unit with its
            current state. The descriptor is based on a special coding of the search tree which
            allows minimizing of the mem- ory space required by the checkpointing mechanism and the
            communications involved by the dynamic work distri- bution. The proposed method has been
            implemented using the XtremWeb middleware (Mezmaz, Melab, and Talbi 2006) and RPCs. The
            second implementation is used to solve the Taillard's Flow-Shop problem instance Ta0561
            of scheduling 50 jobs on 20 machines. A near-optimal solu- tion has been found in (Ruiz
            and Stutzle 2004). However, as it is CPU time consuming such instance has never been
            optimally solved. The proposed method in Melab (2005) allowed not only an improvement in
            the best known solution (Ruiz and Stutzle 2004) for the problem instance but also proved
            the optimality of the provided solution. Indeed, once the supposed optimal solution is
            found it has to be compared with the remaining solutions being visited to prove that it
            is really the best. The experiments were performed on a computational grid including,
            simultaneously, processors from Grid'5000 and different educational networks of
            Université de Lille1 (Polytech'Lille, IEEA, IUT "A"). The number of proces- sors
            averaged approximately 500, and peaked at 1245 machines during one night. The Grid'5000
            sites involved in the computation are Bordeaux, Lille, Orsay, Rennes, Sophia-Antipolis
            and Toulouse. The optimal solution was found with a total wall-clock time of 7 weeks.
            The exper- iment was performed a second time starting from the best known near-optimal
            solution minus 1. The optimal solu- tion was found within 25 days and 46 minutes. The
            reso- lution would take 22 years 185 days and 16 hours on a single machine. During the
            resolution, an average of 328 processors were used and peaked at 1195 processors. The
            total number of explored nodes was 6,5874e + 12, and a small number of them (0. 39%)
            were explored twice (redundant work). Moreover, 129 958 work allocation operations and 4
            094 176 checkpointing opeartions have been performed. Such statistics show that the
            dynamic load-balancing and checkpointing mechanisms have been heavily requested and
            performed well. Furthermore, the parallel efficiency is measured as the ratio between
            the agregated execution time of the workers and the total time of their availability.
            The parallel efficiency observed in this experience is 97%, so the load balancing
            approach is very efficient. Finally, the average CPU time consumed by the dispatcher is
            1.7%. The approach scales up to 1195 processors without any problem. 7 Conclusion
            Grid'5000 belongs to a novel category of research tools for Grid research: a large scale
            distributed platform that can be easily controlled, reconfigured, and monitored. We have
            presented the motivation behind design and archi- tecture of this platform. The main
            difference between Grid'5000 and previous real-life experimental platforms is its degree
            of reconfigurability, allowing researchers to deploy and install the exact software
            environment they need for each experiment. This capability raises a secu- rity
            difficulty, solved in Grid'5000 by establishing a virtual domain spanning over several
            sites, rigorously controlling the communications at the domain boundaries and relax- ing
            restrictions for intra-domain communications. We have described some configuration
            examples, illustrating the variety of experiments that can benefit from Grid'5000. We
            also presented the performance of the reconfiguration system which provides a
            "boot-to-boot" time of less than 10 minutes on the full platform. Ongoing work focuses
            on several areas: 1) ease soft- ware image construction for the users; 2) provide
            automatic validation of software images; 3) support and coordinate experiments; and 4)
            tune and validate network perform- ance. For more information about the Grid'5000
            project, please contact the corresponding author, Franck Cappello (fci@lri.fr), who is
            responsible for this project. Acknowledgments We would like to thank the French Ministry
            of research and the ACI Grid and ACI Data Mass incentives, especially Thierry Priol
            (Director of the ACI GRID) and Brigitte Plateau (Head of the Scientific Committee of the
            ACI Grid), and Dany Vandromme (Director of RENATER) for their support. We also thank
            INRIA, CNRS, regional councils of Aquitaine, Bretagne, Ile de France and Pro- vence Alpe
            Côte d'Azur, Alpes Maritimes General Council and the following Universities: University
            of Paris Sud, 492 COMPUTING APPLICATIONS Orsay, University Joseph Fourier, Grenoble,
            University of Nice-Sophia Antipolis, University of Rennes 1, Insti- tut National
            Polytechnique de Toulouse/INSA/FERIA/ Universite Paul Sabatier, Toulouse, University
            Bordeaux 1, University Lille 1/GENOPOLE, Ecole Normale Super- ieure de Lyon. We also
            thank MYRICOM. Author Biographies Raphaël Bolze is a Ph.D. student at Ecole Normale
            Superieure de Lyon. He received his Masters in computer science in 2003 from the
            Institut de Recherche en Infor- matiques de Nantes and also a Masters degree in engi-
            neering from l'Ecole Polytechniques de l'Universite de Nantes. His research interests
            focus on workflow sched- uling over grid environment. Franck Cappello holds a Research
            Director position at INRIA and leads the Grand-Large project at INRIA. He has initiated
            the XtremWeb (Desktop Grid) and MPICH-V (Fault tolerant MPI) projects. He is currently
            the director of the Grid'5000 project, designing, building and running a large scale
            Grid experimental platform. He has authored more than 60 papers in the domains of high
            performance programming, desktop grids, grids and fault tolerant MPI. He has contributed
            to more than 30 Program Com- mittees. He is editorial board member of the International
            Journal on GRID Computing and steering committee mem- ber of IEEE HPDC and IEEE/ACM
            CCGRID. He is the general chair of IEEE HPDC'2006. Eddy Caron is an assistant professor
            at Ecole Normale Superieure de Lyon and holds a position with the LIP lab- oratory (ENS
            Lyon, France). He is a member of GRAAL project and technical manager for the DIET
            software pack- age. He received his Ph.D. in computer science from Uni- versity de
            Picardie Jules Verne in 2000. His research interests include parallel libraries for
            scientific comput- ing on parallel distributed memory machines, problem solving
            environments, and grid computing. Michel J. Daydé received his Ph.D. from Institut
            National Polytechnique de Toulouse (France) in 1986 in computer science. From 1987 to
            1995, he was a postdoctorate fel- low then visiting a Senior Scientist in the Parallel
            Algo- rithms Group at CERFACS. From 1988, he has been Professor at Ecole Nationale
            Superieure d'Electrotech- nique, d'Electronique, d'Informatique, d'Hydraulique et des
            Télécommunications (ENSEEIHT) at Institut National Polytechnique de Toulouse. Since
            1996, he has been Research Director in the Groupe Algorithmes Paralleles et Optimisation
            at Institut de Recherche en Informatique de Toulouse (IRIT). He is Head of the ENSEEIHT
            Site of IRIT and Vice-Head of IRIT. His current research interests are in grid
            computing, parallel computing and computa- tional kernels in linear algebra and large
            scale nonlinear optimization. He is the coordinator of the GRID-TLSE Project and
            scientific coordinator of the Toulouse/Midi- Pyrenees Site of GRID'5000. Frédéric
            Desprez is a director of research at INRIA and holds a position at LIP laboratory (ENS
            Lyon, France). He received is Ph.D. in computer science from the Insti- tut National
            Polytechnique de Grenoble in 1994 and his M.S. in computer science from the ENS Lyon in
            1990. His research interests include parallel libraries for scien- tific computing on
            parallel distributed memory machines, problem solving environments, and grid computing.
            Emmanuel Jeannot is currently full-time researcher at INRIA (Institut National de
            Recherche en Informatique et en Automatique) and is doing its research at the LORIA lab-
            oratory. From September 1999 to September 2005 he was associate professor at the
            Université Henry Poincare, Nancy 1. He got his Ph.D. and Master degree of computer
            science (respectively in 1996 and 1999) both from Ecole Normale Superieure de Lyon. His
            main research interests are scheduling for heterogeneous environments and grids, data
            redistribution, grid computing software, adaptive online compression and programming
            models. He is cur- rently visiting the ICL Laboratory of the University of Tennessee.
            Yvon Jégou is a full time INRIA reasercher in the PARIS project of INRIA-Rennes (IRISA).
            His research activities are centered on architecture, operating systems and com-
            pilation techniques for parallel and distributed computing. His current work is focused
            on the development of a DSM for the implementation of runtime systems on large clusters
            and for the management of data repositories on the Grid. In the recent past, he
            participated to the IST POP European project on the implementation of an OpenMP system
            for clusters using distributed shared memories (DSM). He is currently involved in the
            XtreemOS European project. The objective of XtreemOS is the development of a Grid oper-
            ating system with native support for virtual organizations. He is the leader of the
            Grid'5000 team at INRIA-Rennes. Stephane Lanteri is a researcher at INRIA Sophia
            Antipolis in a scientific computing team. His current activities are concerned with the
            design of unstructured mesh based numerical methods for the discretization of PDE
            systems modeling wave propagation phenomena, domain decomposition and multilevel
            algorithms and high performance parallel and distributed computing. He is the scientic
            coordinator of the Grid5000@Sophia project which defines the contributions of INRIA
            Sophia Antipolis to the Grid'5000 project. 493GRID'5000 Julien Leduc is the contractor
            CNRS Research Engi- neer, a member of Grid'5000 technical committee and participated in
            the design of the Grid'5000 grid services architecture. He is the technical manager of
            the reconfig- uration feature of Grid'5000: Kadeploy designer and main developer.
            Previously, he worked on the Clic clus- tering distribution, and system administration
            of several clusters in Grenoble. Nordine Melab received his Master's, Ph.D. and HDR
            degrees in computer science, from the Laboratoire d'Informatique Fondamentale de Lille
            (LIFL, Universite de Lille1). He is an Associate Professor at Polytech'Lille and a
            member of the OPAC team at LIFL. He is involved in the DOLPHIN project of INRIA Futurs.
            He is particu- larly a member of the Steering Committee of the Grid'5000 French
            Nation-Wide project. His major research interests include parallel and grid computing,
            combinatorial opti- mization algorithms and applications and software frame- works. Dr.
            Pascale Vicat-Blanc Primet, graduated in Compu- ter Science, is senior researcher
            (Directrice de Recherche) at INRIA. Her research interests include Distributed and
            Real-Time Systems, High Performance Grid and Cluster Networking, Active Networks,
            Internet protocols (TCP/ IP), Network Quality of Service. She is leading the RESO team,
            labelled RESO project of the Institut National de la Recherche en Informatique et
            Automatique (INRIA) at LIP laboratory in LYON (France). Since 2000, she has been very
            active in the international and national Grid community. Co-chair of the DataTransport
            Research Group in the Global Grid Forum, she has co-edited several Grid Networking and
            Transport protocol GGF documents. She is general co-chair of the International GRIDNETS
            con- ference and PFLDNET workshop, member of interna- tional conferences steering or
            program committees and reviewer for international conference and journal in Grids and
            Networking. She has published her work in more than 60 papers in Grid and Networking
            journals or confer- ences. She is member of the steering committee of the French ACI MD
            DataGRID Explorer (GdX) project, of the ACI Grid Grid5000 project, ANR IGTMD and EU
            Strep EC-GIN project. Raymond Namyst received his Ph.D. in computer sci- ence from Lille
            in 1997. He held the position of assistant professor in the Computer Science Departement
            of the Ecole Normale Superieure of Lyon (19972002). In 2002, he joined the Computer
            Science Laboratory of Bordeaux (LaBRI) where he holds a Professor position. He is the
            head of the Runtime INRIA research project, devoted to the design of high performance
            runtime systems for paral- lel architectures. His main research interests are in
            parallel computing, thread scheduling on multiprocessor architec- tures, communications
            over high speed networks and com- munications within Grids. He has played a major role
            in the development of the PM2 software suite. He has written numerous papers about the
            design of efficient runtime sys- tems. He also serves as the chair of the Computer
            Science Teaching Department of the University of Bordeaux. Pascale Vicat-Blanc Primet
            received a Ph.D. degree in computer science from INSA Lyon, France in 1988. Based at
            École Normale Supérieure de Lyon (ENS-Lyon), she is currently "directrice de recherche"
            at INRIA and leads the RESO team-project. This team is specialized in commu- nication
            protocols and software optimization for high- speed networks. Pascale's research
            interests include high- performance Grid and cluster networking, active networks, TCP,
            QoS, bandwidth sharing and security. She was a co- chair of the GGF Data Transport
            Research Group. She is member of the Steering Committee the Gridnets and Pfld- net
            conferences. Member of the GRID'5000 project steer- ing committee, she is co-charing its
            Lyon's site. She has published about hundred papers in distributed computing and
            networking journals and conferences. Benjamin Quetier is a Ph.D. student of Franck
            Cappello (INRIA) and works in the Grand-Large project and in the European project
            CoreGrid. He is also involved in the Grid'5000 project working on virtualization. The
            goal of his thesis is to build a large scale emulator platform (more than 100 K nodes)
            over Grid'5000. The first part of his the- sis was a comparison of the diverse
            virtualization tools such as Xen or VMware. He works on the comparison of applications
            on a real live platform and on a emulated one. Olivier Richard is an associate professor
            at the ID- IMAG laboratory. He graduated from Paris XI University with a Ph.D. in
            computer science in 1999. His researh interests are focused on system architecure for
            high per- formance computing and large distributed system (cluster, Grid and P2P). His
            is co-leader of OAR and Kadeploy software projects. El-Ghazali Talbi received his
            Master's and Ph.D. degrees in computer science, both from the Institut National
            Polytechnique de Grenoble. He is presently Pro- fessor in computer science at
            Polytech'Lille (Universite de Lille1), and researcher in Laboratoire d'Informatique
            Fondamentale de Lille. He is the leader of OPAC team at LIFL, the DOLPHIN project at
            INRIA Futurs and the platform of bioinformatics of Lille (Genopole de Lille). He took
            part to several CEC Esprit and national research projects. His current research
            interests are mainly parallel and grid computing, combinatorial optimization algo-
            rithms and applications and software frameworks. 494 COMPUTING APPLICATIONS Iréa Touche
            took part in the project e-toile, which was the first major French high performance data
            transfer grid project, when she studied to obtain her engineering degree in applied
            mathematics and scientific calcula- tions. She was in charge of a part of the cluster's
            configu- ration of the CEA (Commissariat à l'energie atomique) and also had to deploy a
            parallel application of molecular dynamics called CHARMM. After that, she worked for 6
            months at IRIT (Institut de Recherche en Informatique de Toulouse), on the Grid'5000
            project. She had to "gridify" four applications used by different laboratories of INP
            Toulouse (Institut National Polytechnique). To do this she studied the application's
            features, and deployed them on one or more clusters of the grid. For the multi-para-
            metric applications, she used several sites in order to be able to easily carry out a
            great number of executions. For the parallel ones, she has made scalability tests. She
            is currently working at the LGC (Laboratoire de Génie Chimique), where she helps
            researchers to optimize their scientific codes. Note 1
            http://ina2.eivd.ch/Collaborateurs/etd/problemes.dir/ordon-
            nancement.dir/ordonnancement.html References Augerat, P., Martin, C., and Stein, B.
            2002. Scalable monitoring and configuration tools for grids and clusters. Proceed- ings
            of the 10th Euromicro Workshop on Parallel, Dis- tributed and Network-based Processing.
            IEEE Computer Society. Bolze, R., Caron, E., and Desprez, F. 2006. A monitoring and
            visualization tool and its application for a network ena- bled server platform. In LNCS,
            editor, Parallel and Dis- tributed Computing Workshop of ICCSA 2006, 811 May, Glasgow,
            UK. Caron, E. and Desprez, F. 2006. DIET: A scalable toolbox to build network enabled
            servers on the Grid. International Journal of High Performance Computing Applications,
            20(2):335352. Casanova, H., Legrand, A., and Marchal, L. 2003. Scheduling distributed
            applications: the simgrid simulation frame- work. Proceedings of the Third IEEE
            International Sym- posium on Cluster Computing and the Grid (CCGrid'03), Tokyo, Japan.
            Chun, B., Culler, D., Roscoe, T., Bavier, A., Peterson, L., Wawrzoniak, M., and Bowman,
            M. 2003. PlanetLab: An overlay testbed for broad-coverage services. ACM SIG- COMM
            Computer Communication Review, 33(3):312. Daydé, M., Giraud, L., Hernandez, M.,
            L'Excellent, J.-Y., Pantel, M., and Puglisi, C. 2004. An overview of the grid-tlse
            project. Proceedings of 6th International Meeting VECPAR 04, June, Valencia, Spain.
            Dumitrescu, C. and Foster, I. 2005. Gangsim: A simulator for grid scheduling studies.
            Proceedings of the IEEE Interna- tional Symposium on Cluster Computing and the Grid
            (CCGrid'05), May, Cardiff, UK. Georgiou, Y., Leduc, J., Videau, B., Peyrard, J., and
            Richard, O. 2006. A tool for environment deployment in clusters and light grids. Second
            Workshop on System Management Tools for Large-Scale Parallel Systems (SMTPS'06), April,
            Rhodes Island, Greece. Georgiou, Y., Richard, O., Neyron, P., Huard, G., and Martin, C.
            2005. A batch scheduler with high level components. Proceedings of CCGRID'2005, May,
            Cardiff, UK. IEEE Computer Society. Liu, X., Xia, H., and Chien, A. 2004. Validating and
            scaling the MicroGrid: A scientific instrument for grid dynamics. The Journal of Grid
            Computing 2(2):141161. Melab, N. 2005. Contributions a la resolution de problemes
            d'optimisation combinatoire sur grilles de calcul. Ph.D. thesis, November, LIFL, USTL.
            Mezmaz, M., Melab, N., and Talbi, E.-G. 2006. A grid hybrid exact approach for solving
            multi-objective problems. Pro- ceedings of the 9th IEEE/ACM International Workshop on
            Nature Inspired Distributed Computing (NIDISC'06  in conjunction with IPDPS'2006),
            Rhodes Island, Greece. Ponish, A., Azzaro-Pantel, C., Domenech, S., and Pibouleau, L.
            2005. About the relevance of mathematical programming and stochastic optimisation
            methods: application to the optimal batch plant design problems. ESCAPE 15, May 29June
            1. Ruiz, R. and Stutzle, T. 2004. A simple and effective iterative greedy algorithm for
            the flowshop scheduling problem. Technical Report, European Journal of Operational
            Research, in print. Seymour, K., Lee, C., Desprez, F., Nakada, H., and Tanaka, Y. 2004.
            The end-user and middleware APIs for GridRPC. Workshop on Grid Application Programming
            Interfaces, in conjunction with GGF12, September, Brussels, Bel- gium. Takefusa, A.,
            Matsuoka, S., Aida, K., Nakada, H., and Nagashima, U. 1999. Overview of a performance
            evaluation system for global computing scheduling algorithms. HPDC '99: Proceedings of
            the The Eighth IEEE International Symposium on High Performance Distributed Computing,
            Washington, DC, USA. IEEE Computer Society. White, B., Lepreau, J., Stoller, L., Ricci,
            R., Guruprasad, S., Newbold, M., Hibler, M., Barb, C., and Joglekar, A. 2002. An
            integrated experimental environment for distributed systems and networks. OSDI02
            Proceedings of the Fifth Symposium on Operating Systems Design and Implemen- tation, pp.
            255270, December, Boston, MA.</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<back><notes><p>1 http://ina2.eivd.ch/Collaborateurs/etd/problemes.dir/ordonnancement.dir/ordonnancement.html</p>
</notes>
<ref-list><ref><citation citation-type="confproc" xlink:type="simple"><name name-style="western"><surname>Augerat, P.</surname>
</name>
                , 
                    <name name-style="western"><surname>Martin, C.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Stein, B.</surname>
</name>
                 2002. 
                    <article-title>Scalable monitoring and configuration tools for grids and clusters</article-title>
                . <conf-name>Proceedings of the 10th Euromicro Workshop on Parallel,
                    Distributed and Network-based Processing</conf-name>
. <conf-loc>IEEE Computer Society</conf-loc>
.</citation>
</ref>
<ref><citation citation-type="book" xlink:type="simple"><name name-style="western"><surname>Bolze, R.</surname>
</name>
                , 
                    <name name-style="western"><surname>Caron, E.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Desprez, F.</surname>
</name>
<year>2006</year>
. <article-title>A monitoring and visualization tool and its application for
                    a network enabled server platform</article-title>. In 
                    <name name-style="western"><surname>LNCS</surname>
</name>
                , editor, <source>Parallel and Distributed Computing Workshop of ICCSA
                2006</source>
, <fpage>8</fpage>
–<lpage>11</lpage>
 May, Glasgow, UK.</citation>
</ref>
<ref><citation citation-type="journal" xlink:type="simple"><name name-style="western"><surname>Caron, E.</surname>
</name>
                 and 
                    <name name-style="western"><surname>Desprez, F.</surname>
</name>
<year>2006</year>. 
                    <article-title>DIET: A scalable toolbox to build network enabled servers on the Grid</article-title>
                . <source>International Journal of High Performance Computing
                Applications</source>
, <volume>20</volume>
(<issue>2</issue>):
                    <fpage>335</fpage>
–<lpage>352</lpage>
                .</citation>
</ref>
<ref><citation citation-type="confproc" xlink:type="simple"><name name-style="western"><surname>Casanova, H.</surname>
</name>
                , 
                    <name name-style="western"><surname>Legrand, A.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Marchal, L.</surname>
</name>
                 2003. 
                    <article-title>Scheduling distributed applications: the simgrid simulation framework</article-title>
                . <conf-name>Proceedings of the Third IEEE International Symposium on Cluster
                    Computing and the Grid (CCGrid'03)</conf-name>
, Tokyo, Japan.</citation>
</ref>
<ref><citation citation-type="journal" xlink:type="simple"><name name-style="western"><surname>Chun, B.</surname>
</name>
                , 
                    <name name-style="western"><surname>Culler, D.</surname>
</name>
                , 
                    <name name-style="western"><surname>Roscoe, T.</surname>
</name>
                , 
                    <name name-style="western"><surname>Bavier, A.</surname>
</name>
                , 
                    <name name-style="western"><surname>Peterson, L.</surname>
</name>
                , 
                    <name name-style="western"><surname>Wawrzoniak, M.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Bowman, M.</surname>
</name>
<year>2003</year>. 
                    <article-title>PlanetLab: An overlay testbed for broad-coverage services</article-title>
                . <source>ACM SIG-COMM Computer Communication Review</source>
, <volume>33</volume>
(<issue>3</issue>):
                    <fpage>3</fpage>
–<lpage>12</lpage>
                .</citation>
</ref>
<ref><citation citation-type="confproc" xlink:type="simple"><name name-style="western"><surname>Daydé, M.</surname>
</name>
                , 
                    <name name-style="western"><surname>Giraud, L.</surname>
</name>
                , 
                    <name name-style="western"><surname>Hernandez, M.</surname>
</name>
                , 
                    <name name-style="western"><surname>L'Excellent, J.-Y.</surname>
</name>
                , 
                    <name name-style="western"><surname>Pantel, M.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Puglisi, C.</surname>
</name>
                 2004. 
                    <article-title>An overview of the grid-tlse project</article-title>
                . <conf-name>Proceedings of 6th International Meeting VECPAR 04</conf-name>,
                June, <conf-loc>Valencia, Spain</conf-loc>
.</citation>
</ref>
<ref><citation citation-type="confproc" xlink:type="simple"><name name-style="western"><surname>Dumitrescu, C.</surname>
</name>
                 and 
                    <name name-style="western"><surname>Foster, I.</surname>
</name>
                 2005. 
                    <article-title>Gangsim: A simulator for grid scheduling studies</article-title>
                . <conf-name>Proceedings of the IEEE International Symposium on Cluster
                    Computing and the Grid (CCGrid'05)</conf-name>
, May, <conf-loc>Cardiff, UK</conf-loc>
.</citation>
</ref>
<ref><citation citation-type="book" xlink:type="simple"><name name-style="western"><surname>Georgiou, Y.</surname>
</name>
                , 
                    <name name-style="western"><surname>Leduc, J.</surname>
</name>
                , 
                    <name name-style="western"><surname>Videau, B.</surname>
</name>
                , 
                    <name name-style="western"><surname>Peyrard, J.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Richard, O.</surname>
</name>
<year>2006</year>
. <article-title>A tool for environment deployment in clusters and light
                grids</article-title>
. <source>Second Workshop on System Management Tools for Large-Scale
                    Parallel Systems (SMTPS'06)</source>
, April, Rhodes Island, Greece.</citation>
</ref>
<ref><citation citation-type="confproc" xlink:type="simple"><name name-style="western"><surname>Georgiou, Y.</surname>
</name>
                , 
                    <name name-style="western"><surname>Richard, O.</surname>
</name>
                , 
                    <name name-style="western"><surname>Neyron, P.</surname>
</name>
                , 
                    <name name-style="western"><surname>Huard, G.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Martin, C.</surname>
</name>
                 2005. 
                    <article-title>A batch scheduler with high level components</article-title>
                . <conf-name>Proceedings of CCGRID'2005</conf-name>, May,
                    <conf-loc>Cardiff, UK. IEEE Computer Society</conf-loc>
.</citation>
</ref>
<ref><citation citation-type="journal" xlink:type="simple"><name name-style="western"><surname>Liu, X.</surname>
</name>
                , 
                    <name name-style="western"><surname>Xia, H.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Chien, A.</surname>
</name>
<year>2004</year>. 
                    <article-title>Validating and scaling the MicroGrid: A scientific instrument for grid dynamics</article-title>
                . <source>The Journal of Grid Computing</source>
<volume>2</volume>
(<issue>2</issue>):
                    <fpage>141</fpage>
–<lpage>161</lpage>
                .</citation>
</ref>
<ref><citation citation-type="other" xlink:type="simple">Melab, N. 2005. Contributions a la resolution de problemes
                d'optimisation combinatoire sur grilles de calcul. Ph.D. thesis, November,
                LIFL, USTL.</citation>
</ref>
<ref><citation citation-type="confproc" xlink:type="simple"><name name-style="western"><surname>Mezmaz, M.</surname>
</name>
                , 
                    <name name-style="western"><surname>Melab, N.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Talbi, E.-G.</surname>
</name>
                 2006. 
                    <article-title>A grid hybrid exact approach for solving multi-objective problems</article-title>
                . <conf-name>Proceedings of the 9th IEEE/ACM International Workshop on Nature
                    Inspired Distributed Computing (NIDISC'06 – in conjunction
                    with IPDPS'2006)</conf-name>
, <conf-loc>Rhodes Island, Greece</conf-loc>
.</citation>
</ref>
<ref><citation citation-type="book" xlink:type="simple"><name name-style="western"><surname>Ponish, A.</surname>
</name>
                , 
                    <name name-style="western"><surname>Azzaro-Pantel, C.</surname>
</name>
                , 
                    <name name-style="western"><surname>Domenech, S.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Pibouleau, L.</surname>
</name>
<year>2005</year>
. <article-title>About the relevance of mathematical programming and
                    stochastic optimisation methods: application to the optimal batch plant design
                problems</article-title>
. <source>ESCAPE 15</source>
, May 29–June 1.</citation>
</ref>
<ref><citation citation-type="other" xlink:type="simple">Ruiz, R. and Stutzle, T. 2004. A simple and effective iterative greedy
                algorithm for the flowshop scheduling problem. Technical Report, European Journal of
                Operational Research, in print.</citation>
</ref>
<ref><citation citation-type="book" xlink:type="simple"><name name-style="western"><surname>Seymour, K.</surname>
</name>
                , 
                    <name name-style="western"><surname>Lee, C.</surname>
</name>
                , 
                    <name name-style="western"><surname>Desprez, F.</surname>
</name>
                , 
                    <name name-style="western"><surname>Nakada, H.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Tanaka, Y.</surname>
</name>
<year>2004</year>
. <article-title>The end-user and middleware APIs for GridRPC</article-title>.
                    <source>Workshop on Grid Application Programming Interfaces</source>, in conjunction
                with GGF12, September, Brussels, Belgium.</citation>
</ref>
<ref><citation citation-type="book" xlink:type="simple"><name name-style="western"><surname>Takefusa, A.</surname>
</name>
                , 
                    <name name-style="western"><surname>Matsuoka, S.</surname>
</name>
                , 
                    <name name-style="western"><surname>Aida, K.</surname>
</name>
                , 
                    <name name-style="western"><surname>Nakada, H.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Nagashima, U.</surname>
</name>
<year>1999</year>
. <article-title>Overview of a performance evaluation system for global
                    computing scheduling algorithms</article-title>
. <source>HPDC '99: Proceedings of
                    the The Eighth IEEE International Symposium on High Performance Distributed
                Computing</source>, 
                    <publisher-loc>Washington, DC, USA</publisher-loc>
. <publisher-name>IEEE Computer Society</publisher-name>
                .</citation>
</ref>
<ref><citation citation-type="book" xlink:type="simple"><name name-style="western"><surname>White, B.</surname>
</name>
                , 
                    <name name-style="western"><surname>Lepreau, J.</surname>
</name>
                , 
                    <name name-style="western"><surname>Stoller, L.</surname>
</name>
                , 
                    <name name-style="western"><surname>Ricci, R.</surname>
</name>
                , 
                    <name name-style="western"><surname>Guruprasad, S.</surname>
</name>
                , 
                    <name name-style="western"><surname>Newbold, M.</surname>
</name>
                , 
                    <name name-style="western"><surname>Hibler, M.</surname>
</name>
                , 
                    <name name-style="western"><surname>Barb, C.</surname>
</name>
                , and 
                    <name name-style="western"><surname>Joglekar, A.</surname>
</name>
<year>2002</year>
. <article-title>An integrated experimental environment for distributed
                    systems and networks</article-title>
. <source>OSDI02 Proceedings of the Fifth Symposium on
                    Operating Systems Design and Implementation</source>, pp.
                <fpage>255</fpage>
–<lpage>270</lpage>
, December, Boston, MA.</citation>
</ref>
</ref-list>
</back>
</article>
</istex:document>
</istex:metadataXml>
<mods version="3.6"><titleInfo lang="en"><title>Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
</titleInfo>
<titleInfo type="alternative" lang="en" contentType="CDATA"><title>Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed</title>
</titleInfo>
<name type="personal"><namePart type="given">Raphaël</namePart>
<namePart type="family">Bolze</namePart>
<affiliation>Lip, Ens Lyon</affiliation>
</name>
<name type="personal"><namePart type="given">Franck</namePart>
<namePart type="family">Cappello</namePart>
<affiliation>Inria, Lri, Paris</affiliation>
<affiliation>E-mail: Fci@Lri.Fr</affiliation>
</name>
<name type="personal"><namePart type="given">Eddy</namePart>
<namePart type="family">Caron</namePart>
<affiliation>Lip, Ens Lyon</affiliation>
</name>
<name type="personal"><namePart type="given">Michel</namePart>
<namePart type="family">Daydé</namePart>
<affiliation>Inpt/Irit, Toulouse</affiliation>
</name>
<name type="personal"><namePart type="given">Frédéric</namePart>
<namePart type="family">Desprez</namePart>
<affiliation>Lip, Ens Lyon</affiliation>
</name>
<name type="personal"><namePart type="given">Emmanuel</namePart>
<namePart type="family">Jeannot</namePart>
<affiliation>Loria, Inria</affiliation>
</name>
<name type="personal"><namePart type="given">Yvon</namePart>
<namePart type="family">Jégou</namePart>
<affiliation>Irisa, Inria</affiliation>
</name>
<name type="personal"><namePart type="given">Stephane</namePart>
<namePart type="family">Lanteri</namePart>
<affiliation>Inria Sophia Antipolis</affiliation>
</name>
<name type="personal"><namePart type="given">Julien</namePart>
<namePart type="family">Leduc</namePart>
<affiliation>Inria, Lri, Paris</affiliation>
<affiliation>E-mail: Fci@Lri.Fr</affiliation>
</name>
<name type="personal"><namePart type="given">Noredine</namePart>
<namePart type="family">Melab</namePart>
<affiliation>Lifl, Université De Lille</affiliation>
</name>
<name type="personal"><namePart type="given">Guillaume</namePart>
<namePart type="family">Mornet</namePart>
<affiliation>Irisa, Inria</affiliation>
</name>
<name type="personal"><namePart type="given">Raymond</namePart>
<namePart type="family">Namyst</namePart>
<affiliation>Labri, Université De Bordeaux</affiliation>
</name>
<name type="personal"><namePart type="given">Pascale</namePart>
<namePart type="family">Primet</namePart>
<affiliation>Lip, Ens Lyon</affiliation>
</name>
<name type="personal"><namePart type="given">Benjamin</namePart>
<namePart type="family">Quetier</namePart>
<affiliation>Inria, Lri, Paris</affiliation>
<affiliation>E-mail: Fci@Lri.Fr</affiliation>
</name>
<name type="personal"><namePart type="given">Olivier</namePart>
<namePart type="family">Richard</namePart>
<affiliation>Laboratoire Id-Imag</affiliation>
</name>
<name type="personal"><namePart type="given">El-Ghazali</namePart>
<namePart type="family">Talbi</namePart>
<affiliation>Lifl, Université De Lille</affiliation>
</name>
<name type="personal"><namePart type="given">Iréa</namePart>
<namePart type="family">Touche</namePart>
<affiliation>Lgc, Toulouse</affiliation>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="research-article" authority="ISTEX" authorityURI="https://content-type.data.istex.fr" valueURI="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</genre>
<originInfo><publisher>Sage Publications</publisher>
<place><placeTerm type="text">Sage CA: Thousand Oaks, CA</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2006-11</dateIssued>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
</originInfo>
<language><languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<abstract lang="en">Large scale distributed systems such as Grids are difficult to study from theoretical models and simulators only. Most Grids deployed at large scale are production platforms that are inappropriate research tools because of their limited reconfiguration, control and monitoring capabilities. In this paper, we present Grid'5000, a 5000 CPU nation-wide infrastructure for research in Grid computing. Grid'5000 is designed to provide a scientific tool for computer scientists similar to the large-scale instruments used by physicists, astronomers, and biologists. We describe the motivations, design considerations, architecture, control, and monitoring infrastructure of this experimental platform. We present configuration examples and performance results for the reconfiguration subsystem.</abstract>
<subject><genre>keywords</genre>
<topic>Grid</topic>
<topic>P2P</topic>
<topic>experimental platform</topic>
<topic>highly recon-figurable syste</topic>
</subject>
<relatedItem type="host"><titleInfo><title>The International Journal of High Performance Computing Applications</title>
</titleInfo>
<genre type="journal" authority="ISTEX" authorityURI="https://publication-type.data.istex.fr" valueURI="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</genre>
<identifier type="ISSN">1094-3420</identifier>
<identifier type="eISSN">1741-2846</identifier>
<identifier type="PublisherID">HPC</identifier>
<identifier type="PublisherID-hwp">sphpc</identifier>
<part><date>2006</date>
<detail type="volume"><caption>vol.</caption>
<number>20</number>
</detail>
<detail type="issue"><caption>no.</caption>
<number>4</number>
</detail>
<extent unit="pages"><start>481</start>
<end>494</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">C3211077D2750B2369FA4662C4215BD8245E6D38</identifier>
<identifier type="ark">ark:/67375/M70-8RJ7Z37Z-T</identifier>
<identifier type="DOI">10.1177/1094342006070078</identifier>
<identifier type="ArticleID">10.1177_1094342006070078</identifier>
<recordInfo><recordContentSource authority="ISTEX" authorityURI="https://loaded-corpus.data.istex.fr" valueURI="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-0J1N7DQT-B">sage</recordContentSource>
</recordInfo>
</mods>
<json:item><extension>json</extension>
<original>false</original>
<mimetype>application/json</mimetype>
<uri>https://api.istex.fr/ark:/67375/M70-8RJ7Z37Z-T/record.json</uri>
</json:item>
</metadata>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Istex/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002E27 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 002E27 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:C3211077D2750B2369FA4662C4215BD8245E6D38
   |texte=   Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022

	Serveur d'exploration sur la recherche en informatique en Lorraine
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la recherche en informatique en Lorraine

Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed

Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed

Source :

English descriptors

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri