Serveur d'exploration sur l'Université de Trèves

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Scheduling expression DAGs for minimal register need

Identifieur interne : 001485 ( Istex/Corpus ); précédent : 001484; suivant : 001486

Scheduling expression DAGs for minimal register need

Auteurs : Christoph W. Kessler1

Source :

RBID : ISTEX:05DEE45F06AC2589021C00311CF21F99695D673C

Abstract

Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behavior. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.

Url:
DOI: 10.1016/S0096-0551(98)00002-2

Links to Exploration step

ISTEX:05DEE45F06AC2589021C00311CF21F99695D673C

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Scheduling expression DAGs for minimal register need</title>
<author>
<name sortKey="Kessler1, Christoph W" sort="Kessler1, Christoph W" uniqKey="Kessler1 C" first="Christoph W." last="Kessler1">Christoph W. Kessler1</name>
<affiliation>
<mods:affiliation>FB 4 — Informatik, Universität Trier, D-54286 Trier, Germany</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:05DEE45F06AC2589021C00311CF21F99695D673C</idno>
<date when="1998" year="1998">1998</date>
<idno type="doi">10.1016/S0096-0551(98)00002-2</idno>
<idno type="url">https://api.istex.fr/document/05DEE45F06AC2589021C00311CF21F99695D673C/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001485</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001485</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Scheduling expression DAGs for minimal register need</title>
<author>
<name sortKey="Kessler1, Christoph W" sort="Kessler1, Christoph W" uniqKey="Kessler1 C" first="Christoph W." last="Kessler1">Christoph W. Kessler1</name>
<affiliation>
<mods:affiliation>FB 4 — Informatik, Universität Trier, D-54286 Trier, Germany</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Computer Languages</title>
<title level="j" type="abbrev">CL</title>
<idno type="ISSN">0096-0551</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1998">1998</date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="33">33</biblScope>
<biblScope unit="page" to="53">53</biblScope>
</imprint>
<idno type="ISSN">0096-0551</idno>
</series>
<idno type="istex">05DEE45F06AC2589021C00311CF21F99695D673C</idno>
<idno type="DOI">10.1016/S0096-0551(98)00002-2</idno>
<idno type="PII">S0096-0551(98)00002-2</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0096-0551</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behavior. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.</div>
</front>
</TEI>
<istex>
<corpusName>elsevier</corpusName>
<author>
<json:item>
<name>Christoph W. Kessler1</name>
<affiliations>
<json:string>FB 4 — Informatik, Universität Trier, D-54286 Trier, Germany</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Program optimization</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Basic block</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Expression DAG</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Instruction scheduling</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Register allocation</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Code generation</value>
</json:item>
</subject>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>Full-length article</json:string>
</originalGenre>
<abstract>Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behavior. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.</abstract>
<qualityIndicators>
<score>6.776</score>
<pdfVersion>1.2</pdfVersion>
<pdfPageSize>595 x 842 pts (A4)</pdfPageSize>
<refBibsNative>true</refBibsNative>
<keywordCount>6</keywordCount>
<abstractCharCount>967</abstractCharCount>
<pdfWordCount>7636</pdfWordCount>
<pdfCharCount>41179</pdfCharCount>
<pdfPageCount>21</pdfPageCount>
<abstractWordCount>148</abstractWordCount>
</qualityIndicators>
<title>Scheduling expression DAGs for minimal register need</title>
<pii>
<json:string>S0096-0551(98)00002-2</json:string>
</pii>
<refBibs>
<json:item>
<author>
<json:item>
<name>GJ Chaitin</name>
</json:item>
<json:item>
<name>MA Auslander</name>
</json:item>
<json:item>
<name>AK Chandra</name>
</json:item>
<json:item>
<name>J Cocke</name>
</json:item>
<json:item>
<name>ME Hopkins</name>
</json:item>
<json:item>
<name>PW Markstein</name>
</json:item>
</author>
<host>
<volume>6</volume>
<pages>
<last>57</last>
<first>47</first>
</pages>
<author></author>
<title>Computer Languages</title>
</host>
<title>Register allocation via coloring</title>
</json:item>
<json:item>
<author>
<json:item>
<name>GJ Chaitin</name>
</json:item>
</author>
<host>
<volume>17</volume>
<pages>
<last>207</last>
<first>201</first>
</pages>
<issue>6</issue>
<author></author>
<title>ACM SIGPLAN Notices</title>
</host>
<title>Register allocation and spilling via graph coloring</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>FC Chow</name>
</json:item>
<json:item>
<name>JL Hennessy</name>
</json:item>
</author>
<host>
<volume>19</volume>
<pages>
<last>232</last>
<first>222</first>
</pages>
<issue>6</issue>
<author></author>
<title>ACM SIGPLAN Notices</title>
</host>
<title>Register allocation by priority-based coloring</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>JJ Dongarra</name>
</json:item>
<json:item>
<name>AR Jinds</name>
</json:item>
</author>
<host>
<volume>9</volume>
<pages>
<last>226</last>
<first>219</first>
</pages>
<issue>3</issue>
<author></author>
<title>Software: Practice and Experience</title>
</host>
<title>Unrolling loops in Fortran</title>
</json:item>
<json:item>
<author>
<json:item>
<name>JA Fisher</name>
</json:item>
</author>
<host>
<volume>C30</volume>
<pages>
<last>490</last>
<first>478</first>
</pages>
<issue>7</issue>
<author></author>
<title>IEEE Transactions on Computers,</title>
</host>
<title>Trace scheduling: a technique for global microcode compaction</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>R Sethi</name>
</json:item>
</author>
<host>
<volume>4</volume>
<pages>
<last>248</last>
<first>226</first>
</pages>
<author></author>
<title>SIAM J Comput</title>
</host>
<title>Complete register allocation problems</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>CW Kessler</name>
</json:item>
<json:item>
<name>T Rauber</name>
</json:item>
</author>
<host>
<volume>21</volume>
<pages>
<last>127</last>
<first>113</first>
</pages>
<issue>2</issue>
<author></author>
<title>Computer Languages</title>
</host>
<title>Generating optimal contiguous evaluations for expression DAGs</title>
</json:item>
<json:item>
<author>
<json:item>
<name>R Sethi</name>
</json:item>
<json:item>
<name>JD Ullman</name>
</json:item>
</author>
<host>
<volume>17</volume>
<pages>
<last>728</last>
<first>715</first>
</pages>
<author></author>
<title>Journal of the ACM</title>
</host>
<title>The generation of optimal code for arithmetic expressions</title>
</json:item>
<json:item>
<author>
<json:item>
<name>AV Aho</name>
</json:item>
<json:item>
<name>SC Johnson</name>
</json:item>
</author>
<host>
<volume>23</volume>
<pages>
<last>501</last>
<first>488</first>
</pages>
<issue>3</issue>
<author></author>
<title>Journal of the ACM</title>
</host>
<title>Optimal code generation for expression trees</title>
</json:item>
<json:item>
<author>
<json:item>
<name>D Bernstein</name>
</json:item>
<json:item>
<name>I Gertner</name>
</json:item>
</author>
<host>
<volume>11</volume>
<pages>
<last>67</last>
<first>57</first>
</pages>
<issue>1</issue>
<author></author>
<title>ACM Trans on Progr Lang and Systems</title>
</host>
<title>Scheduling expressions on a pipelined processor with a maximal delay of one cycle</title>
</json:item>
<json:item>
<author>
<json:item>
<name>D Bernstein</name>
</json:item>
<json:item>
<name>JM Jaffe</name>
</json:item>
<json:item>
<name>M Rodeh</name>
</json:item>
</author>
<host>
<volume>18</volume>
<pages>
<last>1127</last>
<first>1098</first>
</pages>
<author></author>
<title>SIAM J Comput</title>
</host>
<title>Scheduling arithmetic and load operations in parallel with no spilling</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>R Venugopal</name>
</json:item>
<json:item>
<name>YN Srikant</name>
</json:item>
</author>
<host>
<volume>21</volume>
<pages>
<last>65</last>
<first>49</first>
</pages>
<issue>1</issue>
<author></author>
<title>Computer Languages</title>
</host>
<title>Scheduling expression trees with reusable registers on delayed-load architectures</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>J Hennessy</name>
</json:item>
<json:item>
<name>T Gross</name>
</json:item>
</author>
<host>
<volume>5</volume>
<pages>
<last>448</last>
<first>422</first>
</pages>
<issue>3</issue>
<author></author>
<title>ACM Transactions on Programming Languages and Systems</title>
</host>
<title>Postpass code optimization of pipeline constraints</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
<json:item>
<author>
<json:item>
<name>CW Kessler</name>
</json:item>
<json:item>
<name>H Seidl</name>
</json:item>
</author>
<host>
<volume>25</volume>
<pages>
<last>50</last>
<first>17</first>
</pages>
<issue>1</issue>
<author></author>
<title>International Journal on Parallel Programming</title>
</host>
<title>The Fork95 parallel programming language: design, implementation, application</title>
</json:item>
<json:item>
<host>
<author></author>
</host>
</json:item>
</refBibs>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<volume>24</volume>
<pii>
<json:string>S0096-0551(00)X0009-4</json:string>
</pii>
<pages>
<last>53</last>
<first>33</first>
</pages>
<issn>
<json:string>0096-0551</json:string>
</issn>
<issue>1</issue>
<genre>
<json:string>journal</json:string>
</genre>
<language>
<json:string>unknown</json:string>
</language>
<title>Computer Languages</title>
<publicationDate>1998</publicationDate>
</host>
<publicationDate>1998</publicationDate>
<copyrightDate>1998</copyrightDate>
<doi>
<json:string>10.1016/S0096-0551(98)00002-2</json:string>
</doi>
<id>05DEE45F06AC2589021C00311CF21F99695D673C</id>
<score>2.0476103</score>
<fulltext>
<json:item>
<extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/document/05DEE45F06AC2589021C00311CF21F99695D673C/fulltext/pdf</uri>
</json:item>
<json:item>
<extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/document/05DEE45F06AC2589021C00311CF21F99695D673C/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/05DEE45F06AC2589021C00311CF21F99695D673C/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a">Scheduling expression DAGs for minimal register need</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>ELSEVIER</publisher>
<availability>
<p>©1998 Elsevier Science Ltd</p>
</availability>
<date>1998</date>
</publicationStmt>
<notesStmt>
<note type="content">Fig. 1: Example: suppose the expression tree for (−a)×((−a+b)×(b+c)) has been scheduled using the Labeling algorithm of Sethi/Ullman with minimal register need 4. —Common subexpression elimination results in a DAG G, reducing the number of instructions to be executed. If G is scheduled according to S0, now 5 registers are required. The better schedule on the right obtained by reordering the instructions needs only 4 registers, which is optimal.</note>
<note type="content">Fig. 2: Algorithm nce.</note>
<note type="content">Fig. 3: A snapshot of a DAG being processed by algorithm nce.</note>
<note type="content">Fig. 4: An example DAG and its selection DAG.</note>
<note type="content">Fig. 5: Algorithm ncc.</note>
<note type="content">Fig. 6: Algorithm ncv.</note>
<note type="content">Fig. 7: Data dependencies among the sets Lik of selection nodes.</note>
<note type="content">Fig. 8: Algorithm time computes the time slots and the completion time of a schedule</note>
<note type="content">Table 1: ncv applied to some random DAGs</note>
<note type="content">Table 2: ncv applied to some DAGs taken from real programs (LL=Livermore Loop Kernels; MDG=molecular dynamics, and SPEC77=atmospheric flow simulation; the last two are taken from the Perfect Club Benchmark Suite)</note>
<note type="content">Table 3: Real and random DAGs, submitted to ncn with two target machine constellations: (1) delayed Load with one delay slot, and (2) delayed Load with 2 delay slots and delayed Compute with 0 or 1 delay slots (randomly chosen with probability 0.5)</note>
</notesStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a">Scheduling expression DAGs for minimal register need</title>
<author xml:id="author-1">
<persName>
<forename type="first">Christoph W.</forename>
<surname>Kessler1</surname>
</persName>
<affiliation>FB 4 — Informatik, Universität Trier, D-54286 Trier, Germany</affiliation>
</author>
</analytic>
<monogr>
<title level="j">Computer Languages</title>
<title level="j" type="abbrev">CL</title>
<idno type="pISSN">0096-0551</idno>
<idno type="PII">S0096-0551(00)X0009-4</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1998"></date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="33">33</biblScope>
<biblScope unit="page" to="53">53</biblScope>
</imprint>
</monogr>
<idno type="istex">05DEE45F06AC2589021C00311CF21F99695D673C</idno>
<idno type="DOI">10.1016/S0096-0551(98)00002-2</idno>
<idno type="PII">S0096-0551(98)00002-2</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>1998</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behavior. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.</p>
</abstract>
<textClass>
<keywords scheme="keyword">
<list>
<head>Keywords</head>
<item>
<term>Program optimization</term>
</item>
<item>
<term>Basic block</term>
</item>
<item>
<term>Expression DAG</term>
</item>
<item>
<term>Instruction scheduling</term>
</item>
<item>
<term>Register allocation</term>
</item>
<item>
<term>Code generation</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="1997-12-08">Modified</change>
<change when="1998">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/document/05DEE45F06AC2589021C00311CF21F99695D673C/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="Elsevier, elements deleted: ce:floats; body; tail">
<istex:xmlDeclaration>version="1.0" encoding="utf-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//ES//DTD journal article DTD version 4.5.2//EN//XML" URI="art452.dtd" name="istex:docType">
<istex:entity SYSTEM="gr1" NDATA="IMAGE" name="gr1"></istex:entity>
<istex:entity SYSTEM="gr2" NDATA="IMAGE" name="gr2"></istex:entity>
<istex:entity SYSTEM="gr3" NDATA="IMAGE" name="gr3"></istex:entity>
<istex:entity SYSTEM="gr4" NDATA="IMAGE" name="gr4"></istex:entity>
<istex:entity SYSTEM="gr5" NDATA="IMAGE" name="gr5"></istex:entity>
<istex:entity SYSTEM="gr6" NDATA="IMAGE" name="gr6"></istex:entity>
<istex:entity SYSTEM="gr7" NDATA="IMAGE" name="gr7"></istex:entity>
<istex:entity SYSTEM="gr8" NDATA="IMAGE" name="gr8"></istex:entity>
</istex:docType>
<istex:document>
<converted-article version="4.5.2" docsubtype="fla">
<item-info>
<jid>CL</jid>
<aid>61</aid>
<ce:pii>S0096-0551(98)00002-2</ce:pii>
<ce:doi>10.1016/S0096-0551(98)00002-2</ce:doi>
<ce:copyright year="1998" type="full-transfer">Elsevier Science Ltd</ce:copyright>
</item-info>
<head>
<ce:title>Scheduling expression DAGs for minimal register need</ce:title>
<ce:author-group>
<ce:author>
<ce:given-name>Christoph W.</ce:given-name>
<ce:surname>Kessler
<ce:sup>1</ce:sup>
</ce:surname>
<ce:cross-ref refid="CORR1">*</ce:cross-ref>
</ce:author>
<ce:affiliation id="AFF1">
<ce:label>affl1</ce:label>
<ce:textfn>FB 4 — Informatik, Universität Trier, D-54286 Trier, Germany</ce:textfn>
</ce:affiliation>
<ce:correspondence id="CORR1">
<ce:label>*</ce:label>
<ce:text>Corresponding author. Fax: +49-651-2013822; E-mail: kessler@psi.uni-trier.de</ce:text>
</ce:correspondence>
</ce:author-group>
<ce:date-received day="29" month="7" year="1997"></ce:date-received>
<ce:date-revised day="8" month="12" year="1997"></ce:date-revised>
<ce:abstract>
<ce:section-title>Abstract</ce:section-title>
<ce:abstract-sec>
<ce:simple-para>Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial
<ce:italic>O</ce:italic>
(
<ce:italic>n</ce:italic>
!) enumeration algorithm. We present a new algorithm with worst-case complexity
<ce:italic>O</ce:italic>
(
<ce:italic>n</ce:italic>
2
<ce:sup>2
<ce:italic>n</ce:italic>
</ce:sup>
) and very good average behavior. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an
<ce:italic>optimal</ce:italic>
schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.</ce:simple-para>
</ce:abstract-sec>
</ce:abstract>
<ce:keywords class="keyword">
<ce:section-title>Keywords</ce:section-title>
<ce:keyword>
<ce:text>Program optimization</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Basic block</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Expression DAG</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Instruction scheduling</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Register allocation</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Code generation</ce:text>
</ce:keyword>
</ce:keywords>
</head>
</converted-article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo>
<title>Scheduling expression DAGs for minimal register need</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>Scheduling expression DAGs for minimal register need</title>
</titleInfo>
<name type="personal">
<namePart type="given">Christoph W.</namePart>
<namePart type="family">Kessler1</namePart>
<affiliation>FB 4 — Informatik, Universität Trier, D-54286 Trier, Germany</affiliation>
<description>Corresponding author. Fax: +49-651-2013822; E-mail: kessler@psi.uni-trier.de</description>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="Full-length article"></genre>
<originInfo>
<publisher>ELSEVIER</publisher>
<dateIssued encoding="w3cdtf">1998</dateIssued>
<dateModified encoding="w3cdtf">1997-12-08</dateModified>
<copyrightDate encoding="w3cdtf">1998</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behavior. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.</abstract>
<note type="content">Fig. 1: Example: suppose the expression tree for (−a)×((−a+b)×(b+c)) has been scheduled using the Labeling algorithm of Sethi/Ullman with minimal register need 4. —Common subexpression elimination results in a DAG G, reducing the number of instructions to be executed. If G is scheduled according to S0, now 5 registers are required. The better schedule on the right obtained by reordering the instructions needs only 4 registers, which is optimal.</note>
<note type="content">Fig. 2: Algorithm nce.</note>
<note type="content">Fig. 3: A snapshot of a DAG being processed by algorithm nce.</note>
<note type="content">Fig. 4: An example DAG and its selection DAG.</note>
<note type="content">Fig. 5: Algorithm ncc.</note>
<note type="content">Fig. 6: Algorithm ncv.</note>
<note type="content">Fig. 7: Data dependencies among the sets Lik of selection nodes.</note>
<note type="content">Fig. 8: Algorithm time computes the time slots and the completion time of a schedule</note>
<note type="content">Table 1: ncv applied to some random DAGs</note>
<note type="content">Table 2: ncv applied to some DAGs taken from real programs (LL=Livermore Loop Kernels; MDG=molecular dynamics, and SPEC77=atmospheric flow simulation; the last two are taken from the Perfect Club Benchmark Suite)</note>
<note type="content">Table 3: Real and random DAGs, submitted to ncn with two target machine constellations: (1) delayed Load with one delay slot, and (2) delayed Load with 2 delay slots and delayed Compute with 0 or 1 delay slots (randomly chosen with probability 0.5)</note>
<subject>
<genre>Keywords</genre>
<topic>Program optimization</topic>
<topic>Basic block</topic>
<topic>Expression DAG</topic>
<topic>Instruction scheduling</topic>
<topic>Register allocation</topic>
<topic>Code generation</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Computer Languages</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>CL</title>
</titleInfo>
<genre type="journal">journal</genre>
<originInfo>
<dateIssued encoding="w3cdtf">19980401</dateIssued>
</originInfo>
<identifier type="ISSN">0096-0551</identifier>
<identifier type="PII">S0096-0551(00)X0009-4</identifier>
<part>
<date>19980401</date>
<detail type="volume">
<number>24</number>
<caption>vol.</caption>
</detail>
<detail type="issue">
<number>1</number>
<caption>no.</caption>
</detail>
<extent unit="issue pages">
<start>1</start>
<end>54</end>
</extent>
<extent unit="pages">
<start>33</start>
<end>53</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">05DEE45F06AC2589021C00311CF21F99695D673C</identifier>
<identifier type="DOI">10.1016/S0096-0551(98)00002-2</identifier>
<identifier type="PII">S0096-0551(98)00002-2</identifier>
<accessCondition type="use and reproduction" contentType="copyright">©1998 Elsevier Science Ltd</accessCondition>
<recordInfo>
<recordContentSource>ELSEVIER</recordContentSource>
<recordOrigin>Elsevier Science Ltd, ©1998</recordOrigin>
</recordInfo>
</mods>
</metadata>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Rhénanie/explor/UnivTrevesV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001485 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 001485 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Rhénanie
   |area=    UnivTrevesV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:05DEE45F06AC2589021C00311CF21F99695D673C
   |texte=   Scheduling expression DAGs for minimal register need
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Sat Jul 22 16:29:01 2017. Site generation: Wed Feb 28 14:55:37 2024