Taking snapshots of the Web with a TEI camera
Identifieur interne :
000065 ( PascalFrancis/Curation );
précédent :
000064;
suivant :
000066
Taking snapshots of the Web with a TEI camera
Auteurs : D. Walker [
Canada]
Source :
-
Computers and the humanities [ 0010-4817 ] ; 1999.
RBID : Francis:524-99-12228
Descripteurs français
English descriptors
Abstract
Electronic texts are claimed to exhibit features distinct from their more tangible cousins. The Snapshot project aims to observe and capture language usage in an electronic medium by creating an open corpus of World Wide Web documents. These documents are re-encoded using the TEI guidelines to create a flexible, persistent and portable data repository. This report gives an overview of the decisions made with respect to the re-encoding of HTML documents, and with the structuring the overall corpus
pA |
A01 | 01 | 1 | | @0 0010-4817 |
---|
A02 | 01 | | | @0 COHUAD |
---|
A03 | | 1 | | @0 Comput. humanit. |
---|
A05 | | | | @2 33 |
---|
A06 | | | | @2 1-2 |
---|
A08 | 01 | 1 | ENG | @1 Taking snapshots of the Web with a TEI camera |
---|
A09 | 01 | 1 | ENG | @1 Selected papers from TEI 10: Celebrating the tenth anniversary of the Text Encoding Initiative |
---|
A11 | 01 | 1 | | @1 WALKER (D.) |
---|
A12 | 01 | 1 | | @1 MYLONAS (Elli) @9 ed. |
---|
A12 | 02 | 1 | | @1 RENEAR (Allen) @9 ed. |
---|
A14 | 01 | | | @1 Computing and Information Science, Queen's University @2 Kingston, Ontario, K7L 3N6 @3 CAN @Z 1 aut. |
---|
A15 | 01 | | | @1 Scholarly Technology Group, Brown University @2 Providence, RI @3 USA @Z 1 aut. @Z 2 aut. |
---|
A20 | | | | @1 185-192 |
---|
A21 | | | | @1 1999 |
---|
A23 | 01 | | | @0 ENG |
---|
A43 | 01 | | | @1 INIST @2 14902 @5 354000084333370130 |
---|
A44 | | | | @0 0000 @1 © 1999 INIST-CNRS. All rights reserved. |
---|
A45 | | | | @0 17 ref. |
---|
A47 | 01 | 1 | | @0 524-99-12228 |
---|
A60 | | | | @1 P @2 C |
---|
A61 | | | | @0 A |
---|
A64 | 01 | 1 | | @0 Computers and the humanities |
---|
A66 | 01 | | | @0 NLD |
---|
A68 | 01 | 1 | FRE | @1 Prendre des instantanés sur le Web avec un appareil-photo TEI |
---|
A69 | 01 | 1 | FRE | @1 Sélection d'articles célébrant le 10e anniversaire de la TEI |
---|
C01 | 01 | | ENG | @0 Electronic texts are claimed to exhibit features distinct from their more tangible cousins. The Snapshot project aims to observe and capture language usage in an electronic medium by creating an open corpus of World Wide Web documents. These documents are re-encoded using the TEI guidelines to create a flexible, persistent and portable data repository. This report gives an overview of the decisions made with respect to the re-encoding of HTML documents, and with the structuring the overall corpus |
---|
C02 | 01 | L | | @0 52478 @1 XV |
---|
C02 | 02 | L | | @0 524 |
---|
C03 | 01 | L | FRE | @0 Linguistique informatique @5 02 |
---|
C03 | 01 | L | ENG | @0 Computational linguistics @5 02 |
---|
C03 | 02 | L | FRE | @0 Texte électronique @5 03 |
---|
C03 | 02 | L | ENG | @0 Electronic text @5 03 |
---|
C03 | 03 | L | FRE | @0 Description @5 04 |
---|
C03 | 03 | L | ENG | @0 Description @5 04 |
---|
C03 | 04 | L | FRE | @0 Standardisation @5 05 |
---|
C03 | 04 | L | ENG | @0 Standardization @5 05 |
---|
C03 | 05 | L | FRE | @0 Usage linguistique @5 06 |
---|
C03 | 06 | L | FRE | @0 Encodage @4 INC @5 31 |
---|
C03 | 07 | L | FRE | @0 Internet @4 INC @5 33 |
---|
C03 | 08 | L | FRE | @0 TEI @4 CD @5 96 |
---|
C03 | 08 | L | ENG | @0 TEI @4 CD @5 96 |
---|
C03 | 09 | L | FRE | @0 Linguistique de corpus @4 CD @5 97 |
---|
C03 | 09 | L | ENG | @0 Corpus linguistics @4 CD @5 97 |
---|
N21 | | | | @1 193 |
---|
|
pR |
A30 | 01 | 1 | ENG | @1 Text Encoding Initiative 10th Anniversary Conference @3 Providence, RI USA @4 1997-11 |
---|
|
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000064
Links to Exploration step
Francis:524-99-12228
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Taking snapshots of the Web with a TEI camera</title>
<author><name sortKey="Walker, D" sort="Walker, D" uniqKey="Walker D" first="D." last="Walker">D. Walker</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Computing and Information Science, Queen's University</s1>
<s2>Kingston, Ontario, K7L 3N6</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Canada</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">524-99-12228</idno>
<date when="1999">1999</date>
<idno type="stanalyst">FRANCIS 524-99-12228 INIST</idno>
<idno type="RBID">Francis:524-99-12228</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000064</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000065</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Taking snapshots of the Web with a TEI camera</title>
<author><name sortKey="Walker, D" sort="Walker, D" uniqKey="Walker D" first="D." last="Walker">D. Walker</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Computing and Information Science, Queen's University</s1>
<s2>Kingston, Ontario, K7L 3N6</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Canada</country>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Computers and the humanities</title>
<title level="j" type="abbreviated">Comput. humanit.</title>
<idno type="ISSN">0010-4817</idno>
<imprint><date when="1999">1999</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Computers and the humanities</title>
<title level="j" type="abbreviated">Comput. humanit.</title>
<idno type="ISSN">0010-4817</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Computational linguistics</term>
<term>Corpus linguistics</term>
<term>Description</term>
<term>Electronic text</term>
<term>Standardization</term>
<term>TEI</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Linguistique informatique</term>
<term>Texte électronique</term>
<term>Description</term>
<term>Standardisation</term>
<term>Usage linguistique</term>
<term>Encodage</term>
<term>Internet</term>
<term>TEI</term>
<term>Linguistique de corpus</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Electronic texts are claimed to exhibit features distinct from their more tangible cousins. The Snapshot project aims to observe and capture language usage in an electronic medium by creating an open corpus of World Wide Web documents. These documents are re-encoded using the TEI guidelines to create a flexible, persistent and portable data repository. This report gives an overview of the decisions made with respect to the re-encoding of HTML documents, and with the structuring the overall corpus</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0010-4817</s0>
</fA01>
<fA02 i1="01"><s0>COHUAD</s0>
</fA02>
<fA03 i2="1"><s0>Comput. humanit.</s0>
</fA03>
<fA06><s2>1-2</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG"><s1>Taking snapshots of the Web with a TEI camera</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Selected papers from TEI 10: Celebrating the tenth anniversary of the Text Encoding Initiative</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>WALKER (D.)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>MYLONAS (Elli)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>RENEAR (Allen)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>Computing and Information Science, Queen's University</s1>
<s2>Kingston, Ontario, K7L 3N6</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA15 i1="01"><s1>Scholarly Technology Group, Brown University</s1>
<s2>Providence, RI</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA15>
<fA20><s1>185-192</s1>
</fA20>
<fA21><s1>1999</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>14902</s2>
<s5>354000084333370130</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 1999 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>17 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>524-99-12228</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA64 i1="01" i2="1"><s0>Computers and the humanities</s0>
</fA64>
<fA66 i1="01"><s0>NLD</s0>
</fA66>
<fA68 i1="01" i2="1" l="FRE"><s1>Prendre des instantanés sur le Web avec un appareil-photo TEI</s1>
</fA68>
<fA69 i1="01" i2="1" l="FRE"><s1>Sélection d'articles célébrant le 10<sup>e</sup>
anniversaire de la TEI</s1>
</fA69>
<fC01 i1="01" l="ENG"><s0>Electronic texts are claimed to exhibit features distinct from their more tangible cousins. The Snapshot project aims to observe and capture language usage in an electronic medium by creating an open corpus of World Wide Web documents. These documents are re-encoded using the TEI guidelines to create a flexible, persistent and portable data repository. This report gives an overview of the decisions made with respect to the re-encoding of HTML documents, and with the structuring the overall corpus</s0>
</fC01>
<fC02 i1="01" i2="L"><s0>52478</s0>
<s1>XV</s1>
</fC02>
<fC02 i1="02" i2="L"><s0>524</s0>
</fC02>
<fC03 i1="01" i2="L" l="FRE"><s0>Linguistique informatique</s0>
<s5>02</s5>
</fC03>
<fC03 i1="01" i2="L" l="ENG"><s0>Computational linguistics</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="L" l="FRE"><s0>Texte électronique</s0>
<s5>03</s5>
</fC03>
<fC03 i1="02" i2="L" l="ENG"><s0>Electronic text</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="L" l="FRE"><s0>Description</s0>
<s5>04</s5>
</fC03>
<fC03 i1="03" i2="L" l="ENG"><s0>Description</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="L" l="FRE"><s0>Standardisation</s0>
<s5>05</s5>
</fC03>
<fC03 i1="04" i2="L" l="ENG"><s0>Standardization</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="L" l="FRE"><s0>Usage linguistique</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="L" l="FRE"><s0>Encodage</s0>
<s4>INC</s4>
<s5>31</s5>
</fC03>
<fC03 i1="07" i2="L" l="FRE"><s0>Internet</s0>
<s4>INC</s4>
<s5>33</s5>
</fC03>
<fC03 i1="08" i2="L" l="FRE"><s0>TEI</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="08" i2="L" l="ENG"><s0>TEI</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="09" i2="L" l="FRE"><s0>Linguistique de corpus</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="09" i2="L" l="ENG"><s0>Corpus linguistics</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fN21><s1>193</s1>
</fN21>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Text Encoding Initiative 10th Anniversary Conference</s1>
<s3>Providence, RI USA</s3>
<s4>1997-11</s4>
</fA30>
</pR>
</standard>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/PascalFrancis/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000065 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Curation/biblio.hfd -nk 000065 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien
|wiki= Wicri/Ticri
|area= TeiVM2
|flux= PascalFrancis
|étape= Curation
|type= RBID
|clé= Francis:524-99-12228
|texte= Taking snapshots of the Web with a TEI camera
}}
| This area was generated with Dilib version V0.6.31. Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024 | |