Discussion Wicri:HypertextV6
De H2PTM
Résultats de la phase download
- Etape 1
- 31 Aout
time IstexGetCorpus -q "hypertext*" -s 2500 -A \
      | IstexToSxml     \
      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.00000
real	16m39.445s
user	0m52.675s
sys	0m10.033s
time IstexGetCorpus -q "hypertext*" -s 2500 -f 2500 -A \
     | IstexToSxml     \
     | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.02500
real	17m2.458s
user	0m49.014s
sys	0m9.563s
- Etape 2
- 4 septembre:
On constate que le nombre de documents à augmenté => RAZ Plantage 1
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -A \ > | IstexToSxml \ > | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.00000 ^C real 11m13.539s user 0m13.005s sys 0m2.921s
Reprise
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -A      | IstexToSxml     \
      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.00000
real	19m0.494s
user	0m42.690s
sys	0m8.521s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 2500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.02500
real	20m36.548s
user	0m39.006s
sys	0m8.032s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 5000 -A \>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.05000
real	20m2.215s
user	0m43.069s
sys	0m8.637s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 7500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.07500
real	18m51.286s
user	0m36.873s
sys	0m7.478s
 time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 10000 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.10000
real	22m58.873s
user	0m35.527s
sys	0m7.422s
Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 12500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.12500
real	25m36.362s
user	0m36.576s
sys	0m7.652s
Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 15000 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.15000
real	30m29.968s
user	0m35.259s
sys	0m7.587s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 17500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.17500
real	32m33.527s
user	0m24.830s
sys	0m5.797s
Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 20000 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.20000
real	16m19.937s
user	0m12.012s
sys	0m3.052s
Vérification :
HfdCat /Users/jacquesducloy/Documents/WicriRoot/Ticri/H2ptm/corpus/Hypertext.storage/HypertextV6/Import/IstexDownload.*.hfd | wc 21596 189124377 2035508930 IstexGetCorpusSize -q "hypertext* OR hypermedia" 21596
Création du Repository
time HfdCat $EXPLOR_AREA/Import/IstexDownload.*.hfd \ > | SgmlFast -c 1 | HfdBuild -bh $EXPLOR_AREA/Import/IstexRepository real 4m2.600s user 1m30.546s sys 0m16.059s
Création du repository de métadonnées
- Documents modifiés à la main (erreurs JSON)


