Discussion Wicri:HypertextV6 : Différence entre versions
De H2PTM
imported>Jacques Ducloy m (1 révision importée) |
(Aucune différence)
|
Version actuelle datée du 20 juillet 2017 à 14:03
Résultats de la phase download
- Etape 1
- 31 Aout
time IstexGetCorpus -q "hypertext*" -s 2500 -A \
| IstexToSxml \
| HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.00000
real 16m39.445s
user 0m52.675s
sys 0m10.033s
time IstexGetCorpus -q "hypertext*" -s 2500 -f 2500 -A \
| IstexToSxml \
| HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.02500
real 17m2.458s
user 0m49.014s
sys 0m9.563s
- Etape 2
- 4 septembre:
On constate que le nombre de documents à augmenté => RAZ Plantage 1
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -A \ > | IstexToSxml \ > | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.00000 ^C real 11m13.539s user 0m13.005s sys 0m2.921s
Reprise
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -A | IstexToSxml \
| HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.00000
real 19m0.494s
user 0m42.690s
sys 0m8.521s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 2500 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.02500
real 20m36.548s
user 0m39.006s
sys 0m8.032s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 5000 -A \> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.05000
real 20m2.215s
user 0m43.069s
sys 0m8.637s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 7500 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.07500
real 18m51.286s
user 0m36.873s
sys 0m7.478s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 10000 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.10000
real 22m58.873s
user 0m35.527s
sys 0m7.422s
Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 12500 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.12500
real 25m36.362s
user 0m36.576s
sys 0m7.652s
Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 15000 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.15000
real 30m29.968s
user 0m35.259s
sys 0m7.587s
time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 17500 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.17500
real 32m33.527s
user 0m24.830s
sys 0m5.797s
Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 20000 -A \
> | IstexToSxml \
> | HfdBuild -bh $EXPLOR_AREA/Import/IstexDownload.20000
real 16m19.937s
user 0m12.012s
sys 0m3.052s
Vérification :
HfdCat /Users/jacquesducloy/Documents/WicriRoot/Ticri/H2ptm/corpus/Hypertext.storage/HypertextV6/Import/IstexDownload.*.hfd | wc 21596 189124377 2035508930 IstexGetCorpusSize -q "hypertext* OR hypermedia" 21596
Création du Repository
time HfdCat $EXPLOR_AREA/Import/IstexDownload.*.hfd \ > | SgmlFast -c 1 | HfdBuild -bh $EXPLOR_AREA/Import/IstexRepository real 4m2.600s user 1m30.546s sys 0m16.059s
Création du repository de métadonnées
- Documents modifiés à la main (erreurs JSON)