Nuvola apps important.png Attention, suite à une faille de sécurité, les liens vers les serveurs d'exploration sont désactivés.

-

Discussion Wicri:HypertextV6

De H2PTM

Résultats de la phase download

Etape 1
31 Aout
time IstexGetCorpus -q "hypertext*" -s 2500 -A \
      | IstexToSxml     \
      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.00000

real	16m39.445s
user	0m52.675s
sys	0m10.033s


time IstexGetCorpus -q "hypertext*" -s 2500 -f 2500 -A \
     | IstexToSxml     \
     | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.02500

real	17m2.458s
user	0m49.014s
sys	0m9.563s


Etape 2
4 septembre:

On constate que le nombre de documents à augmenté => RAZ Plantage 1

time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.00000
^C

real	11m13.539s
user	0m13.005s
sys	0m2.921s

Reprise

time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -A      | IstexToSxml     \
      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.00000

real	19m0.494s
user	0m42.690s
sys	0m8.521s


time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 2500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.02500

real	20m36.548s
user	0m39.006s
sys	0m8.032s

time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 5000 -A \>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.05000

real	20m2.215s
user	0m43.069s
sys	0m8.637s

time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 7500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.07500

real	18m51.286s
user	0m36.873s
sys	0m7.478s


 time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 10000 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.10000

real	22m58.873s
user	0m35.527s
sys	0m7.422s

Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 12500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.12500

real	25m36.362s
user	0m36.576s
sys	0m7.652s

Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 15000 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.15000

real	30m29.968s
user	0m35.259s
sys	0m7.587s



time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 17500 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.17500

real	32m33.527s
user	0m24.830s
sys	0m5.797s


Hypertext.storage jacquesducloy$ time IstexGetCorpus -q "hypertext* OR hypermedia" -s 2500 -f 20000 -A \
>      | IstexToSxml     \
>      | HfdBuild -bh    $EXPLOR_AREA/Import/IstexDownload.20000

real	16m19.937s
user	0m12.012s
sys	0m3.052s


Vérification :

HfdCat /Users/jacquesducloy/Documents/WicriRoot/Ticri/H2ptm/corpus/Hypertext.storage/HypertextV6/Import/IstexDownload.*.hfd | wc
   21596 189124377 2035508930
IstexGetCorpusSize -q "hypertext* OR hypermedia"
21596

Création du Repository

time HfdCat $EXPLOR_AREA/Import/IstexDownload.*.hfd \
>  | SgmlFast -c 1  | HfdBuild -bh $EXPLOR_AREA/Import/IstexRepository

real	4m2.600s
user	1m30.546s
sys	0m16.059s

Création du repository de métadonnées

Documents modifiés à la main (erreurs JSON)