Discussion:Serveur d'exploration sur la grippe au Canada
De Wicri Santé
Révision datée du 3 septembre 2020 à 15:10 par imported>Jacques Ducloy (→Cumuls)
Sommaire
Faits épidémiques par le MeSH
WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie" \
| MediaWikiCleanTable \
| MediaWikiTable2SxmlRowCol \
| MediaWikiTableTransformCol -l 5 \
| SxmlSelect -g r/c5/p/t/1 -p @g1 \
> $EXPLOR_AREA/FixInput/provincesMeSH.dict
WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie" \
| MediaWikiCleanTable \
| MediaWikiTable2SxmlRowCol \
| MediaWikiTableTransformCol -l 5 \
| SxmlSelect -g r/c5/p/t/1 -p "@g1 (epidemiology)" \
> $EXPLOR_AREA/FixInput/provincesEpidemioMeSH.dict
WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie" \
| MediaWikiCleanTable \
| MediaWikiTable2SxmlRowCol \
| MediaWikiTableTransformCol -l 5 \
| SxmlSelect -g r/c5/p/t/1 -g r/c3/l/1 -p @g1 -p @g2 \
> $EXPLOR_AREA/FixInput/provincesMesh2geoNames.dict
WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie" \
| MediaWikiCleanTable \
| MediaWikiTable2SxmlRowCol \
| MediaWikiTableTransformCol -l 5 \
| SxmlSelect -g r/c5/p/t/1 -p "@g1 (epidemiology)" -g r/c3/l/1 -p @g2 \
> $EXPLOR_AREA/FixInput/provincesEpidemioMeSH2geoNames.dict
Faits épidémiques sur PubMed Central
Dictionnaires de correspondances sur PMC
HfdCat $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \
| SxmlSelect -g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=pmc/1 -p@g1 -p @1 \
| sort > $EXPLOR_AREA/FixInput/pmcToHfdRefPubMed.dict
Simple filtrage
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd \
| grep Vancouver | SxmlFindText -s Vancouver -a 30 -b 30 | grep body \
| StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr
Avec GeoNames
But produire Serveur d'exploration sur la grippe au Canada/Où trouver des faits épidémiques ?
- Récupération du fichier des codes postaux du Canada sur GeoNames;
curl http://download.geonames.org/export/zip/CA.zip -o CA.zip
unzip CA.zip
mv CA.txt Import/geoNamesPosCodesCA.txt
Provinces
- Extraction des provinces
cat Import/geoNamesPosCodesCA.txt \
| SxmlSelect -p @4| sort -u > FixInput/geoNamesProvincesCA.dict
cat FixInput/geoNamesProvincesCA.dict \
| SxmlFindTextBuildDict > FixInput/geoNamesProvincesCaFind.dict
- Filtrage sur les provinces
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd \
| SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20 | grep body \
| StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr \
| SxmlSelect -p @6 -p @1 > FixData/provinces2ref.list
HfdCat $EXPLOR_AREA/FixData/Istex/Corpus/repository.hfd \
| SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20 | grep body \
| StrDictSelect -t FixInput/HfdIstexRepo2HfdPubMed.dict -sr \
| SxmlSelect -p @6 -p @1 > FixData/provinces2IstexRef.list
HfdCat Data/Main/Exploration/biblio.hfd \
| SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title -p @1 -p @g1 \
| SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20 \
| SxmlSelect -p @6 -p @1 > FixData/provincesInTitle2ref.list
HfdCat Data/Main/Exploration/biblio.hfd \
| SxmlSelect -g record/TEI/front -p @1 -p @g1 \
| SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20 \
| SxmlSelect -p @6 -p @1 > FixData/provincesInFront2ref.list
Villes
- Extraction des villes
cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @6 \
| StrDictClean | sort -u > FixInput/geoNamesCitiesCA.dict
cat FixInput/geoNamesCitiesCA.dict \
| SxmlFindTextBuildDict > FixInput/geoNamesCitiesCaFind.dict
HfdCat Data/Main/Exploration/biblio.hfd \
| SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title -p @1 -p @g1 \
| SxmlFindText -eD FixInput/geoNamesCitiesCaFind.dict -a 20 -b 20 \
| SxmlSelect -p @6 -p @1 > FixData/citiesInTitle2ref.list
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd \
| SxmlFindText -eD FixInput/geoNamesCitiesCaFind.dict -a 20 -b 20 | grep body \
| StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr \
| grep -v University | SxmlSelect -p @6 -p @1 > FixData/cities2ref.list
HfdCat $EXPLOR_AREA/FixData/Istex/Corpus/repository.hfd \
| SxmlFindText -eD FixInput/geoNamesCitiesCaFind.dict -a 20 -b 20 | grep body \
| StrDictSelect -t FixInput/HfdIstexRepo2HfdPubMed.dict -sr \
| grep -v University | SxmlSelect -p @6 -p @1 > FixData/cities2IstexRef.list
Zones postales
cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @3 | sort -u > FixInput/geoNamesPlacesCA.dict
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd \
| SxmlFindText -D FixInput/geoNamesPlacesCA.dict -a 20 -b 20 | grep body \
| StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr \
| grep -v University | SxmlSelect -p @6 -p @1 > FixData/places2ref.list
cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @6| sort -u > FixInput/geoNamesCitiesCA.dict
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd \
| SxmlFindText -D FixInput/geoNamesCitiesCA.dict -a 20 -b 20 | grep body \
| StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr \
| grep -v University | SxmlSelect -p @6 -p @1 > FixData/cities2ref.list
cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @3 -p @4 | sort -u > FixInput/geoNamesPlaces2ProvCA.dict
cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @6 -p @4 | StrDictClean |sort -u > FixInput/geoNamesCities2ProvCA.dict
Cumuls
(
HfdCat Data/Main/Exploration/Mesh.i.hfd \
| SxmlSelect -g idx/kw/1 -p @g1 -p @2 \
| StrDictSelect -t FixInput/provincesMeSH.dict -s \
| SxmlSelect -s idx/l/e/1 -p @1 -p @s1 \
| StrDictSelect -t FixInput/provincesMesh2geoNames.dict -sr \
| SxmlSelect -p @2,@1 -p 1
HfdCat Data/Main/Exploration/KwdEn.i.hfd \
| SxmlSelect -g idx/kw/1 -p @g1 -p @2 \
| StrDictSelect -t FixInput/provincesEpidemioMeSH.dict -s \
| SxmlSelect -s idx/l/e/1 -p @1 -p @s1 \
| StrDictSelect -t FixInput/provincesEpidemioMeSH2geoNames.dict -sr \
| SxmlSelect -p @2,@1 -p 2
cat FixData/provinces2ref.list | SxmlSelect -p @2,@1 -p 1
cat FixData/provincesInTitle2ref.list | SxmlSelect -p @2,@1 -p 2
cat FixData/provinces2IstexRef.list | SxmlSelect -p @2,@1 -p 1
cat FixData/provincesInFront2ref.list | SxmlSelect -p @2,@1 -p 1
cat FixData/cities2ref.list | StrDictSelect -t FixInput/geoNamesCities2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 2
cat FixData/citiesInTitle2ref.list| StrDictSelect -t FixInput/geoNamesCities2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 3
cat FixData/cities2IstexRef.list | StrDictSelect -t FixInput/geoNamesCities2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 2
cat FixData/places2ref.list | StrDictSelect -t FixInput/geoNamesPlaces2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 2
) | sort | SxmlCumul -wp | SxmlSelect -g i/k/1 -g i/n/1 -p @g1 -p @g2 \
| StrDictFromStream -T, > FixData/geoNamesRefProvW.list
cat FixData/geoNamesRefProvW.list | SxmlSelect -p @2 -p @3 | sort | SxmlCumul -wd
- Test Alberta
cat FixData/geoNamesRefProvW.list \
| grep Alberta | SxmlSelect -p @3 -p @1 | sort -rn | head -15 | SxmlSelect -p @2 -p @1 \
| HfdSelect -h Data/Main/Exploration/biblio -i \
| SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title/1 \
-g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=RBID/1 -p "*[@2] : {{Explor lien
|wiki= Wicri/Sante
|area= GrippeCanadaV4
|flux= Main
|étape= Exploration
|type= RBID | clé=@g2 | texte=@g1}}"
Génération de la page
Shell (FixBin/geneListProvince.sh) à récupérer par
WicriGetPage -l wicri-sante.fr -p "Discussion:Serveur d'exploration sur la grippe au Canada" \
| MediaWikiExtractSources -w | HfdStoreFile
#/bin/sh
echo '<span id="'$2'"></span>'
echo "==[[$3]]=="
cat FixData/geoNamesRefProvW.list \
| grep $1 | SxmlSelect -p @3 -p @1 | sort -rn | head -15 | SxmlSelect -p @2 -p @1 \
| HfdSelect -h Data/Main/Exploration/biblio -i \
| SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title/1 \
-g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=RBID/1 -p "*[@2] : {{Explor lien
|wiki= Wicri/Sante
|area= GrippeCanadaV4
|flux= Main
|étape= Exploration
|type= RBID | clé=@g2 | texte=@g1}}"
Généralisation
(
. FixBin/generListProvince.sh Alberta AB Alberta
. FixBin/generListProvince.sh Columbia BC "Colombie-Britannique"
. FixBin/generListProvince.sh Prince PE "Île-du-Prince-Édouard"
. FixBin/generListProvince.sh Manitoba MB Manitoba
. FixBin/generListProvince.sh Brunswick NB Nouveau-Brunswick
. FixBin/generListProvince.sh Scotia NS Nouvelle-Écosse
. FixBin/generListProvince.sh Nunavut NU Nunavut
. FixBin/generListProvince.sh Ontario ON Ontario
. FixBin/generListProvince.sh Quebec QC Québec
. FixBin/generListProvince.sh Saskatchewan SK Saskatchewan
. FixBin/generListProvince.sh Labrador NL Terre-Neuve-et-Labrador
. FixBin/generListProvince.sh Northwest NT "Territoires du Nord-Ouest"
. FixBin/generListProvince.sh Yukon YT Yukon
)
Génération par modèle
#/bin/sh
echo "|$2="
cat FixData/geoNamesRefProvW.list \
| grep $1 | SxmlSelect -p @3 -p @1 | sort -rn | head -15 | SxmlSelect -p @2 -p @1 \
| HfdSelect -h Data/Main/Exploration/biblio -i \
| SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title/1 \
-g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=RBID/1 -p "*[@2] : {{Explor lien
|wiki= Wicri/Sante
|area= GrippeCanadaV4
|flux= Main
|étape= Exploration
|type= RBID | clé=@g2 | texte=@g1}}"
(
MediaWikiExportCommand -c fileBegin
MediaWikiExportCommand -c pageBegin -p "Modèle:GrippeCanadaV4/Liste par province"
echo "{{#switch:{{{code}}}"
. FixBin/generModeleListProvince.sh Alberta AB Alberta
. FixBin/generModeleListProvince.sh Columbia BC "Colombie-Britannique"
. FixBin/generModeleListProvince.sh Prince PE "Île-du-Prince-Édouard"
. FixBin/generModeleListProvince.sh Manitoba MB Manitoba
. FixBin/generModeleListProvince.sh Brunswick NB Nouveau-Brunswick
. FixBin/generModeleListProvince.sh Scotia NS Nouvelle-Écosse
. FixBin/generModeleListProvince.sh Nunavut NU Nunavut
. FixBin/generModeleListProvince.sh Ontario ON Ontario
. FixBin/generModeleListProvince.sh Quebec QC Québec
. FixBin/generModeleListProvince.sh Saskatchewan SK Saskatchewan
. FixBin/generModeleListProvince.sh Labrador NL Terre-Neuve-et-Labrador
. FixBin/generModeleListProvince.sh Northwest NT "Territoires du Nord-Ouest"
. FixBin/generModeleListProvince.sh Yukon YT Yukon
echo "}}"
MediaWikiExportCommand -c pageEnd
MediaWikiExportCommand -c fileEnd
) >exportModeleListProvinces.xml
Génération de la carte
Pour la page de test sur ce wiki
cat FixData/geoNamesRefProvW.list \
| SxmlSelect -p @2 -p @3 \
| StrDictSelect -t $DILIB_ROOT/Data/Wicri/Canada/geoNamesProvinces2Wicri.dict -sr \
| sort | SxmlCumul -wd \
| MediaWikiGeoStatMap -o w -wn w -k k -g $DILIB_ROOT/Data/Wicri/Canada/localisationProvincesCanada.dict \
-m Canada -f "Canada location map.svg" -s 650 \
-a GrippeCanadaV4 -t "Carte Provinces Canada" -C "#%k" -T "%a/Faits épidémiques/%k" \
-h https://lorexplor.istex.fr/Wicri/Sante/explor
Par modèle
(
MediaWikiExportCommand -c fileBegin
MediaWikiExportCommand -c pageBegin -p "Modèle:GrippeCanadaV4/Faits épidémiques/carte"
cat FixData/geoNamesRefProvW.list \
| SxmlSelect -p @2 -p @3 \
| StrDictSelect -t $DILIB_ROOT/Data/Wicri/Canada/geoNamesProvinces2Wicri.dict -sr \
| sort | SxmlCumul -wd \
| MediaWikiGeoStatMap -o m -wn w -k k -g $DILIB_ROOT/Data/Wicri/Canada/localisationProvincesCanada.dict \
-m Canada -f "Canada location map.svg" -s 650 \
-a GrippeCanadaV4 -t "Carte Provinces Canada" -C "%a/Faits épidémiques/%k" -T "%a/Faits épidémiques/%k"
MediaWikiExportCommand -c pageEnd
MediaWikiExportCommand -c fileEnd
) >exportModeleCarteProvinces.xml
Faits épidémiques sur ISTEX
Simple filtrage sur Istex
HfdCat $EXPLOR_AREA/FixData/Istex/Corpus/repository.hfd \
| grep Vancouver | SxmlFindText -s Vancouver -a 30 -b 30 | grep body \
| StrDictSelect -t FixInput/HfdIstexRepo2HfdPubMed.dict -sr