Discussion:Serveur d'exploration sur la grippe au Canada

De Wicri Santé

Faits épidémiques par le MeSH

WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie"   \
   |  MediaWikiCleanTable                        \
   |  MediaWikiTable2SxmlRowCol                  \
   |  MediaWikiTableTransformCol  -l 5           \
   |  SxmlSelect -g r/c5/p/t/1 -p @g1            \
        > $EXPLOR_AREA/FixInput/provincesMeSH.dict

WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie"   \
   |  MediaWikiCleanTable                        \
   |  MediaWikiTable2SxmlRowCol                  \
   |  MediaWikiTableTransformCol  -l 5           \
   |  SxmlSelect -g r/c5/p/t/1 -p "@g1 (epidemiology)" \
       > $EXPLOR_AREA/FixInput/provincesEpidemioMeSH.dict

WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie"   \
   |  MediaWikiCleanTable                        \
   |  MediaWikiTable2SxmlRowCol                  \
   |  MediaWikiTableTransformCol  -l 5           \
   |  SxmlSelect -g r/c5/p/t/1 -g r/c3/l/1 -p @g1 -p @g2  \
       > $EXPLOR_AREA/FixInput/provincesMesh2geoNames.dict

WicriGetPage -l wicri-canada.fr -p "Provinces et territoires du Canada/Terminologie"   \
   |  MediaWikiCleanTable                        \
   |  MediaWikiTable2SxmlRowCol                  \
   |  MediaWikiTableTransformCol  -l 5           \
   |  SxmlSelect -g r/c5/p/t/1 -p "@g1 (epidemiology)" -g r/c3/l/1 -p @g2 \
       > $EXPLOR_AREA/FixInput/provincesEpidemioMeSH2geoNames.dict

Faits épidémiques sur PubMed Central

Dictionnaires de correspondances sur PMC

HfdCat $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd      \
 | SxmlSelect -g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=pmc/1 -p@g1 -p @1  \
 | sort > $EXPLOR_AREA/FixInput/pmcToHfdRefPubMed.dict

Simple filtrage

HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd               \
   | grep Vancouver | SxmlFindText -s Vancouver -a 30 -b 30 | grep body       \
   | StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr

Avec GeoNames

But produire Serveur d'exploration sur la grippe au Canada/Où trouver des faits épidémiques ?

Récupération du fichier des codes postaux du Canada sur GeoNames;
curl http://download.geonames.org/export/zip/CA.zip -o CA.zip
unzip CA.zip
mv CA.txt Import/geoNamesPosCodesCA.txt

Provinces

Extraction des provinces
cat Import/geoNamesPosCodesCA.txt  \
 | SxmlSelect -p @4| sort -u > FixInput/geoNamesProvincesCA.dict
cat FixInput/geoNamesProvincesCA.dict    \
 | SxmlFindTextBuildDict > FixInput/geoNamesProvincesCaFind.dict
Filtrage sur les provinces
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd         \
   |   SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20 | grep body       \
   | StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr       \
   | SxmlSelect -p @6 -p @1 > FixData/provinces2ref.list
HfdCat $EXPLOR_AREA/FixData/Istex/Corpus/repository.hfd         \
   |   SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20 | grep body       \
   | StrDictSelect -t FixInput/HfdIstexRepo2HfdPubMed.dict -sr       \
   | SxmlSelect -p @6 -p @1 > FixData/provinces2IstexRef.list
HfdCat Data/Main/Exploration/biblio.hfd  \
  | SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title -p @1 -p @g1 \
  | SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20       \
   | SxmlSelect -p @6 -p @1 > FixData/provincesInTitle2ref.list
HfdCat Data/Main/Exploration/biblio.hfd  \
  | SxmlSelect -g record/TEI/front -p @1 -p @g1 \
  | SxmlFindText -eD FixInput/geoNamesProvincesCaFind.dict -a 20 -b 20       \
   | SxmlSelect -p @6 -p @1 > FixData/provincesInFront2ref.list

Villes

Extraction des villes
cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @6 \
| StrDictClean | sort -u > FixInput/geoNamesCitiesCA.dict

cat FixInput/geoNamesCitiesCA.dict    \
 | SxmlFindTextBuildDict > FixInput/geoNamesCitiesCaFind.dict
HfdCat Data/Main/Exploration/biblio.hfd  \
  | SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title -p @1 -p @g1 \
  | SxmlFindText -eD FixInput/geoNamesCitiesCaFind.dict -a 20 -b 20       \
   | SxmlSelect -p @6 -p @1 > FixData/citiesInTitle2ref.list
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd         \
   |   SxmlFindText -eD FixInput/geoNamesCitiesCaFind.dict -a 20 -b 20 | grep body       \
   | StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr       \
   |  grep -v University | SxmlSelect -p @6 -p @1 > FixData/cities2ref.list
HfdCat $EXPLOR_AREA/FixData/Istex/Corpus/repository.hfd         \
   |   SxmlFindText -eD FixInput/geoNamesCitiesCaFind.dict -a 20 -b 20 | grep body       \
   | StrDictSelect -t FixInput/HfdIstexRepo2HfdPubMed.dict -sr       \
   | grep -v University | SxmlSelect -p @6 -p @1 > FixData/cities2IstexRef.list

Zones postales

cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @3 | sort -u > FixInput/geoNamesPlacesCA.dict
HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd         \
   |   SxmlFindText -D FixInput/geoNamesPlacesCA.dict -a 20 -b 20 | grep body       \
   | StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr       \
   | grep -v University | SxmlSelect -p @6 -p @1 > FixData/places2ref.list

cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @6| sort -u > FixInput/geoNamesCitiesCA.dict


HfdCat $EXPLOR_AREA/FixData/Pmc/Corpus/repository.hfd         \
   |   SxmlFindText -D FixInput/geoNamesCitiesCA.dict -a 20 -b 20 | grep body       \
   | StrDictSelect -t FixInput/HfdPmcRepo2HfdPubMed.dict -sr       \
   |  grep -v University | SxmlSelect -p @6 -p @1 > FixData/cities2ref.list

cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @3 -p @4 | sort -u > FixInput/geoNamesPlaces2ProvCA.dict

cat Import/geoNamesPosCodesCA.txt | SxmlSelect -p @6 -p @4 | StrDictClean |sort -u > FixInput/geoNamesCities2ProvCA.dict

Cumuls

(
 HfdCat Data/Main/Exploration/Mesh.i.hfd     \
   | SxmlSelect -g idx/kw/1 -p @g1 -p @2     \
   | StrDictSelect -t FixInput/provincesMeSH.dict -s   \
   | SxmlSelect -s idx/l/e/1 -p @1 -p @s1             \
   | StrDictSelect -t FixInput/provincesMesh2geoNames.dict -sr   \
   | SxmlSelect -p @2,@1 -p  1

HfdCat Data/Main/Exploration/KwdEn.i.hfd     \
   | SxmlSelect -g idx/kw/1 -p @g1 -p @2     \
   | StrDictSelect -t FixInput/provincesEpidemioMeSH.dict -s   \
     | SxmlSelect -s idx/l/e/1 -p @1 -p @s1             \
   | StrDictSelect -t FixInput/provincesEpidemioMeSH2geoNames.dict -sr   \
   | SxmlSelect -p @2,@1 -p  2


 cat FixData/provinces2ref.list | SxmlSelect -p @2,@1 -p 1
 cat FixData/provincesInTitle2ref.list | SxmlSelect -p @2,@1 -p 2
 cat FixData/provinces2IstexRef.list | SxmlSelect -p @2,@1 -p 1
 cat FixData/provincesInFront2ref.list | SxmlSelect -p @2,@1 -p 1

 cat FixData/cities2ref.list | StrDictSelect -t FixInput/geoNamesCities2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 2
 cat FixData/citiesInTitle2ref.list| StrDictSelect -t FixInput/geoNamesCities2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 3
 cat FixData/cities2IstexRef.list | StrDictSelect -t FixInput/geoNamesCities2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 2


 cat FixData/places2ref.list | StrDictSelect -t FixInput/geoNamesPlaces2ProvCA.dict -sr | SxmlSelect -p @2,@1 -p 2
) | sort | SxmlCumul -wp | SxmlSelect -g i/k/1  -g i/n/1 -p @g1 -p @g2  \
  | StrDictFromStream -T, > FixData/geoNamesRefProvW.list

cat FixData/geoNamesRefProvW.list | SxmlSelect -p @2 -p @3 | sort | SxmlCumul -wd
Test Alberta
cat FixData/geoNamesRefProvW.list   \
  | grep Alberta | SxmlSelect -p @3 -p @1 | sort -rn  | head -15 | SxmlSelect -p @2 -p @1 \
  | HfdSelect -h Data/Main/Exploration/biblio -i \
  | SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title/1 \
    -g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=RBID/1 -p "*[@2] : {{Explor lien
   |wiki=    Wicri/Sante
   |area=    GrippeCanadaV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID | clé=@g2 | texte=@g1}}"

Génération de la page

Shell (FixBin/geneListProvince.sh) à récupérer par

WicriGetPage -l wicri-sante.fr -p "Discussion:Serveur d'exploration sur la grippe au Canada" \
  |  MediaWikiExtractSources -w | HfdStoreFile
#/bin/sh
echo '<span id="'$2'"></span>'
echo "==[[$3]]=="

cat FixData/geoNamesRefProvW.list   \
  | grep $1 | SxmlSelect -p @3 -p @1 | sort -rn  | head -15 | SxmlSelect -p @2 -p @1 \
  | HfdSelect -h Data/Main/Exploration/biblio -i \
  | SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title/1 \
    -g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=RBID/1 -p "*[@2] : {{Explor lien
   |wiki=    Wicri/Sante
   |area=    GrippeCanadaV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID | clé=@g2 | texte=@g1}}"

Généralisation

(
. FixBin/generListProvince.sh Alberta AB Alberta
. FixBin/generListProvince.sh Columbia BC "Colombie-Britannique"
. FixBin/generListProvince.sh Prince PE "Île-du-Prince-Édouard"
. FixBin/generListProvince.sh Manitoba MB Manitoba
. FixBin/generListProvince.sh Brunswick NB Nouveau-Brunswick
. FixBin/generListProvince.sh Scotia NS Nouvelle-Écosse
. FixBin/generListProvince.sh Nunavut NU Nunavut
. FixBin/generListProvince.sh Ontario ON Ontario
. FixBin/generListProvince.sh Quebec QC Québec
. FixBin/generListProvince.sh Saskatchewan SK Saskatchewan
. FixBin/generListProvince.sh Labrador NL Terre-Neuve-et-Labrador
. FixBin/generListProvince.sh Northwest NT "Territoires du Nord-Ouest"
. FixBin/generListProvince.sh Yukon YT Yukon
)

Génération par modèle

#/bin/sh
echo "|$2="
cat FixData/geoNamesRefProvW.list   \
  | grep $1 | SxmlSelect -p @3 -p @1 | sort -rn  | head -15 | SxmlSelect -p @2 -p @1 \
  | HfdSelect -h Data/Main/Exploration/biblio -i \
  | SxmlSelect -g record/TEI/teiHeader/fileDesc/titleStmt/title/1 \
    -g record/TEI/teiHeader/fileDesc/publicationStmt/idno@type=RBID/1 -p "*[@2] : {{Explor lien
   |wiki=    Wicri/Sante
   |area=    GrippeCanadaV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID | clé=@g2 | texte=@g1}}"


(
MediaWikiExportCommand -c fileBegin
MediaWikiExportCommand -c pageBegin -p "Modèle:GrippeCanadaV4/Liste par province"
echo "{{#switch:{{{code}}}"
. FixBin/generModeleListProvince.sh Alberta AB Alberta
. FixBin/generModeleListProvince.sh Columbia BC "Colombie-Britannique"
. FixBin/generModeleListProvince.sh Prince PE "Île-du-Prince-Édouard"
. FixBin/generModeleListProvince.sh Manitoba MB Manitoba
. FixBin/generModeleListProvince.sh Brunswick NB Nouveau-Brunswick
. FixBin/generModeleListProvince.sh Scotia NS Nouvelle-Écosse
. FixBin/generModeleListProvince.sh Nunavut NU Nunavut
. FixBin/generModeleListProvince.sh Ontario ON Ontario
. FixBin/generModeleListProvince.sh Quebec QC Québec
. FixBin/generModeleListProvince.sh Saskatchewan SK Saskatchewan
. FixBin/generModeleListProvince.sh Labrador NL Terre-Neuve-et-Labrador
. FixBin/generModeleListProvince.sh Northwest NT "Territoires du Nord-Ouest"
. FixBin/generModeleListProvince.sh Yukon YT Yukon
echo "}}"
MediaWikiExportCommand -c pageEnd 
MediaWikiExportCommand -c fileEnd 
)  >exportModeleListProvinces.xml

Génération de la carte

Pour la page de test sur ce wiki

cat FixData/geoNamesRefProvW.list   \
 | SxmlSelect -p @2 -p @3           \
 | StrDictSelect -t $DILIB_ROOT/Data/Wicri/Canada/geoNamesProvinces2Wicri.dict -sr \
 | sort | SxmlCumul -wd             \
 | MediaWikiGeoStatMap -o w -wn w -k k -g $DILIB_ROOT/Data/Wicri/Canada/localisationProvincesCanada.dict \
            -m Canada -f "Canada location map.svg" -s 650  \
            -a GrippeCanadaV4 -t "Carte Provinces Canada"    -C "#%k"     -T "%a/Faits épidémiques/%k"               \
            -h https://lorexplor.istex.fr/Wicri/Sante/explor

Par modèle

(
MediaWikiExportCommand -c fileBegin
MediaWikiExportCommand -c pageBegin -p "Modèle:GrippeCanadaV4/Faits épidémiques/carte"
cat FixData/geoNamesRefProvW.list   \
 | SxmlSelect -p @2 -p @3           \
 | StrDictSelect -t $DILIB_ROOT/Data/Wicri/Canada/geoNamesProvinces2Wicri.dict -sr \
 | sort | SxmlCumul -wd             \
 | MediaWikiGeoStatMap -o m -wn w -k k -g $DILIB_ROOT/Data/Wicri/Canada/localisationProvincesCanada.dict \
            -m Canada -f "Canada location map.svg" -s 650  \
            -a GrippeCanadaV4 -t "Carte Provinces Canada"    -C "%a/Faits épidémiques/%k"     -T "%a/Faits épidémiques/%k"  
MediaWikiExportCommand -c pageEnd 
MediaWikiExportCommand -c fileEnd 
)  >exportModeleCarteProvinces.xml

Faits épidémiques sur ISTEX

Simple filtrage sur Istex

HfdCat $EXPLOR_AREA/FixData/Istex/Corpus/repository.hfd               \
   | grep Vancouver | SxmlFindText -s Vancouver -a 30 -b 30 | grep body       \
   | StrDictSelect -t FixInput/HfdIstexRepo2HfdPubMed.dict -sr