Difference between revisions of "DC 2010 Artist paper"
imported>Jacques Ducloy (→References) |
imported>Jacques Ducloy (→The current Wicri network) |
||
Line 50: | Line 50: | ||
In a multilingual approach, most wikis are really families of wikis (i.e. a set of wikis, one for each language, connected by interwiki links). In this paper, we use the notation ''<code>Wicri/Water(fr)</code>'' to define the French component of the family, and ''<code>Wicri/Water(en)</code>'' for the English one. | In a multilingual approach, most wikis are really families of wikis (i.e. a set of wikis, one for each language, connected by interwiki links). In this paper, we use the notation ''<code>Wicri/Water(fr)</code>'' to define the French component of the family, and ''<code>Wicri/Water(en)</code>'' for the English one. | ||
[[File:EditorialCris.png|right|300px|thumb|CRIS on wiki]] | [[File:EditorialCris.png|right|300px|thumb|CRIS on wiki]] | ||
− | A first set of common wikis are designed on a regional framework such as ''<code>Wicri/Lorraine</code>'' or ''<code>Wicri/Alsace</code>''. A main objective is to obtain a highly detailed and understandable CRIS (Current Research Information System). This approach looks like Jeffery's [[#bib. | + | A first set of common wikis are designed on a regional framework such as ''<code>Wicri/Lorraine</code>'' or ''<code>Wicri/Alsace</code>''. A main objective is to obtain a highly detailed and understandable CRIS (Current Research Information System). This approach looks like Jeffery's [[#bib.j2|[j2]]] or Erbach's [[#bib.e1|[e1]]] ones. <u>They would like to merge organization related items (CRIS) with open archives in order to produce an e-Science infrastructure[[#bib.j1|[j1]]]. Wicri adds a wiki, with its editorial facilities, for bringing a readable summary.</u>''J'ai ajouté une nouvelle référence ([[#bib.j1|[j1]]] dont le titre me semble plus explicite''. |
An other set of common wikis is devoted to thematic fields. At this time, one of them, ''<code>Wicri/Ticri</code>'' is related to Information Science & Technology (a DCMI portal is included). An other part deals with environment and contains 4 wiki families: ''<code>Wicri/Water</code>'', ''<code>Wicri/Woods</code>'', ''<code>Wicri/Biomass</code>'' and ''<code>Wicri/UrbanSoils</code>''. They are also organized with information system items, (such as program committees) and editorial contents (scientific articles, scientific surveys). | An other set of common wikis is devoted to thematic fields. At this time, one of them, ''<code>Wicri/Ticri</code>'' is related to Information Science & Technology (a DCMI portal is included). An other part deals with environment and contains 4 wiki families: ''<code>Wicri/Water</code>'', ''<code>Wicri/Woods</code>'', ''<code>Wicri/Biomass</code>'' and ''<code>Wicri/UrbanSoils</code>''. They are also organized with information system items, (such as program committees) and editorial contents (scientific articles, scientific surveys). |
Revision as of 08:50, 26 February 2010
DC 2010 | |
---|---|
This article would be submited to DC 2010 Conference
|
DC 2010 Conference Dublin Core Metadata Initiative Pittsburgh, 20-22 October 2010. |
- Title
- Metadata for semantic wikis networks
- Abstract
- bla bla bla (to be done at the end of the process)
- Authors
- Jacques Ducloy(i), Thierry Daunois(ii), Muriel Foulonneau(iii), Alice Hermann(iv), Jean-Charles Lamirel(v), Stéphane Sire(vi) and Christine Vanoirbeek(vi).
- (i) DRRT Lorraine, Metz
- (ii) Université de Lorraine (INPL), Nancy
- (iii) Henri Tudor Research Centre, Luxembourg
- (iv) Rennes
- (v) Loria
- (vi) EPFL, Lausanne (Suisse)
Contents
Introduction
Since March 25, 1995, when Ward Cunningham launched WikiWikiWeb devoted to software development, wikis are playing an increasing role in the scene (field ?) of scientific and technical information. With Wikipedia, size reaches millions of records and the need for metadata becomes ubiquitous. The first generation of metadata is based mainly on the use of categories, which looks like a traditional indexing practice. Semantic wikis introduce a new generation of metadata, allowing a knowledge modelling in a RDF framework.
Most of wikis are quite monolithic and metadata perform an internal function. What happens with an editorial collection of scientific information distributed in a network of semantic wikis ? This article aims at identifying several metadata issues we faced when starting the Wicri network.
WICRI is an acronym that stands for "WIkis for Communities in Research and Innovation". Right now, Wicri is a demonstrator, which contains about sixty wikis, on a regional basis, with a few set of topics. But the knowledge architecture we must design is quite the same as would be required for several thousands of wikis. Thus metadata do play a crucial role in this project.
In this paper, we will first introduce Wicri network; then we will review several existing solutions. Trails to explore in the future will be discussed in two views: that of a contributor who is facing the production of metadata, and that of the computer scientist developing new services.
- Note
- This article is written while using a collaborative practice. It will be published in two versions: traditional on the web site of the conference; and wicrified[1] on Artist wiki.
Introducing Wicri
Wicri, a network of wikis for research and innovation
Wicri network has been created in the framework of Mission Ticri (Technologies dealing with Information and Communication for Communities involved in Research and Innovation). This initiative was launched by the Lorraine representative of Ministry in charge of research. Ticri aims at disseminating main results of research communities in order to promote partnership between innovation actors, to encourage outreach, and to develop technology transfers in a multidisciplinary context.
Wikipedia has demonstrated the interest of the wiki approach to build and disseminate a common knowledge in a very large scale. Thus Wikipedia brings us a first answer (and we are using this media) but it is not sufficient to bring us a global response. A main point is that Wikipedia's contributors must display information that is attested by references. Authors can be anonymous as far as their bibliographic references are significant and link to explicitly named people. But now, when we deal with research fields, the academic communities are producing the knowledge that Wikipedia could use. In many cases, knowledge is in progress and many assumptions appear to be hypothesis. For these reasons we think that the authors must be clearly known; thus anonymous contributions are forbidden.
As a result, such a wiki infrastructure must be driven by institutional entities in order to manage registration processes. Thus the institutions must find an advantage in investing in wiki approach and visibility becomes a strong parameter. The network approach allows each partner to promote its own wiki site, and its own visibility.
In a first step, we have built a little demonstrator with several institutional wikis. The limits have appeared quite immediately: if several organizations are working on the same topic, this topic must be developped on a thematic wiki. Thus we have quickly introduced several wikis on thematic or regional design.
A little team, mainly 3 people in the same office, has operated the demonstrator. As soon as we were more than one, several coherency problems have been met and an effective carrying of metadata has been introduced.
The current Wicri network
The wiki network accepts two main types of wikis.
- Institutional wikis : an institutional wiki is handled by an organization. In this paper, we will often use a naming scheme with two parts: region then accronym. For instance,
Lorraine/sge
stands for the research cluster SGE (Science et Génie de l'Environnement) in Lorraine area. For wikis related with scientific working groups we use in first part a code identifying the global thematic; for instance,Ist/Artist
is the wiki of Artist WG, dealing with Information Science and Technology. - Common wikis : a common wiki is designed by the global Wicri Community. Be it managed by an organization or not, it fully shares the common rules and is moderated by independent and scientific committees. In this paper we use a naming scheme with
Wicri
as first part, like inWicri/Lorraine
orWicri/Water
.
In a multilingual approach, most wikis are really families of wikis (i.e. a set of wikis, one for each language, connected by interwiki links). In this paper, we use the notation Wicri/Water(fr)
to define the French component of the family, and Wicri/Water(en)
for the English one.
A first set of common wikis are designed on a regional framework such as Wicri/Lorraine
or Wicri/Alsace
. A main objective is to obtain a highly detailed and understandable CRIS (Current Research Information System). This approach looks like Jeffery's [j2] or Erbach's [e1] ones. They would like to merge organization related items (CRIS) with open archives in order to produce an e-Science infrastructure[j1]. Wicri adds a wiki, with its editorial facilities, for bringing a readable summary.J'ai ajouté une nouvelle référence ([j1] dont le titre me semble plus explicite.
An other set of common wikis is devoted to thematic fields. At this time, one of them, Wicri/Ticri
is related to Information Science & Technology (a DCMI portal is included). An other part deals with environment and contains 4 wiki families: Wicri/Water
, Wicri/Woods
, Wicri/Biomass
and Wicri/UrbanSoils
. They are also organized with information system items, (such as program committees) and editorial contents (scientific articles, scientific surveys).
A few wikis have been designed for a global coherency of the network. The most visible is Wicri/Wicri
which gives a global view of the network: all topics must appear and link to more detailed pages or desk in other wikis.
An other, Wicri/Media
is an image repository (and plays the same role as Commons in the Wipedia family). It can also host pdf documents, but we are looking for a better solution, using Fedora for instance.
At least, related to metadata handling, a wiki named Wicri/Base
contains templates and semantic items which can be used in all other wikis.
Network coherency versus contenus différenciés
Most information should be developed several times on different wikis. For instance, each research project with several partners must be cited and commented in the regional wiki of each partner, as well as in all relevant thematic wikis.
Even in the initial phase of the Wicri project, we have encountered a significant number of cases. Here follow 3 cases quite strongly differentiated: a city description, a scientific paper, an author page.
- The city of Pittsburgh, where the 2010 DC conference will be held, appears at least on 3 wikis. On Wicri/Ticri, Pittsburgh is directly connected to DC 2010 and the corresponding page speaks about main activities related to information science in this geographic area[2]. On Wicri/Water, we describe the confluence of Allegheny and Monongahela rivers for giving the source of Ohio[3]. On Wicri/Wicri, we talk about general facts about this city and introduce commented links on the other pages[4]. These 3 pages are related to the same topic, but display distinct contents.
- Carl Lagoze has written an article which is becoming very popular in French speaking area: Qu’est-ce qu’une bibliothèque numérique, au juste ? / What Is a Digital Library anymore, anyway? []. In
IST/Artist
the paper is integraped in the portal of Ametist journal in which it was first translated[5]. A copy have been done inWicri/Ticri
, as it was considered as a reference paper for a wiki dealing with digital libraries[6]. Anchor and links are sometimes different that onIST/Artist
. At least, the first part (on only the first part has been introduced onWicri/Wicri
[7].Nouvelle proposition : At least, this paper contains an interesting introduction that coulg get a very large audience. Thus this first part (and only this first part has been introduced onWicri/Wicri
[8].. - Puis la page sur Jean-Claude Guédon dans Artist / Ticri et Wicri et/ou colloques
All these pages are mainly written by human contributors, and not by computers. Computers could help in various ways but in fine pages are made by contributors. In a repository based network, using OAI-PMH for example, the coherency is done by computer protocols, which share controlled metadata. In a wiki network, a contributor can write on many wikis and interact with metadata. Thus metadata plays a crucial role not only with programming activities but also with authoring process.
Issues about networks of semantic wikis
This section introduces a discussion about the technical choices that have been done in the initial design of Wicri. The Wicri project aims at setting up an operational set of services. At the present time, it is a demonstrator which is becoming a digital infrastructure. So, even if we are close to research projects, we have implemented some pragmatic solutions. Je voulais dire ceci : nous sommes dans un contexte de montage de service à caractère opérationnel et nous devons trouver des solutions pragmatiques qui ne sont pas forcément les meilleures du point de vue de la recherche Jacques Ducloy 12:26, 22 February 2010 (UTC)
Ne t'excuse pas. Je propose d'essayer de rebondir sur l'assertion suivante "Researchers need to stop thinking of themselves as researchers and start thinking of themselves as implementors. " dans un post de Zack Rosen "RDF Semantic web research isn't working". Il pointe justement le fait que les chercheurs en Semantic Web ne sont pas pragmatiques et ne s'intègrent pas aux environnements existants. Je peux rédiger + tard [1] Muriel Foulonneau 16:38, 24 February 2010 (UTC)
Wikis for scientists who approach the world
A first choice we met while starting Wicri project was the wiki engine. A priority issue for this project is to allow a maximum of researchers to disseminate their results to a maximum of actors potentially involved. Thus we have chosen to be fully compatible with Wikipedia, and to use MediaWiki[9] as the wiki engine of Wicri network. This CMS[10] is used by Wikipedia and is becoming very popular.
This option implies several consequences. The first one is to supplement the functionality of MediaWiki with php extensions and templates that are commonly used in Wikipedia, so that an occasional contributor is not disoriented when moving from Wikipedia to the Wicri network. The Wicri/base
function is to manage the collection of needed templates (and also semantic items) used throughout the network.
The second consequence is the use of local language (French, for example) to express and manipulate metadata in a given wiki.
Semantic wikis for scientific objects
Scientists and engineers used to work with a lot of technical objects, such as formulas, drawings 3D images, knoledge items; and not only texts. This paragraph gives some issues about the way in which a wiki can address these requirements.
In other words, Wicri's pages must carry many texts that contain scientific results described with scientific objects. Using MediaWiki opens a first set of feature dealing with formulas or drawing. Some of them are very easy to install, for instance "imagemap"[11]. However, our experience is already showing some difficulties, stressing the need for technological support. For instance, downloading LaTeX extensions requires installing LaTeX close to the operating system that supports the wiki.
MediaWiki supports SVG (Scalable Vector Graphics) in a quite poor way. A contributor can upload an SVG image, but, this image is after converted into a png format. So, right now, it appears difficult to manage interactions betwen text and images with the basic SVG facility. Life sciences give a good sample of what could be needed as interactions in a scientific area. The Protopedia project [h1] is carrying 3D images of molecular items such as protein, RNA, DNA and other macromolecules[12]. The contributor can set several kind of interaction while using green links in the wiki text. These links interact with an applet Java (jmol).
Obviously, as the science mission is to build knowledge, semantic tools are ubiquitous. At a first level, Wikipedia has implemented several set of taxonomies. The life species is a quite complete sample. Implementation is made with a tree of categories and related templates (taxobox). This taxonomy is distributed on each language version, on Wikipedia Commons (images), and on WikiSpecies. A comparison between these different wikis shows a multipurpose utilization of 3 classification schemes[13].
At least, Semantic Mediawiki allows contributors to enter the Semantic Web with an RDF approach.phrase de liaison à revoir Jacques Ducloy 06:46, 26 February 2010 (UTC). At least, Wicri is in the process of appropriating Semantic MediaWiki, which provides an extension that enables wiki-users to semantically annotate wiki pages, based on which the wiki contents can be browsed, searched, and reused in novel ways[k2].
Semantic MediaWiki pour la recherche
- description des organisation
- description des objets
- references semantic using MediaWiki
- BOwiki Robert Hoehndorf1, Joshua Bacher, Michael Backhaus, Sergio E Gregorio, Frank Loebe, Kay Prüfer, Alexandr Uciteli, Johann Visagie, Heinrich Herre1, and Janet Kelso / BMC Bioinformatics 2009, 10(Suppl 5):S5doi:10.1186/1471-2105-10-S5-S5
- http://bowiki.net/wiki/index.php/Main_page
- PB n'a pas été mis à jour depuis 2008
- references scientific objects running with other wiki engines
- Christoph Lange. SWiM – a semantic wiki for mathematical knowledge management. In Sean Bechhofer, Manfred Hauswirth, J¨org Hoffmann, and Manolis Koubarakis, editors, ESWC, volume 5021 of Lecture Notes in Computer Science, pages 832–837. Springer, 2008.
Networks and Distributed Wiki Applications
Ce paragraphe doit permettre de dégager 5 modes de replication :
- niveau wiki qui peuvent être distribuées en pair à pair (exemple le pool pour des raisons de sécurité)
- niveau pages qui sont dupliquées sur le réseau
- niveau paragraphe qui sont dupliqués à l'identique
- niveau paragraphe qui sont dupliqués avec transformation
- niveau graphe de pages qui sont dupliqués avec condition
Samples :
- peer-to-peer networks
- Charbel Rahhal, Hala Skaf-Molli, Pascal Molli, and Stéphane Weiss: Multi-synchronous Collaborative Semantic Wikis. In Wise'09: International Conference on Web Information Systems , 2009.
< http://www.loria.fr/~molli/pmwiki/uploads/Main/Skaf09wise.pdf >
- Charbel Rahhal, Hala Skaf-Molli, Pascal Molli, and Stéphane Weiss: Multi-synchronous Collaborative Semantic Wikis. In Wise'09: International Conference on Web Information Systems , 2009.
See also: several intesting papers about SemWiki 2009 on Ticri (en)
Xml handling
- quelques références sur le sujet - en s'appuyant notamment sur le contrôle des replications, la cohabitation d'informations structurées et non structurée, les objets scientifiques complexes
- on revient notamment sur les exemples graphiques 3D
- on parle de la modélisation du réseau dans Wicri/base
Metadata for contributors
In most content management systems designed "before blogs and wikis" a clear barrier exists between editing contents, programming and managing metadata. On a wiki, all these activities can be handled by any actors, on any page, at any time. Quite any contributor may be faced with having to create new metadata. We have to give him a strong environnement to explore, define and comment such an activity.
This section introduces the need of a new wiki for designing metadata items, its content and its organization.
Introducing Wicri/metadata
- On précise le besoin : métadonnées à portée générique, (au moins dans un premier temps) pour permettre à un contributeur d'élaborer un nouvel élément de métadonnée
- references about wikis dealing with metadata
- DCMI set of wikis
- Fredrik Enoksson: A MoinMoin Wiki Syntax for Description Set Profiles, DCMI Working draft,(2008)
< http://dublincore.org/documents/2008/10/06/dsp-wiki-syntax/ >
- Fredrik Enoksson: A MoinMoin Wiki Syntax for Description Set Profiles, DCMI Working draft,(2008)
Metadata sources
- On donne les principales sources sur lesquelles s'appuyer
Main sources of metadata:
- DCMI
- CRIS: The skeleton of Wicri is a Current Research Information System
- generic semantic systems dealing with research : for example Openresearch.org
- bibliographic formats, mainly those that operate in a multilingual context (for instance Unimarc)
- Text formatting????: TEI...
- LOM...
- remarque: on doit introduire le multilinguisme en amont de l'article
- warning :copyright sur les systèmes de métadonnées ()
Metadata for Wicri/metadata
- On donne des premières informations sur l'organisation des données sur le wiki, toujours vu du contributeur
- on fait une liaison avec la section suivante sur les aspects informatiques
Metadata for computers
On veut utiliser les métadonnées pour ouvrir de nouvelles facilités. Cette section est très prospective et veut identifier des pistes d'investigation pour améliorer un procédé qui commence à fonctionner en pratique.
Handling network coherency
2 points :
- cohérence du réseau
- étendre les fonctionnalités disponibles au niveau wiki vers le résaeu - mais je ne suis pas certain qu'il y ait beaucoup de métadonnées sur ce pointJacques Ducloy 08:55, 19 February 2010 (UTC)
Interopérabilité externe
- exporter / importer une ontologie avec des systèmes qui ne sont pas Wicri
Human machine interface for contributors
Il faut faciliter la tâche du contributeur, surtout en phase d'apprentissage, qui doit intervenir dans un monde où cohabitent des informations structurées et non structurées. La littérature propose des réponses partielles, à base de RDF pour des approches centrées sur les données (au sens SGDB) et des réponses à base d'XML pour la construction d'un document lisible par l'homme. Est-il possible de concilier les 2 approches ? Y-a-t-il des choses à faire à court terme ?
Un paragraphe -> EPFL
External web mining
Un paragraphe -> Loria :
- comment utiliser l'information formalisée dans le réseau à de s fins de veille ?
- Comment les métadonnées peuvent jouer un rôle ?
Discussion & conclusion
:-)les métadonnées c'est important !!!
Plus sérieusement, on peut développer quelque chose autour de plus on s'y prend tôt, mieux c'est... (alors le wiki permet de différer...)
Les possibilités de tirer partie du réseau de wikis et du raisonnement sémantique
References
- [e1] Gregor Erbach - Data-centric view in e-Science information systems. Data Science Journal Vol. 5 (2006) pp.219-222
< http://www.jstage.jst.go.jp/article/dsj/5/0/219/_pdf >
- [h1] Eran Hodis, Jaime Prilusky, Eric Martz, Israel Silman, John Moult and Joel L. Sussman: Proteopedia - a scientific 'wiki' bridging the rift between 3D structure and function of biomacromolecules, Genome Biology 2008, 9:R121 doi:10.1186/gb-2008-9-8-r121
< http://genomebiology.com/2008/9/8/R121 > - [j1] Keith G. Jeffery. CRIS + open access = the route to research knowledge on the GRID. In 71st IFLA General Conference and Council proceedings, Oslo, Norway, 2005
< http://www.ifla.org/IV/ifla71/papers/007e-Jeffery.pdf > - [j2] Keith G. Jeffery - Technical Infrastructure and Policy Framework for Maximising the Benefits from Research Output in:ELPUB2007. Openness in Digital Publishing: Awareness, Discovery and Access - Proceedings of the 11th International Conference on Electronic Publishing held in Vienna, Austria 13-15 June 2007 / Edited by: Leslie Chan and Bob Martens. ISBN 978-3-85437-292-9, 2007, pp. 1-12
< http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.102.5044&rep=rep1&type=pdf> - [k1] Markus Krötzsch, Denny Vrandecic, Max Völkel, Heiko Haller, Rudi Studer. Semantic Wikipedia. In Journal of Web Semantics 5/2007, pp. 251–261. Elsevier 2007.
Notes
- ↑ Wicrified is a neologism that comes from term “wikified” in Wikipedia jargon.This task consists in using Wiki mark-up in order to adapt a document to Wicri network, i.e. setting wiki links, categories or semantic annotations.
- ↑ http://maquettewicri.loria.fr/fr.ticri/index.php5?title=Pittsburgh
- ↑ http://maquettewicri.loria.fr/fr.wicri-t-eau/index.php5?title=Pittsburgh
- ↑ http://maquettewicri.loria.fr/fr.wicri/index.php5?title=Pittsburgh
- ↑ http://maquettewicri.loria.fr/fr.artist/index.php5?title=Qu%E2%80%99est-ce_qu%E2%80%99une_biblioth%C3%A8que_num%C3%A9rique%2C_au_juste_%3F
- ↑ http://maquettewicri.loria.fr/fr.ticri/index.php5?title=Qu%E2%80%99est-ce_qu%E2%80%99une_biblioth%C3%A8que_num%C3%A9rique%2C_au_juste_%3F
- ↑ http://maquettewicri.loria.fr/fr.wicri/index.php5?title=Qu%27est-ce_qu%27une_biblioth%C3%A8que_num%C3%A9rique%2C_au_juste_%3F
- ↑ http://maquettewicri.loria.fr/fr.wicri/index.php5?title=Qu%27est-ce_qu%27une_biblioth%C3%A8que_num%C3%A9rique%2C_au_juste_%3F
- ↑ < http://www.mediawiki.org/wiki/MediaWiki >
- ↑ Content Managment System
- ↑ An image map is a list of coordinates relating to a specific image, created in order to hyperlink areas of this image to various destinations.
- ↑ < http://proteopedia.org/wiki/index.php >
- ↑ For instance Acer on
- Wikipedia Species : http://species.wikimedia.org/wiki/Acer
- Wikipedia (en) : http://en.wikipedia.org/w/index.php?title=Maple&oldid=345810808
- Wikipédia (fr) : http://fr.wikipedia.org/wiki/%C3%89rable
- Wikipedia Commons : http://commons.wikimedia.org/wiki/Category:Acer
More biblio
- M Krötzsch, S Schaffert, D Vrandecic. Reasoning in semantic wikis - Reasoning Web, 2007 - Springer