Open data in Luxembourg, strategy and best practices (2012) chapter 1

From Wicri Luxembourg (en)

Terminological distinctions

The different stakeholders involved in the production and reuse of public data use a variety of terms to describe these data, often with meanings that do not meet completely or are contradictory. The aim of this document is not to rationalize these thoughts, but to indicate the most followed acceptance of each term. This is important because some of them are older than open data and pose very similar issues. It is for example the case of public sector information. Fioretti [1] provides a basic definition of public data : "data that is of public interest, that belongs to the whole community, data that every citizen is surely entitled to know and use". It is not only data produced by government and administrations but: "[…] much bigger amount of data describing and measuring all the activities of private companies, from bus timetables to packaged food ingredients, aqueducts performances and composition of fumes released in the atmosphere, that have a direct impact on the health […]" The OECD [2] defines the most accurately the expression "public sector information" : "[…] public sector information as having characteristics of being “dynamic and continually generated, directly generated by the public sector, associated with the functioning of the public sector (for example, meteorological data, business statistics), and readily useable in commercial application…”. Uhlir [3] notes that the OECD distinguishes PSI from "public content" : "static (i.e., it is an established record), held by the public sector rather than being generated by it (cultural archives, artistic works where third-party rights may be important), not directly associated with the functioning of government, and not necessarily associated with commercial uses but having other public good purposes (culture, education)". According to Fioretti the main problem is that public sector information and public data are often substituted for one another, while distinct in content. These concepts can be considered as sources of inspiration, or at least comparison, to the open data.

Different approaches to open data have been proposed. Here the aim is to combine them to get a definition to the broadest spectrum. The Open Data Handbook [4], defines open data as follows : "Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.” The objective is to establish a clear definition of the openness, which allows interoperability between data sets, to avoid a "Tower of Babel". This book clearly places the open government data as the heir of public sector information, and even as a part of it : “Open government data : Open data produced by the government. This is generally accepted to be data gathered during the course of business as usual activities which do not identify individuals or breach commercial sensitivity. Open government data is a subset of Public Sector Information, which is broader in scope.” The only data excluded a priori are sensitive data (e.g. military data) and all personal data.

However, the still recent nature of open data and even more the wide variety of public data, make a traditional definition less interesting than a network of conditions, which allows for nuances. In 2007, in Sebastopol, California, a working group [5] - which have included Lawrence Lessig and Tim O'Reilly - offers a definition of open public data on eight principles. By combining different approaches, the Open knowledge Foundation gets a very broad definition of the openness according to eleven criteria [6]. These criteria cover both the data and their metadata, conditions of access and reuse. This broad definition, however, disqualifies a lot of datasets on open data platforms.

The intellectual origins of the open data

Openness

Far from being an ephemeral concept, open data is rooted in key scientific and intellectual movements of the past 60 years. The idea of openness is the common point between four movements: open science, open source, open government and open data. The founding principle is scientific research, including communalism statement by Merton. It is consistent that the first mention of open data, in the 1990s, comes from the scientific community. This movement also continues to be enriched, for example with the Panton Principles. Open source is another influence of open data. According to Tim O'Reilly, open data is like applying the principles of open source and its working methods to public affairs. Linked Open Data are all stored data connected by the World Wide Web which could be made accessible in the public interest without any restrictions for usage and distribution. The last element is the open government, for which open data might considered as one of its components. Like all ideas that have been successful, and as open data itself, the scope of the open government is blurred and moving. Sandoval-Almazan [7] distinguishes three essential elements: freedom of information that can be traced back at least to the 1950s in the United States, the release of information to improve decision-making process, and the third component is the open data.

Relationships with linked data

Berners-Lee [8] is interested in open data in line with his thoughts on the semantic web. The open data is, in its view, the way to provide fuel to the linked data approach. It seems that rather than source, one should mention hybridization between open data and linked data. One can always considers open data and linked open data as two distinct branches: to meet the definitions of openness, datasets do not require the data to be linked, everything depends on the objectives of the data owners. These distinctions have important consequences for the opening initiatives: to present related data requires - in many cases - an additional work, an adjustment of data sets and adhjustment of procedures.

The reflects of a political culture

Open data is data is finally the product of a particular political culture. If the political culture, if it is a democratic one, does not ask inevitable questions to the existence of open data, however it strongly contributes to define the shape of open data. The influence of American political culture is undeniable. Population requires knowledge of the allocation of taxes and, even more important, it requires accountability from the leaders. We can also describe the steps of opening as the result of a tension between two conflicting influences. The liberal idea is rather related to a distrust of public institutions, open data can then be then seen as an instrument to control the government. There are questions about the effectiveness of public services, one wonders if it is legitimate that the state assumes them. In contrast, an interventionist perspective considers the open data as a means to increase and optimize the effectiveness of the state, intervention seen as positive. This is of course a simplified view, reality combining these trends to varying degrees.


Notes :

  1. Fioretti Marco. "Open Data : Emerging trends , issues and best practices administration". Work [Online]. 2011,. p. 1–34. Retrieved from : <http://www.lem.sssup.it/WPLem/odos/odos_2.html/>
  2. OECD. 2006. Digital Broadband Content: Public Sector Information and Content. Paris: OECD, p. 8.
  3. Uhlir P. F. The Socioeconomic Effects of Public Sector Information on Digital Networks. [Online]. Networks. 2009. Retrieved from : < http://www.nap.edu/catalog.php?record_id=12687 >
  4. Open Knowledge Foundation. Open Data Handbook. [Online]. Open Knowledge Foundation. 2012,. Retrieved from : < http://opendatahandbook.org/ >
  5. 8 Principles of Open Government Data. Retrieved from : <http://www.opengovdata.org/home/8principles>
  6. Open Definition. Retrieved from : <http://opendefinition.org/okd/ >
  7. Sandoval-almazán R. « The Two Door Perspective : An Assessment Framework for Open Government ». Journal of E Democracy and Open Government [Online]. 2011,. Vol. 3, n°2, p. 166–181. Retrieved from : < http://www.jedem.org/article/view/67 >
  8. Linked data. Retrieved from : < http://www.w3.org/DesignIssues/LinkedData.html >