Methodology for publishing datasets as Open Data
COMSODE Open Data methodology is intended mainly for data owners and publishers (mainly public bodies). This generic methodology for publishing open data provides answers to questions such as:
- How to identify unique resources in the datasets? How to reuse well known codebooks/vocabularies/ontologies (currencies, NUTS codes, ..)?
- In which formats the data should be published so that they are machine readable?
- How the data should be transformed (e.g. anonymized) before being published?
- Which descriptive and provenance metadata should be published together with the dataset (such as name, format, location/URL, source, responsible person)?
Methodology is made of 5 main building blocks: phases, cross-cutting activities, artefacts, roles and practices.
Phases represent particular stages of the open data publication process and they reflect the lifecycle of an open dataset. The following phases of the open data publication process were defined:
- (P01) Development of open data publication plan,
- (P02) Preparation of publication,
- (P03) Realization of publication,
- (P04) Archiving.
There are also some activities that should be performed in every phase of the open data publication process – cross-cutting activities. There are four cross-cutting activities in the methodology:
- (CA01) Data quality management;
- (CA02) Communication management;
- (CA03) Risk management;
- (CA04) Benefits management.
Both the phases and the cross-cutting activities are further divided into set of tasks. Tasks represent steps in the publication process. For each of the tasks practices are described that provide more detailed guidelines how the tasks should be performed. For each of the tasks responsibilities are set. There are 10 different roles that are either responsible, accountable, consulted or informed about the result of some task.
Artefacts are the last building block of the Methodology for publishing datasets as open data. One artefact might be both input and output at the same time. Usually output of one task becomes an input into the subsequent task. However in certain situations an artefact might be both input and output of just one task, i.e. in situation when some artefact is updated by some task.
Methodology for publishing datasets as open data is made up by the following parts:
- Methodology overview – describes the overall concept of the methodology;
- Master spreadsheet – provides definition of phases, cross-cutting activities, tasks, roles and
- Set of responsibilities and artefacts (inputs/outputs);
- Documentation of practices – describes practices for each of the tasks.
Methodology for publishing Open Data:
- D5.1 – Methodology for publishing datasets as open data – DOWNLOAD
ANNEX 1 (Documentation of practices)
ANNEX2 (Methodology Master Spreadsheet)
- D5.2 – Methodologies for deployment and usage of COMSODE publication platform (ODN), tools and data – DOWNLOAD
- D5.3 – Contribution to international standards and best practices – DOWNLOAD.
In case you have any questions or comments please contact us via our webpage.