National Open Data Catalog of the Czech Republic (NODC)
The NODC is a data catalog fully compliant with DCAT-AP, the European standard for dataset metadata. It is open source, developed on GitHub, and consists of other open source projects. It can be reused on various levels of government and addresses an important issue with currently available data catalog implementations not in compliace with today’s metadata standards.
In 2015 when the Czech Republic decided to establish a national open data catalog, only two existing open source implementations were available, CKAN and DKAN. They both have a hardwired data model for metadata, which was not sufficiently compatible with DCAT (W3C Recommendation) and DCAT-AP (Recommendation by the European Commission). Therefore, it was decided that a proprietary solution using existing IT infrastructure of the Ministry of the Interior would be developed. However, it was not user friendly and was considered a temporary solution. A new, open source and standards compliant solution had to be developed.
The current NODC is a national open data catalog implementation focused on data standards. It is an open source software developed on GitHub, consisting of multiple open source projects and their configurations. Its primary function is to harvest open data catalogs at lower levels of the government, creating an open central repository of metadata about open data published in the Czech Republic. The data is fully compliant with DCAT-AP, the European Commission recommendation for data catalogs. The data is decoupled from the rest of the catalog, which comprises a viewer of the harvested data intended for human users, and input forms intended for public administrations not willing to run their own local catalog instance. Everyone is therefore welcome to reuse the data as they wish.
The viewer and the input forms can be directly reused by anyone wishing to view DCAT-AP compatible data or create a machine readable, standardized metadata for their datasets.
In future we aim at creating a standards driven, ready to be deployed alternative to the well established, but non-standard and rigid catalog implementations like CKAN or DKAN.
What Makes Your Project Innovative?
It is open source, developed on GitHub, and consists of other open source projects. The NODC is a data catalog fully compliant with DCAT-AP, the European standard for dataset metadata. It can be reused on various levels of government and addresses an important issue with currently available data catalog implementations not being compliant with today’s metadata standards. It divides data and applications into several cathegories.
Collaborations & Partnerships
Data publishers - they contributed requirements on the harvesting part of the solution (Ministry of Finance, Ministry of the Interior, Ministry of Regional Development, Czech Statistical Office, State Administration of Land Surveying and Cadastre)
Data consumers - they contributed requirements on the frontend of the solution
Academic researchers - they contributed the technology and know-how behind the data model and API (Charles University of Prague, University of Economics Prague.
Users, Stakeholders & Beneficiaries
Citizens - City of Prague, City of Brno, City of Pilsen , City of Ostrava, City of Bohumin, Hlidac Smlub
Government officials - Czech Police, Ministry of Finance, Ministry of Interior, Ministry of the Environment, Czech Telecommunication Office
Civil Society . Open Society Fund Prague
Companies - DHL Company, Financial Portals - Finance.cz
Results, Outcomes & Impacts
Two main results were observed:
- potential users of open data are finding the data viewer UI friendlier than the original open data catalog. This was measured using the System Usability Scale (https://www.usability.gov/how-to-and-tools/resources/templates/system-usability-scale-sus.html)
- users are appreciating the availability of the metadata from the catalog as open data
- the fact that the software is open source, in fact it is one of first developed for the public administration as open source, sets a good practice for other software developed for the public administration, which should be made open source to, e.g. avoid vendor lock-in.
Challenges and Failures
The main challenge we face is the deployment of the solution within the environment of the Ministry of the Interior. Even though the hardware requirements are not high, the process of acquiring them takes too long.
Another challenge is the lack of understanding of the need to primarily focus on having clean, standard data before dealing with how to show the data to people. The solution to this challenge is patient, thorough education of civil servants involved in IT decisions.
Conditions for Success
Diminishing the fear of open source software and explaining the need of separation of well-documented data from functionality through education of civil servants. These concepts go against the desires of software suppliers wanting to create vendor lock-in, which for them is a very favourable position. They purposefully misinform civil servants about these concepts to maintain their advantages.
Not yet. Replication of this solution is currently in progress. In the near future, it will probably be used by one of Prague’s universities to establish its local open data catalog.
- It is important to focus on having standardized data, which can be reused by various applications. Only then the focus should shift to applications working with the data in an interoperable manner.
- Open source and software reuse is key to lowering high costs of public administration IT systems