Blog

CKAN 3.0 Product Strategy Research (part 2)

As promised, here are five more of the 37 interviews Alexander Gostev has conducted with various stakeholders during the engagement process. These insights will help make CKAN 3.0 even better than before. More updates will be coming soon!

Taleatha Franks
ckan, ckan 3.0, ckan roadmap, ckan product strategy, CKAN
20 Oct 2022
Share

08-CKAN 3-product strategy research-02-01.png

STAKEHOLDER ENGAGEMENT RESULTS 6-10 of 37

Respondent 6: Manager

Interview date: 16 June 2022

OVERVIEW

The organization that operates in the Pacific Islands region works on publishing, managing and supplying data to decision-makers from local governments. The interviewee is occupied with data procurement and data stewardship. A major UI update is planned in near future.

The respondent looked into Socrata and used DKAN for PoC but switched to CKAN because of the bigger and more active community and more extensions available.

Plugins used

Data integration extensions
Data harvesters
Metadata extensions

CKAN issues

Generally, CKAN is good 🟢
Customization can be frustrating 🔴
Having UI form for adding licenses and managing legal docs is what he needs 💡
Program managers want to have more visibility from their tech partners to catch malfunctions faster. Example: when the harvester stopped working.
He needs workflow for restricted datasets when meta-data is available, but the dataset is restricted 💡

Jobs customers do:

In general, it’s a centralized catalogue that provides:

Preview of datasets
Grouping of datasets

National governments

Easy access to decision-ready data products
Links to other data sources

Large research institutions

To know what data exists in a particular area

Other online services used for working with data

.Stat (dotstat) platform - statistical platform
- To publish indicators for UN SDGs ad other international development frameworks
Geo-spatial databases that provide OGC endpoints, e.g. WMS, WFS, CSW
- Geo-server
Spatial data explorer
- Open-source tool - teria.js
- Integrated with iframe

Respondent 7: Manager

Interview date: 16 June 2022

OVERVIEW

The interviewee installs, customises and maintains CKAN. Provides governance and training to departmental staff. He built an automated, data-driven, collaboratively edited, web-based reporting system.

He’s a CKAN maintainer. Manages internal faced CKAN instance. Focused on workflow automation.

The interviewee says that CKAN is easy to install 🟢

CKAN integrations

Open refine (Google refine) https://openrefine.org/ 💡
- Spreadsheet on steroids
- Very easy to clean the data
  - Duplicates
https://nationalmap.gov.au/💡
- If the dataset has csv-geo (❓) you can display it on a map
- Teria.js
- Immediate value
- Plugin but not upgraded
- CKAN Cesium Preview
- Configure CKAN as data source
CKAN XQA - rudimentary plugin, it can be upgraded 💡
- Checking links
- https://goodtables.io/
  - Better checked for CSV
- Functionality - list of issues you should fix

Issues

The interviewee’s data sets are for internal use, but he wants to have a more flexible visibility model for public or private 💡
- Sensitive data
- Visibility option: public sector only
- CKAN dataset visibility
- Use case: metadata public, data private, single datasets can be shared
He’s limited to tech that can run small
- Kubernetes
- Docker compose
- CKAN 🟢
People aren’t using technology on their own. It’s should be part of their workflow. The respondent makes it internally by creating a policy. Controlled parameters are
- Quality of data published
- Amount of data delivered
- Tags added

Ideas

It’s critical to drag Data Engineers to use CKAN

Respondent 8: Data Distributor

Interview date: 16 June 2022

OVERVIEW

The interviewee is 2-3 years in the industry. The respondent’s primary role is in data distribution: picking a tech (CKAN), defining a flow of services, installs CKAN instances. He also worked as a developer and created CKAN client lib for Java which he uses.

Current usage

As a data distributor, he works a lot on meta-data as people should know well what they’re buying.
Most exciting feature of CKAN for the interviewee - the preview, when he wants to show visualization to people. 🟢
- CKAN + Python
- CKAN as a visualization tool
- Plugin: Data → algo → visualization
- Created frontend based on CKAN → visualization on top of it

Business Model

He got funding for some of his projects
15 countries in the consortium
The interviewee get data from different countries of Europe and Asia
Data exploiters (his) clients are from EU (AI, analyzing data)
The respondent has 100+ datasets published
He would like to standardize descriptors automatically (15 pages)
- Still figuring out how to attach descriptors to the dataset 💡
- How to attach Html document to the dataset published automatically.
- Doesn’t want to have a long description on UI as it hinders the download dataset button.
The respondent developed their data access system for data distribution 💡:
- Open access
- Controlled access
- Manual
He develops his plugins to cover his own needs
- Has a number of plugins in their pipeline

Issues

Publishing issue - doesn’t know how to add tags 🔴
The interviewee wants to contribute as a developer. He has an idea but needs help in creating a PR

Ideas

If it’s a new feature in the plugin, it’s cool to separate it from the plugin and upload it to CKAN. Now the respondent needs to get in touch with the contributor. But with guidelines, he could make his plugin based on features you liked (reusability) 💡
Getting feedback from the community (community feature) 💡

How to make CKAN support easier

Gitter works great. All the questions get responded 🟢
FAQ would help. The obvious stuff is important for newbies 💡

Used Plugins

Harvesting
Scheming
d-cat
Google Analytics
Kitlock - identity provider
Helm for deployment (customized it heavily)

CKAN 3.0 top-3 directions of improvement

Expand distribution portal: catalogue, data sets
Expand data exploration: if you provide API, you can connect your datasets
Easier deploy with docker: a couple of weeks to deploy. But as it’s open-source, documentation is there (big +)
Support: Keeping up to date is important. Documentation helps with troubleshooting
Solr: it was hard to understand how to install it - fixed with docker install

Respondent 9: data consultant

Interview date: 16 June 2022

OVERVIEW

The interviewee is working across government and enterprise clients by managing the delivery of data-driven solutions. He’s not contributing but rather following.

Now the respondent is interested in the Enterprise capabilities of CKAN for internal use: with multiple data environments, for consumers that don’t understand data.

Products the respondent used

Socrata,
JKAN (smaller out-of-the-box one),
Mashta (Australia),
Dataverse (from Harvard, data marketplace),
Data Republic (Australia),
Data. world,
open-metadata.org,
one library from Ln on data discovery and cataloguing

CKAN features, strengths, weaknesses

CKAN misses a lot as a catalogue 🔴
For the data portal, important functionality is missing as well 🔴
- Ability to create a workflow between several users who manages data 💡
- User management and roles: developers, data publishers, users 💡
Integration with internal tools can be better 🔴
- Connectors don’t cover his needs
- Authenticators for corporate environments 💡
- More data sources 💡
UI doesn’t matter as anyone uses their own UI (+1 for decoupling frontend) 💡
Dokerization is helpful and works great 🟢

Success metric for the interviewee’s customers?

Case 1: Search and discoverability - ability to identify data assets that we didn’t know they’re there.
Case 2: % of users in organization that use CKAN as storage, then moving data to analyze in Tableau, for example.
Case 3: % of team members who use CKAN in their workflow. Engagement metric for non-tech users.

Respondent 10: data manager

Interview date: 24 June 2022

OVERVIEW

The interviewee manages a CKAN data portal that aggregates data from several other CKAN instances. They use Magda in their setup for searching and cataloguing. It’s done to improving of UX - to have the least clicks to the data 💡.

CKAN visually looks better than MAGDA 🟢

Success metrics for the respondent’s customers

End users:
- Number of dataset downloads 💡
- Number of visitors
Manager
- Number of datasets
- Number of downloaded datasets
- Accessibility and ease of use for data custodians (UX for publishing data) 💡

CKAN 3.0 top-3 directions of improvement

Improve the visualization of data (maybe as a plugin)
1. Ability to put a particular visualization onto data
  1. At the moment, when you load data, it’s a table and graph one for everything
2. Mapping program (map for specialized data)
  1. Magda has a good preview map
User management/data sharing
1. These datasets are open
2. These are open if you apply
Customizable dashboard - linking Tableau, Power BI, and Google Data Studio

CKAN issues

Logging - the interviewee found it difficult to track usage by API
- How many people (by API) have downloaded datasets
- Simple statistics on content usage and engagement (Twitter, Reddit style) 💡

How to make CKAN support easier

Training for people who are jumping into managing CKAN.
Options of how to do it in a different way
Best practices

In Category on 14 Mar 2025

"Don’t Overcomplicate It" – An Interview with NRC’s Nadine Levin on Building a CKAN Data Catalog That Works

From fragmented spreadsheets to actionable insights—how NRC leverages CKAN’s open-source power for better decision-making in crisis response.

In Category on 06 Mar 2025

How NRC is Using CKAN to Improve Humanitarian Data Management

Discover how the Norwegian Refugee Council (NRC) is using CKAN to streamline humanitarian data management, improve collaboration, and enhance data accessibility for field teams across 40+ countries.

CKAN 3.0 Product Strategy Research (part 2)

STAKEHOLDER ENGAGEMENT RESULTS 6-10 of 37

"Don’t Overcomplicate It" – An Interview with NRC’s Nadine Levin on Building a CKAN Data Catalog That Works

How NRC is Using CKAN to Improve Humanitarian Data Management

Connect with CKAN