DCAT JSON
The DCAT JSON harvester is a CKAN harvester that can be used to harvest metadata from DCAT JSON files.
DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.
Warning
This harvester is based on the original DCAT harvester from ckanext-dcat,
therefore it requires the ckanext-dcat
library to be installed.
Enable the Harvester
To enable the harvester, add basket_dcat_json_harvester
to the ckan.plugins
setting in your CKAN configuration file (e.g., ckan.ini
or production.ini
).
ckan.plugins = ... basket_dcat_json_harvester ...
Configuration options
tsm_schema
[optional]
Transmute schema allows you to define a schema that will be used to transform the harvested data before we're trying to create/update a dataset in CKAN.
This is useful when the harvested data doesn't match the CKAN dataset schema and you need to transform it.
Otherwise, you'd need to write a custom harvester and process the remote data yourself.
See the ckanext-transmute
documentation to learn more about the transmute schema syntax.
Example
{
"root": "Dataset",
"types": {
"Dataset": {
"fields": {
"title": {
"validators": [
"tsm_string_only",
"tsm_to_lowercase",
"tsm_name_validator"
],
"map": "name"
},
"resources": {
"type": "Resource",
"multiple": true,
"map": "attachments"
},
"metadata_created": {
"validators": [
"tsm_isodate"
],
"default": "2022-02-03T15:54:26.359453"
},
"metadata_modified": {
"validators": [
"tsm_isodate"
],
"default_from": "metadata_created"
},
"metadata_reviewed": {
"validators": [
"tsm_isodate"
],
"replace_from": "metadata_modified"
},
}
},
"Resource": {
"fields": {
"title": {
"validators": [
"tsm_string_only"
],
"map": "name"
},
"extension": {
"validators": [
"tsm_string_only",
"tsm_to_uppercase"
],
"map": "format"
},
"web": {
"validators": [
"tsm_string_only"
],
"map": "url"
},
"sub-resources": {
"type": "Sub-Resource",
"multiple": true
},
},
}
}
}
Type: dict[str, Any]
Default: None