CSIRO
The CSIRO harvester is a CKAN harvester that can be used to harvest metadata from CSIRO data sources.
CSIRO is Australia's national science research agency. It is a government agency that conducts scientific research to solve problems that are important to Australia and the world.
The CSIRO Data Access Portal provides access to research data, software and other digital assets published by CSIRO across a range of disciplines.
Enable the Harvester
To enable the harvester, add basket_csiro_harvester
to the ckan.plugins
setting in your CKAN configuration file (e.g., ckan.ini
or production.ini
).
ckan.plugins = ... basket_csiro_harvester ...
Configuration options
tsm_schema
[optional]
Transmute schema allows you to define a schema that will be used to transform the harvested data before we're trying to create/update a dataset in CKAN.
This is useful when the harvested data doesn't match the CKAN dataset schema and you need to transform it.
Otherwise, you'd need to write a custom harvester and process the remote data yourself.
See the ckanext-transmute
documentation to learn more about the transmute schema syntax.
Example
{
"root": "Dataset",
"types": {
"Dataset": {
"fields": {
"title": {
"validators": [
"tsm_string_only",
"tsm_to_lowercase",
"tsm_name_validator"
],
"map": "name"
},
"resources": {
"type": "Resource",
"multiple": true,
"map": "attachments"
},
"metadata_created": {
"validators": [
"tsm_isodate"
],
"default": "2022-02-03T15:54:26.359453"
},
"metadata_modified": {
"validators": [
"tsm_isodate"
],
"default_from": "metadata_created"
},
"metadata_reviewed": {
"validators": [
"tsm_isodate"
],
"replace_from": "metadata_modified"
},
}
},
"Resource": {
"fields": {
"title": {
"validators": [
"tsm_string_only"
],
"map": "name"
},
"extension": {
"validators": [
"tsm_string_only",
"tsm_to_uppercase"
],
"map": "format"
},
"web": {
"validators": [
"tsm_string_only"
],
"map": "url"
},
"sub-resources": {
"type": "Sub-Resource",
"multiple": true
},
},
}
}
}
Type: dict[str, Any]
Default: None