Skip to content

Group/organization images

Note

internally, groups and organizations are the same entity, so this workflow describes both of them.

First of all, you need a configured storage that supports public links. As all group/organization images are stored inside local filesystem, you can use files:public_fs storage adapter.

Storage name

This extension expects that the name of group images storage will be group_images. This name will be used in all other commands of this migration workflow. If you want to use different name for group images storage, override ckanext.files.group_images_storage config option which has default value group_images and don't forget to adapt commands if you use a different name for the storage.

Size restriction

This configuration example sets 10MiB restriction on upload size via ckanext.files.storage.group_images.max_size option. Feel free to change it or remove completely to allow any upload size. This restriction is applied to future uploads only. Any existing file that exceeds limit is kept.

Type restriction

Uploads restricted to image/* MIMEtype via ckanext.files.storage.group_images.supported_types option. You can make this option more or less restrictive. This restriction is applied to future uploads only. Any existing file with wrong MIMEtype is kept.

Location

ckanext.files.storage.group_images.path controls location of the upload folder in filesystem. It should match value of ckan.storage_path option plus storage/uploads/group. In example below we assume that value of ckan.storage_path is /var/storage/ckan.

Public URL

ckanext.files.storage.group_images.public_root option specifies base URL from which every group image can be accessed. In most cases it's CKAN URL plus uploads/group. If you are serving CKAN application from the ckan.site_url, leave this option unchanged. If you are using ckan.root_path, like /data/, insert this root path into the value of the option. Example below uses %(ckan.site_url)s wildcard, which will be automatically replaced with the value of ckan.site_url config option. You can specify site URL explicitely if you don't like this wildcard syntax.

ckanext.files.storage.group_images.type = files:public_fs
ckanext.files.storage.group_images.max_size = 10MiB
ckanext.files.storage.group_images.supported_types = image
ckanext.files.storage.group_images.path = /var/storage/ckan/storage/uploads/group
ckanext.files.storage.group_images.public_root = %(ckan.site_url)s/uploads/group

Now let's run a command that show us the list of files available under newly configured storage:

ckan files scan -s group_images

All these files are not tracked by files extension yet, i.e they don't have corresponding record in DB with base details, like size, MIMEtype, filehash, etc. Let's create these details via the command below. It's safe to run this command multiple times: it will gather and store information about files not registered in system and ignore any previously registered file.

ckan files scan -s group_images -t

Finally, let's run the command, that shows only untracked files. Ideally, you'll see nothing upon executing it, because you just registered every file in the system.

ckan files scan -s group_images -u

Note

All the file are still available inside storage directory. If previous command shows nothing, it only means that CKAN already knows details about each file from the storage directory. If you want to see the list of the files again, omit -u flag(which stands for "untracked") and you'll see again all the files in the command output:

ckan files scan -s group_images

Now, when all images are tracked by the system, we can give the ownership over these files to groups/organizations that are using them. Run the command below to connect files with their owners. It will search for groups/organizations first and report, how many connections were identified. There will be suggestion to show identified relationship and the list of files that have no owner(if there are such files). Presence of files without owner usually means that you removed group/organization from database, but did not remove its image.

Finally, you'll be asked if you want to transfer ownership over files. This operation does not change existing data and if you disable ckanext-files after ownership transfer, you won't see any difference. The whole ownership transfer is managed inside custom DB tables generated by ckanext-files, so it's safe operation.

ckan files migrate groups group_images

Here's an example of output that you can see when running the command:

Found 3 files. Searching file owners...
[####################################] 100% Located owners for 2 files out of 3.

Show group IDs and corresponding file? [y/N]: y
d7186937-3080-429f-a434-22b74b9a8d39: file-1.png
87e2a1aa-7905-4a28-a087-90433f8e169e: file-2.png

Show files that do not belong to any group? [y/N]: y
file-3.png

Transfer file ownership to group identified in previous steps? [y/N]: y
Transfering file-2.png  [####################################]  100%

Now comes the most complex part. You need to change metadata schema and UI in order to:

  • make sure that all new files are uploaded and managed by ckanext-files instead of native CKAN's uploader
  • generate image URLs using ckanext-files functionality. Right now, while files stored in the original storage folder it makes no difference. But if you change upload directory in future or even decide to move files from local filesystem into different storage backend, it will guarantee that files are remain visible.

Original CKAN workflow for uploading files was:

  • just save image URL provided by user or
  • upload a file
  • put it into directory that is publicly served by application
  • replace uploaded file in the HTML form/group metadata with the public URL of the uploaded file

This approach is different from strategy recommended by ckanext-files. But in order to make the migration as simple as possible, we'll stay close to original workflow.

Note

suggested approach resembles existing process of file uploads in CKAN. But ckanext-files was designed as a system, that gives you a choice. Check file upload strategies to learn more about alternative implementations of upload and their pros/cons.

First, we need to replace Upload/Link widget on group/organization form. If you are using native group templates, create group/snippets/group_form.html and organization/snippets/organization_form.html. Inside both files, extend original template and override block basic_fields. You only need to replace last field

{{ form.image_upload(
    data, errors, is_upload_enabled=h.uploads_enabled(),
    is_url=is_url, is_upload=is_upload) }}

with

{{ form.image_upload(
    data, errors, is_upload_enabled=h.files_group_images_storage_is_configured(),
    is_url=is_url, is_upload=is_upload,
    field_upload="files_image_upload") }}

There are two differences with the original. First, we use h.files_group_images_storage_is_configured() instead of h.uploads_enabled(). As we are using different storage for different upload types, now upload widgets can be enabled independently. And second, we pass field_upload="files_image_upload" argument into macro. It will send uploaded file to CKAN inside files_image_upload instead of original image_upload field. This must be done because CKAN unconditionally strips image_upload field from submission payload, making processing of the file too unreliable. We changed the name of upload field and CKAN keeps this new field, so that we can process it as we wish.

Tip

If you are using ckanext-scheming, you only need to replace form_snippet of the image_url field, instead of rewriting the whole template.

Now, let's define validation rules for this new upload field. We need to create plugins that modify validation schema for group and organization. Due to CKAN implementation details, you need separate plugin for group and organization.

Tip

If you are using ckanext-scheming, you can add files_image_upload validators to schemas of organization and group. Check the list of validators that must be applied to this new field below.

Here's an example of plugins that modify validation schemas of group and organization. As you can see, they are mostly the same:

from ckan.lib.plugins import DefaultGroupForm, DefaultOrganizationForm
from ckan.logic.schema import default_create_group_schema, default_update_group_schema


def _modify_schema(schema, type):
    schema["files_image_upload"] = [
        tk.get_validator("ignore_empty"),
        tk.get_validator("files_into_upload"),
        tk.get_validator("files_validate_with_storage")("group_images"),
        tk.get_validator("files_upload_as")(
            "group_images",
            type,
            "id",
            "public_url",
            type + "_patch",
            "image_url",
        ),
    ]


class FilesGroupPlugin(p.SingletonPlugin, DefaultGroupForm):
    p.implements(p.IGroupForm, inherit=True)
    is_organization = False

    def group_types(self):
        return ["group"]

    def create_group_schema(self):
        return _modify_schema(default_create_group_schema(), "group")

    def update_group_schema(self):
        return _modify_schema(default_update_group_schema(), "group")


class FilesOrganizationPlugin(p.SingletonPlugin, DefaultOrganizationForm):
    p.implements(p.IGroupForm, inherit=True)
    is_organization = True

    def group_types(self):
        return ["organization"]

    def create_group_schema(self):
        return _modify_schema(default_create_group_schema(), "organization")

    def update_group_schema(self):
        return _modify_schema(default_update_group_schema(), "organization")

There are 4 validators that must be applied to the new upload field:

  • ignore_empty: to skip validation, when image URL set manually and no upload selected.
  • files_into_upload: to convert value of upload field into normalized format, which is expected by ckanext-files
  • files_validate_with_storage(STORAGE_NAME): this validator requires an argument: the name of the storage we are using for image uploads. The validator will use storage settings to verify size and MIMEtype of the appload.
  • files_upload_as(STORAGE_NAME, GROUP_TYPE, NAME_OF_ID_FIELD, "public_url", NAME_OF_PATCH_ACTION, NAME_OF_URL_FIELF): this validator is the most challenging. It accepts 6 arguments:

    • the name of storage used for image uploads
    • group or organization depending on processed entity
    • name of the ID field of processed entity. It's id in your case.
    • public_url - use this exact value. It tells which property of file you want to use as link to the file.
    • group_patch or organization_patch depending on processed entity
    • image_url - name of the field that contains URL of the image. ckanext-files will put the public link of uploaded file into this field when form is processed.

That's all. Now every image upload for group/organization is handled by ckanext-files. To verify it, do the following. First, check list of files currently stored in group_images storage via command that we used in the beginning of the migration:

ckan files scan -s group_images

You'll see a list of existing files. Their names follow format <ISO_8601_DATETIME><FILENAME>, e.g 2024-06-14-133840.539670photo.jpg.

Now upload an image into existing group, or create a new group with any image. When you check list of files again, you'll see one new record. But this time this record resembles UUID: da046887-e76c-4a68-97cf-7477665710ff.