Skip to content

Replace DataPusher with XLoader (Compose and CKAN Dockerfile changes)#207

Open
kowh-ai wants to merge 3 commits intomasterfrom
XLoader-replacement
Open

Replace DataPusher with XLoader (Compose and CKAN Dockerfile changes)#207
kowh-ai wants to merge 3 commits intomasterfrom
XLoader-replacement

Conversation

@kowh-ai
Copy link
Copy Markdown
Contributor

@kowh-ai kowh-ai commented Mar 20, 2025

These are the ckan-docker changes required when a replacement of DataPusher by XLoader is required

Can be implemented/merged when the Add XLoader image to run in it's own container PR is merged

There is an instructions README file in the new xloader/ directory detailing how to make the changes

@amercader
Copy link
Copy Markdown
Member

Tl;DR: My suggestion for this:

  • Update entrypoint scripts in ckan-docker-base to be able to run the worker command
  • Add new ckan-worker service in compose
  • Install xloader in ckan/Dockerfile

Let's step back for a second and look at what we need:

  • Xloader is just another plugin that uses background jobs, for which we need a running worker. It might not be the only plugin sending jobs to the worker so any service, image etc created should be named "worker", not "xloader". For instance, we want to end up with a new ckan-worker service alongside the ckan one (or the ckan-dev one in the dev compose file)
  • The image running the worker should be identical to the one running the web app, both in terms of base image and in terms of custom image changes and configuration that users have added to their own sites (including the xloader extension). That means that both the ckan and the ckan-worker service should use the image built by ckan/Dockerfile in this repo.
  • The only difference between these two images should be the command they are running. In ckan-worker it should be ckan -c $CKAN_INI jobs worker instead of the uwsgi or dev server of the ckan service.

Now that we know what we need let's see what's the best way to achieve it.

  • Let's start with the command to run. Our entrypoint scripts do a bunch more other things other than running the server, things that we also want to happen when running the ckan jobs worker so I think the best way is to define a CKAN_WORKER env var (that would be set in the ckan-worker: > environment: section in the compose file) and modify the entry point scripts so they run the jobs worker command if that env var is present and set. e.g.:

    diff --git a/ckan-2.11/setup/start_ckan.sh b/ckan-2.11/setup/start_ckan.sh
    index a503832..506553c 100755
    --- a/ckan-2.11/setup/start_ckan.sh
    +++ b/ckan-2.11/setup/start_ckan.sh
    @@ -54,8 +54,13 @@ fi
    
     if [ $? -eq 0 ]
     then
    -    # Start uwsgi
    -    uwsgi $UWSGI_OPTS
    +    if [[ "${CKAN_WORKER,,}" == "true" ]]; then
    +        # Start the CKAN worker
    +        ckan -c "$CKAN_INI" jobs worker "$CKAN_WORKER_OPTS"
    +    else
    +        # Start uwsgi serving the CKAN web app
    +        uwsgi "$UWSGI_OPTS"
    +    fi
     else
       echo "[prerun] failed...not starting CKAN."
     fi

    We definitely need a $CKAN_WORKER_OPTS env var so users can pass their own options from the compose file or the .env file. The same should be repeated in the dev entrypoint and across the supported image versions.

  • Where do we actually install the xloader extension?

    • Creating a separate ckan-base-worker image that includes the xloader as done in Add XLoader image to run in it's own container, separate from the CKAN UI container ckan-docker-base#76 doesn't make much sense because we need xloader in the ckan web container as well.
    • We could include it by default in ckan-base. We are already installing ckanext-envvars and doing some datapusher configuration in the entrypoint scripts so we could consider xloader a "core" requirement and include it there. It would simplify users custom Dockerfiles, but we are adding more stuff and requirements to the base images which are supposed to be "just CKAN".
    • So I perhaps would suggest to add some lines to install ckanext-xloader in the project Dockerfile (ckan/Dockerfile) so users get it by default but can remove it if not needed, and they can treat it as any other extension they add to their site (bumping the version from time to time)
  • The only think left would be to document the worker service and that xloader is enabled by default, and update .env.example to add it to the active default plugins (and remove all mentions of datapusher)

How does that sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants