Skip to content

BUG: environment variables from startup.sh are not imported due to a bug in entrypoint.sh #428

@mi-volodin

Description

@mi-volodin

Summary

There's a bug in the entrypoint.sh: when dags are downloaded from S3 the workdir is changed. That leads to the situation when stored_env file is created in dags folder instead of $AIRFLOW_HOME. Consequent import fails.

Description

We use mwaa-local-runner to emulate MWAA behaviour for ephemeral deployments. We recently discovered that whatever is defined as environmental variable in the startup.sh is never actually imported into the environment of local-runner.

The issue lives in this block of code. I am copying lines here

    if [ -n "$S3_DAGS_PATH" ]; then             #< --- 1
      echo "Syncing $S3_DAGS_PATH"   
      mkdir -p $AIRFLOW_HOME/dags
      cd $AIRFLOW_HOME/dags                     #< --- 2
      aws s3 sync --exact-timestamp --delete $S3_DAGS_PATH .
    fi    
    # if S3_REQUIREMENTS_PATH
    if [ -n "$S3_REQUIREMENTS_PATH" ]; then
      echo "Downloading $S3_REQUIREMENTS_PATH"
      mkdir -p $AIRFLOW_HOME/requirements
      aws s3 cp $S3_REQUIREMENTS_PATH $AIRFLOW_HOME/$REQUIREMENTS_FILE
    fi




    execute_startup_script          # < --- 3
    source stored_env                 # < --- 4

So if we provide S3_DAGS_PATH (1) then before download happens we changedir to ./dags (2), so when we start the startup script (3) we actually write stored_env (see run-startup.sh) to the dags folder instead of $AIRFLOW_HOME.

Unfortunately, the execute_startup_script explicitly switches back to $AIRFLOW_HOME before finishing.

Then we see the following error in the logs:

/entrypoint.sh: line 130: stored_env: No such file or directory

Expected behaviour

stored_env should be successfully imported, thus making exported variables from startup.sh available to a local-runner process.

Suggested fix

I am not a shell-ninja, but I think ensuring that stored_env will be available should be done in same context as the call to execute_startup_script.

Therefore it should be something like this in that place

    cd $AIRFLOW_HOME
    execute_startup_script
    source stored_env

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions