Skip to content

Different behavior for DagBag.dagbag_stats.file on Linux vs Windows #45172

@Dev-iL

Description

@Dev-iL

Apache Airflow version

2.10.4

If "Other Airflow 2 version" selected, which one?

No response

What happened?

On Windows, dagbag.dagbag_stats.file contains an absolute path, whereas on Linux, this field contains a path relative to dagbag.dag_folder.

What you think should happen instead?

The behavior should be the same regardless of the OS.

How to reproduce

Run the below on Windows and Linux and compare the results:

from os import environ
from pathlib import Path

from airflow.models import DagBag

repo_root: Path = next(p for p in Path(__file__).parents if p.name == "my_proj")
home: Path = repo_root / "airflow"
environ["AIRFLOW_HOME"] = home.as_posix()
dagbag = DagBag(include_examples=False)

Operating System

Win10 + RL9.3

Versions of Apache Airflow Providers

Irrelevant

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

I believe the reason for this difference is that

file=filepath.replace(settings.DAGS_FOLDER, ""),

Doesn't normalize the folder separators, so in the case of

# These are values copied from the debugger on Win
settings.DAGS_FOLDER == 'C:\\repositories\\my_proj\\airflow/dags'
filepath == "C:\\repositories\\my_proj\\airflow\\dags\\my_dag.py"

nothing gets replaced. A possible solution for this is applying Path(raw_path).as_posix() to all paths involved.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions