Skip to content

Canonical way to isolate binary info from PyExecutableInfo #3324

@FrankPortman

Description

@FrankPortman

🚀 feature request

Relevant Rules

Adding information to PyExecutableInfo (impacts py_binary, py_test).

Description

Adding something to PyExecutableInfo that helps communicate "here's extra stuff for the binary that wouldn't be present were this a library".

Right now I have a custom rules that takes deps which expose PyInfo (which py_binary does).

It then executes some custom logic stemming from something like this:

    inputs = depset(
        transitive = [dep[DefaultInfo].data_runfiles.files for dep in ctx.attr.deps] +
                     [dep[DefaultInfo].default_runfiles.files for dep in ctx.attr.deps],
    )

After some more processing of that depset named "inputs", it then exposes them to other rules which seek to cp or symlink those files around.

This fails due to errors such as:

FileNotFoundError: [Errno 2] No such file or directory: 'bazel-out/darwin_arm64-fastbuild/bin/tools/runnable/examples/python/_example_app_bin.venv/bin/python3'

This only occurs with build --@rules_python//python/config_settings:bootstrap_impl=script.

Describe the solution you'd like

It sounds like I generally need to re-work my rules because accepting the heterogeneous inputs of Pyinfo vs PyInfo AND PyExecutableInfo will need some better handling on my end, but ideally there is a canonical way for me to remove any extra things that are only necessary because something is a binary dependency.

Please see this Slack thread for extra context: https://bazelbuild.slack.com/archives/CA306CEV6/p1759447665927069?thread_ts=1759177072.316529&cid=CA306CEV6

If PyExecutableInfo was modified to add something like

PyExecutableInfo = provider(
    fields = {
        # ... existing fields ...
        "content_files": """
        :type: depset[File]
        The actual content files (sources, data) without runtime scaffolding.
        This excludes venv directories, bootstrap scripts, and executable wrappers.
        """,
    }
)

then I think my use case would be unblocked relatively nicely.

Describe alternatives you've considered

Right now I have added some workarounds to my custom rule to check for the presence of PyExecutableInfo and then do some heuristic filtering of paths based on patterns that I've seen.

This seems to work but feels non ideal for obvious reasons

def _is_runtime_artifact(file):
    """
    Identify files that are runtime scaffolding, not actual content.

    This cannot identify the actual shell script that `py_binary`
    generates, but that is largely static and small, and more importantly it exists
    at build time, so there is no real harm in including it.

    The bootstrap script is also harmless to keep in but we remove it for completeness.
    """
    path = file.short_path
    basename = file.basename

    # Venv artifacts added by `rules_python` with bootstrap_impl=script
    if path.endswith(".venv/bin/python3"):
        return True

    # Bootstrap scripts added by `rules_python` with bootstrap_impl=script
    if basename.endswith("_stage2_bootstrap.py"):
        return True

    return False

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions