Skip to content

Conversation

@kevinschoon
Copy link

This adds a new subcommand to bst called inspect which dumps structured data to stdout of a given project. The data sent to stdout can be either JSON or YAML with JSON being the default.

Having this command available will make writing external tools which need to inspect the state of a buildstream project more easy since JSON and YAML encoding are widely supported. This command may easily be extended in the future to support other inspectable elements related to bst project which isn't present in this first commit.



def _dump_project(self):
# TODO: What else do we want here?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take note of this, I'm unsure what other items we should include in the project output.

project = str(datafiles)
result = cli.run(project=project, silent=True, args=["inspect", "*.bst"])
result.assert_success()
json.loads(result.output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would perform an assert between the result, and an 'expected' JSON result stored in the project directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into doing this but because the inspect format appears to contain some localized information to the caller, for example:

      "sources": [
        {
          "kind": "remote",
          "url": "file:///home/kevin/repos/external/github.com/apache/buildstream/tests/frontend/source-fetch/files/bananas",
          "medium": "remote-file",
          "version-type": "sha256",
          "version": "e49295702f7da8670778e9b95a281b72b41b31cb16afa376034b45f59a18ea3f"
        }
      ]

It would not be possible to store the entire expected output of the inspect command. Instead of doing that I instead just tested for the existence of a few keys. If you think we should still test more though I'm open to trying a different approach.

project = str(datafiles)
result = cli.run(project=project, silent=True, args=["inspect", "--state", "--deps", "all"])
result.assert_success()
json.loads(result.output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

Comment on lines 41 to 45
assert(output["project"]["name"] == "test")
element = _element_by_name(output["elements"], "import-bin.bst")
source = element["sources"][0]
assert(source["kind"] == "local")
assert(source["url"] == "files/bin-files")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: assert should report the reason if the assert fails.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have cleaned up and parameterized the tests a bit in my last commit.

@harrysarson
Copy link
Contributor

This PR allows you to see the version of a nested junction element. Currently this is only possible by hacking around with .bst/staged-junctions.

With this PR you can do something like

bst inspect xxxx.bst:freedesktop-sdk.bst | jq '.elements[0].sources[0]["version-guess"]'

Which is really nice and will be very useful.

@abderrahim
Copy link
Contributor

This PR allows you to see the version of a nested junction element. Currently this is only possible by hacking around with .bst/staged-junctions.

With this PR you can do something like

bst inspect xxxx.bst:freedesktop-sdk.bst | jq '.elements[0].sources[0]["version-guess"]'

You should be able to get the source information already using bst show

bst show --format %{source-info} xxxx.bst:freedesktop-sdk.bst

Unless you're referring to a bug with the above command?

@harrysarson
Copy link
Contributor

Ah --format %{source-info} is new. And also works!

sources: list[dict[str, str]]

@dataclass
class _ProjectOutput:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of project related data missing here, such as which plugins have been loaded and their provenances, which cache servers were accessed, the provenance string of the project itself (i.e. which junction it was loaded from and where).

All of which might not be reported by default.

I think we should cover everything that is reported in LogLine.print_heading() - even if some of this data may be conditional (i.e. user config and access to caches may dictate which artifact/source remotes are active).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of project related data missing here,

such as which plugins have been loaded and their provenances,

Plugins are now exposed, example:

...
 "plugins": [
	{
	  "name": "stack",
	  "description": "core plugin",
	  "plugin_type": "element"
	},
	{
	  "name": "import",
	  "description": "core plugin",
	  "plugin_type": "element"
	},
  ...
]

which cache servers were accessed

All remote state including cache servers it not included in inspect output currently.

the provenance string of the project itself (i.e. which junction it was loaded from and where).

Each project loaded shows the junction from where it comes, example:

...
 "junction": "freedesktop-sdk.bst:plugins/buildstream-plugins.bst",
...

self.stream.set_project(self.project)

# Initialize the inspector
self.inspector = Inspector(self.stream, self.project, self.context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed for anything outside of the bst inspect command, as such I'd rather this be created directly in _frontend/cli.py within the app.initialized() block.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved from app.py -> cli.py.

if _artifact.cached():
artifact = {
"files": artifact.get_files(),
"digest": artifact_files._get_digest(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks untested, the artifact_files variable appears to be undeclared here.

Also artifact.get_files() reports a CasBasedDirectory, perhaps this patch is expecting the __iter__ implementation to serialize into a list of paths ?

Not sure what's going on overall in this code block, but if we want to report files, we need more structured data here, such that we can report file types, symlink targets, file sizes, permissions, etc.

That said; I think it is acceptable to not support reporting artifact files for an initial version of bst inspect.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've eliminated all of the artifact inspection for this first version.

)
@click.argument("elements", nargs=-1, type=click.Path(readable=False))
@click.pass_obj
def inspect(app, elements, state, deps):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with bst show, we should also support support the --except options here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added support for --except in the same way bst show works.


state = self._read_state(element).value

# BUG: Due to the assersion within .get_artifact this will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the correct way to go about this is Element._load_artifact().

I think we can avoid pull for now and consider whether bst inspect pulls data from remotes as a possible followup - and strict comes from the Context object here which will be resolved by the toplevel options (defaults to True unless bst --no-strict inspect .... is specified).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not sure I fully understand this. Are you saying we should avoid pulling the state entirely with bst inspect (for this PR anyway)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you are misunderstanding is that there is a difference between the local and remote caches.

loading artifacts from the local CAS is expensive, so is checking the cached state of artifacts. That is why we avoid loading cached state where possible.

In this comment, we’re talking about loading the artifact to check its files and CAS digest, we can do that if the artifact is in the local cache.

Some bst commands also allow pulling from remotes, in case the artifacts were built in CI or pushed by another user, it can be useful, but I don’t think having bst inspect support downloading artifacts from remotes is important for an initial implementation.

For this case, you should follow the code paths for bst artifact list-contents in order to see how artifact loading is done… and follow the bst show code path for displaying %{artifact-cas-digest} to observe how to get the digest.

It is acceptable to just omit the data for these in the case that the local artifact is not cached and is thus unavailable.

@gtristan
Copy link
Contributor

Overall this is coming along nicely... some things which come to mind:

  • We need to consider how this will eventually evolve to support project-less use
    • I.e. there is a certain amount of information which is serialized into artifact metadata, that is extensible and can grow, and bst inspect should (at least eventually) allow inspection of artifact dependency chains loaded from a remote cache, without needing to have a local checkout of the project
    • This means we need to consider whether we intend to support this directly in bst inspect, or if we intend to have that as a separate addition as bst artifact inspect... perhaps the latter seems more sensible
  • We still do not have anything reporting of public data here as far as I can see
    • Public data is special as it is mutable
    • We need to consider what is sensible to report here
    • Initially, my feeling is that we should support reporting both original public data and mutated public data separately
      • This might require some added features to the core
      • Displaying original public data is only possible if we have a local project (not loading from a remote artifact)
      • Displaying mutated public data is only possible if the element has a locally cached artifact
  • Looks like this is still missing the ability to show artifact build logs
    • Perhaps not necessary for an initial implementation but shouldn't be too hard to include
    • Definitely should not be included in output by default (only if specified on the command line --fields)

@gtristan
Copy link
Contributor

gtristan commented Jul 17, 2025

Suggestion: I think we should have a whole ”artifact”: { … } block which contains only things which are loaded from the artifact, and omit it entirely in the case that the artifact is not present.

This can include:

  • the public data as found in the artifact, which may not be the same as the public data loaded from the bst files before building
  • The cas digest(s)
  • The file metadata
  • The build logs
  • The failure state (whether it is a cached failed build)
  • whether the artifact has a cached build tree

I think this nested structure disambiguates things, solves the public data concern nicely and is all around nicely structured.

Also… I think that given we present a lot of data in the output json, I think maybe we should not do the %{state} thing at all, and leave that only to bst show. Rationale being that the “state” concept is really very user facing, and it can anyway be easily computed based on the json reported by bst inspect

This adds a new subcommand to bst called `inspect` which dumps structured data
to stdout of a given project. The data sent to stdout can be either JSON or
YAML with JSON being the default.

Having this command available will make writing external tools which need to
inspect the state of a buildstream project more easy since JSON and YAML
encoding are widely supported. This command may easily be extended in the
future to support other inspectable elements related to bst project which
isn't present in this first commit.
@kevinschoon kevinschoon force-pushed the ks/bst-inspect-command branch from 38e085a to a224502 Compare July 21, 2025 09:11
@kevinschoon
Copy link
Author

Also… I think that given we present a lot of data in the output json, I think maybe we should not do the %{state} thing at all, and leave that only to bst show. Rationale being that the “state” concept is really very user facing, and it can anyway be easily computed based on the json reported by bst inspect

I have eliminated state entirely from the output.

@kevinschoon kevinschoon force-pushed the ks/bst-inspect-command branch from 09e7542 to 8d32ba3 Compare July 21, 2025 12:27
@kevinschoon kevinschoon force-pushed the ks/bst-inspect-command branch from 4c6752d to c72aa10 Compare July 21, 2025 12:56
@kevinschoon
Copy link
Author

Here is an example of the current state of the inspect output against a test project.

Any insight into specifically what types of fields we would like to include in this output as well as general feedback would be appreciated.

{
  "project": [
    {
      "duplicates": [],
      "declarations": [],
      "config": {
        "name": "test",
        "directory": "/home/kevin/repos/external/github.com/apache/buildstream/tests/frontend/inspect",
        "aliases": {
          "example": "https://example.org/"
        },
        "element_overrides": {},
        "source_overrides": {},
        "plugins": [
          {
            "name": "stack",
            "description": "core plugin",
            "plugin_type": "element"
          },
          {
            "name": "import",
            "description": "core plugin",
            "plugin_type": "element"
          },
          {
            "name": "local",
            "description": "core plugin",
            "plugin_type": "source"
          },
          {
            "name": "remote",
            "description": "core plugin",
            "plugin_type": "source"
          },
          {
            "name": "tar",
            "description": "core plugin",
            "plugin_type": "source"
          },
          {
            "name": "local",
            "description": "core plugin",
            "plugin_type": "source-mirror"
          },
          {
            "name": "remote",
            "description": "core plugin",
            "plugin_type": "source-mirror"
          },
          {
            "name": "tar",
            "description": "core plugin",
            "plugin_type": "source-mirror"
          }
        ]
      }
    }
  ],
  "user_config": {
    "configuration": "/home/kevin/.config/buildstream.conf",
    "cache_directory": "/home/kevin/.cache/buildstream",
    "log_directory": "/home/kevin/.cache/buildstream/logs",
    "source_directory": "/home/kevin/.cache/buildstream/sources",
    "build_directory": "/home/kevin/.cache/buildstream/build",
    "source_mirrors": "/home/kevin/.cache/buildstream/sources",
    "build_area": "/home/kevin/.cache/buildstream/build",
    "strict_build_plan": "/home/kevin/.config/buildstream.conf",
    "maximum_fetch_tasks": 10,
    "maximum_build_tasks": 4,
    "maximum_push_tasks": 4,
    "maximum_network_retries": 2
  },
  "elements": [
    {
      "name": "import-local-files.bst",
      "description": "",
      "environment": {
        "PATH": "/usr/bin:/bin:/usr/sbin:/sbin",
        "SHELL": "/bin/sh",
        "TERM": "dumb",
        "USER": "tomjon",
        "USERNAME": "tomjon",
        "LOGNAME": "tomjon",
        "LC_ALL": "C",
        "HOME": "/tmp",
        "TZ": "UTC",
        "SOURCE_DATE_EPOCH": "1321009871"
      },
      "variables": {
        "prefix": "/usr",
        "exec_prefix": "/usr",
        "bindir": "/usr/bin",
        "sbindir": "/usr/sbin",
        "libexecdir": "/usr/libexec",
        "datadir": "/usr/share",
        "sysconfdir": "/etc",
        "sharedstatedir": "/usr/com",
        "localstatedir": "/var",
        "lib": "lib",
        "libdir": "/usr/lib",
        "debugdir": "/usr/lib/debug",
        "includedir": "/usr/include",
        "docdir": "/usr/share/doc",
        "infodir": "/usr/share/info",
        "mandir": "/usr/share/man",
        "build-root": "/buildstream/test/import-local-files.bst",
        "conf-root": ".",
        "install-root": "/buildstream-install",
        "strip-binaries": "",
        "schema": "https",
        "project-name": "test",
        "max-jobs": "8",
        "element-name": "import-local-files.bst"
      },
      "dependencies": [],
      "build_dependencies": [],
      "runtime_dependencies": [],
      "sources": [
        {
          "kind": "local",
          "url": "files",
          "medium": "local",
          "version-type": "cas-digest",
          "version": "d8c20623d7160ffe2fd69bd03b3ad7b24a6d1dfd7c96f3ed6c8deb3f268d2d64/85"
        }
      ]
    },
    {
      "name": "import-remote-files.bst",
      "description": "",
      "environment": {
        "PATH": "/usr/bin:/bin:/usr/sbin:/sbin",
        "SHELL": "/bin/sh",
        "TERM": "dumb",
        "USER": "tomjon",
        "USERNAME": "tomjon",
        "LOGNAME": "tomjon",
        "LC_ALL": "C",
        "HOME": "/tmp",
        "TZ": "UTC",
        "SOURCE_DATE_EPOCH": "1321009871"
      },
      "variables": {
        "prefix": "/usr",
        "exec_prefix": "/usr",
        "bindir": "/usr/bin",
        "sbindir": "/usr/sbin",
        "libexecdir": "/usr/libexec",
        "datadir": "/usr/share",
        "sysconfdir": "/etc",
        "sharedstatedir": "/usr/com",
        "localstatedir": "/var",
        "lib": "lib",
        "libdir": "/usr/lib",
        "debugdir": "/usr/lib/debug",
        "includedir": "/usr/include",
        "docdir": "/usr/share/doc",
        "infodir": "/usr/share/info",
        "mandir": "/usr/share/man",
        "build-root": "/buildstream/test/import-remote-files.bst",
        "conf-root": ".",
        "install-root": "/buildstream-install",
        "strip-binaries": "",
        "schema": "https",
        "project-name": "test",
        "max-jobs": "8",
        "element-name": "import-remote-files.bst"
      },
      "dependencies": [],
      "build_dependencies": [],
      "runtime_dependencies": [],
      "sources": [
        {
          "kind": "remote",
          "url": "https://example.org/foo.bar.bin",
          "medium": "remote-file",
          "version-type": "sha256",
          "version": "d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1"
        },
        {
          "kind": "tar",
          "url": "https://example.org/baz.qux.tar.gz",
          "medium": "remote-file",
          "version-type": "sha256",
          "version": "d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1"
        }
      ]
    },
    {
      "name": "target.bst",
      "description": "Main stack target for the bst build test",
      "environment": {
        "PATH": "/usr/bin:/bin:/usr/sbin:/sbin",
        "SHELL": "/bin/sh",
        "TERM": "dumb",
        "USER": "tomjon",
        "USERNAME": "tomjon",
        "LOGNAME": "tomjon",
        "LC_ALL": "C",
        "HOME": "/tmp",
        "TZ": "UTC",
        "SOURCE_DATE_EPOCH": "1321009871"
      },
      "variables": {
        "prefix": "/usr",
        "exec_prefix": "/usr",
        "bindir": "/usr/bin",
        "sbindir": "/usr/sbin",
        "libexecdir": "/usr/libexec",
        "datadir": "/usr/share",
        "sysconfdir": "/etc",
        "sharedstatedir": "/usr/com",
        "localstatedir": "/var",
        "lib": "lib",
        "libdir": "/usr/lib",
        "debugdir": "/usr/lib/debug",
        "includedir": "/usr/include",
        "docdir": "/usr/share/doc",
        "infodir": "/usr/share/info",
        "mandir": "/usr/share/man",
        "build-root": "/buildstream/test/target.bst",
        "conf-root": ".",
        "install-root": "/buildstream-install",
        "strip-binaries": "",
        "schema": "https",
        "project-name": "test",
        "max-jobs": "8",
        "element-name": "target.bst"
      },
      "dependencies": [
        "import-local-files.bst",
        "import-remote-files.bst"
      ],
      "build_dependencies": [
        "import-local-files.bst",
        "import-remote-files.bst"
      ],
      "runtime_dependencies": [
        "import-local-files.bst",
        "import-remote-files.bst"
      ],
      "sources": []
    }
  ]
}

@gtristan
Copy link
Contributor

Here is an example of the current state of the inspect output against a test project.

Any insight into specifically what types of fields we would like to include in this output as well as general feedback would be appreciated.

Here are my thoughts for today...

Connecting the dots

Something I that we'll absolutely need is a way for scripts to piece together
the data in interesting ways, this means we need reliable ways to make references
to other data in the JSON output.

For instance, when printing a dependency - we need a reliable way to uniquely
address that dependency in the accompanying data.

Element references (dependencies)

How to uniquely address an element dependency in the output may depend on other
things.

Currently with bst show output, we print the full element paths (including
the junction prefixes) which would allow one to make the connection.

In your comment you are proposing:

      "dependencies": [
        "import-local-files.bst",
        "import-remote-files.bst"
      ],
      "build_dependencies": [
        "import-local-files.bst",
        "import-remote-files.bst"
      ],
      "runtime_dependencies": [
        "import-local-files.bst",
        "import-remote-files.bst"
      ],

I think it would be better to have one "dependencies" list following the
full dependency dictionary

E.g.:

      "dependencies" : [
        {
          "name": "foo.bst",
          "junction": "junction.bst"
          "type" : "dependency type"
        },
      ]

Where "type" can be any of the valid dependency types.

This would mean that for a dependency in the toplevel project, the "junction" entry
would be empty, if the element is in a sub-subproject, the junction would be a junction
path, such as "subproject.bst:subsubproject.bst".

Project junction

In order to piece together the data, we should have a special "junction" parameter
which is serialized as a regular element, serializing data from the junction element
directly.

In this way, the junction's "name" will be a junction element present in the parent
project which the project was loaded from.

Also, this is very helpful as it adds the ability to provide a
SourceInfo
list for the junction itself.

Project elements

I think that the elements which were loaded from a given project, should be found
in a nested "elements" list defined within the "project".

E.g.:

{
  "projects" : [
    {
      "name": "freedesktop-sdk",
      "junction" : {
        "freedesktop-sdk.bst",
        "source-info" : [
           ...
        ]
      },

      # Serialize the elements from this project, which are loaded in this pipeline
      "elements" : [
         ...
      ]
    }
  ],
  ...
}

Lets start smaller and build on that

There is a bunch of stuff here which we don't know why it might be useful, lets
reduce this dramatically and iteravely add aspects to the serialized data with
careful thought, depending on what it might be useful for.

User config

I think we should just drop the "user_config" stuff, it is unclear whether
any of this will be useful.

Also, it doesn't seem to make sense regarding the BuildStream data
model to dump data about "project config" and "user config" separately.

The finalized data model is an amalgamation of defaults, user configuration,
project configuration and element specific information, and I am skeptical
about serializing/framing the loaded data model in terms of the input
which BuildStream uses to load the data model.

In any case, I suggest we just drop all of this for now.

Project level things

  • "duplicates" and "declarations"

    I don't think we need "duplicates" or "declarations" here, it is not clear
    what you propose "declarations" to actually mean, either.

    About duplicates: Given my previous comment about Connecting the dots above, I do not
    think it is relevant to serialize how a project configures this; instead we focus on how
    the data model was constructed.

    In the case that the same project was loaded twice in the pipeline, they will be
    distinguished by their "junction", and any dependencies on an element that is loaded
    twice in the pipeline, will be able to distinguish which "foo.bst" was being
    referred to, by way of providing strong references to elements, as described
    in my previous comment about Element references.

  • "element_overrides" and "source_overrides"

    I don't know what you intended here, but in either of my interpretations, lets
    drop these.

    If we are talking about overriding elements
    in subprojects via junctions, then lets ignore this detail.

    The resolved data model will show which element was used from which project, even in the
    case that a subproject ends up depending on an element in it's parent project, which
    depends on elements in the parent and subproject.

    I think instead you are referring to
    overriding plugin configuration

    There is no reason to include this in the serialized json, as it is not a part of the
    resolved data model. These dictionaries are a part of the
    element composition,
    and we are only concerned with providing the end resulting variables, environment etc in
    the reported elements.

  • "config" dictionary

    I don't think we need any nesting here, I think we can assume that the toplevel
    "project" dictionary is stuff about the project.

  • "directory"

    I don't think we need this, because it is only contextual to where the project itself
    is checked out and built - I don't see how this is useful for a scripting interface.

  • "aliases"

    Again this is input data to the data model, but not relevant I think.

    Ultimately relevant URL results will be reported in the source URLs.

    Later we may consider doing something fancy to report possible mirror URLs for
    a given laoded project/user config - allowing us to print which URLs will be
    traversed in order for a given source - but nobody is asking for this now
    and we don't need it in an initial revision.

Plugin serialization

We need to consider some different useful things for each plugin origin.

We have local plugins, pip loaded plugins, and plugins loaded through junctions.

I think it will make sense to have separate lists here within a "project"
dictionary, specifying more precisely where a plugin came from.

Similarly to my suggestion about the project junction, we should also
serialize the junction from whence the plugins from that origin were
loaded in the same way (so we can have the SourceInfo describing where
plugins came from).

While I feel strongly that fully descriptive data needs to be provided
about loaded plugins, I do not think that serializing the project's plugins
in the bst inspect output is a hard blocker for an initial version.

What we need and don't have

  • We definitely want the cache keys here.

    For each element, at least the equivalent of "%{full-key}" from
    bst show, or empty if the element or it's dependencies are in
    an inconsistent state (inconsistent state means that the element
    or it's dependencies are missing a resolved, specific source reference).

  • Element description

    This is easy, and can be useful for generating nice reports

  • Artifact cas digest

    I'm on the fence about whether we need to have this in an initial
    revision, not having it is a step down from bst show.

    If we add this, we should have it in a nested "artifact" block
    found on the element dictionary which is enumerated on a per
    project basis.

    E.g.:

    {
      "projects" : [
        ...
        "elements" : [
          {
            "artifact" : {
              "cas-digest" : "<digest value>"
            }
          }
        ]
      ]
    }
    
    This will allow extensibility for other things from the artifact
    which we'll want to add later, like files metadata, build logs, etc.

@juergbi
Copy link
Contributor

juergbi commented Jul 25, 2025

Connecting the dots

Something I that we'll absolutely need is a way for scripts to piece together the data in interesting ways, this means we need reliable ways to make references to other data in the JSON output.

I fully agree. In addition to making sure that dependencies are fully qualified, also the element definitions need to be fully qualified.

And I think we want to make sure that they use the same structure. As far as I can tell, this branch currently uses _get_full_name() in both places, which results in qualified strings such as foo.bst:bar.bst when junctions are involved. This seems perfectly reasonable to me.

If we want to extract the junction part to a separate JSON member, we should do it also in element definitions for consistency. However, I suspect that the current approach with a single colon-separated string will be more convenient in practice, as there will be no need to do any string manipulation just for 'connecting the dots'. And it also matches bst show.

@kevinschoon
Copy link
Author

I think we should just drop the "user_config" stuff, it is unclear whether any of this will be useful.

I've dropped all of the user config.

@kevinschoon
Copy link
Author

I've re-written the dependencies section of the elements to now output their names and junctions as you described.

Example:

   ....
    {
      "name": "bootstrap/base-sdk/automake.bst",
      "junction": "freedesktop-sdk.bst",
      "kind": "all"
    },
    {
      "name": "bootstrap/build/rsync.bst",
      "junction": "freedesktop-sdk.bst",
      "kind": "all"
    },
  ...

Non-junction output:

  "dependencies": [
    {
      "name": "import-local-files.bst",
      "kind": "all"
    },
    {
      "name": "import-remote-files.bst",
      "kind": "all"
    },
    {
      "name": "target.bst",
      "kind": "all"
    }
  ]

@kevinschoon
Copy link
Author

I've continued to make updates based on your feedback although this is still a WIP please see an updated project output:

{
  "projects": [
    {
      "name": "test",
      "plugins": [
        {
          "name": "stack",
          "description": "core plugin",
          "plugin_type": "element"
        },
        {
          "name": "import",
          "description": "core plugin",
          "plugin_type": "element"
        },
        {
          "name": "local",
          "description": "core plugin",
          "plugin_type": "source"
        },
        {
          "name": "remote",
          "description": "core plugin",
          "plugin_type": "source"
        },
        {
          "name": "tar",
          "description": "core plugin",
          "plugin_type": "source"
        }
      ],
      "elements": [
        {
          "name": "import-local-files.bst",
          "description": "",
          "environment": {},
          "variables": {
            "prefix": "/usr",
            "exec_prefix": "/usr",
            "bindir": "/usr/bin",
            "sbindir": "/usr/sbin",
            "libexecdir": "/usr/libexec",
            "datadir": "/usr/share",
            "sysconfdir": "/etc",
            "sharedstatedir": "/usr/com",
            "localstatedir": "/var",
            "lib": "lib",
            "libdir": "/usr/lib",
            "debugdir": "/usr/lib/debug",
            "includedir": "/usr/include",
            "docdir": "/usr/share/doc",
            "infodir": "/usr/share/info",
            "mandir": "/usr/share/man",
            "build-root": "/buildstream/test/import-local-files.bst",
            "conf-root": ".",
            "install-root": "/buildstream-install",
            "strip-binaries": "",
            "schema": "https",
            "project-name": "test",
            "max-jobs": "8",
            "element-name": "import-local-files.bst"
          },
          "dependencies": [
            {
              "name": "import-local-files.bst",
              "kind": "all"
            }
          ],
          "sources": [
            {
              "kind": "local",
              "url": "files",
              "medium": "local",
              "version-type": "cas-digest",
              "version": "d8c20623d7160ffe2fd69bd03b3ad7b24a6d1dfd7c96f3ed6c8deb3f268d2d64/85"
            }
          ]
        },
        {
          "name": "import-remote-files.bst",
          "description": "",
          "environment": {},
          "variables": {
            "prefix": "/usr",
            "exec_prefix": "/usr",
            "bindir": "/usr/bin",
            "sbindir": "/usr/sbin",
            "libexecdir": "/usr/libexec",
            "datadir": "/usr/share",
            "sysconfdir": "/etc",
            "sharedstatedir": "/usr/com",
            "localstatedir": "/var",
            "lib": "lib",
            "libdir": "/usr/lib",
            "debugdir": "/usr/lib/debug",
            "includedir": "/usr/include",
            "docdir": "/usr/share/doc",
            "infodir": "/usr/share/info",
            "mandir": "/usr/share/man",
            "build-root": "/buildstream/test/import-remote-files.bst",
            "conf-root": ".",
            "install-root": "/buildstream-install",
            "strip-binaries": "",
            "schema": "https",
            "project-name": "test",
            "max-jobs": "8",
            "element-name": "import-remote-files.bst"
          },
          "dependencies": [
            {
              "name": "import-remote-files.bst",
              "kind": "all"
            }
          ],
          "sources": [
            {
              "kind": "remote",
              "url": "https://example.org/foo.bar.bin",
              "medium": "remote-file",
              "version-type": "sha256",
              "version": "d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1"
            },
            {
              "kind": "tar",
              "url": "https://example.org/baz.qux.tar.gz",
              "medium": "remote-file",
              "version-type": "sha256",
              "version": "d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1"
            }
          ]
        },
        {
          "name": "target.bst",
          "description": "Main stack target for the bst build test",
          "environment": {},
          "variables": {
            "prefix": "/usr",
            "exec_prefix": "/usr",
            "bindir": "/usr/bin",
            "sbindir": "/usr/sbin",
            "libexecdir": "/usr/libexec",
            "datadir": "/usr/share",
            "sysconfdir": "/etc",
            "sharedstatedir": "/usr/com",
            "localstatedir": "/var",
            "lib": "lib",
            "libdir": "/usr/lib",
            "debugdir": "/usr/lib/debug",
            "includedir": "/usr/include",
            "docdir": "/usr/share/doc",
            "infodir": "/usr/share/info",
            "mandir": "/usr/share/man",
            "build-root": "/buildstream/test/target.bst",
            "conf-root": ".",
            "install-root": "/buildstream-install",
            "strip-binaries": "",
            "schema": "https",
            "project-name": "test",
            "max-jobs": "8",
            "element-name": "target.bst"
          },
          "dependencies": [
            {
              "name": "import-local-files.bst",
              "kind": "all"
            },
            {
              "name": "import-remote-files.bst",
              "kind": "all"
            },
            {
              "name": "target.bst",
              "kind": "all"
            }
          ],
          "sources": []
        }
      ]
    }
  ],
  "defaults": {
    "environment": {
      "PATH": "/usr/bin:/bin:/usr/sbin:/sbin",
      "SHELL": "/bin/sh",
      "TERM": "dumb",
      "USER": "tomjon",
      "USERNAME": "tomjon",
      "LOGNAME": "tomjon",
      "LC_ALL": "C",
      "HOME": "/tmp",
      "TZ": "UTC",
      "SOURCE_DATE_EPOCH": "1321009871"
    }
  }
}

@kevinschoon
Copy link
Author

One thing to note: I added a top level defaults field which contains the default environment loaded per the default configuration so we don't have to duplicate redundant information within. Happy to remove this if you think the previous duplicated way is still better though.

@gtristan
Copy link
Contributor

One thing to note: I added a top level defaults field which contains the default environment loaded per the default configuration so we don't have to duplicate redundant information within. Happy to remove this if you think the previous duplicated way is still better though.

It is interesting to consider the optimization this might provide for external tooling which wants to consider the env vars of many elements yes.

I wouldn’t want to frame such an optimization in terms of project defaults though, as this would imply the user repeats the composition algorithm, which is a bit more complex than just elements overriding project settings.

Also, we need to construct data which can equally be obtained by loading a local buildstream project, or, by downloading artifacts, so we really want to be dealing with only resolved element data which can potentially be encoded into artifact metadata (not all data is currently included in artifact metadata, but that is an interesting thing to enrich).

Perhaps there is some semantic we could have to reduce redundant data in the output, especially when the requested output fields include things which typically repeat (like env vars), I think it would be also fine to punt that optimization down the road, though.

@gtristan
Copy link
Contributor

gtristan commented Jul 25, 2025

I've continued to make updates based on your feedback although this is still a WIP please see an updated project output:

This is great progress and already looking much better !

{
  "projects": [
    {
      "name": "test",
      "plugins": [
        {
          "name": "stack",
          "description": "core plugin",
          "plugin_type": "element"
        },

For the plugins, we will need to nest this into individual origins.

This way we can see an origin defined with its junction or its pip package and version, and see which plugins were loaded from that origin.

Also I think it will be better to list the different plugin types as lists, I.e separate lists for elements, sources, and source mirrors plugins, within a plugin origin block. This might be more a matter of taste, but would seem to avoid many redundant “plugin_type” members.

Finally, I think we can drop “description” from plugins, as I think that is inspired by an unstructured string intended for user comprehension coming from _frontend/widget.py, defining the origins in a more structured way will anyway be more useful for introspection and interoperability purposes.

@abderrahim
Copy link
Contributor

I gave this a try and it crashed with the following traceback. (Yes, I do have an open workspace).

[--:--:--][        ][    main:core activity                 ] BUG     Object of type Workspace is not JSON serializable

    Traceback (most recent call last):
      File "/home/abderrahimkitouni/.local/bin/bst", line 10, in <module>
        sys.exit(cli())
                 ~~~^^
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
               ~~~~~~~~~^^^^^^^^^^^^^^^^^
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/buildstream/_frontend/cli.py", line 273, in override_main
        original_main(self, args=args, prog_name=prog_name, complete_var=None, standalone_mode=standalone_mode, **extra)
        ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/click/core.py", line 1053, in main
        rv = self.invoke(ctx)
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/click/core.py", line 1659, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
                               ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/click/core.py", line 1395, in invoke
        return ctx.invoke(self.callback, **ctx.params)
               ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/click/core.py", line 754, in invoke
        return __callback(*args, **kwargs)
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/click/decorators.py", line 38, in new_func
        return f(get_current_context().obj, *args, **kwargs)
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/buildstream/_frontend/cli.py", line 600, in inspect
        inspector.dump_to_stdout(elements, except_=except_, selection=deps)
        ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/abderrahimkitouni/.local/share/uv/tools/buildstream/lib/python3.13/site-packages/buildstream/_frontend/inspect.py", line 251, in dump_to_stdout
        json.dump(_dump_dataclass(output), sys.stdout)
        ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.13/json/__init__.py", line 179, in dump
        for chunk in iterable:
                     ^^^^^^^^
      File "/usr/lib/python3.13/json/encoder.py", line 433, in _iterencode
        yield from _iterencode_dict(o, _current_indent_level)
      File "/usr/lib/python3.13/json/encoder.py", line 407, in _iterencode_dict
        yield from chunks
      File "/usr/lib/python3.13/json/encoder.py", line 326, in _iterencode_list
        yield from chunks
      File "/usr/lib/python3.13/json/encoder.py", line 407, in _iterencode_dict
        yield from chunks
      File "/usr/lib/python3.13/json/encoder.py", line 326, in _iterencode_list
        yield from chunks
      File "/usr/lib/python3.13/json/encoder.py", line 407, in _iterencode_dict
        yield from chunks
      File "/usr/lib/python3.13/json/encoder.py", line 440, in _iterencode
        o = _default(o)
      File "/usr/lib/python3.13/json/encoder.py", line 180, in default
        raise TypeError(f'Object of type {o.__class__.__name__} '
                        f'is not JSON serializable')
    TypeError: Object of type Workspace is not JSON serializable

@abderrahim
Copy link
Contributor

Another thing I noticed is that all elements are included under all projects, even the plugins project that has no elements of its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants