Skip to content

Conversation

@barne856
Copy link
Contributor

@barne856 barne856 commented Aug 3, 2025

Closes #109 Replaces #547

Introduces a stone.lock file in an effort to make builds reproducible in the case of git tags / branches being re-pointed to different commits.

Changes to recipes

This is a breaking change and will require any recipes that use git upstreams with tags to be specified using the tag key.

Before:

upstreams:
    - git|https://github.com/serpent-os/boulder : v1.0.1

After:

upstreams:
    - git|https://github.com/serpent-os/boulder:
        tag: v1.0.1

Packagers can also specify the rev (commit hash) directly or through the rev key

Like this:

upstreams:
    - git|https://github.com/serpent-os/boulder : a94a8fe5ccb19ba61c4c0873d391e987982fbbd3

Or this:

upstreams:
    - git|https://github.com/serpent-os/boulder:
        rev: a94a8fe5ccb19ba61c4c0873d391e987982fbbd3

A branch key is also possible, but only one of tag, branch or rev can be provided in the stone.yaml

Lock File Design

  • When you run builder build on a recipe with a git tag or branch, it resolves it to a specific commit hash (rev) at that moment
  • Only git type upstreams that specify either a tag or a branch will be recorded in the lock file
  • It then saves this information to a new stone.lock file in the same directory as the provided recipe file only after the build succeeds
  • On subsequent build, boulder will use that locked commit hash from the stone.lock (and provide a warning message if the resolved hash does not match the lock file)

Example stone.lock generated for the aom recipe

# This file is automatically generated by boulder.
# It is not intended for manual editing.

upstreams:
- uri: git|https://aomedia.googlesource.com/aom.git
  tag: v3.12.1
  rev: fc5cf6a132697487fbaa9965b249012e0238768f

Future Considerations

This brings up a question for a follow-up discussion: How should a developer intentionally update the lock file if a tag or branch has been updated on the remote and they want the new version?

Currently, the only way to refresh the commit hash would be to manually delete the stone.lock file and run the build again.

Should we create a dedicated command for this like boulder lock --update, or should we expand the scope of an existing command like boulder recipe update to also handle refreshing the lock file?

Just deleting the file and rebuilding doesn't necessarily seem like a bad option either, but open to thoughts on the best path forward here.

@ermo
Copy link
Member

ermo commented Aug 3, 2025

Out of curiousity, if we can generate a lock file, could we not also update the fields in the recipe itself where they're immediately visible (instead of hidden away)?

@barne856
Copy link
Contributor Author

barne856 commented Aug 3, 2025

Out of curiousity, if we can generate a lock file, could we not also update the fields in the recipe itself where they're immediately visible (instead of hidden away)?

Yes that is a valid approach to consider as well. My thinking was to treat the stone.yml as the user-controlled file that expresses intent (e.g., I want to follow this tag), while the stone.lock file is the machine-generated artifact that records the resolved state for that intent.

The pattern is pretty common in other package manager (Cargo.toml / Cargo.lock, package.json / package-lock.json) and it felt like a good model to follow here too and keeps the stone.yml more readable. Although at the moment, I only really see a use for it to resolve git style upstreams as the plain style upsteams still require the full hash to be provided in the yml file...

@barne856
Copy link
Contributor Author

barne856 commented Aug 3, 2025

Would be out of scope for this PR, but would you see anything else that we would want to resolve the state of during / after the build process, e.g., versions of builddeps used or even the hash for the final built stone file?

@barne856
Copy link
Contributor Author

barne856 commented Aug 3, 2025

If we do adopt a lock file approach, it may be beneficial to include some metadata in the lock file, at the very least a version string

@jplatte
Copy link
Member

jplatte commented Aug 3, 2025

This is a breaking change and will require any recipes that use git upstreams with tags to be specified using the tag key.

Question to @ermo: Do we want to do breaking changes at this point? I tend to prefer smooth transitions with deprecation warnings and such. Of course a little bit more implementation work, but probably not actually that much here?

@ermo
Copy link
Member

ermo commented Aug 4, 2025

Regarding breaking changes, the repo is currently small enough that I think that, in general, any breaking changes that we do need a follow up PR in the recipes repo to fix the fallout. This is also the reason that we have a separate repo named "volatile" -- there is no guarantee that this repo will be in a working state.

That said, a smooth transition process with deprecation warnings does sound like a nice thing to have, particularly in a production-ready setting. For now, while we are still alpha, I am asking myself if we want to be constrained by such a process?

In some ways, if the tooling goes "I'm sorry Dave, I'm afraid I can't do that. However, if we do it this way, you're good to go.", then that would essentially be the deprecation warning and the smooth process?

Comments and input welcome.

Regarding the lock-file, the manifest is in many ways similar to that. We have already seen one user vociferously complain that, in their opinion:

  • There were too many build files
  • It seemed odd to commit binary data to git
  • Lockfiles would be a better approach

My thinking is that we need to consider the following design properties:

  • Recipes should express intent (as mentioned) and be mostly devoid of superfluous ceremony. Anything that can be auto-discovered should be auto-discovered for ergonomics.
  • Hence, there should exist a build-generated artefact (or set of artefacts) that contain the discovered consequences of the intent expressed in the recipes at the time they were built.
    • Currently, manifests capture the data necessary to resolve DAGs when the infra schedules builds and are also quick to parse. They currently represent the minimum necessary info necessary for scheduling builds.
    • We would like to also capture installed files in order to be able to provide a remote repo index for reverse searches of the form moss what-provides <a file under /usr>.
    • The manifest format was made binary to avoid clever developers trying to shirk the requirement to provide a manifest as proof-of-build, as the point is to never commit things that don't build locally
    • I have seen the requirement to build and test locally be a point of contention -- some people don't think it is their responsibility to check that the stuff they change actually works, and will make comments about this surely being the responsibility of CI and other people testing their changes.
    • The ethos about the person making the change being responsible for testing is a long-standing one in Solus-derived packaging workflows. It serves to ensure that we never commit recipes that don't actually build at the point where they were committed.
    • That said, one of our goals is to enable what we call a "try-build" approach, where we can take a recipe change and ask the infra to schedule a remote build to see if the recipe change worked, and if it doesn't, report the error back. This in some ways serves to offload the "proof-of-build", though it does not remove it.

In summary, it seems to me that we have an opportunity to rethink our approach as it relates to both the discovery process itself, and how we capture/encode that in a git friendly way without losing the "proof of build" guarantees associated with recipe updates.

@barne856
Copy link
Contributor Author

barne856 commented Aug 5, 2025

Thanks for looking into this. I appreciate the context.

TBH I haven't looked into the manifest details at all yet, so feel free to correct me, but it sounds like there's definitely some overlap between what manifests and lock files are trying to solve and there is opportunity to unify these concepts into something more ergonomic.

I'm pretty much on board with the user feedback you mentioned:

  • There were too many build files
  • It seemed odd to commit binary data to git
  • Lockfiles would be a better approach

I think if we migrated to a single text-based stone.lock (YAML or eventually KDL?) this could satisfy all the design properties you outlined.

If the main reason for choosing binary was to prevent people from circumventing local builds, I think CI can solve that problem just as effectively. And I don't see any reason why a lock file couldn't capture installed files too.

We could enforce that the submitted stone.lock is current by having CI run boulder build and verify no changes to the lock file. A smart CI process could rebuild only the packages with changes. Obviously compute isn't free though, so this really depends on how many rebuilds we're typically looking at. If we start seeing several large PRs each month with tons of package changes, that could get expensive pretty quick on the org's GitHub Actions quota. Self-hosted runners on your own infra or just pulling and building yourself locally are also an option. Regardless, appropriate safeguards should be in place to prevent excessive CI runs.

some people don't think it is their responsibility to check that the stuff they change actually works

I think this is a shared responsibility. Of course I would expect anyone contributing changes to test that it works, but ultimately we do need to provide a mechanism to verify that was done (hence CI checks and branch protection). This gives automated feedback to the contributor if they've done something obviously wrong or unacceptable.

one of our goals is to enable what we call a "try-build" approach

I'm not familiar with the infra setup, but could this potentially work through CI with self-hosted runners? Just thinking out loud.

One other thought: since you mentioned auto-discovery for ergonomics, we could probably auto-discover hashes for plain upstreams and store those in the lock file too.

Approach-wise, we could merge this PR as-is to get the lock file foundation in place, then tackle the manifest migration incrementally. Keep both systems running in parallel and incrementally add lock file generation for the same data currently in manifests. We could probably compress the full migration into 2-3 PRs?

@barne856 barne856 marked this pull request as draft August 8, 2025 04:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

boulder: Disallow using git tags in upstreams

3 participants