Skip to content

Just-ify CI#322

Merged
rustaceanrob merged 1 commit into2140-dev:masterfrom
nyonson:justify-ci
Apr 17, 2025
Merged

Just-ify CI#322
rustaceanrob merged 1 commit into2140-dev:masterfrom
nyonson:justify-ci

Conversation

@nyonson
Copy link
Collaborator

@nyonson nyonson commented Apr 1, 2025

Some more deets in the commit message, but quick hits.

  • Build times are faster, everything runs in about a minute. Probably due to the caching added by the consistent use of the setup rust plugin.
  • The setup rust plugin has a neat problem matchers feature where it shows lint/fmt errors in the github UI itself.
  • The justfile holds all the build complexity so tried to have each task commented with a description. Used a little dispatcher pattern so only a single test recipe is exposed with an arg to run a specific suite.
  • Lockfile complexity has been relegated to the _test-msrv recipe. The check no longer depends on a checked in lockfile which technically doesn't matter much since consumers create their own.

Nice clean interface for contributors:

image

@nyonson nyonson force-pushed the justify-ci branch 9 times, most recently from afbc8fb to 4b86fd6 Compare April 2, 2025 20:46
@nyonson nyonson marked this pull request as ready for review April 2, 2025 20:48
@nyonson nyonson force-pushed the justify-ci branch 2 times, most recently from 4457fa3 to a533711 Compare April 2, 2025 21:00
@nyonson nyonson force-pushed the justify-ci branch 7 times, most recently from f40e237 to 088e5b3 Compare April 3, 2025 03:44
@rustaceanrob
Copy link
Collaborator

rustaceanrob commented Apr 3, 2025

I may add some individual comments but this is my overall stance after viewing this:

  1. Obfuscating the complexity to a separate file doesn't remove it. If anything, now future maintainers have to know that the logic of the CI is not contained in the typical file. Just because a file is particularly ugly doesn't mean it's wrong. If I see a CI failure I would like to know exactly what line in the yaml failed personally, without having to cross-reference with the justfile.
  2. The setup-toolchain action is great and I would like to at least preserve that
  3. Code comments should be kept to an absolute minimum because ultimately many lines in the repository could deserve some sort of explanation - Rust is complicated. We can get to those clippy lints another time, but I don't see the upside of adding the comments.
  4. Wasn't the whole point of adding the resolver so we don't have to take any special care to the MSRV lockfile? If it is simply because you can't execute certain shell commands with just, we should be ditching just for CI, not the resolver. The more files we add, like rust-toolchain.toml, is just more maintenance overhead.
  5. If we are going to use just more heavily, I think we should look into passing arguments to commands. Prefixing things with test doesn't do anything but make one add a hyphen to type a command imo. It would be nice if instead we had: just test [lib/doc/sync/integration] which ran each type of test depending on the argument.

@nyonson
Copy link
Collaborator Author

nyonson commented Apr 3, 2025

  1. Obfuscating the complexity to a separate file doesn't remove it. If anything, now future maintainers have to know that the logic of the CI is not contained in the typical file. Just because a file is particularly ugly doesn't mean it's wrong. If I see a CI failure I would like to know exactly what line in the yaml failed personally, without having to cross-reference with the justfile.

The goal isn't to remove complexity or avoid the YAML. If a tool does its job its OK to look ugly (like nix). But I think there are a few benefits to using a special purpose task runner which can be weighed against the cost of the introduced indirection.

  1. Running commands locally helps us avoid accidentally depending on subtle parts of the CI environment. And a task runner makes it easier for contributors to run the same tests locally. I am more likely to run just check than to go over to .github/workflows/ci.yaml and copy/paste the exact cargo commands.
  2. A task runner also allows for quicker iterations locally when makin any changes to the CI pipe, avoiding the push, wait for github, get strange syntax error loops.
  3. And while there is indirection (CI config -> just), I think its helpful to have this file at the root vs. under a hidden directory. Developers and CI are two callers to the shared definitions at the heart of a project. I think ideally the ci.yaml file is almost static, while the justfile changes with new requirements.

Just is a nice interface, but doesn't have to be the task runner. I was also looking into cargo aliases which are cool cause no new dependency. But they are also defined in a file under a hidden directory and don't offer great cross platform support if you try to run more than a single cargo command. Looks like they defer to shell calls or recommend writing little rust programs which is interesting, but feels a little much for now.

  1. Code comments should be kept to an absolute minimum because ultimately many lines in the repository could deserve some sort of explanation - Rust is complicated. We can get to those clippy lints another time, but I don't see the upside of adding the comments.

Not sure I agree on keeping comments to a minimum, although they can be a sign things are getting too complicated? Guess I am just not offended by over documentation. It does require upkeep, but doesn't usually actively hurt and can always be easily deleted. Stuff doesn't build on top of documentation like it does with code. And for build stuff in particular, lots of times even small declarative changes have a bunch of subtle effects which are helpful to document.

For the clippy comments, I usually like to document why I am turning off something. Kinda justifying the short cut. It might help in the future if the filter-control feature is refactored to remind us to remove the short cuts instead of just unintentionally living on leaving an opening for leaks.

  1. Wasn't the whole point of adding the resolver so we don't have to take any special care to the MSRV lockfile? If it is simply because you can't execute certain shell commands with just, we should be ditching just for CI, not the resolver. The more files we add, like rust-toolchain.toml, is just more maintenance overhead.

I don't think I have made the lockfile stuff clear, so dropping some high level thoughts.

The lockfile for a library doesn't matter to the consumers since they will generate their own dependency set. But they do care about the dependency constraints of a library since they will have to deal with them to generate that lockfile. So the goal of a library is to gain confidence that its dependency constraints are as flexible as possible (easy for consumers) while capturing the requirements of the code. The worst case scenario is the constraints are too strict and actually impossible to satisfy. An example would be a low MSRV paired with high versions for some dependencies. Its impossible to satisfy a 1.63.0 MSRV and a recent version tokio. The v2 resolver doesn't detect this, but the v3 resolver fails, so at least lets the maintainers know. Generating a lockfile is proof for the maintainers that the constraints work. Maintainers could generate lockfiles for every possible combination of dependency versions for the highest level of confidence the constraints are good, but that might actually take forever. I think the best middle ground is generating an MSRV compliant version set as a bare minimum, and then some sort of minimum set and max set to make sure the extremes work. There is still potential for some version set in the middle to cause issues, but feels like a lower chance without that high of a cost.

I think generating an MSRV lockfile is very useful check to just ensure dependency constraints are possible. But maybe just that generation should be added to the CI pipe and less care given to whatever lockfile happens to be checked in?

The rust-toolchain.toml is a bit tangential. It is just a soft push for contributors to get them the right toolchain out of the gate using rustup which picks it up pretty well since version 1.28. I think the one footgun it may help avoid is when using general toolchain specs with cargo like +stable, that matches any stable toolchain installed. So very possible contributors are all using different toolchains. As far as I can tell, rust-toolchain.toml doesn't enforce anything and is just a nice push in the right direction. I don't think CI or anything should depend on it, so doesn't have to be used.

  1. If we are going to use just more heavily, I think we should look into passing arguments to commands. Prefixing things with test doesn't do anything but make one add a hyphen to type a command imo. It would be nice if instead we had: just test [lib/doc/sync/integration] which ran each type of test depending on the argument.

Prefixing is just a form of namespacing, but I don't have a strong opinion between them and instead taking an argument. Conditionals in just are a little weird, but maybe there is a smoother pattern to dispatch an arg to a certain command.

@nyonson nyonson force-pushed the justify-ci branch 5 times, most recently from a3dd967 to eec6517 Compare April 4, 2025 22:00
Cargo.toml Outdated
path = "example/managed.rs"
required-features = ["filter-control"]

[lints.rust]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some cases where intermediate commits might have lints, but with the intention of them being removed in a later commit. For instance making a new struct and unit testing it will result in lints because it isn't used yet, but I would consider that a good software practice

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is just too strict a setting? I flipped it on because there was only like 2 warnings and easy to fix. I think its one of those things that would be easier to enforce now, cause I've accepted that anything not tested in CI slips through. But don't have a super strong take on this setting.

rust-bitcoin has a policy that every commit needs to pass CI. While strict, I appreciate the consistent bar for commits on master. Obviously depends on the merge strategy though since squash would make the effort worthless.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of this library I think it is pretty strict. By exporting a type as public, there is no lint incurred even if you don't "use" the method. For rust-bitcoin most type are public. In this crate, like with the BlockTree example, some abstractions can and should be completely internal. Getting lints on something that isn't used would lead us to try to build a type and integrate in one commit, which seems bad to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll roll it back since I am not totally sold on it myself, could definitely cause pain. I think the ideal might be to enable it and then poke holes for certain rules that are the most annoying, but can think on that later.

FWIW I am not sure your use case is the most compelling though. I think it is another sign that there are some downsides to pub(crate) and it might be worth using the more traditional "private module with pub items" pattern in some scenarios.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, I rolled back the lint-warn-to-error and associated fixes, only to have CI fail on them. Turns out, the fancy new rust action enables that flag by default: https://github.com/actions-rust-lang/setup-rust-toolchain?tab=readme-ov-file#rustflags

Can disable it on the action for now, but may take a quick look at more granular controls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, instead of reverting the default on all instances of the github action, took it as a sign to "roll forward" and keep the setting in the manifest, that way local and CI have the same settings.

There are only 2 errors I iron'd out for now, both due to code being available (or not) with the filter-control features. Feels like something to clean up later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFICT the solution you described would still cause the lint. If it is public via pub to the module or pub(crate) to the entire crate I think the linter would still complain there is no user of the method. It feels like a massive pain in the ass to try to develop a new feature and still pass clippy. I don't necessarily understand the reasoning that local and CI has to be the same on this policy. To get into master, sure - deny lints, but intermediate commits in development?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a bad assumption about the visibility modifiers and thought pub was always considered possibly external facing by the linter, so never dead. Turns out it's smarter than that though, just ran this test.

pub mod exposed {
    mod hidden {
        pub fn public_in_private() {}  // Will be flagged as dead
    }
    
    pub mod visible {
        pub fn truly_public() {}  // Won't be flagged as dead
    }
}

As far as local and CI, tbh I've always worked under the assumption that life is easier when those match. Didn't consider it a feature to have flexible settings between them. My local workflow isn't harmed by the setting so figured take the bonus and match CI, but i'll drop it since it hurts.

@rustaceanrob
Copy link
Collaborator

Small conflict in lib.rs, sorry about that. Locally if you don't want to resolve with a rebase you can just change the Filter import to use chain::Filter

@nyonson nyonson force-pushed the justify-ci branch 8 times, most recently from e07d745 to 2e82030 Compare April 16, 2025 18:58
@nyonson
Copy link
Collaborator Author

nyonson commented Apr 16, 2025

Dang, there is a cool new feature for giving structure to lint disable reasons, but it isn't available for our msrv. Will keep it in mind though for the future. Looks like it was stabilized in rust 1.77.0.

Copy link
Collaborator

@rustaceanrob rustaceanrob Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to this one. If there is the change proposed in .cargo/config.toml and resolver=3 is removed, do we really care what toolchain people develop on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to avoid certain "broken on my machines" issues, but also could cause confusion that this is a requirement. I'll drop it.

Keep build and test logic as agnostic as possible to whatever CI task
runner is used (today that is Github Actions). Let the CI job focus on
performance, like caching artifacts and keep the build and test logic in
the forefront easy for developers to use.

With that goal in mind, the CI job has been refactored to a handful
of jobs which each call a `just` recipe. Consistent caching has been
added to each job which shows a nice improvement in speed for the
workflow, running in less than a minute. The justfile has been
refactored with the logic that previously lived in the CI yaml. This
makes it easier to maintain as well as run locally when developing. A
few un-used recipes have been cleaned up.

Lockfile complexity has been relegated to the _test-msrv recipe. This
recipe checks that there is at least one set of dependencies which
satisfy the msrv requirements. The check no longer depends on the
checked in lockfile which doesn't matter much since consumers create
their own.

A rust-toolchain.toml file was added as a gently nudge for contributers
to stay on a recent version of "stable", that way we can use some of the
newer features like the v3 resolver and see less of "works on my
machine". CI will most likely not use the file however, since it usually
needs to test more complex things like MSRV.
@rustaceanrob
Copy link
Collaborator

ACK 3e5feb7

@rustaceanrob rustaceanrob merged commit 4f01931 into 2140-dev:master Apr 17, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants