Indexes 3: Adds flatbuffers schema and SolverPackageSpec for indexes to spk#1338
Indexes 3: Adds flatbuffers schema and SolverPackageSpec for indexes to spk#1338dcookspi wants to merge 2 commits intoindex-2-new-unchecked-additionsfrom
Conversation
| @@ -0,0 +1,327 @@ | |||
| // Copyright (c) Contributors to the SPK project. | |||
There was a problem hiding this comment.
This flatbuffer schema does not match the rust spk schema objects one-to-one. The focus is on in making an index that is useful for spk solvers. There is not enough data in the index schema to build or run tests on any of the packages. There is only enough to run solves on them.
Some of the tables have the same name as their rust counterparts, others do not. Some are flatbuffer specific. Some do not store all the fields of their rust counter part. The ones not stored are noted as 'ignored' in comments. Some of the objects have been flattened down to include pieces of intermediate or more deeply nested rust objects. Those should also be noted in comments.
| // TODO: should this be IndexedPackageSpec or something else? It comes | ||
| // from an index and requires one to function, but it is used/usable | ||
| // only by the solver. | ||
| #[derive(Debug, Serialize)] | ||
| pub struct SolverPackageSpec { |
There was a problem hiding this comment.
Does this name seem sensible?
There was a problem hiding this comment.
First I'm not sure we even want to use "spec" in the name here, since to me that word makes me think more about content a human would write and/or yaml-formatted content, based on all the existing things we refer to as a "spec".
I'll propose IndexedPackage as a name for this.
I don't think it is true that it is only used by the solver. We already have plans to use the index for other kinds of reporting.
| // TODO: skip this? It could be quite large? | ||
| self.buf.hash(state); |
There was a problem hiding this comment.
I'm not sure how meaningful the underlying bytes are here. The offset might be enough.
There was a problem hiding this comment.
It would presumably be way cheaper to only hash the offset. That should be enough to cover uniqueness, I'm assuming no two packages in the index are going to have the same offset.
| // TODO: should this change the return value, update all the | ||
| // caller's handling, and return an error for this implementation? | ||
| unreachable!("{err}"); |
There was a problem hiding this comment.
This question comes up in all the unimplemented parts of the SolverPackageSpec. The same decision can be applied to all of them. I went with unreachable for now to avoid the resulting Result<...> checking cascade throughout the rest of the codebase. But I don't know how large a change it would be.
There was a problem hiding this comment.
It's more work of course but I would like to refactor traits such that metadata isn't a required method on Package and then also the solver doesn't require an implementation of whatever [new] trait metadata moves to. Therefore you know for a fact this is unreachable; the code doesn't have to exist.
| // TODO: all these functions are here to keep the flatbuffer | ||
| // conversions together while deciding on whether to replace the | ||
| // backing of current rust objects with flatbuffer ones, like | ||
| // SolverPackageSpec does. Or whether to split these functions up into | ||
| // a couple of traits (e.g. IntoFlatbuffer, FromFlatbuffer) across | ||
| // each of the existing rust types. |
There was a problem hiding this comment.
Or, another alternative is to leave them together in this file for now.
There was a problem hiding this comment.
I like the idea of leaving them all here in one place as a starting point.
| // TODO: this will be the next thing to get a | ||
| // proper flatbuffer representation because it | ||
| // is now showing up as the next significant | ||
| // chunk on the large solve flamegraphs |
There was a problem hiding this comment.
I want to highlight this as one of the next likely improvements to the overall indexing. It's not part of this PR-chain at the moment.
| // Note: fb_compats in packages are not optional in the rust structs, | ||
| // but fb_compat's stored in var opts are optional in the rust struct. | ||
| #[inline] | ||
| fn var_opt_fb_compat_to_var_opt_compat(fb_compat: Option<&str>) -> Option<Compat> { | ||
| // None stored as an fb_compat represents no compat specified at | ||
| // all for a var opt. Some compat stored means a compat was | ||
| // specified, and it may even be the same as the default compat. | ||
| fb_compat.map(|fc| { | ||
| Compat::new_unchecked(fc) | ||
| .expect("A Compat in flatbuffer data shold be a valid Compat when parsed") | ||
| }) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn fb_compat_to_compat(fb_compat: Option<&str>) -> Compat { | ||
| if let Some(compat) = fb_compat { | ||
| Compat::new_unchecked(compat) | ||
| .expect("A Compat in flatbuffer data should be a valid Compat when parsed") | ||
| } else { | ||
| // In this case, None, so nothing, stored as an fb_compat | ||
| // represents the default compat. This is different to the var | ||
| // opt compat method above. | ||
| Compat::default() | ||
| } | ||
| } |
There was a problem hiding this comment.
Compats and the 2 uses of CompatRules described here, across the comments in these functions, bit me more than once during this work.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
47ea722 to
af8dcc2
Compare
d83388a to
a59523b
Compare
af8dcc2 to
53f24d1
Compare
5c8e4db to
33938c3
Compare
…kage data. Adds v0 SolverPackageSpec as a package implementation backed by a flatbuffer. Adds fb_converter functions for converting spk schema objects into and out of flatbuffer objects. Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
53f24d1 to
cb733cd
Compare
| // The pieces of a pkg/version/build Ident, but not combined as a | ||
| // BuildIdent. This is used for embedded packages. | ||
| table IdentPartsBuf { | ||
| repository_name: string; |
There was a problem hiding this comment.
I can't think of a reason why this field would ever [need to be] populated in the index. But if there's zero cost to having it here...
|
|
||
| // A Package/Version/Build containing what is needed when solving | ||
| table BuildIndex { | ||
| // Older packages may have been published with non-normalized |
There was a problem hiding this comment.
This isn't really about "older packages" but that this preserves the version number as written in the recipe (to please users) versus the normalized version number which is used for comparisons.
| table VersionIndex { | ||
| // The version number only | ||
| version: Version (required); | ||
| builds: [BuildIndex]; |
| } | ||
|
|
||
| // The index for a repository | ||
| table RepositoryIndex { |
There was a problem hiding this comment.
Let's at least add some field that represents a high level version number that governs the practices outlined in comments but aren't represented in the types. If we change validation rules, expectations, etc., then we would bump this number up.
| thiserror = { workspace = true } | ||
|
|
||
| [build-dependencies] | ||
| flatc-rust = "0.2" |
There was a problem hiding this comment.
Please move this version up to workspace and update spfs-proto to use the workspace version too.
| true | ||
| } | ||
|
|
||
| fn sources(&self) -> &Vec<SourceSpec> { |
There was a problem hiding this comment.
Similar to the idea above about moving metadata out of Package, I'm thinking Package shouldn't have a sources either -- wouldn't that be a property for Recipe to have? In the future with #1187 we won't have this property in a "package".
| Cow::Owned((**self.cached_requirements.load()).clone()) | ||
| } | ||
|
|
||
| fn get_all_tests(&self) -> Vec<SpecTest> { |
There was a problem hiding this comment.
Same idea here of something that shouldn't be in Package.
It does leave me wondering if we want to have a separate SolverPackage trait with a reduced set of requirements that the solver expects. We either need that or some combination of sub-dividing Package to cater to the different needs of different subsystems.
| // TODO: all these functions are here to keep the flatbuffer | ||
| // conversions together while deciding on whether to replace the | ||
| // backing of current rust objects with flatbuffer ones, like | ||
| // SolverPackageSpec does. Or whether to split these functions up into | ||
| // a couple of traits (e.g. IntoFlatbuffer, FromFlatbuffer) across | ||
| // each of the existing rust types. |
There was a problem hiding this comment.
I like the idea of leaving them all here in one place as a starting point.
| spk_proto::PreReleasePolicy::None => None, | ||
| spk_proto::PreReleasePolicy::ExcludeAll => Some(PreReleasePolicy::ExcludeAll), | ||
| spk_proto::PreReleasePolicy::IncludeAll => Some(PreReleasePolicy::IncludeAll), | ||
| _ => None, |
There was a problem hiding this comment.
are these enums generated with non-exhaustive?
There was a problem hiding this comment.
I guess based on some code comments below that these are plain numbers.
It could be a good idea to add debug_assert!()s on each of these catch all cases to detect things that were missed after any schema changes.
| ComponentEmbeddedPackage::new_unchecked(ident, components) | ||
| }) | ||
| .collect() | ||
| // TODO: This does not set fabricated, not sure if that is a problem or not? |
There was a problem hiding this comment.
It should be set true if the list was populated from default values, the point being that if it was fabricated then the list is not serialized. So what's most important is to skip serializing it when creating the flatbuffer instance (if fabricated is true). Then, this would go through the None case when deserializing.
Perhaps we can cook up some tests around the expected behavior.
Note: for info on benefits of indexing for spk solves see #1340 (5 of 5). Maybe start there and work back down to this PR if you prefer to review PRs top down.
This adds a new
spk-protocrate to spk, and a new v0 package implementation:SolverPackageSpec.spk-protocontains a flatbuffers schema for index package, version, build, options, and global variables data. The flatbuffers schema is designed for solver use. It only stores data needed for solving with the packages. It is not a complete replacement for all package operations. It does not have enough data to build or test packages.SolverPackageSpecis a package implementation backed by a flatbuffer. Those packages will be produced from an index by changes added in a subsequent PR (#1339, 4 of 5). The functions for converting spk schema objects into and out of flatbuffer objects are contained infb_converter.rs.This is one of the changes that supports adding indexes and index based packages to Spk repositories.
This is 3 of 5 chained PRs for adding indexes to spk solves:
new_unchecked()constructors to spk schema objects #1337spk repo indexsubcommand for index generation and updates #1340