Indexes 4: Adds Indexes for SPK repositories by dcookspi · Pull Request #1339 · spkenv/spk

dcookspi · 2026-03-19T23:03:20Z

Note: for info on benefits of indexing for spk solves see #1340 (5 of 5). Maybe start there and work back down to this PR if you prefer to review PRs top down.

This adds a new Indexed repository type, RepositoryIndex traits, a RepoIndex enum, and a flatbuffer index implementation in FlatBufferRepoIndex, along with some configuration. The indexing is designed to help the solvers. An index only contains information useful for solving. It does not contain enough information to do other things, such as build or test the packages.

The hierarchy of new types is roughly:

RepositoryHandle (existing spk repos)
   Indexed 
       RepoIndex
           FlatBufferRepoIndex  (makes SolverPackageSpecs, a v0 package alternate, see PR 3 of 5 below)
                flatbuffers schema (see PR 3 of 5 below)

This also updates the solver and repo based tests to include the new Indexed repo. This does not add command line tools or solver interaction for indexes. Those are in the last PR in this chain.

This is the change that adds indexes and index based packages to Spk repositories.

This is 4 of 5 chained PRs for adding indexes to spk solves:

dcookspi · 2026-03-19T23:04:14Z

crates/spk-config/src/config.rs

+
+    /// Whether to get the solver to use repository indexes instead of
+    /// the repository directly.
+    pub use_indexes: bool,
+


Not all of these config settings are used in the PR, some are used in the last PR in the chain: #1340

Docs for these have been added to PR 5 #1340

dcookspi · 2026-03-19T23:06:50Z

crates/spk-storage/src/storage/flatbuffer_index.rs

+            // The default number of tables seems to be 1 million, and
+            // that isn't enough for the current index. Tried 100
+            // million and that didn't error, but this is not a great
+            // solution long term, especially once VersionFilters are
+            // converted from strings.
+            max_tables: MAX_FLATBUFFER_TABLES,


This may become a problem longer term in repositories that do no delete old unused deprecated packages. It may also be a problem if VersionFilters are properly represented in the flatbuffer schema, instead of leaving them as strings (I think that'll be worth it, but it's something to keep in mind).

Along the lines of the idea to do the verification after writing a new index file, can we find out how many tables were used and print a warning if it is some percentage of the maximum?

dcookspi · 2026-03-19T23:17:58Z

crates/spk-storage/src/storage/flatbuffer_index.rs

+                // Does not store any more details because the package
+                // is deprecated.


This means that deprecated packages in the index cannot be safely used in current solves as things stand. Because they have no install requirements in the index, they will short-circuit any solve they are resolved in. This is a problem for solves that ask for a deprecated build specifically. However, the number of times this happens is vanishingly small in our experience. There are several options to address this:

add a --deprecated flag that disables index use if deprecated packages are wanted (it turns out this does not exist for solves at the moment) add a --deprecated flag that disables index use if deprecated packages are wanted (it turns out this does not exist for solves at the moment). These solves would lack the benefit of any index

add a way of marking a specific request as allowing deprecated builds, like the --deprecated but for that request only, like spk has for prerelease policies in requests. This could either disable index use, or we could split the solvers processing to first ask the repo if the build is deprecated, and then load the package data from the index or underlying repo as required.

include the full solver require data for deprecated packages in the index, which increases the index size, loading time and generation time. The increases might be managable but the data would be unused for most solves.

something else?

This probably needs a separate discussion.

Another option: have 2 indexes, one for non-deprecated and one for deprecated things, and access the appropriate index as required. It's unlikely that lots of deprecated builds will be pulled into most solves. This idea might be suitable to use with the "ask the repo if the build is deprecated before deciding whether to fully load it" option above.

And another (a subset of the second one above) reverse and split the solvers processing to: first, ask the repo if the build is deprecated, and then load the package data if it needs the package. An Indexed repo, when asked to read a pacakge, could get the data from the index for non-deprecated packages, and from underlying repo for deprecated ones. The solver wouldn't need to know where the package is coming from, but it would need to change the current order of operations for deprecated checks in the current solvers.

This comment is out of date now. I tested adding all the deprecated data to the index, and while it increases the index side, it doesn't impact the solve times (see PR5 #1340 for some times).

Update the comment in the code

dcookspi · 2026-03-19T23:18:43Z

crates/spk-storage/src/storage/flatbuffer_index.rs

+                // TODO: might want to add something to load them from
+                // underlying repo if required, such as when the
+                // --deprecate flag is set.


It turns out there isn't one (a --deprecate flag) for spk env, build or explain, just for ls and search.

codecov · 2026-03-19T23:18:51Z

Codecov Report

❌ Patch coverage is 62.35409% with 387 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/spk-storage/src/storage/flatbuffer_index.rs	59.36%	269 Missing ⚠️
crates/spk-storage/src/storage/indexed.rs	57.36%	55 Missing ⚠️
crates/spk-schema/src/v0/solver_package_spec.rs	0.00%	29 Missing ⚠️
...s/spk-storage/src/storage/flatbuffer_index_test.rs	83.72%	21 Missing ⚠️
crates/spk-storage/src/storage/handle.rs	44.44%	5 Missing ⚠️
crates/spk-storage/src/storage/repository_index.rs	85.71%	4 Missing ⚠️
crates/spk-solve/src/solvers/solver_test.rs	80.00%	2 Missing ⚠️
crates/spk-storage/src/storage/runtime.rs	0.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

dcookspi · 2026-03-19T23:23:26Z

crates/spk-storage/src/storage/repository_index.rs

+pub enum RepoIndex {
+    Flat(FlatBufferRepoIndex),
+}


There were more kinds of index during development and testing. There may be more in future.

dcookspi · 2026-03-20T00:50:53Z

I can't replicate the test failures locally at the moment. I'll dig into them further tomororw.

dcookspi · 2026-03-20T20:10:11Z

crates/spk-storage/src/storage/flatbuffer_index_test.rs

+    assert_repo_and_index_have_same_packages(repo2, indexed_repo2).await;
+}
+
+// TODO: add rest of solves sample repos to this as tests


The repos used in these tests (so far) are copied from the solver_test.rs tests. There are only a few of them, but the others could be duplicated into more tests here, or all the test repo setups could be moved to somewhere shared between spk-storage and spk-solve.

At the moment, there is a difference between the repo setup for the standard Repository tests (that make_repo() fixture function), and the Solver tests (that use the make_repo!() macro). That difference should probably be removed in future, it's a minor irritation when adding tests.

dcookspi · 2026-03-21T01:11:38Z

I can't replicate the test failures locally at the moment. I'll dig into them further tomororw.

The issue was related to deprecated packages and the migration-to-components feature. I've put in a temporary fix so the tests pass for now. But it should not get merged as is. A direction needs to be decided with regard to deprecated packages, solves, and what to index in order to replace the temp fix with a better solution. See #1339 (comment) comment above.

… flabuffers index and configuration Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>

jrray · 2026-03-27T23:31:22Z

crates/spk-config/src/config.rs

    /// Name of the solver whose output to show  when multiple solvers are being run.
    pub solver_to_show: String,
+
+    /// Whether to get the solver to use repository indexes instead of


Suggested change

/// Whether to get the solver to use repository indexes instead of

/// Whether to let the solver use repository indexes instead of

This sounds more natural IMO.

jrray · 2026-03-27T23:38:22Z

crates/spk-config/src/config.rs

+    /// Whether to validate flatbuffers index data before using it.
+    /// Validating is safer but adds some overhead at the start of a
+    /// solve when using an index.
+    pub verify_flatbuffers_index_before_use: bool,


It's a bit awkward to both try to be generic about what index types are available but also have "top-level" configuration options that are specific to one index type.

For that reason I would suggest having a sub-struct for containing options pertaining to the flatbuffers index.

Calling it the flatbuffers index is a separate concern. Who's to say some other index wouldn't get created and also choose to use flatbuffers for the storage representation? In practice I don't see it being likely that we'll have competing index implementations, but it is nice to leave the possibility open.

jrray · 2026-03-27T23:42:01Z

crates/spk-solve/src/solvers/resolvo/spk_provider.rs

+                            // Exclude this deprecated build.
+                            let reason = provider
+                                .pool
+                                .intern_string(format!("{} is deprecated", ident));


Suggested change

.intern_string(format!("{} is deprecated", ident));

.intern_string(format!("{ident} is deprecated"));

jrray · 2026-03-27T23:46:11Z

crates/spk-solve/src/solvers/solver_test.rs

+#[case::step(step_solver(), false)]
+#[case::step_indexed(step_solver(), true)]
+#[case::resolvo(resolvo_solver(), false)]
+#[case::resolvo_indexed(resolvo_solver(), true)]
 #[tokio::test]
 async fn test_solver_package_with_no_recipe(
    #[case] mut solver: SolverImpl,
+    #[case] use_index: bool,
    random_build_id: BuildId,
 ) {


Suggested change

#[case::step(step_solver(), false)]

#[case::step_indexed(step_solver(), true)]

#[case::resolvo(resolvo_solver(), false)]

#[case::resolvo_indexed(resolvo_solver(), true)]

#[tokio::test]

async fn test_solver_package_with_no_recipe(

#[case] mut solver: SolverImpl,

#[case] use_index: bool,

random_build_id: BuildId,

) {

#[case::step(step_solver())]

#[case::resolvo(resolvo_solver())]

#[tokio::test]

async fn test_solver_package_with_no_recipe(

#[case] mut solver: SolverImpl,

#[values(true, false)] use_index: bool,

random_build_id: BuildId,

) {

I'd prefer this pattern for simplicity / extensibility. I was thinking we'd have a declarative macro to help with removing boilerplate on every test.

jrray · 2026-03-27T23:49:04Z

crates/spk-solve/src/solvers/solver_test.rs

    repo.publish_package(&spec.into(), &components)
        .await
        .unwrap();
+    let repo = wrap_repo_for_test(repo, use_index).await;


It's not ideal that every test has to remember to call this, though I guess there would be a lint that use_index is unused if that's the case.

jrray · 2026-03-27T23:51:42Z

crates/spk-solve/src/solvers/solver_test.rs

 #[tokio::test]
 async fn test_solver_component_embedded_multiple_versions(
    #[values(step_solver(), resolvo_solver())] mut solver: SolverImpl,
+    #[values(false, true)] use_index: bool,


Ayyy...

You could argue this is better than defining cases for the solver flavors.

jrray · 2026-03-27T23:55:34Z

crates/spk-storage/src/storage/flatbuffer_index.rs

+            // The default number of tables seems to be 1 million, and
+            // that isn't enough for the current index. Tried 100
+            // million and that didn't error, but this is not a great
+            // solution long term, especially once VersionFilters are
+            // converted from strings.
+            max_tables: MAX_FLATBUFFER_TABLES,


Along the lines of the idea to do the verification after writing a new index file, can we find out how many tables were used and print a warning if it is some percentage of the maximum?

jrray · 2026-03-28T00:00:35Z

crates/spk-storage/src/storage/flatbuffer_index.rs

+        }
+
+        // Create the new index file with the correct permissions
+        if let Err(err) = tokio::fs::write(&filepath, builder.finished_data()).await {


We need to write to a temp file first and rename if successful.

jrray · 2026-03-28T00:03:17Z

crates/spk-storage/src/storage/flatbuffer_index_test.rs

+use crate::{IndexedRepository, RepoWalkerBuilder, RepoWalkerItem, RepositoryHandle};
+
+// A copy of the one in spk-solve-macros because using it directly
+// here cases an import loop


Suggested change

// here cases an import loop

// here causes an import loop

jrray · 2026-03-28T00:05:47Z

crates/spk-storage/src/storage/handle.rs

        matches!(self, Self::Runtime(_))
    }

+    pub fn is_indexed(&self) -> bool {


If you add Variantly to the derive list you get this method for free.

dcookspi · 2026-03-28T00:32:00Z

I can't replicate the test failures locally at the moment. I'll dig into them further tomororw.

These have been fixed now.

dcookspi self-assigned this Mar 19, 2026

dcookspi added enhancement New feature or request SPI AOI Area of interest for SPI pr-chain This PR doesn't target the main branch, don't merge! labels Mar 19, 2026

dcookspi commented Mar 19, 2026

View reviewed changes

dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch 2 times, most recently from 00757f2 to 8530adc Compare March 20, 2026 00:18

dcookspi requested review from jrray and rydrman March 20, 2026 00:50

dcookspi force-pushed the index-3-flatbuffer-and-solver-package-spec branch from 47ea722 to af8dcc2 Compare March 20, 2026 19:29

dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch 2 times, most recently from 68bb519 to 1e28bf4 Compare March 20, 2026 19:53

dcookspi commented Mar 20, 2026

View reviewed changes

dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from 1e28bf4 to 854446a Compare March 21, 2026 01:07

dcookspi force-pushed the index-3-flatbuffer-and-solver-package-spec branch from af8dcc2 to 53f24d1 Compare March 25, 2026 01:07

dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch 2 times, most recently from a58142a to b32582c Compare March 25, 2026 17:08

dcookspi force-pushed the index-3-flatbuffer-and-solver-package-spec branch from 53f24d1 to cb733cd Compare March 27, 2026 19:09

Adds new Indexed repository type, related RepositoryIndex traits, and…

1a43f64

… flabuffers index and configuration Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>

dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from b32582c to 1a43f64 Compare March 27, 2026 19:30

jrray requested changes Mar 28, 2026

View reviewed changes

		// Does not store any more details because the package
		// is deprecated.

	/// Whether to get the solver to use repository indexes instead of
	/// Whether to let the solver use repository indexes instead of

	.intern_string(format!("{} is deprecated", ident));
	.intern_string(format!("{ident} is deprecated"));

Conversation

dcookspi commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcookspi Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi commented Mar 20, 2026

Uh oh!

dcookspi Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi commented Mar 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcookspi commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dcookspi commented Mar 19, 2026 •

edited

Loading

dcookspi Mar 19, 2026 •

edited

Loading

dcookspi Mar 19, 2026 •

edited

Loading

dcookspi Mar 19, 2026 •

edited

Loading

dcookspi Mar 28, 2026 •

edited

Loading

dcookspi Mar 19, 2026 •

edited

Loading

codecov bot commented Mar 19, 2026 •

edited

Loading

dcookspi Mar 20, 2026 •

edited

Loading