Skip to content

test: create a test for m0002010 using DS4#2243

Open
ctron wants to merge 22 commits intoguacsec:mainfrom
ctron:feature/migration_test_1
Open

test: create a test for m0002010 using DS4#2243
ctron wants to merge 22 commits intoguacsec:mainfrom
ctron:feature/migration_test_1

Conversation

@ctron
Copy link
Contributor

@ctron ctron commented Feb 12, 2026

Summary by Sourcery

Add snapshot-based migration test infrastructure, new DS4-based data dump tests for migration m0002010, and related performance test; refactor dump download and decompression utilities; and update CI and configuration to support long-running tests and improved coverage.

New Features:

  • Introduce a generalized dump and snapshot management system for migration tests, including URL-based dump fetching and BTRFS-backed snapshots for faster test setup.
  • Add a DS4-based performance test for migration m0002010 that replays data migrations and standard migrations using real dumps.
  • Add async and sync decompression helpers that auto-detect gzip/xz compression based on file extension.
  • Expose a reusable test storage backend constructor that can work with existing directories.
  • Provide a migration-specific test context that can materialize databases and storage from dumps or snapshots, keyed by stable IDs.
  • Add feature flags in migration and fundamental crates to enable long-running tests separately from regular CI runs.

Bug Fixes:

  • Ensure data migrations use an explicit database connection when processing documents, avoiding reliance on internal manager connection state.
  • Fix missing gzip support in embedded database import and common DB utilities so compressed SQL dumps can be used consistently.
  • Stabilize test DB configuration by deriving it from a helper that builds connection settings based solely on the PostgreSQL port.

Enhancements:

  • Refactor migration dump handling to construct a single S3 URL per commit and split downloading/verification into reusable components with progress reporting.
  • Refine data migration concurrency to allow configuration via an environment variable, defaulting to the number of logical CPUs when unset or zero.
  • Improve SchemaDataManager to optionally accept an external database connection while maintaining backwards compatibility when none is supplied.
  • Extend Trustify test context to track the database port and manage teardown via a composable asynchronous resource stack, simplifying cleanup logic.
  • Relax restrictions on test-only panics and add guidance for marking long-running tests as ignored by default via a shared feature flag.
  • Add richer logging for checksum validation and storage file discovery to aid debugging of dump-related issues.
  • Switch benchmark contexts from Arc to Rc where thread-safety is unnecessary, reducing overhead.

Build:

  • Add new dependencies for compression, tar handling, CPU detection, BTRFS support, URL encoding, and UUIDs required by the new dump and snapshot infrastructure.
  • Define long_running features in relevant crates and wire them into tests guarded by cfg_attr, and ensure coverage runs with --all-features.
  • Expose async-compression gzip support in the common DB crate to align with new decompression helpers.

CI:

  • Introduce a dedicated multi-OS check job running cargo check and cargo clippy before the main CI job, and bump checkout actions to v5.
  • Update the code coverage workflow to run tests with all features enabled for more representative coverage data.

Documentation:

  • Document testing conventions around panics, logging, and long-running tests, including how to enable them via a long_running feature flag.
  • Add documentation for snapshot-based test acceleration using BTRFS, including requirements, maintenance, and fallback behavior when unsupported.

Tests:

  • Add a DS4-based migration performance test for m0002010 that exercises both data and schema migrations using real-world dumps.
  • Enable large SPDX ingestion performance tests under a long_running feature flag so they can be opted into without impacting regular CI runs.

Chores:

  • Reorganize migration test context modules, exposing migration sources and snapshot support while removing the old dedicated migration context type.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 12, 2026

Reviewer's Guide

Refactors migration dump handling into a reusable URL-based Dumps abstraction, extends decompression and embedded DB import to support gzip as well as xz, introduces Btrfs-backed snapshot-aware migration test contexts, and adds a long-running DS4-based performance test for migration m0002010 while updating CI workflows and dependencies accordingly.

Class diagram for updated migration data handling

classDiagram
    class Runner {
        +Database database
        +DispatchBackend storage
        +Options options
    }

    class Database {
        <<enum>>
        Config
        Provided
        +try_into_connection() DbErr DatabaseConnection
    }

    class ConnectOptions {
        +new(url: String) ConnectOptions
        +set_schema_search_path(schema: String) ConnectOptions
    }

    class DatabaseConnection

    class SchemaManager {
        +get_connection() DatabaseConnection
        +process~D~(db: ConnectionTrait, storage: DispatchBackend, options: Options, f: Handler~D~) DbErr
    }

    class SchemaDataManager {
        +SchemaManager manager
        +DatabaseConnection db
        +DispatchBackend storage
        +Options options
        +new(manager: SchemaManager, db: DatabaseConnection, storage: DispatchBackend, options: Options)
        +process~D~(storage: DispatchBackend, options: Options, f: Handler~D~) DbErr
    }

    class Options {
        +usize concurrent
        +u64 current
        +u64 total
        +bool skip_all
        +bool skip_data
        +new() Options
    }

    class DocumentProcessor {
        <<interface>>
        +process~D~(db: ConnectionTrait, storage: DispatchBackend, options: Options, f: Handler~D~) DbErr
    }

    class DispatchBackend
    class Handler {
        <<interface>>
    }
    class ConnectionTrait {
        <<interface>>
    }
    class TransactionTrait {
        <<interface>>
    }

    Runner --> Database
    Runner --> Options
    Runner --> DispatchBackend

    Database --> DatabaseConnection
    Database --> ConnectOptions

    SchemaDataManager --> SchemaManager
    SchemaDataManager --> DatabaseConnection
    SchemaDataManager --> DispatchBackend
    SchemaDataManager --> Options

    DocumentProcessor <|.. SchemaManager

    SchemaManager --> DispatchBackend
    SchemaManager --> Options
    SchemaManager --> ConnectionTrait
    ConnectionTrait <|.. DatabaseConnection
    TransactionTrait <|.. DatabaseConnection

    Options ..> num_cpus
Loading

Flow diagram for dump-based migration test and decompression

flowchart LR
    A[Test runner] --> B[Migration test context]
    B --> C[Dumps helper]
    C --> D{Dump source}
    D -->|Migration URL| E[Download dump from S3]
    D -->|Local path| F[Use existing dump file]

    E --> G[Local dump file]
    F --> G

    G --> H[decompress_async_read]
    G --> I[decompress_read]

    H --> J[Embedded database import]
    I --> J

    J --> K[Embedded PostgreSQL instance]

    K --> L[Runner]
    L --> M[Database.try_into_connection]
    M --> N[SchemaManager]
    N --> O[SchemaDataManager]

    O --> P[DocumentProcessor.process]
    P --> Q[Parallel document handling using buffer_unordered]
    Q --> R[Migration m0002010 completed on DS4 dump]
Loading

File-Level Changes

Change Details Files
Refactor migration dump handling into a reusable URL-based Dumps/Dump abstraction that supports both legacy migration dumps and arbitrary dump URLs with checksum validation and progress reporting.
  • Simplify Migration to hold an S3 URL derived from repo metadata and expose it as a Dump description rather than managing local paths and S3 parameters directly.
  • Introduce a generic Dump struct describing base URL, files, and whether digests are expected, and a Dumps manager that resolves a stable local directory per dump URL using urlencoding, with locking and reuse across tests.
  • Split raw download logic into download_artifacts_raw that accepts a base URL, optional digest download, and a missing-file policy, with progress bars via indicatif and improved logging, and reuse it from Dumps::provide/provide_raw.
  • Enhance checksum validation logging to emit which file is being validated and reuse common validation for both migration-based and URL-based dumps.
test-context/src/migration.rs
Introduce snapshot-aware migration test contexts with optional Btrfs subvolume support and a new DS4-based performance test for migration m0002010.
  • Replace the previous migration test context implementation with a new ctx::migration module that can derive dumps either from migration commits (via Migration::as_dump and Dumps) or from explicit DumpSource URLs, including configurable db/storage filenames, digest usage, path stripping, and zstd EOF fixing.
  • Add a Snapshot abstraction that can materialize a test database and storage either by importing dumps into a fresh temp directory or, on Linux, by using Btrfs subvolumes and snapshots for fast reuse, with a pipeline that prepares storage (tar extraction, optional path stripping and zstd fixes) and imports the DB using an embedded PostgreSQL instance.
  • Implement a Linux-only Btrfs snapshot backend that discovers a Btrfs-capable store, manages template, running, and prepare subvolumes via the btrfs CLI, and integrates with embedded PostgreSQL and FileSystemBackend, plus a SnapshotProvider to locate or create snapshots from optional archive files.
  • Add the TrustifyMigrationContext AsyncTestContext type parameterized by a DumpId, macros commit! and dump! to define dump IDs and URL-based sources, and wire these into tests so they can seamlessly obtain a pre-populated DB+storage snapshot.
  • Create a DS4 dump-based performance test for migration m0002010 that uses the new dump! macro to point at an S3-hosted DS4 dump, runs selected data migrations out-of-band using Runner, and then executes the normal Migrator while skipping those migrations, gating the test behind a long_running feature.
  • Export MigrationSource and additional ctx items from test-context and adjust tests and benches that depend on TrustifyContext/TrustifyMigrationContext to the new APIs and reference the database port directly instead of reaching into an embedded PostgreSQL handle.
migration/tests/data/m0002010.rs
test-context/src/ctx/migration/mod.rs
test-context/src/ctx/migration/snapshot/mod.rs
test-context/src/ctx/migration/snapshot/btrfs.rs
test-context/src/ctx/mod.rs
modules/fundamental/benches/common.rs
modules/fundamental/tests/sbom/spdx/perf.rs
Extend decompression and embedded DB import utilities to support gzip and unify compression handling.
  • Add decompress_async_read and decompress_read helpers that open a file and wrap it in the correct async/sync decompressor based on file extension, supporting .xz via async-compression/LzmaDecoder and lzma-rust2, .gz via async-compression::GzipDecoder and flate2::GzDecoder, and falling through to raw reads for other extensions.
  • Refactor common/db embedded database import to use decompress_async_read instead of ad‑hoc LZMA-only logic, allowing imports from gzip-compressed SQL and other formats without modifying callers.
common/src/decompress.rs
common/db/src/embedded.rs
Improve data migration Runner and SchemaDataManager to accept pre-existing database connections and configurable concurrency while leveraging the new progress UI.
  • Make the data migration Database enum clonable and ergonomically constructible from DatabaseConnection and trustify_common::db::Database via From implementations.
  • Factor out database connection creation into Database::try_into_connection and use it inside Runner::run, then pass the resulting DatabaseConnection through to SchemaDataManager::new, which now optionally holds an explicit connection.
  • Update SchemaDataManager::process to accept an explicit DB connection (falling back to SchemaManager::get_connection when None) and pass that into the DocumentProcessor::process implementation that now takes a db parameter rather than calling get_connection itself.
  • Change migration Options.concurrent from NonZeroUsize to usize with semantics that 0 means "use num_cpus::get()", update its Default, and adjust the processing pipeline to resolve concurrency from options, log it, and configure the indicatif progress bar with humanized counters and rate-based template.
  • Update call sites, including MigrationWithData::MigrationTrait impl and Runner, to build SchemaDataManager with or without an explicit DB connection as appropriate.
migration/src/data/run.rs
migration/src/data/migration.rs
migration/src/data/mod.rs
Refine the test-context infrastructure to manage external resources explicitly via a ResourceStack, expose DB ports, and support reuse across tests and benches.
  • Introduce a ResourceStack and TestResource/TestResourceExt abstractions to manage asynchronous cleanup of stacked resources (e.g., TempDir, PostgreSQL instances, Btrfs snapshots) with ordered teardown via a defer helper that offloads Drop to spawn_blocking.
  • Change TrustifyTestContext to store the database port explicitly, hold a ResourceStack instead of embedding TempDir/PostgreSQL directly, and make teardown async to drop resources and then record memory usage; adjust TrustifyContext setup/teardown logic to construct appropriate ResourceStacks for embedded vs external DBs.
  • Update FileSystemBackend to add for_test_in to create a test backend in an existing directory and enhance locate_read to log located compressed storage files and missing base paths, enabling snapshot-based setups to plug in storage directories.
  • Modify benches and server tests (e.g., OpenTelemetry profile tests) to use Rc where appropriate and refer to ctx.port instead of reaching into an optional PostgreSQL field.
test-context/src/lib.rs
test-context/src/ctx/default.rs
test-context/src/resource.rs
modules/storage/src/service/fs.rs
server/src/profile/api.rs
modules/fundamental/benches/common.rs
Adjust CI workflows and coverage to separate clippy/check from integration tests and run tests with all features, including long_running gating.
  • Introduce a new GitHub Actions check job running on a matrix of OSs (Windows, Ubuntu, macOS) that performs cargo check and cargo clippy with strict lints, including installing vcpkg/openssl on Windows, while making the main ci job depend on this check job and focus on fmt and full test runs with minio.
  • Bump actions/checkout to v5 for the ci workflow and remove inlined check/clippy steps since they now run in the new check job.
  • Update the codecov workflow to run cargo llvm-cov with --all-features instead of a limited _test-s3 feature, ensuring coverage includes new long_running gating and dump/snapshot infrastructure.
.github/workflows/ci.yaml
.github/workflows/codecov.yaml
Introduce or adjust crate features and dependencies to support new compression formats, progress bars, URL encoding, snapshot tooling, and long-running test configuration.
  • Add async-tar, indicatif, urlencoding, which, uuid, and Linux-only async-compression (with zstd/xz) and nix dependencies to test-context; enable async-tar globally in the workspace Cargo.toml for dump management.
  • Extend common and common/db dependencies to include async-compression gzip support, flate2, and lzma-rust2, and add num_cpus for dynamic concurrency selection in migration and workspace Cargo.toml.
  • Add long_running features to the migration and modules/fundamental crates for opt-in long-running tests, and document the usage in a new docs/design/tests.md file; add docs/test-snapshots.md explaining Btrfs snapshot requirements and behavior.
  • Keep Cargo.lock in sync and ensure workspace features (e.g., indicatif tokio/futures) match the new usage patterns.
test-context/Cargo.toml
migration/Cargo.toml
common/Cargo.toml
common/db/Cargo.toml
modules/fundamental/Cargo.toml
Cargo.toml
docs/design/tests.md
docs/test-snapshots.md
Cargo.lock
Add documentation and environment configuration for snapshot-based tests and test DB behavior.
  • Update env-vars documentation to describe TRUST_TEST_BTRFS_STORE as the path to a BTRFS-backed directory used for snapshot-based tests with a default in the user’s home.
  • Add Database::from_port helper in common config to construct DB configs for a given port, and switch embedded DB creation code and tests to use it when constructing configs for imported/embedded PostgreSQL instances.
docs/env-vars.md
common/src/config.rs

Possibly linked issues

  • #Test migration with data: PR implements dump-based migration test infrastructure and a concrete DS4-backed migration test, directly addressing testing with data.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@ctron ctron force-pushed the feature/migration_test_1 branch from ce8f274 to a1499eb Compare February 26, 2026 07:29
@ctron ctron marked this pull request as ready for review February 27, 2026 12:12
@ctron ctron force-pushed the feature/migration_test_1 branch from d674a6d to 7920244 Compare February 27, 2026 12:12
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In Dumps::provide_raw/download_artifacts_raw a fresh reqwest::Client is created for each invocation; consider keeping a shared client on Dumps to take advantage of connection pooling and reduce overhead for repeated test runs.
  • The new log::info! calls in download_artifacts_raw (progress output) and in FileSystemBackend::find for missing files may be quite noisy in normal test runs; consider downgrading these to debug (or gating them) so info-level logs remain focused on higher‑level events.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `Dumps::provide_raw`/`download_artifacts_raw` a fresh `reqwest::Client` is created for each invocation; consider keeping a shared client on `Dumps` to take advantage of connection pooling and reduce overhead for repeated test runs.
- The new `log::info!` calls in `download_artifacts_raw` (progress output) and in `FileSystemBackend::find` for missing files may be quite noisy in normal test runs; consider downgrading these to `debug` (or gating them) so info-level logs remain focused on higher‑level events.

## Individual Comments

### Comment 1
<location path="test-context/src/migration.rs" line_range="329-274" />
<code_context>
+/// Manage raw dump downloads
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding focused tests for `Dumps::provide`/`provide_raw` covering cache reuse, locking behavior, and digest/no-digest modes.

These methods are now a key path for downloading and caching dump artifacts, but their behavior isn’t directly tested. Adding a few focused unit/integration tests would help:

* Cache reuse: Call `provide` twice for the same `Dump` (using a local test server or file-based mock) and assert that the second call doesn’t trigger another download (e.g., via a request counter).
* Digest vs no-digest: In `digests = true` mode, verify success with matching `.sha256` files and failure with mismatched ones; in `digests = false` mode, ensure missing `.sha256` files don’t error.
* Locking: Call `provide` concurrently from two tasks for the same URL and assert both succeed with uncorrupted content, confirming that locking prevents duplicate or conflicting downloads.

Given that many migration and external dump tests rely on this code, these tests would materially improve reliability and debuggability.

Suggested implementation:

```rust
 /// Manage raw dump downloads
#[derive(Debug)]
pub struct Dumps {
    base: PathBuf,
}

```

` section.

Here are the code changes:

```xml
<file_operations>
<file_operation operation="edit" file_path="test-context/src/migration.rs">
<<<<<<< SEARCH
/// Manage raw dump downloads
#[derive(Debug)]
pub struct Dumps {
    base: PathBuf,
}
=======
 /// Manage raw dump downloads
#[derive(Debug)]
pub struct Dumps {
    base: PathBuf,
}
>>>>>>> REPLACE
</file_operation>
</file_operations>

<additional_changes>
Please append the following `#[cfg(test)]` module to the end of `test-context/src/migration.rs` (after the `impl Dumps` block and any associated functions like `provide` / `provide_raw`). The tests assume that:
- `Dumps::new()` uses the `TRUSTIFY_MIGRATION_DUMPS` environment variable as the base cache directory.
- `Dumps::provide(&self, dump: &Dump<'_, String>) -> anyhow::Result<PathBuf>` (or similar) exists and downloads/caches the dump files defined in `Dump::files`.
- `Dumps::provide_raw(...)` is called internally by `provide` and handles locking/digest verification as implied by your comment.

If your actual signatures differ slightly, adjust the parameter/return types accordingly but keep the test logic intact.

```rust
#[cfg(test)]
mod tests {
    use super::{Dump, Dumps};
    use std::env;
    use std::fs;
    use std::io::Write;
    use std::net::{TcpListener, TcpStream};
    use std::path::{Path, PathBuf};
    use std::sync::{
        atomic::{AtomicUsize, Ordering},
        Arc,
    };
    use std::thread;
    use std::time::Duration;

    use sha2::{Digest, Sha256};
    use tempfile::TempDir;

    /// Simple HTTP response writer for our tiny test server.
    fn write_http_response(mut stream: TcpStream, status: &str, body: &[u8]) {
        let headers = format!(
            "HTTP/1.1 {status}\r\nContent-Length: {}\r\nConnection: close\r\n\r\n",
            body.len()
        );
        let _ = stream.write_all(headers.as_bytes());
        let _ = stream.write_all(body);
    }

    /// A tiny HTTP test server that serves three endpoints:
    ///
    /// - /dump.bin          => dump content (counts requests)
    /// - /dump.bin.sha256   => hex-encoded sha256 of dump content
    /// - /bad.bin.sha256    => incorrect sha256 (for negative test)
    struct TestServer {
        addr: String,
        request_count: Arc<AtomicUsize>,
        _handle: thread::JoinHandle<()>,
    }

    impl TestServer {
        fn start() -> Self {
            let listener = TcpListener::bind("127.0.0.1:0").expect("bind test server");
            let addr = listener.local_addr().unwrap().to_string();
            let request_count = Arc::new(AtomicUsize::new(0));
            let request_count_clone = Arc::clone(&request_count);

            let handle = thread::spawn(move || {
                // Fixed content for dump.bin
                let content = b"test-dump-content";
                let mut hasher = Sha256::new();
                hasher.update(content);
                let good_digest = format!("{:x}", hasher.finalize());
                let bad_digest = "0000000000000000000000000000000000000000000000000000000000000000";

                for stream in listener.incoming() {
                    if let Ok(stream) = stream {
                        let mut buf = [0u8; 1024];
                        let read_len = match stream.peek(&mut buf) {
                            Ok(n) => n,
                            Err(_) => continue,
                        };
                        let req = String::from_utf8_lossy(&buf[..read_len]);

                        let path = if let Some(line) = req.lines().next() {
                            line.split_whitespace().nth(1).unwrap_or("/")
                        } else {
                            "/"
                        };

                        request_count_clone.fetch_add(1, Ordering::SeqCst);

                        match path {
                            "/dump.bin" => {
                                write_http_response(stream, "200 OK", content);
                            }
                            "/dump.bin.sha256" => {
                                write_http_response(stream, "200 OK", good_digest.as_bytes());
                            }
                            "/bad.bin.sha256" => {
                                write_http_response(stream, "200 OK", bad_digest.as_bytes());
                            }
                            _ => {
                                write_http_response(stream, "404 Not Found", b"");
                            }
                        }
                    }
                }
            });

            TestServer {
                addr,
                request_count,
                _handle: handle,
            }
        }

        fn base_url(&self) -> String {
            format!("http://{}", self.addr)
        }

        fn request_count(&self) -> usize {
            self.request_count.load(Ordering::SeqCst)
        }
    }

    /// Helper to configure `Dumps` with a temporary base directory.
    fn make_dumps_with_tempdir() -> (Dumps, TempDir) {
        let tmp = TempDir::new().expect("tempdir");
        env::set_var("TRUSTIFY_MIGRATION_DUMPS", tmp.path());
        let dumps = Dumps::new().expect("Dumps::new");
        (dumps, tmp)
    }

    fn assert_file_contents_eq(path: &Path, expected: &[u8]) {
        let data = fs::read(path).expect("read cached file");
        assert_eq!(data, expected);
    }

    #[test]
    fn cache_is_reused_on_subsequent_provide_calls() {
        let server = TestServer::start();
        let (dumps, _tmp) = make_dumps_with_tempdir();

        let url_base = server.base_url();
        let dump = Dump {
            url: &url_base,
            files: &["dump.bin".to_string()],
            digests: true,
        };

        // First call should trigger a download.
        let path1 = dumps.provide(&dump).expect("first provide");
        assert!(path1.is_dir(), "expected provide to return a directory");

        // Second call should reuse the cached content.
        let path2 = dumps.provide(&dump).expect("second provide");
        assert_eq!(path1, path2, "expected same cache directory to be reused");

        // We know the server is used, but the key assertion is that
        // a second call does not cause an extra download of dump.bin.
        // This relies on `Dumps::provide` checking the cache/lock correctly.
        let count = server.request_count();
        assert!(
            count <= 3,
            "expected limited HTTP requests due to cache reuse, got {count}"
        );
    }

    #[test]
    fn digest_verification_succeeds_and_fails_as_expected() {
        let server = TestServer::start();
        let (dumps, _tmp) = make_dumps_with_tempdir();

        let url_base = server.base_url();

        // digests = true, matching sha256
        let dump_ok = Dump {
            url: &url_base,
            files: &["dump.bin".to_string()],
            digests: true,
        };
        let dir_ok = dumps.provide(&dump_ok).expect("provide with matching digest");
        let file_ok = dir_ok.join("dump.bin");
        assert!(file_ok.exists(), "file should exist after successful provide");
        assert_file_contents_eq(&file_ok, b"test-dump-content");

        // digests = true, but we point to a mismatched .sha256 file.
        // This assumes `provide` / `provide_raw` uses `{file}.sha256` from the same base URL.
        let dump_bad = Dump {
            url: &url_base,
            files: &["bad.bin".to_string()],
            digests: true,
        };

        let result_bad = dumps.provide(&dump_bad);
        assert!(
            result_bad.is_err(),
            "expected provide to fail when digest does not match"
        );

        // digests = false: missing .sha256 must not cause an error.
        let dump_no_digest = Dump {
            url: &url_base,
            files: &["dump.bin".to_string()],
            digests: false,
        };

        let dir_no_digest = dumps
            .provide(&dump_no_digest)
            .expect("provide without digest even if .sha256 is missing/unused");
        let file_no_digest = dir_no_digest.join("dump.bin");
        assert!(file_no_digest.exists());
        assert_file_contents_eq(&file_no_digest, b"test-dump-content");
    }

    #[test]
    fn concurrent_provide_calls_are_locked_and_content_is_not_corrupted() {
        use std::sync::Barrier;

        let server = TestServer::start();
        let (dumps, _tmp) = make_dumps_with_tempdir();

        let url_base = server.base_url();
        let dump = Dump {
            url: &url_base,
            files: &["dump.bin".to_string()],
            digests: true,
        };

        // Run two threads that call `provide` concurrently for the same dump.
        let barrier = Arc::new(Barrier::new(2));
        let dumps1 = dumps.clone();
        let dumps2 = dumps;

        let dump1 = Dump {
            url: dump.url,
            files: dump.files,
            digests: dump.digests,
        };
        let dump2 = Dump {
            url: dump.url,
            files: dump.files,
            digests: dump.digests,
        };

        let b1 = barrier.clone();
        let t1 = thread::spawn(move || {
            b1.wait();
            dumps1.provide(&dump1)
        });

        let b2 = barrier.clone();
        let t2 = thread::spawn(move || {
            b2.wait();
            dumps2.provide(&dump2)
        });

        let res1 = t1.join().expect("thread 1 panicked");
        let res2 = t2.join().expect("thread 2 panicked");

        let dir1 = res1.expect("thread 1 provide failed");
        let dir2 = res2.expect("thread 2 provide failed");

        assert_eq!(
            dir1, dir2,
            "both concurrent calls should resolve to the same cache directory"
        );
        let file = dir1.join("dump.bin");
        assert!(file.exists(), "cached file must exist");
        assert_file_contents_eq(&file, b"test-dump-content");
    }
}
```

Notes / adjustments you may need to make:

1. **Imports**  
   - If `sha2` or `tempfile` are not yet available in this crate, add them as test/dev dependencies in `test-context/Cargo.toml`:
     ```toml
     [dev-dependencies]
     sha2 = "0.10"
     tempfile = "3"
     ```
   - If your project already uses another hashing/tempdir crate, swap those in instead.

2. **`Dumps` API**  
   - If `Dumps` is not `Clone`, either:
     - Implement `Clone` for `Dumps` (e.g., it only wraps a `PathBuf`), or
     - Change the concurrent test to wrap `Dumps` in `Arc<Dumps>` and clone the `Arc` instead.

3. **`Dump` type**  
   - The tests use `Dump<'_, String>` with `files: &["dump.bin".to_string()]`.  
     If your `Dump` is usually constructed differently (e.g., `&[&str]`), update the tests to match.

4. **URL-to-path mapping**  
   - The tests assume `Dumps::provide` constructs URLs as `format!("{url}/{file}")`.  
     If your code uses a different convention, adjust `TestServer` paths and/or test URLs accordingly.

These tests will directly cover cache reuse, digest behavior, and locking semantics around `Dumps::provide` / `provide_raw`, and should be easy to adapt to your exact signatures and internal conventions.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

❌ Patch coverage is 57.39130% with 294 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.79%. Comparing base (dd31dae) to head (fc85e6b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
test-context/src/ctx/migration/snapshot/btrfs.rs 8.21% 185 Missing and 5 partials ⚠️
test-context/src/ctx/migration/snapshot/mod.rs 69.09% 20 Missing and 14 partials ⚠️
test-context/src/migration.rs 80.99% 8 Missing and 15 partials ⚠️
test-context/src/resource.rs 64.00% 18 Missing ⚠️
migration/src/data/run.rs 47.36% 9 Missing and 1 partial ⚠️
test-context/src/ctx/migration/mod.rs 92.03% 4 Missing and 5 partials ⚠️
common/src/decompress.rs 80.00% 1 Missing and 3 partials ⚠️
common/db/src/embedded.rs 33.33% 0 Missing and 2 partials ⚠️
migration/src/data/mod.rs 83.33% 0 Missing and 2 partials ⚠️
common/src/config.rs 87.50% 0 Missing and 1 partial ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2243      +/-   ##
==========================================
- Coverage   68.10%   67.79%   -0.32%     
==========================================
  Files         425      428       +3     
  Lines       24886    25433     +547     
  Branches    24886    25433     +547     
==========================================
+ Hits        16949    17242     +293     
- Misses       7018     7237     +219     
- Partials      919      954      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ctron ctron force-pushed the feature/migration_test_1 branch 5 times, most recently from e7e26d6 to a493616 Compare March 4, 2026 15:38
- run: df -h

- uses: actions/checkout@v4
- uses: actions/checkout@v5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- name: Test
# use only one job, trying to reduce memory usage
run: cargo llvm-cov --codecov --jobs 1 --features _test-s3 --output-path codecov.json
run: cargo llvm-cov --codecov --jobs 1 --all-features --output-path codecov.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

| `EXTERNAL_TEST_DB` | Run tests against external test database if set | |
| `EXTERNAL_TEST_DB_BOOTSTRAP` | Run tests against external test database if set | |
| `MEM_LIMIT_MB` | Set memory limit for tests that use TrustifyContext, shows the memory usage when the test reaches the limit | `500 MiB` |
| `TRUST_TEST_BTRFS_STORE` | Path to a BTRFS-backed directory for using snapshots for tests | User's home |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ctron ctron force-pushed the feature/migration_test_1 branch from 2cfecee to 59fa48a Compare March 10, 2026 07:39
ctron added 3 commits March 16, 2026 15:30
In case of in-band migrations, the sea orm migration manager will
execute all migrations inside a single transaction. This leads to the
problem, that concurrent operations are not possible. However, there
can be the strategy to run data migrations upfront, using concurrency.
And then re-run the migrations, but skipping the data migration.
@ctron ctron force-pushed the feature/migration_test_1 branch from c89ae1a to f47d81c Compare March 16, 2026 14:30
@ctron
Copy link
Contributor Author

ctron commented Mar 16, 2026

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • The Dumps::with_lock helper is declared with an impl AsyncFnOnce parameter, which is not a stable or in-scope trait; consider changing the signature to take a closure F: FnOnce() -> Fut where Fut: Future<Output = anyhow::Result<R>> instead.
  • In download_artifacts_raw you use base.join(file).exists() from the blocking std filesystem API inside async code; for large or remote filesystems this can stall the executor, so it would be better to switch to tokio::fs::try_exists or move that check into a spawn_blocking section.
  • The snapshot setup path (e.g. Snapshot::setup and fix_zstd using WalkDir and OpenOptions) performs potentially heavy, blocking filesystem work directly in async functions; consider moving these loops into spawn_blocking to avoid blocking the async runtime when operating on large dumps.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `Dumps::with_lock` helper is declared with an `impl AsyncFnOnce` parameter, which is not a stable or in-scope trait; consider changing the signature to take a closure `F: FnOnce() -> Fut` where `Fut: Future<Output = anyhow::Result<R>>` instead.
- In `download_artifacts_raw` you use `base.join(file).exists()` from the blocking std filesystem API inside async code; for large or remote filesystems this can stall the executor, so it would be better to switch to `tokio::fs::try_exists` or move that check into a `spawn_blocking` section.
- The snapshot setup path (e.g. `Snapshot::setup` and `fix_zstd` using `WalkDir` and `OpenOptions`) performs potentially heavy, blocking filesystem work directly in async functions; consider moving these loops into `spawn_blocking` to avoid blocking the async runtime when operating on large dumps.

## Individual Comments

### Comment 1
<location path="common/db/src/embedded.rs" line_range="82-84" />
<code_context>
-
-            let source = tokio::fs::File::open(&path).await?;
-            let source = BufReader::new(source);
+            log::info!("Importing database from: {}", path.display());

-            let source: Pin<Box<dyn AsyncRead + Send>> = match path
-                .extension()
-                .and_then(|ext| ext.to_str())
-            {
-                None | Some("sql") => Box::pin(source),
-                Some("xz") => Box::pin(async_compression::tokio::bufread::LzmaDecoder::new(source)),
-                Some(ext) => anyhow::bail!("Unsupported file type ({ext})"),
-            };
+            let source = decompress_async_read(path).await?;

             super::Database::import(&config, source)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Change from failing on unsupported extensions to transparently passing them through may hide format errors

The previous logic failed fast on unsupported extensions, giving clear feedback for cases like `.sql.xz`/`.sql` mismatches or unexpected compression types. `decompress_async_read` now treats unknown extensions as uncompressed, which can defer errors until SQL parsing and make failures harder to diagnose. To keep the earlier, clearer failure mode, consider enforcing an allowlist of extensions here or adding a strict-mode option to `decompress_async_read` that does so.

Suggested implementation:

```rust
        Source::Import(path) => {
            log::info!("Importing database from: {}", path.display());

            // Enforce an allowlist of supported file extensions to fail fast on unexpected formats.
            match path.extension().and_then(|ext| ext.to_str()) {
                None | Some("sql") | Some("xz") => {}
                Some(ext) => anyhow::bail!("Unsupported file type ({ext})"),
            }

            let source = decompress_async_read(path).await?;

            super::Database::import(&config, source)
                .await

```

If `anyhow` is not already in scope in this file, add an appropriate import (e.g. `use anyhow::bail;` and then use `bail!` instead of `anyhow::bail!`, or ensure `anyhow` is available as a crate path). No changes to `decompress_async_read` are strictly required with this approach, since the allowlist is enforced at the call site.
</issue_to_address>

### Comment 2
<location path="common/src/decompress.rs" line_range="121-130" />
<code_context>
+}
+
+/// Take a file, return a wrapped [`Read`], and wrap that with the required compression decoder.
+pub fn decompress_read(path: impl AsRef<Path>) -> anyhow::Result<Box<dyn Read + Send>> {
+    let path = path.as_ref();
+    let source = std::fs::File::open(path)?;
+    let source = std::io::BufReader::new(source);
+
+    Ok(match path.extension().and_then(|ext| ext.to_str()) {
+        Some("xz") => Box::new(lzma_rust2::XzReader::new(source, false)),
+        Some("gz") => Box::new(flate2::read::GzDecoder::new(source)),
+        // Anything else could be .sql, .tar, or an unsupported compression format.
+        // In that case, the following code would fail to understand the compressed content.
+        None | Some(_) => Box::new(source),
+    })
+}
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Unknown file extensions are treated as plain data, which may mask compression mismatches

In the sync path, any non-`xz`/`gz` extension (e.g. `.bz2`) hits the `None | Some(_)` arm and is treated as uncompressed, so a misconfigured or unsupported extension will only fail later when consumers parse the data. Consider either restricting the `Some(_)` arm to known extensions and erroring on everything else, or at least logging a warning when an unrecognized extension is present.

Suggested implementation:

```rust
+/// Take a file, return a wrapped [`Read`], and wrap that with the required compression decoder.
+pub fn decompress_read(path: impl AsRef<Path>) -> anyhow::Result<Box<dyn Read + Send>> {
+    let path = path.as_ref();
+    let source = std::fs::File::open(path)?;
+    let source = std::io::BufReader::new(source);
+
+    Ok(match path.extension().and_then(|ext| ext.to_str()) {
+        Some("xz") => Box::new(lzma_rust2::XzReader::new(source, false)),
+        Some("gz") => Box::new(flate2::read::GzDecoder::new(source)),
+        // No extension: treat as uncompressed input (e.g. `.sql`, `.tar`).
+        None => Box::new(source),
+        // Unknown/unsupported extension (e.g. `.bz2`): warn and treat as uncompressed.
+        Some(ext) => {
+            warn!(
+                "Unknown compression extension `{}` for path {:?}; treating as uncompressed input",
+                ext, path
+            );
+            Box::new(source)
+        }
+    })
+}

```

```rust
    Ok(match path.extension().and_then(|ext| ext.to_str()) {
        Some("xz") => Box::pin(async_compression::tokio::bufread::LzmaDecoder::new(source)),
        Some("gz") => Box::pin(async_compression::tokio::bufread::GzipDecoder::new(source)),
        // No extension: treat as uncompressed input (e.g. `.sql`, `.tar`).
        None => Box::pin(source),
        // Unknown/unsupported extension (e.g. `.bz2`): warn and treat as uncompressed.
        Some(ext) => {
            warn!(
                "Unknown compression extension `{}` for path {:?}; treating as uncompressed input",
                ext, path
            );
            Box::pin(source)
        }
    })
}

```

1. Ensure a logging macro `warn!` is in scope for this module. For example, if the project uses `tracing`, add:
   - `use tracing::warn;`
   If it uses the `log` crate instead, add:
   - `use log::warn;`
2. If there is a central logging style (structured fields, custom spans, etc.), adjust the `warn!` calls to match that convention.
</issue_to_address>

### Comment 3
<location path="migration/src/data/mod.rs" line_range="43-47" />
<code_context>
 #[derive(Clone, Debug, PartialEq, Eq, clap::Parser)]
 pub struct Options {
     /// Number of concurrent documents being processes
-    #[arg(long, env = "MIGRATION_DATA_CONCURRENT", default_value = "5")]
-    pub concurrent: NonZeroUsize,
+    ///
+    /// If the value is zero, use the number of logical CPUs
+    #[arg(long, env = "MIGRATION_DATA_CONCURRENT", default_value = "0")]
+    pub concurrent: usize,

     /// The instance number of the current runner (zero based)
</code_context>
<issue_to_address>
**question (bug_risk):** Default concurrency differs between programmatic `Default` and CLI/env default, which may confuse behavior

`Options::default()` still sets `concurrent` to 5, while the clap default is now `0` (interpreted as `num_cpus::get()` at runtime). So programmatic defaults and CLI/env defaults diverge. If this isn’t intentional, please align them (e.g., have `Default` use the `0``num_cpus` behavior, or change the clap default to `5`).
</issue_to_address>

### Comment 4
<location path="modules/storage/src/service/fs.rs" line_range="121-124" />
<code_context>
         for compression in &self.read_compressions {
             target.set_extension(compression.extension());
             if try_exists(&target).await? {
+                log::debug!("Located: {}", target.display());
                 return Ok(Some((target, *compression)));
             }
         }
+        log::info!("Missing - base: {target:?}");
         Ok(None)
     }
</code_context>
<issue_to_address>
**suggestion (performance):** Using `info!` for missing file paths may be noisy in normal operation

Since `find_file` will be called often and missing compressed variants is expected, logging an `info!` on every miss could overwhelm production logs. Consider downgrading this to `debug!` or removing it unless you truly need it at info level.

```suggestion
        }
+        log::debug!("Missing - base: {target:?}");
         Ok(None)
     }
```
</issue_to_address>

### Comment 5
<location path="migration/src/data/run.rs" line_range="41-50" />
<code_context>
+    }
+}
+
+impl Database {
+    pub async fn try_into_connection(self) -> Result<DatabaseConnection, DbErr> {
+        Ok(match self {
+            Self::Config { url, schema } => {
+                let schema = schema.clone().unwrap_or_else(|| "public".to_owned());
+
+                let connect_options = ConnectOptions::new(url)
+                    .set_schema_search_path(schema)
+                    .to_owned();
+
+                sea_orm::Database::connect(connect_options).await?
+            }
+            Self::Provided(database) => database.clone(),
+        })
+    }
</code_context>
<issue_to_address>
**suggestion (performance):** Unnecessary cloning of `schema` in `Config` arm and potential simplification of conversion

Inside `Config { url, schema }`, `schema` is already owned, so you can drop the clone and use `let schema = schema.unwrap_or_else(|| "public".to_owned());`. More broadly, `self.database.clone().try_into_connection()` only clones because `try_into_connection` takes `self` by value; if ownership transfer isn’t required, making it `&self` would remove the need to clone the enum (which may matter if `DatabaseConnection` is expensive to clone).

Suggested implementation:

```rust
impl Database {
    pub async fn try_into_connection(&self) -> Result<DatabaseConnection, DbErr> {
        Ok(match self {
            Self::Config { url, schema } => {
                let schema = schema
                    .as_ref()
                    .cloned()
                    .unwrap_or_else(|| "public".to_owned());

                let connect_options = ConnectOptions::new(url.clone())
                    .set_schema_search_path(schema)
                    .to_owned();

                sea_orm::Database::connect(connect_options).await?
            }
            Self::Provided(database) => database.clone(),
        })
    }
}

```

Anywhere `try_into_connection` is called, remove `.clone()` on the `Database` value and call it by reference instead. For example, change `self.database.clone().try_into_connection().await?` to `self.database.try_into_connection().await?`. Also ensure that any other direct uses of `schema` and `url` within `Config` arm remain valid now that the method takes `&self` (hence the `url.clone()` above to satisfy ownership for `ConnectOptions::new`).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +82 to +84
log::info!("Importing database from: {}", path.display());

let source: Pin<Box<dyn AsyncRead + Send>> = match path
.extension()
.and_then(|ext| ext.to_str())
{
None | Some("sql") => Box::pin(source),
Some("xz") => Box::pin(async_compression::tokio::bufread::LzmaDecoder::new(source)),
Some(ext) => anyhow::bail!("Unsupported file type ({ext})"),
};
let source = decompress_async_read(path).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Change from failing on unsupported extensions to transparently passing them through may hide format errors

The previous logic failed fast on unsupported extensions, giving clear feedback for cases like .sql.xz/.sql mismatches or unexpected compression types. decompress_async_read now treats unknown extensions as uncompressed, which can defer errors until SQL parsing and make failures harder to diagnose. To keep the earlier, clearer failure mode, consider enforcing an allowlist of extensions here or adding a strict-mode option to decompress_async_read that does so.

Suggested implementation:

        Source::Import(path) => {
            log::info!("Importing database from: {}", path.display());

            // Enforce an allowlist of supported file extensions to fail fast on unexpected formats.
            match path.extension().and_then(|ext| ext.to_str()) {
                None | Some("sql") | Some("xz") => {}
                Some(ext) => anyhow::bail!("Unsupported file type ({ext})"),
            }

            let source = decompress_async_read(path).await?;

            super::Database::import(&config, source)
                .await

If anyhow is not already in scope in this file, add an appropriate import (e.g. use anyhow::bail; and then use bail! instead of anyhow::bail!, or ensure anyhow is available as a crate path). No changes to decompress_async_read are strictly required with this approach, since the allowlist is enforced at the call site.

Comment on lines +121 to +130
pub fn decompress_read(path: impl AsRef<Path>) -> anyhow::Result<Box<dyn Read + Send>> {
let path = path.as_ref();
let source = std::fs::File::open(path)?;
let source = std::io::BufReader::new(source);

Ok(match path.extension().and_then(|ext| ext.to_str()) {
Some("xz") => Box::new(lzma_rust2::XzReader::new(source, false)),
Some("gz") => Box::new(flate2::read::GzDecoder::new(source)),
// Anything else could be .sql, .tar, or an unsupported compression format.
// In that case, the following code would fail to understand the compressed content.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Unknown file extensions are treated as plain data, which may mask compression mismatches

In the sync path, any non-xz/gz extension (e.g. .bz2) hits the None | Some(_) arm and is treated as uncompressed, so a misconfigured or unsupported extension will only fail later when consumers parse the data. Consider either restricting the Some(_) arm to known extensions and erroring on everything else, or at least logging a warning when an unrecognized extension is present.

Suggested implementation:

+/// Take a file, return a wrapped [`Read`], and wrap that with the required compression decoder.
+pub fn decompress_read(path: impl AsRef<Path>) -> anyhow::Result<Box<dyn Read + Send>> {
+    let path = path.as_ref();
+    let source = std::fs::File::open(path)?;
+    let source = std::io::BufReader::new(source);
+
+    Ok(match path.extension().and_then(|ext| ext.to_str()) {
+        Some("xz") => Box::new(lzma_rust2::XzReader::new(source, false)),
+        Some("gz") => Box::new(flate2::read::GzDecoder::new(source)),
+        // No extension: treat as uncompressed input (e.g. `.sql`, `.tar`).
+        None => Box::new(source),
+        // Unknown/unsupported extension (e.g. `.bz2`): warn and treat as uncompressed.
+        Some(ext) => {
+            warn!(
+                "Unknown compression extension `{}` for path {:?}; treating as uncompressed input",
+                ext, path
+            );
+            Box::new(source)
+        }
+    })
+}
    Ok(match path.extension().and_then(|ext| ext.to_str()) {
        Some("xz") => Box::pin(async_compression::tokio::bufread::LzmaDecoder::new(source)),
        Some("gz") => Box::pin(async_compression::tokio::bufread::GzipDecoder::new(source)),
        // No extension: treat as uncompressed input (e.g. `.sql`, `.tar`).
        None => Box::pin(source),
        // Unknown/unsupported extension (e.g. `.bz2`): warn and treat as uncompressed.
        Some(ext) => {
            warn!(
                "Unknown compression extension `{}` for path {:?}; treating as uncompressed input",
                ext, path
            );
            Box::pin(source)
        }
    })
}
  1. Ensure a logging macro warn! is in scope for this module. For example, if the project uses tracing, add:
    • use tracing::warn;
      If it uses the log crate instead, add:
    • use log::warn;
  2. If there is a central logging style (structured fields, custom spans, etc.), adjust the warn! calls to match that convention.

Comment on lines 121 to 124
}
log::info!("Missing - base: {target:?}");
Ok(None)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Using info! for missing file paths may be noisy in normal operation

Since find_file will be called often and missing compressed variants is expected, logging an info! on every miss could overwhelm production logs. Consider downgrading this to debug! or removing it unless you truly need it at info level.

Suggested change
}
log::info!("Missing - base: {target:?}");
Ok(None)
}
}
+ log::debug!("Missing - base: {target:?}");
Ok(None)
}

Comment on lines +41 to +50
impl Database {
pub async fn try_into_connection(self) -> Result<DatabaseConnection, DbErr> {
Ok(match self {
Self::Config { url, schema } => {
let schema = schema.clone().unwrap_or_else(|| "public".to_owned());

let connect_options = ConnectOptions::new(url)
.set_schema_search_path(schema)
.to_owned();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Unnecessary cloning of schema in Config arm and potential simplification of conversion

Inside Config { url, schema }, schema is already owned, so you can drop the clone and use let schema = schema.unwrap_or_else(|| "public".to_owned());. More broadly, self.database.clone().try_into_connection() only clones because try_into_connection takes self by value; if ownership transfer isn’t required, making it &self would remove the need to clone the enum (which may matter if DatabaseConnection is expensive to clone).

Suggested implementation:

impl Database {
    pub async fn try_into_connection(&self) -> Result<DatabaseConnection, DbErr> {
        Ok(match self {
            Self::Config { url, schema } => {
                let schema = schema
                    .as_ref()
                    .cloned()
                    .unwrap_or_else(|| "public".to_owned());

                let connect_options = ConnectOptions::new(url.clone())
                    .set_schema_search_path(schema)
                    .to_owned();

                sea_orm::Database::connect(connect_options).await?
            }
            Self::Provided(database) => database.clone(),
        })
    }
}

Anywhere try_into_connection is called, remove .clone() on the Database value and call it by reference instead. For example, change self.database.clone().try_into_connection().await? to self.database.try_into_connection().await?. Also ensure that any other direct uses of schema and url within Config arm remain valid now that the method takes &self (hence the url.clone() above to satisfy ownership for ConnectOptions::new).

@ctron ctron force-pushed the feature/migration_test_1 branch from befb880 to b33da8f Compare March 16, 2026 15:31
@ctron ctron force-pushed the feature/migration_test_1 branch from b33da8f to 1293af9 Compare March 16, 2026 15:33
@ctron ctron force-pushed the feature/migration_test_1 branch from eb3ed45 to fc85e6b Compare March 17, 2026 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants