-
Notifications
You must be signed in to change notification settings - Fork 11
improvement: reuse docker compose for e2e tests #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
78a567c to
f508d06
Compare
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 9387833 | Triggered | Generic Password | 22fcc91 | e2e/packages/client/reset_db.go | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
d6f09f0 to
7fa0723
Compare
f508d06 to
22fcc91
Compare
Greptile OverviewGreptile SummaryThis PR introduces a smart Docker Compose caching system to avoid recreating containers for E2E tests, along with new database and Redis reset utilities. Key ChangesContainer Reuse System (
Reset Utilities (
Test Improvements (
Critical Security IssueSQL Injection Vulnerability in Other Issues Found
Architecture AssessmentThe caching approach is sound and will significantly speed up test execution by reusing containers. The reset mechanism properly isolates tests while avoiding the overhead of full container recreation. However, the SQL injection issue needs immediate attention before merging. Confidence Score: 2/5
Important Files ChangedFile Analysis
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 files reviewed, 7 comments
e2e/packages/infisical/compose.go
Outdated
| if len(containers) > 0 { | ||
| services := make([]string, 0, len(s.Project.Services)) | ||
| for name := range s.Project.Services { | ||
| services = append(services, name) | ||
| } | ||
|
|
||
| missingServices := make(map[string]int, len(services)) | ||
| for _, service := range services { | ||
| missingServices[service] = 1 | ||
| } | ||
| for _, c := range containers { | ||
| if c.State == container.StateRunning { | ||
| serviceName, ok := c.Labels[api.ServiceLabel] | ||
| if !ok { | ||
| continue | ||
| } | ||
| _, ok = missingServices[serviceName] | ||
| if ok { | ||
| delete(missingServices, serviceName) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if len(missingServices) == 0 { | ||
| provider, err := testcontainers.NewDockerProvider(testcontainers.WithLogger(log.Default())) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| s.dockerCompose = &RunningCompose{ | ||
| name: uniqueName, | ||
| client: dockerClient, | ||
| provider: provider, | ||
| services: services, | ||
| containers: make(map[string]*testcontainers.DockerContainer), | ||
| } | ||
| slog.Info("Found existing running containers", "name", uniqueName) | ||
| // Found existing compose, reuse instead | ||
| return s.dockerCompose.Up(ctx) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code checks if containers exist and are running, then decides to reuse them. However, there's a time-of-check to time-of-use (TOCTOU) race condition here. Between checking the container state (lines 61-82) and actually using it (line 98), the containers could:
- Stop running
- Be removed by another process
- Change state
While this is test code and the race window is small, in concurrent test scenarios this could lead to flaky tests where containers appear to exist but fail to respond.
Recommendation: Consider adding retry logic or additional state validation when attempting to reuse existing containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache system is not meant for CI test or any form of parallel way of running yet. For now, the main focus is on speeding up the development cycle of writing the test. Because otherwise we will waste tons of time in just waiting the compose container to boot up. If we really care about it, we will either set CLI_E2E_DISABLE_COMPOSE_CACHE=1 to disable this feature on CI or add a lock to it.
| panic("Cannot modify running compose") | ||
| } | ||
|
|
||
| func (c *RunningCompose) WithEnv(m map[string]string) compose.ComposeStack { | ||
| panic("Cannot modify running compose") | ||
| } | ||
|
|
||
| func (c *RunningCompose) WithOsEnv() compose.ComposeStack { | ||
| panic("Cannot modify running compose") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods use panic() to indicate that operations are not supported on running compose instances. While this makes the intent clear, panicking in library code is generally considered poor practice as it prevents callers from gracefully handling the error.
Recommendation: Consider returning an error instead:
| panic("Cannot modify running compose") | |
| } | |
| func (c *RunningCompose) WithEnv(m map[string]string) compose.ComposeStack { | |
| panic("Cannot modify running compose") | |
| } | |
| func (c *RunningCompose) WithOsEnv() compose.ComposeStack { | |
| panic("Cannot modify running compose") | |
| } | |
| func (c *RunningCompose) WaitForService(s string, strategy wait.Strategy) compose.ComposeStack { | |
| // Cannot modify running compose - return self to maintain interface compatibility | |
| return c | |
| } | |
| func (c *RunningCompose) WithEnv(m map[string]string) compose.ComposeStack { | |
| // Cannot modify running compose - return self to maintain interface compatibility | |
| return c | |
| } | |
| func (c *RunningCompose) WithOsEnv() compose.ComposeStack { | |
| // Cannot modify running compose - return self to maintain interface compatibility | |
| return c | |
| } |
Or if you want to preserve the error behavior, log a warning instead of panicking since these methods are part of the ComposeStack interface and may be called by library code that doesn't expect panics.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this library is for test. We don't want to handle the error, cashing the app is the easy way to tell the developer that you shouldn't do that. There's no need for recovery from that kind of error anyway, and this is not code for production anyway.
| if _, ok := opts.SkipTables[table]; ok { | ||
| continue | ||
| } | ||
| builder.WriteString(fmt.Sprintf("TRUNCATE TABLE %s RESTART IDENTITY CASCADE;\n", table)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨 CRITICAL: SQL Injection Vulnerability - Unescaped Table Identifiers
This line directly interpolates table names from the database into a SQL TRUNCATE statement without proper identifier quoting. While table names come from information_schema, a malicious actor who can create tables with special characters in their names could potentially inject SQL.
Attack Scenario:
If an attacker can create a table named something like: myTable; DROP DATABASE infisical; --, this could execute arbitrary SQL commands.
Example of vulnerability:
table := `myTable; DROP TABLE important_table; --`
// Results in: TRUNCATE TABLE myTable; DROP TABLE important_table; -- RESTART IDENTITY CASCADE;Fix: Use PostgreSQL identifier quoting with pgx.Identifier or properly escape identifiers:
| builder.WriteString(fmt.Sprintf("TRUNCATE TABLE %s RESTART IDENTITY CASCADE;\n", table)) | |
| builder.WriteString(fmt.Sprintf("TRUNCATE TABLE %s RESTART IDENTITY CASCADE;\n", pgx.Identifier{schema, table}.Sanitize())) |
Or use double quotes for identifier escaping:
// Escape table names properly
escapedTable := strings.ReplaceAll(table, `"`, `""`)
builder.WriteString(fmt.Sprintf(`TRUNCATE TABLE "%s" RESTART IDENTITY CASCADE;`, escapedTable) + "\n")Security Impact: HIGH - Could allow arbitrary SQL execution in test environments, potentially affecting other tests or leaking data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, hope, this is an e2e test running with a one time use database in a container. there's zero value to SQL inject it, not to mention the table name is from the db itself instead of user input. you have no idea what you're talking about, it's just the stats making you say that it looks like a SQL injection attack vector unfortunately 😅
| slog.Info("Connected to Redis", "pong", pong) | ||
|
|
||
| // Clear all keys in the current database | ||
| err = rdb.FlushAll(ctx).Err() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code uses FlushAll() which clears all databases in the Redis instance (databases 0-15 by default), not just the currently selected one. This could be problematic if:
- Multiple test suites run in parallel and use different Redis database numbers
- The Redis instance is shared with other applications during development
Consider: Using FlushDB() instead to only flush the current database, or explicitly select a specific database number before flushing:
// Flush only the current database (safer for shared Redis)
err = rdb.FlushDB(ctx).Err()Or if you intentionally want to clear all databases (current behavior is likely correct for e2e tests):
// FlushAll clears all databases - this is intentional for e2e test reset
err = rdb.FlushAll(ctx).Err()Add a comment to clarify the intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I don't need to add a comment. The intention is very clear: reset_redis. And this is under e2e folder / package, it's obviously for e2e test to reset the redis db.
| _, err = conn.Exec(ctx, | ||
| `INSERT INTO public.super_admin ("id", "fipsEnabled", "initialized", "allowSignUp") VALUES ($1, $2, $3, $4)`, | ||
| "00000000-0000-0000-0000-000000000000", true, false, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code inserts a hardcoded all-zeros identifier for the super_admin record. While this is consistent and deterministic for testing, there are potential issues:
- If the truncate operation fails for the
super_admintable (due to it being in SkipTables), this INSERT will fail with a unique constraint violation - The hardcoded values may not match what a real initialization would create
Recommendation: Consider using ON CONFLICT DO UPDATE or ON CONFLICT DO NOTHING to make this operation idempotent and prevent failures on repeated calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, if the truncate failed for whatever reason, the whole test should just fail. And hope, the super admin is exactly like that using this uuid. the quality of review is really bad this time unfortunately
Description 📣
ref: https://linear.app/infisical/issue/PLATFRM-134/make-a-smarter-docker-compose-cache-system-to-avoid-recreating-compose
based on: #91
Tests 🛠️
The way you test this PR is mostly like the #91. The major difference is that now we have a cache system for the docker-compose. If we found a compose (multiple containers with a compose label) already meets the spec, we will not create a new one but reuse it instead. The purpose is to speed up development cycle.
To ensure this system works, you can run the only relay test case we have right now with
TESTCONTAINERS_RYUK_DISABLED=true. This will stop the Ryuk container from cleaning up our compose containers. After successfully running the relay test case, you can then run again to see how long it takes and if it reuses the existing container.After that, you can also set
CLI_E2E_DISABLE_COMPOSE_CACHE=1to disable the cache and you should see it boot the container from the ground up instead of using the cache.Type ✨