Skip to content

Conversation

@ahmedxgouda
Copy link
Collaborator

@ahmedxgouda ahmedxgouda commented Nov 24, 2025

Proposed change

Resolves #2709

  • Added db environment variables to the docker compose backend service.
  • Added csrf_decorate that skips the csrf protection if the environment is e2e.
  • Fixed rest API internal error.
  • Updated the docs to elaborate how to update the e2e db data.

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 24, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced end-to-end testing environment configuration with dedicated initialization workflow.
    • Added data export utility for generating database snapshots.
  • Improvements

    • E2E testing workflow now includes timeout protection and optimized initialization sequence.
    • Migrated data operations to SQL-based approach.
  • Tests

    • Updated test infrastructure to support new data handling approach.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds E2E-specific settings and SITE_URL, inserts an IS_E2E_ENVIRONMENT branch in API settings, introduces a new SQL-based dump_data management command with tests, removes the prior load_data command/tests, updates Makefile and CI workflows to use SQL dumps and timeouts, and adjusts env examples, lint ignores, and dictionary entries.

Changes

Cohort / File(s) Change Summary
E2E settings & API customization
backend/settings/e2e.py, backend/apps/api/rest/v0/__init__.py, backend/.env.e2e.example
Added SITE_URL and CACHES to E2E settings; added an IS_E2E_ENVIRONMENT branch setting auth=None, a single E2E server using SITE_URL, and empty throttle; changed DJANGO_DB_HOST example value.
E2E workflow / CI
.github/workflows/setup-e2e-environment/action.yaml, .github/workflows/run-ci-cd.yaml
Added 5-minute timeouts around DB/backend readiness waits; moved initial Postgres data load to after backend startup and switched to nest.sql.gz; added DJANGO_DB_HOST=localhost to backend step; updated frontend env-file to .env.e2e.example.
Data dump & Makefile
backend/apps/common/management/commands/dump_data.py, backend/Makefile
Added dump_data management command that creates a temporary DB, masks email fields, and produces a gzipped SQL dump via pg_dump; updated Makefile targets to use and load nest.sql.gz into containers; removed old dump/load-e2e targets and Django-based load.
Tests
backend/tests/apps/common/management/commands/dump_data_test.py, backend/tests/apps/common/management/commands/load_data_test.py
Added unit tests for the new dump_data command; removed tests for the removed load_data command.
Lint, docs, dictionary
backend/pyproject.toml, CONTRIBUTING.md, cspell/custom-dict.txt
Added per-file Ruff ignores for the new command (S603, S607); documented gzip prerequisite in CONTRIBUTING; added dictionary entries (Atqc, PGPASSWORD, attisdropped, nspname).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas needing focused review:
    • backend/apps/common/management/commands/dump_data.py — subprocess usage, environment/PGPASSWORD handling, SQL masking correctness, and cleanup on failure.
    • .github/workflows/setup-e2e-environment/action.yaml — ordering of steps, timeouts, and the updated dump file path.
    • Consistency of SITE_URL and DJANGO_DB_HOST across settings, env examples, Makefile, and CI.

Possibly related PRs

Suggested labels

docker

Suggested reviewers

  • kasya
  • arkid15r

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.46% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix running e2e backend' is concise and directly relates to the main objective of resolving the e2e backend instance internal server errors.
Description check ✅ Passed The description is related to the changeset, listing specific changes made including environment variables, CSRF decoration, API fixes, and documentation updates for e2e database data.
Linked Issues check ✅ Passed The PR directly addresses issue #2709 by eliminating the internal server error in the e2e backend through multiple fixes: E2E environment configuration, REST API settings customization, database configuration, and workflow adjustments [#2709].
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the e2e backend instance: E2E environment settings, API customization, database configuration, workflow setup, data dumping/loading, and documentation updates are all necessary for the objective.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1570cf and ddc7e32.

📒 Files selected for processing (1)
  • backend/tests/apps/common/management/commands/dump_data_test.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/tests/apps/common/management/commands/dump_data_test.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run CI Denendencies Scan
  • GitHub Check: Run Code Scan
  • GitHub Check: CodeQL (python)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

The PR must be linked to an issue assigned to the PR author.

@github-actions github-actions bot closed this Nov 24, 2025
@ahmedxgouda
Copy link
Collaborator Author

I think I am assigned to the issue :)
image

@ahmedxgouda ahmedxgouda reopened this Nov 24, 2025
@github-actions
Copy link

The PR must be linked to an issue assigned to the PR author.

@github-actions github-actions bot closed this Nov 24, 2025
@ahmedxgouda ahmedxgouda linked an issue Nov 24, 2025 that may be closed by this pull request
@ahmedxgouda ahmedxgouda reopened this Nov 24, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/apps/common/utils.py (1)

71-76: Consider adding type hints for consistency.

Other functions in this file include type hints (e.g., convert_to_camel_case(text: str) -> str). Consider adding type hints to match the existing code style.

-def csrf_decorate(view):
+from typing import Callable
+from django.http import HttpRequest, HttpResponse
+
+def csrf_decorate(view: Callable[[HttpRequest], HttpResponse]) -> Callable[[HttpRequest], HttpResponse]:
     """Apply CSRF protection based on settings."""
     if settings.IS_E2E_ENVIRONMENT:
         return csrf_exempt(view)  # NOSONAR
     return csrf_protect(view)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 133c89f and fd0aa86.

📒 Files selected for processing (3)
  • backend/apps/common/utils.py (2 hunks)
  • backend/settings/urls.py (2 hunks)
  • docker-compose/e2e.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-31T13:48:09.830Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2155
File: frontend/graphql-codegen.ts:0-0
Timestamp: 2025-08-31T13:48:09.830Z
Learning: In the OWASP/Nest project, Django's GraphQL endpoint accepts CSRF tokens via 'x-csrftoken' header (lowercase) without requiring a Referer header, working fine in their configuration for GraphQL codegen introspection.

Applied to files:

  • backend/settings/urls.py
🧬 Code graph analysis (1)
backend/settings/urls.py (1)
backend/apps/common/utils.py (1)
  • csrf_decorate (71-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (4)
docker-compose/e2e.yaml (1)

15-20: LGTM! DB environment variables properly configured.

Adding explicit database connection environment variables ensures the backend service can connect to the PostgreSQL database. The values correctly reference the db service name for the host and align with the database service configuration (lines 38-40). This explicit configuration takes precedence over the env_file and likely resolves the internal server error mentioned in the linked issue.

backend/settings/urls.py (2)

14-14: LGTM! Import correctly references the new utility.

The import is correctly placed and the csrf_decorate utility is appropriately used for both the Algolia search and GraphQL endpoints.


24-25: Implementation correctly applies conditional CSRF protection.

The routes now use csrf_decorate which conditionally applies CSRF protection based on the environment. In production, these endpoints remain protected. Security implications of the E2E exemption are addressed in the review of backend/apps/common/utils.py.

Based on learnings, the GraphQL endpoint accepts CSRF tokens via the 'x-csrftoken' header, which continues to work with this implementation in non-E2E environments.

backend/apps/common/utils.py (1)

71-76: Verification confirms safeguards already in place—concern is substantially mitigated.

The architecture provides multiple layers of protection:

  1. Defensive default: IS_E2E_ENVIRONMENT = False in backend/settings/base.py
  2. Explicit opt-in: E2E config requires DJANGO_CONFIGURATION=E2E environment variable (not set in production deployment)
  3. Separate deployment: E2E environments use distinct .env.e2e.example and separate docker-compose files; production uses production_nest-app-network

The code pattern itself is sound—settings default to False and require explicit configuration to enable E2E behavior. The residual risk (human misconfiguration of DJANGO_CONFIGURATION) is addressed through deployment infrastructure and separate environment configurations.

@github-actions github-actions bot added the ci label Nov 24, 2025
@github-actions github-actions bot added the docs Improvements or additions to documentation label Nov 24, 2025
@ahmedxgouda ahmedxgouda marked this pull request as ready for review November 24, 2025 12:40
CONTRIBUTING.md Outdated

**Please note that you only need to do these steps once.**

#### Updating e2e Test Data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My data file update comment was not about the instructions for the data loading. It was about how to maintain the nest-e2e..sql.gz file.

return url.strip().rstrip(".,;:!?") or None


def csrf_decorate(view):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to avoid settings based on the running environment? We want it to be as close to production as possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last time I test sending fuzzing request in the fuzzing PR, it didn't work even with setting csrf header, but it was before a dedicated backend instance for e2e. I will remove this for now and try again in the fuzzing PR.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/.env.e2e.example (1)

7-13: Consider ordering DJANGO_DB_* variables alphabetically.

The change is functionally correct and aligns with the PR objectives. However, DJANGO_DB_HOST at line 9 is not in alphabetical order relative to other keys. Reordering these variables alphabetically would improve consistency and satisfy dotenv-linter's expectations.

Apply this diff to reorder the DB-related variables alphabetically:

 DJANGO_SETTINGS_MODULE=settings.e2e
 DJANGO_CONFIGURATION=E2E
-DJANGO_DB_HOST=db
 DJANGO_DB_NAME=nest_db_e2e
+DJANGO_DB_HOST=db
 DJANGO_DB_USER=nest_user_e2e
 DJANGO_DB_PASSWORD=nest_user_e2e_password
 DJANGO_DB_PORT=5432

Note: If your project uses dotenv-linter or similar tools in CI, you may want to apply this reorganization to avoid future warnings.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1556873 and c35e43f.

📒 Files selected for processing (3)
  • .github/workflows/setup-e2e-environment/action.yaml (3 hunks)
  • CONTRIBUTING.md (1 hunks)
  • backend/.env.e2e.example (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/setup-e2e-environment/action.yaml
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-17T15:25:55.624Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/providers.tf:1-3
Timestamp: 2025-10-17T15:25:55.624Z
Learning: The infrastructure code in the OWASP/Nest repository (infrastructure/ directory) is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • CONTRIBUTING.md
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • backend/.env.e2e.example
🪛 dotenv-linter (4.0.0)
backend/.env.e2e.example

[warning] 9-9: [UnorderedKey] The DJANGO_DB_HOST key should go before the DJANGO_SETTINGS_MODULE key

(UnorderedKey)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (2)
CONTRIBUTING.md (2)

444-470: Ensure the e2e data loading procedure aligns with the linked issue objectives.

The new "Updating e2e Test Data" section documents the procedure for refreshing e2e test data. Per the PR objective to "resolve the e2e backend instance error," verify that these steps fully address the data setup requirements and are consistent with any Docker Compose or backend setup changes made elsewhere in this PR.

A past reviewer flagged concerns about maintaining the nest-e2e..sql.gz data file itself. Confirm that the current documentation provides sufficient guidance on how to update the source data file (if applicable) before reloading into the container, rather than just the container lifecycle steps.


444-470: Let me retry the shell script execution to verify these issues:
<function_calls>


#!/bin/bash

Check the exact indentation on line 469 in CONTRIBUTING.md

echo "=== Checking line 469 indentation in CONTRIBUTING.md ==="
sed -n '465,475p' CONTRIBUTING.md | cat -A

echo ""
echo "=== Finding docker-compose files ==="
find . -maxdepth 3 -type f ( -name "docker-compose.yml" -o -name "docker-compose.yaml" -o -name ".env*" ) 2>/dev/null | head -20


</function_calls>

#!/bin/bash
# Search for e2e database volume definitions
echo "=== Searching for e2e-db volume definitions ==="
rg -n "e2e.*db.*volume|nest-e2e" --type yaml --type yml -C 3
</parameter>
</invoke>
</function_calls>

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/Makefile (1)

69-71: Clarify the relationship between dump-data-local and dump-data-e2e targets.

Both dump-data-local and dump-data-e2e write to the same output file (backend/data/nest-e2e.sql.gz), which could be confusing for developers. The naming suggests these serve different purposes, but they're actually part of the same workflow (local data → e2e seed data).

Consider renaming either the target or output file to make the intent clearer:

  • Option 1: Rename the output to nest-e2e-local-seed.sql.gz to indicate it's local data prepared for e2e
  • Option 2: Rename the target to prepare-e2e-data-from-local to clarify its purpose
  • Option 3: Add a brief comment above the target explaining the workflow

The PR description mentions updated documentation on updating e2e DB data—ensure the docs clarify this workflow.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c35e43f and d2d8a58.

📒 Files selected for processing (1)
  • backend/Makefile (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.

Applied to files:

  • backend/Makefile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)

backend/Makefile Outdated
@echo "Dumping Nest e2e data"
@CMD="pg_dumpall -U nest_user_e2e --clean | gzip -9 > backend/data/nest-e2e.sql.gz" $(MAKE) exec-db-command-e2e

dump-data-local:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's look at this one more time. Are we going to use our full data snapshot for e2e testing (e.g. the entire data from nest.json.gz)? If so I'd prefer to keep it simple and have just a single data file -- without nest-e2e.sql.gz introduction.

Copy link
Collaborator Author

@ahmedxgouda ahmedxgouda Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use nest.sql.gz instead of nest.json.gz. As it is extremely faster to load.

def decorator(view_func):
@wraps(view_func)
def _wrapper(request, *args, **kwargs):
if settings.IS_E2E_ENVIRONMENT:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why e2e needs this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't setup cache for e2e, so when it tries to access cache it gives 500 internal error. I think there is an option to setup redis cache in CI/CD. Maybe we can do that or keep it simple.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's add cache service for e2e instead. The closer to production architecture the better -- for both local and CI/CD cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will add the cache in another PR after this one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's configure the cache backed for e2e via Django settings (locmem for now)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
backend/apps/api/rest/v0/__init__.py (1)

61-71: Consider combining duplicate environment configurations.

The E2E configuration (lines 61-71) is nearly identical to the local configuration (lines 49-59), differing only in the server description. This duplication can be reduced for better maintainability.

Apply this diff to combine the configurations:

-if settings.IS_LOCAL_ENVIRONMENT:
+if settings.IS_LOCAL_ENVIRONMENT or settings.IS_E2E_ENVIRONMENT:
+    description = "E2E" if settings.IS_E2E_ENVIRONMENT else "Local"
     api_settings_customization = {
         "auth": None,
         "servers": [
             {
-                "description": "Local",
+                "description": description,
                 "url": settings.SITE_URL,
             }
         ],
         "throttle": [],
     }
-
-elif settings.IS_E2E_ENVIRONMENT:
-    api_settings_customization = {
-        "auth": None,
-        "servers": [
-            {
-                "description": "E2E",
-                "url": settings.SITE_URL,
-            }
-        ],
-        "throttle": [],
-    }
-
-elif settings.IS_STAGING_ENVIRONMENT:
+
+elif settings.IS_STAGING_ENVIRONMENT:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2d8a58 and 74bdb6c.

⛔ Files ignored due to path filters (2)
  • backend/data/nest-e2e.sql.gz is excluded by !**/*.gz
  • backend/data/nest.sql.gz is excluded by !**/*.gz
📒 Files selected for processing (5)
  • .github/workflows/setup-e2e-environment/action.yaml (4 hunks)
  • CONTRIBUTING.md (1 hunks)
  • backend/Makefile (2 hunks)
  • backend/apps/api/rest/v0/__init__.py (1 hunks)
  • backend/data/dump.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CONTRIBUTING.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.

Applied to files:

  • .github/workflows/setup-e2e-environment/action.yaml
  • backend/Makefile
🪛 GitHub Actions: Run CI/CD
backend/data/dump.sh

[error] 3-3: CSpell: Unknown word (PGPASSWORD)


[error] 13-13: CSpell: Unknown word (Atqc)


[error] 15-15: CSpell: Unknown word (nspname)


[error] 23-23: CSpell: Unknown word (attisdropped)


[error] 24-24: CSpell: Unknown word (nspname)


[error] CSpell: 5 unknown words found in 1 file. CSpell check failed; verify spelling or update dictionary.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (4)
backend/Makefile (1)

50-52: LGTM! Simplified dump workflow.

The refactored dump-data target now delegates to the dedicated dump.sh script, which is cleaner and more maintainable than the previous inline Django command sequence.

.github/workflows/setup-e2e-environment/action.yaml (3)

10-15: Good addition of timeout safeguards.

Adding 1-minute timeouts to the readiness loops prevents the workflow from hanging indefinitely if services fail to start. This improves CI reliability and provides faster feedback on failures.

Also applies to: 57-62


10-15: Verify 1-minute timeout is adequate for CI environments.

The 1-minute timeout for both database and backend readiness might be too short in resource-constrained or slow CI runners, potentially causing false failures. Consider whether this duration provides enough margin for typical startup times plus occasional CI slowness.

Monitor the first few CI runs after merging to ensure:

  1. Services consistently start within the timeout
  2. No spurious timeout failures occur during slow CI periods

If timeout failures occur, consider increasing to 2-3 minutes or making it configurable via environment variable.

Also applies to: 57-62


26-26: Unable to verify file existence due to repository access restrictions.

The review comment requests verification that backend/data/nest.sql.gz exists, but I cannot access the OWASP Nest repository to confirm this. The PostgreSQL import command syntax is correct (gunzip -c ... | psql), and the filename nest.sql.gz follows standard PostgreSQL dump conventions.

The verification should be completed by:

  1. Running the proposed script locally to confirm the file exists and is valid
  2. Ensuring the file path is updated consistently across any related Makefile or documentation changes
  3. Verifying the data file contains appropriate E2E test fixtures

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/data/dump.sh (2)

14-15: Add environment variable validation at script entry.

The script doesn't validate that required environment variables are set before use. If any variable is empty or unset, PostgreSQL commands fail with cryptic errors downstream. Building on prior feedback, add explicit validation.

Apply this diff to validate required variables early:

 #!/bin/sh
+# cspell:ignore PGPASSWORD Atqc nspname attisdropped
 
 set -e
 
+# Validate required environment variables
+: "${DJANGO_DB_PASSWORD:?DJANGO_DB_PASSWORD is required}"
+: "${DJANGO_DB_HOST:?DJANGO_DB_HOST is required}"
+: "${DJANGO_DB_USER:?DJANGO_DB_USER is required}"
+: "${DJANGO_DB_NAME:?DJANGO_DB_NAME is required}"
+
 # Cleanup function to ensure temp DB is dropped even on error
 cleanup() {
     if [ -n "$TEMP_DB" ]; then
         echo "Cleaning up temporary database $TEMP_DB..."
         psql -h "$DJANGO_DB_HOST" -U "$DJANGO_DB_USER" -d postgres -c "DROP DATABASE IF EXISTS $TEMP_DB;" 2>/dev/null || true
     fi
 }
 trap cleanup EXIT
 
 export PGPASSWORD="$DJANGO_DB_PASSWORD"

This adds the CSpell configuration (which was flagged in prior CI runs) and validates all critical variables before any database operations.


6-11: Add explicit return statement to cleanup function for SonarCloud compliance.

SonarCloud flags shell functions without explicit return statements. While this is often a false positive for shell cleanup functions, adding an explicit return improves linter compliance.

Apply this diff:

 cleanup() {
     if [ -n "$TEMP_DB" ]; then
         echo "Cleaning up temporary database $TEMP_DB..."
         psql -h "$DJANGO_DB_HOST" -U "$DJANGO_DB_USER" -d postgres -c "DROP DATABASE IF EXISTS $TEMP_DB;" 2>/dev/null || true
     fi
+    return 0
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b4c9e2 and f747481.

📒 Files selected for processing (1)
  • backend/data/dump.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
🪛 GitHub Check: SonarCloud Code Analysis
backend/data/dump.sh

[warning] 6-6: Add an explicit return statement at the end of the function.

See more on https://sonarcloud.io/project/issues?id=OWASP_Nest&issues=AZrF2NveBsDskKfEapS9&open=AZrF2NveBsDskKfEapS9&pullRequest=2710

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run CI Denendencies Scan
  • GitHub Check: Run Code Scan
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/data/dump.sh (1)

38-43: LGTM.

The conditional check before applying UPDATE statements is sound—it gracefully handles cases where no email columns exist. Combined with the error handling (set -e, trap cleanup), the script is robust and properly addresses the prior feedback on error handling and cleanup safeguards.

@ahmedxgouda ahmedxgouda requested a review from arkid15r November 27, 2025 15:32
def decorator(view_func):
@wraps(view_func)
def _wrapper(request, *args, **kwargs):
if settings.IS_E2E_ENVIRONMENT:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's add cache service for e2e instead. The closer to production architecture the better -- for both local and CI/CD cases.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
.github/workflows/setup-e2e-environment/action.yaml (2)

51-56: Use a known health endpoint for backend readiness check instead of /a.

The readiness check probes /a, which is a non-standard, arbitrary path. If this endpoint doesn't exist or is removed, the check will silently pass but the API may not be fully ready. Use a dedicated health or status endpoint (e.g., /api/health, /healthz, or a known public endpoint) to ensure the backend is genuinely operational.

-          until wget --spider http://localhost:9000/a; do
+          until wget --spider http://localhost:9000/api/health; do

Verify that the chosen endpoint exists, is publicly accessible (no auth required), and reflects backend readiness.


60-64: Add error handling for the data load step.

If the psql command fails (e.g., due to connection issues, SQL errors, or permission problems), the workflow silently continues. This can mask data load failures and cause misleading E2E test results.

       - name: Load Postgres data
         env:
           PGPASSWORD: nest_user_e2e_password
         run: |
           gunzip -c backend/data/nest.sql.gz | psql -h localhost -U nest_user_e2e -d nest_db_e2e
+          if [ $? -ne 0 ]; then
+            echo "Failed to load Postgres data"
+            exit 1
+          fi
         shell: bash

Or, use set -e to fail fast on any command error.

backend/pyproject.toml (1)

97-100: Scoped Ruff ignore looks fine; consider narrowing if more subprocess usage is added later.

Limiting S603/S607 ignores to just dump_data.py is a reasonable compromise given the trusted, ops-only nature of that command. If this file grows more subprocess logic over time, consider moving to line-level ignores (# noqa: S603,S607) around the specific psql/pg_dump invocations so future additions in the same file still benefit from those checks.

backend/Makefile (1)

50-76: Make SQL‑based load robust against missing/corrupt dump files.

Switching load-data / load-data-e2e to pipe nest.sql.gz directly into psql is a good fit for the new dump command and avoids the previous schema‑drop behavior. To harden this a bit, consider explicitly checking that the dump file exists (and is non‑empty) before piping it, so a missing/corrupt dump can fail the make target instead of just emitting a gunzip warning:

load-data:
	@echo "Loading Nest data"
	@test -s backend/data/nest.sql.gz || { echo "Missing or empty backend/data/nest.sql.gz"; exit 1; }
	@gunzip -c backend/data/nest.sql.gz | docker exec -i nest-db psql -U nest_user_dev -d nest_db_dev

load-data-e2e:
	@echo "Loading Nest e2e data"
	@test -s backend/data/nest.sql.gz || { echo "Missing or empty backend/data/nest.sql.gz"; exit 1; }
	@gunzip -c backend/data/nest.sql.gz | docker exec -i e2e-nest-db psql -U nest_user_e2e -d nest_db_e2e

This keeps behavior the same on the happy path while surfacing issues earlier when the dump file is unavailable.

backend/apps/common/management/commands/dump_data.py (3)

20-30: CLI --table arguments are additive to the defaults (not replacing them).

Because tables uses action="append" with a non‑empty default list, passing --table will append to the default patterns instead of replacing them:

  • No --table["public.owasp_*", "public.github_*", "public.slack_*"]
  • --table public.foo["public.owasp_*", "public.github_*", "public.slack_*", "public.foo"]

If that additive behavior is intentional, it might be worth clarifying it in the help text (e.g., “may be specified multiple times to add tables; defaults are always included”). If you intended CLI tables to override the defaults, you can instead:

  • Use default=None in add_arguments, and

  • In handle, do something like:

    tables = options["tables"] or ["public.owasp_*", "public.github_*", "public.slack_*"]

Also applies to: 40-41


44-60: Temp database name can cause collisions; consider making it unique per run.

temp_db = f"temp_{name}" combined with an unconditional DROP DATABASE IF EXISTS {temp_db} in finally means:

  • Two concurrent runs against the same source DB will race on the same temp DB name; one run can drop the other’s temp DB in the middle of its dump.
  • If the main DB name ever contains characters invalid in unquoted identifiers, CREATE DATABASE {temp_db} TEMPLATE {name}; will fail.

For this ops‑only command it’s not a blocker, but you can make it more robust by incorporating a unique suffix (PID, timestamp, or both) and quoting via psql:

import os
import time

temp_db = f"temp_{name}_{os.getpid()}_{int(time.time())}"

and then using a small helper to generate safe SQL (e.g. via format('CREATE DATABASE %I TEMPLATE %I', ...) in a DO block executed with \gexec), if you want to fully guard against unusual DB names.

Also applies to: 88-104


134-158: Subprocess wrapper is consistent; consider minor enhancements if you extend it later.

The _psql helper centralizes all psql calls, uses check=True, and passes credentials via environment (PGPASSWORD), which is a solid, simple pattern for this internal command. If this grows, a couple of optional tweaks you might consider:

  • Add text=True and pass strings instead of bytes to avoid manual .encode() for stdin cases.
  • Capture stderr on failures to include more context in the CommandError message (e.g., via run(..., capture_output=True) and surfacing e.stderr).

Not required for this PR, but these can make debugging dump failures easier when operating this command.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f747481 and ddae26b.

⛔ Files ignored due to path filters (1)
  • backend/data/nest.sql.gz is excluded by !**/*.gz
📒 Files selected for processing (4)
  • .github/workflows/setup-e2e-environment/action.yaml (2 hunks)
  • backend/Makefile (2 hunks)
  • backend/apps/common/management/commands/dump_data.py (1 hunks)
  • backend/pyproject.toml (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.

Applied to files:

  • backend/Makefile
  • .github/workflows/setup-e2e-environment/action.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (2)
.github/workflows/setup-e2e-environment/action.yaml (1)

44-45: I'll verify the review comment by examining the actual workflow file and investigating the database initialization sequence. Let me start by inspecting the action.yaml file.
<function_calls>

cat -n .github/workflows/setup-e2e-environment/action.yaml

</function_calls>

#!/bin/bash
# Read the full action.yaml file
cat -n .github/workflows/setup-e2e-environment/action.yaml
backend/apps/common/management/commands/dump_data.py (1)

106-132: Email masking strategy is broad but appropriate for e2e; just be aware of side‑effects.

The DO block walks all non‑system schemas and unconditionally runs:

UPDATE <schema>.<table> SET email = '';

for every column named email. This is a simple and effective way to scrub PII for e2e, but keep in mind:

  • Any “email” column used for non‑PII purposes will also be blanked.
  • Very large tables will get full‑table updates, which may be slow on big dumps.

Given this is an explicit data‑masking step for generating test/e2e dumps, this trade‑off is generally fine; just something to keep in mind if performance or more granular masking ever becomes a concern.

def decorator(view_func):
@wraps(view_func)
def _wrapper(request, *args, **kwargs):
if settings.IS_E2E_ENVIRONMENT:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's configure the cache backed for e2e via Django settings (locmem for now)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/apps/api/rest/v0/__init__.py (1)

60-70: E2E env behavior looks correct; consider DRYing up shared config.

The E2E branch cleanly mirrors the local environment (no auth, no throttling, same SITE_URL), which matches the intent for a test-only backend and keeps staging/prod untouched. To avoid duplication between the local and E2E branches, you could extract a small helper:

@@
-api_settings = {
-    "auth": ApiKey(),  # The `api_key` param name is based on the ApiKey class name.
-    "default_router": RouterPaginated(),
-    "description": "Open Worldwide Application Security Project API",
-    "docs": Swagger(settings={"persistAuthorization": True}),
-    "throttle": [AuthRateThrottle("10/s")],
-    "title": "OWASP Nest",
-    "version": "0.3.1",
-}
-
-api_settings_customization = {}
-if settings.IS_LOCAL_ENVIRONMENT:
-    api_settings_customization = {
-        "auth": None,
-        "servers": [
-            {
-                "description": "Local",
-                "url": settings.SITE_URL,
-            }
-        ],
-        "throttle": [],
-    }
-elif settings.IS_E2E_ENVIRONMENT:
-    api_settings_customization = {
-        "auth": None,
-        "servers": [
-            {
-                "description": "E2E",
-                "url": settings.SITE_URL,
-            }
-        ],
-        "throttle": [],
-    }
+api_settings = {
+    "auth": ApiKey(),  # The `api_key` param name is based on the ApiKey class name.
+    "default_router": RouterPaginated(),
+    "description": "Open Worldwide Application Security Project API",
+    "docs": Swagger(settings={"persistAuthorization": True}),
+    "throttle": [AuthRateThrottle("10/s")],
+    "title": "OWASP Nest",
+    "version": "0.3.1",
+}
+
+
+def _no_auth_env_server(description: str) -> dict:
+    return {
+        "auth": None,
+        "servers": [
+            {
+                "description": description,
+                "url": settings.SITE_URL,
+            }
+        ],
+        "throttle": [],
+    }
+
+
+api_settings_customization = {}
+if settings.IS_LOCAL_ENVIRONMENT:
+    api_settings_customization = _no_auth_env_server("Local")
+elif settings.IS_E2E_ENVIRONMENT:
+    api_settings_customization = _no_auth_env_server("E2E")
@@
 elif settings.IS_STAGING_ENVIRONMENT:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95b1391 and 4d2e344.

📒 Files selected for processing (2)
  • .github/workflows/setup-e2e-environment/action.yaml (2 hunks)
  • backend/apps/api/rest/v0/__init__.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.

Applied to files:

  • .github/workflows/setup-e2e-environment/action.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (4)
.github/workflows/setup-e2e-environment/action.yaml (4)

50-58: Verify the health check endpoint /a exists and is accessible.

The workflow checks for backend readiness by hitting http://localhost:9000/a, but this endpoint path appears arbitrary. If this endpoint doesn't exist, returns an error, or requires authentication, the workflow could timeout or fail even when the backend is functionally ready. Verify that /a is intentional or consider using a standard health check endpoint (e.g., /health, /, or /api/).


60-64: Verify that loading data after backend startup is correct.

Data loading has been moved to occur after backend startup and migrations (line 60), whereas previously it ran before backend startup. Since migrations execute before gunicorn starts (line 44), the database schema is initialized first, but confirm this sequence doesn't break expected behavior or require data to be present during any initialization steps.


35-47: Security: Verify --network host is necessary.

Using --network host exposes the container to all host network interfaces, which is a permissive configuration. Confirm this is required for your E2E testing or consider using an explicit Docker network bridge for better isolation.


8-16: Timeout pattern is well-structured.

The timeout 5m wrapper around the pg_isready retry loop is a solid defensive measure to prevent indefinite hangs. Similarly applied to the backend readiness check (lines 51–56).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/tests/apps/common/management/commands/dump_data_test.py (2)

53-62: Make the SELECT assertion less brittle to whitespace/formatting

The multi‑line string comparison for the information_schema.columns SELECT is quite sensitive to indentation and line breaks, so minor formatting changes in the command could break the test even if the query is logically the same.

You could relax this to look for key fragments instead:

-        executed_sql = [str(c.args[0]) for c in mock_cursor.execute.call_args_list]
-        assert "UPDATE public.users SET email = '';" in executed_sql
-        assert (
-            """
-        SELECT table_name
-        FROM information_schema.columns
-        WHERE table_schema = 'public' AND column_name = 'email';
-        """
-            in executed_sql
-        )
+        executed_sql = [str(c.args[0]) for c in mock_cursor.execute.call_args_list]
+        assert "UPDATE public.users SET email = '';" in executed_sql
+        assert any(
+            "FROM information_schema.columns" in sql
+            and "column_name = 'email'" in sql
+            for sql in executed_sql
+        )

This still asserts the right query is executed but is more resilient to harmless formatting changes.


64-65: Drop leftover debug print from the test

The print(mock_run.call_args[0][0]) will add noise to test output without adding assertions.

You can safely remove it:

-        print(mock_run.call_args[0][0])
-        assert [
+        assert [
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ea8fc7 and d3f9aef.

📒 Files selected for processing (3)
  • backend/apps/common/management/commands/load_data.py (0 hunks)
  • backend/tests/apps/common/management/commands/dump_data_test.py (1 hunks)
  • backend/tests/apps/common/management/commands/load_data_test.py (0 hunks)
💤 Files with no reviewable changes (2)
  • backend/tests/apps/common/management/commands/load_data_test.py
  • backend/apps/common/management/commands/load_data.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run CI Denendencies Scan
  • GitHub Check: CodeQL (javascript-typescript)
  • GitHub Check: CodeQL (python)
🔇 Additional comments (1)
backend/tests/apps/common/management/commands/dump_data_test.py (1)

1-38: Solid end‑to‑end coverage of the dump_data command behavior

The test wiring (override_settings, psycopg/connect/sql/Path/subprocess.run patches, and call_command) gives good, realistic coverage of the management command without touching a real DB or filesystem. The assertions for template DB creation, masking SQL, pg_dump args, and directory creation all look coherent and purposeful.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
backend/tests/apps/common/management/commands/dump_data_test.py (2)

18-18: Consider inheriting from a Django test base class or using pytest consistently.

The test class doesn't inherit from any base class. If you're using pytest, this is fine, but Django's @override_settings decorator is typically paired with Django's test classes (e.g., TestCase, SimpleTestCase). Consider clarifying the testing approach:

  • If using pytest: Use pytest fixtures instead of Django decorators for consistency.
  • If using Django: Inherit from django.test.SimpleTestCase or django.test.TestCase.

65-85: Consider conventional assertion order.

The assertion uses assert expected == actual, which is backwards from the pytest/Python convention of assert actual == expected. While functionally equivalent, the conventional order provides better failure messages.

-        assert [
+        actual_args = mock_run.call_args[0][0]
+        expected_args = [
             "pg_dump",
             "-h",
             "db-host",
             ...
             str(mock_resolve),
-        ] == mock_run.call_args[0][0]
+        ]
+        assert actual_args == expected_args
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3f9aef and a1570cf.

📒 Files selected for processing (1)
  • backend/tests/apps/common/management/commands/dump_data_test.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
🔇 Additional comments (3)
backend/tests/apps/common/management/commands/dump_data_test.py (3)

1-15: LGTM!

The imports and test database configuration are appropriate. The NOSONAR comments correctly suppress false positives for hardcoded test passwords.


24-38: LGTM!

The mock setup is comprehensive and correctly configured to simulate the PostgreSQL connection, cursor operations, and file path resolution.


86-89: LGTM!

The cleanup verification correctly ensures that the temporary database is dropped and path resolution is called as expected.

@ahmedxgouda ahmedxgouda requested a review from arkid15r December 2, 2025 19:32
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend backend-tests ci docs Improvements or additions to documentation makefile

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix the running e2e backend instance

2 participants