Skip to content

Conversation

@VirtualAgentics
Copy link
Owner

@VirtualAgentics VirtualAgentics commented Oct 19, 2025

Summary

This PR removes dead code and fixes all linting issues across the codebase.

Changes Made

Dead Code Removal

  • Removed unused score function from src/contextforge_memory/main.py (lines 90-92)
  • The function was dead code as the inline distance logic is used instead

Flake8 Issues Fixed

  • Fixed line length issues in clients/python/contextforge_client.py
  • Properly broke long function calls across multiple lines

Markdownlint Issues Fixed

  • CODE_OF_CONDUCT.md: Fixed line lengths and heading spacing (MD022)
  • CONTRIBUTING.md: Fixed line lengths and list/code block spacing (MD032, MD031)
  • README.md: Fixed line length issue
  • SECURITY.md: Fixed multiple line length issues

Verification

All linting tools now pass without errors:

  • ✅ Flake8: All checks passed
  • ✅ Markdownlint: All checks passed
  • ✅ Ruff: All checks passed
  • ✅ Black: All files properly formatted

Files Changed

  • src/contextforge_memory/main.py
  • clients/python/contextforge_client.py
  • CODE_OF_CONDUCT.md
  • CONTRIBUTING.md
  • README.md
  • SECURITY.md
  • .github/workflows/ci.yml

This improves code quality and ensures consistent formatting across the entire codebase.

Summary by CodeRabbit

  • New Features

    • Added an AI review/configuration for automated repository reviews.
  • Security / CI

    • Enhanced CI security audit with dedicated scanning, result uploads as artifacts, and failure notifications.
    • Excluded security scan artifacts from source tracking and removed stored scan output.
  • Performance

    • Improved retrieval speed by precomputing item embeddings for faster scoring.
  • Chores

    • Pinned/added dependencies (FastAPI, Starlette, NumPy).
  • Documentation

    • Reformatted README, CONTRIBUTING, SECURITY, and Code of Conduct for clarity.

- Add .coderabbit.yaml with security-focused configuration
- Configure path-specific instructions for different code areas
- Add auto-approval for small, safe changes
- Include ContextForge Memory-specific review prompts
- Add comprehensive setup guide for CodeRabbit AI
- Optimize for security, performance, and code quality reviews
Security fixes:
- Fix pip version constraint from >=25.0 to >=25.2 (GHSA-4xh5-x5gv-qwph)
- Add explicit starlette==0.48.0 to requirements.txt
- Pin fastapi to exact version 0.119.0 for reproducibility

CI workflow improvements:
- Remove duplicate dependency installation (safety/pip-audit now in requirements.txt)
- Remove ad-hoc dependency upgrades from CI
- Replace deprecated 'safety check' with 'safety scan'
- Use continue-on-error instead of shell || true for cleaner JSON output

Performance optimizations:
- Remove unused _score computation in search function
- Fix redundant query embedding computation in sort key
- Reuse precomputed query_vec instead of recomputing for each item

Code quality:
- Add safety-results.json and audit-results.json to .gitignore
- Remove committed safety-results.json artifact file
- Ensure all dependencies are properly pinned for reproducibility

All CodeRabbit AI suggestions have been implemented.
- Remove .coderabbit.yaml (system-specific configuration)
- Remove CODERABBIT_SETUP.md (setup documentation)
- Keep CodeRabbit AI integration but use default settings
- Configuration should be managed at user/organization level, not in repo
- Remove unused score function from main.py (dead code)
- Fix flake8 line length issues in Python client
- Fix markdownlint issues across all markdown files:
  - Break long lines to comply with 80-character limit
  - Add proper spacing around headings and lists
  - Fix code block spacing issues
- All linting tools now pass without errors
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 19, 2025

Walkthrough

Adds a CodeRabbit AI review config; updates CI to run and upload security audits with a failure-reporting step; pins/introduces dependencies; removes a stored safety-results.json artifact; applies documentation and client formatting; and refactors the memory service to precompute per-item embeddings and sort by distance for queries.

Changes

Cohort / File(s) Change Summary
CodeRabbit config
​.coderabbit.yaml
New AI review configuration with tone and detailed, path-specific review rules and file filters.
CI / Security audit
.github/workflows/ci.yml, .gitignore
Add security-audit step (runs safety scan --json and pip-audit --format=json with continue-on-error), upload security-audit-results artifact containing safety-results.json and audit-results.json, add failure-notification step that prints grouped outputs; update .gitignore to exclude audit artifacts.
Documentation formatting
CODE_OF_CONDUCT.md, CONTRIBUTING.md, README.md, SECURITY.md
Whitespace/line-wrapping and structural formatting changes only; no semantic edits.
Dependencies
requirements.txt
Pin fastapi==0.119.0; add starlette==0.48.0 and numpy==2.1.1; removed inline CI upgrade comments.
Client formatting
clients/python/contextforge_client.py
Reflowed multi-line function signatures and HTTP call formatting; no behavioral changes.
Embedding pipeline refactor
src/contextforge_memory/main.py
Replace per-item on-the-fly embedding and scoring with precomputed (MemoryItem, embedding) pairs; compute query embedding once, compute distances to stored embeddings, sort and return top_k MemoryItems.
Removed artifact
safety-results.json
Deleted generated safety scan JSON results file and its contents.

Sequence Diagram(s)

sequenceDiagram
    participant Scanner as File Scanner
    participant Encoder as Embedding Engine
    participant Store as items_with_embeddings
    participant Querier as Query Handler
    participant Sorter as Sort & Return

    rect rgb(248,250,252)
    Note over Scanner,Sorter: Old flow — per-item embedding & scoring
    Scanner->>Encoder: compute embedding for item N
    Encoder->>Querier: score item N vs query embedding
    Querier->>Sorter: accumulate scored item N
    end

    rect rgb(237,247,237)
    Note over Scanner,Sorter: New flow — precompute embeddings once
    Scanner->>Encoder: compute embeddings for all items
    Encoder->>Store: store (MemoryItem, embedding) pairs
    Querier->>Encoder: compute query embedding
    Store->>Sorter: compute distances between query and stored embeddings
    Sorter->>Sorter: sort distances & extract top_k MemoryItems
    end
Loading
sequenceDiagram
    participant CI as GitHub Actions
    participant Scanner as security scanners
    participant Artifacts as Artifact Store
    participant Notifier as Failure Notifier

    CI->>Scanner: run `safety scan --json > safety-results.json` and `pip-audit --format=json --output=audit-results.json` (continue-on-error)
    Scanner->>Artifacts: upload `safety-results.json` & `audit-results.json` as security-audit-results
    alt scanner failure
        CI->>Notifier: run failure step (on: failure())
        Notifier-->>CI: read & print grouped audit outputs if present (handles missing files)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐇 Embeddings precomputed, tidy and swift,
Audits packaged up, reported if adrift,
Docs smoothed lines, deps pinned tight,
CI whispers warnings in the night,
CodeRabbit nods — checks pass in light.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Remove dead code and fix linting issues" is partially related to the changeset. It accurately captures some legitimate aspects of the PR—specifically the removal of an unused score function in main.py and the multiple formatting and linting fixes across markdown and Python files (Flake8, Markdownlint corrections). However, the title does not represent the full scope of changes, as the PR includes significant additional work: a substantial refactoring of the embedding pipeline in src/contextforge_memory/main.py (shifting from in-loop per-item embedding to precomputed embeddings), CI workflow enhancements with new security audit functionality, addition of a new .coderabbit.yaml configuration file, and version pinning changes in requirements.txt. These changes go beyond dead code removal and linting fixes, representing optimization and feature additions that are not conveyed by the title.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-coderabbit-config

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/contextforge_memory/main.py (1)

90-114: Consider scalability for large datasets.

The current implementation loads all matching items into memory before sorting. For repositories with many items per namespace/project, this could cause memory pressure.

For production use with large datasets, consider:

  1. Adding pagination support to limit results
  2. Implementing a heap-based top-k selection to avoid sorting all items
  3. Adding observability metrics to track item counts per namespace
.coderabbit.yaml (1)

18-35: Ignore patterns are comprehensive but verify necessity.

The ignore section defines reasonable patterns for Python artifacts, build outputs, and cache directories. However, verify:

  • Are safety-results.json and audit-results.json (lines 29-30) intentionally excluded from review, given the new CI artifacts? These should likely be reviewed or explicitly documented as generated artifacts.
  • The data/ and logs/ directories (lines 26-27) are broad; consider if version control should exclude these at .gitignore level instead.

Consider documenting why security result JSON files are ignored, as they're now being generated and uploaded as CI artifacts (per .github/workflows/ci.yml changes).

.github/workflows/ci.yml (1)

43-60: Security audit notification step has redundant condition and could improve error handling.

Line 44: if: failure() && steps.security-audit.outcome == 'failure'

Issues:

  • The failure() context condition is redundant; if the step outcome is 'failure', the failure() context is already true.
  • Simpler condition: if: steps.security-audit.outcome == 'failure'

Additionally:

  • The graceful file-not-found handling (lines 48-51, 55-58) is good, but if results are missing, it may indicate the tools didn't run properly. Consider logging a warning for this case.

Simplify the condition and add explicit logging for missing audit files:

- name: Security audit failure notification
  if: steps.security-audit.outcome == 'failure'
  run: |
    echo "::warning::Security audit failed! Check the security-audit-results artifact for details."
    echo "::group::Safety scan results"
    if [ -f safety-results.json ]; then
      cat safety-results.json
    else
      echo "::warning::safety-results.json not found - safety scan may not have executed"
    fi
    echo "::endgroup::"
    echo "::group::Pip audit results"
    if [ -f audit-results.json ]; then
      cat audit-results.json
    else
      echo "::warning::audit-results.json not found - pip-audit may not have executed"
    fi
    echo "::endgroup::"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c85c74e and 16c4e90.

📒 Files selected for processing (11)
  • .coderabbit.yaml (1 hunks)
  • .github/workflows/ci.yml (2 hunks)
  • .gitignore (1 hunks)
  • CODE_OF_CONDUCT.md (2 hunks)
  • CONTRIBUTING.md (4 hunks)
  • README.md (1 hunks)
  • SECURITY.md (3 hunks)
  • clients/python/contextforge_client.py (2 hunks)
  • requirements.txt (2 hunks)
  • safety-results.json (0 hunks)
  • src/contextforge_memory/main.py (2 hunks)
💤 Files with no reviewable changes (1)
  • safety-results.json
🧰 Additional context used
🧬 Code graph analysis (2)
src/contextforge_memory/main.py (1)
clients/typescript/contextforgeClient.ts (1)
  • MemoryItem (1-8)
clients/python/contextforge_client.py (2)
src/contextforge_memory/main.py (1)
  • search (83-114)
clients/typescript/contextforgeClient.ts (1)
  • search (33-41)
🪛 LanguageTool
CODE_OF_CONDUCT.md

[style] ~30-~30: Try using a synonym here to strengthen your wording.
Context: ...ind * Trolling, insulting or derogatory comments, and personal or political attacks * Pu...

(COMMENT_REMARK)

README.md

[uncategorized] ~36-~36: The official name of this software platform is spelled with a capital “H”.
Context: ...sic import smoke in GitHub Actions (see .github/workflows/ci.yml).

(GITHUB)

🔇 Additional comments (17)
.gitignore (1)

18-20: LGTM!

The additions properly exclude security scan artifacts generated by the CI workflow, maintaining a clean working directory.

README.md (1)

1-36: LGTM! Documentation structure improved.

The formatting changes and new sections enhance readability and provide better structure for users.

CODE_OF_CONDUCT.md (1)

1-129: LGTM!

The formatting adjustments improve readability without changing the content or meaning.

CONTRIBUTING.md (1)

1-188: LGTM!

The formatting adjustments improve consistency and comply with Markdownlint rules without altering the content.

clients/python/contextforge_client.py (3)

17-22: LGTM!

The line breaks improve compliance with line-length limits without changing functionality.


24-26: LGTM!

The multi-line method signature improves readability and meets linting standards.


37-42: LGTM!

The line breaks improve compliance with line-length limits without changing functionality.

SECURITY.md (1)

1-57: LGTM!

The formatting adjustments improve readability without changing the security policy content.

src/contextforge_memory/main.py (3)

90-103: Efficient precomputation of embeddings.

The refactor correctly precomputes embeddings during the filtering pass, eliminating redundant computations during sorting.


105-110: Correct L2 distance sorting.

The sort key correctly uses negative squared L2 distance to rank items by similarity (smallest distance first).


113-113: Verify the max(1, req.top_k) behavior.

The use of max(1, req.top_k) forces at least one result even when top_k=0. This might be unexpected behavior—users might expect zero results when explicitly requesting top_k=0.

Consider whether the minimum-one-result behavior is intentional:

-    results = [item for item, _ in items_with_embeddings[: max(1, req.top_k)]]
+    results = [item for item, _ in items_with_embeddings[: max(0, req.top_k)]]

Or add validation to enforce top_k >= 1 in the SearchRequest model:

class SearchRequest(BaseModel):
    namespace: str
    project_id: str
    query: str
    top_k: int = Field(default=5, ge=1)  # Enforce minimum of 1
requirements.txt (3)

1-1: FastAPI version is current and secure—no action needed.

Verification confirms that fastapi==0.119.0 is the latest version on PyPI and is not affected by known security vulnerabilities. The two disclosed advisories (CSRF and Content-Type Header ReDoS) both target older versions and are already patched in 0.119.0.


5-5: NumPy 2.1.1 is secure; consider updating for currency.

The pinned version numpy==2.1.1 is not affected by the listed security advisories (which target older versions such as 1.x releases). However, the latest stable version is 2.3.4, making 2.1.1 two minor versions behind. If compatibility with your codebase and dependencies allows, updating to the latest version would improve currency. Otherwise, the current version is secure for use.


10-10: Starlette 0.48.0 is current, secure, and compatible with FastAPI 0.119.0.

Starlette 0.48.0 is the latest version on PyPI. All known CVEs for Starlette are patched in 0.48.0, including advisories for multipart DoS, form field vulnerabilities, path traversal, and header ReDoS attacks. FastAPI 0.119.0 supports Starlette >=0.40.0,<0.49.0, confirming full compatibility with the pinned version.

.coderabbit.yaml (1)

90-103: Auto-approval conditions may be overly permissive; clarify intent.

The auto_approve section enables approval when small_change and other conditions are met. Confirm:

  • Does "small_change" have a defined threshold (lines of code, files changed)? If not, any PR could potentially auto-approve.
  • Given the new security audit step, consider whether security scan results (safety/pip-audit) should be an explicit approval condition rather than relying on ci_passes.

Clarify the definition of small_change and whether it should explicitly include "security_audit_passes" as a condition.

.github/workflows/ci.yml (2)

29-34: Security audit step design allows workflow to pass despite vulnerabilities.

The combination of continue-on-error: true (line 31) and security scanning commands (lines 33-34) means:

  • Security vulnerabilities detected by safety or pip-audit will not fail the workflow
  • Results are collected in JSON artifacts for visibility but do not block merging

Confirm this is intentional security posture. If vulnerabilities should block merges, either:

  1. Remove continue-on-error: true and handle failures explicitly, or
  2. Add a separate step that parses JSON results and fails if issues exceed a threshold

Clarify whether the workflow should fail on security findings or only report them.


36-42: Artifact upload configuration is solid.

The if: always() and multi-path artifact collection ensures security scan results are captured regardless of outcome. This supports the audit trail and debugging workflow. Good practice.

- Move tone from review.tone to top-level tone_instructions
- Move path_instructions under reviews section
- Replace ignore.paths with reviews.path_filters using ! prefix
- Remove unsupported custom sections (behavior, security, performance, integrations, custom_prompts)
- Restructure to match official CodeRabbit schema requirements
- Preserve all review guidance content in properly formatted structure
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.coderabbit.yaml (1)

31-39: Consider enabling additional review tools for broader coverage (optional improvement).

The path_instructions are focused and practical. As a nice-to-have, you might evaluate enabling additional reviews.tools (e.g., gitleaks for secrets, semgrep or checkov for infrastructure code) if not already configured in CodeRabbit's UI. This would complement the security-focused tone and align with the PR's emphasis on linting and audit.

Also applies to: 42-49, 52-59, 62-69, 71-78, 81-96, 97-109

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16c4e90 and 0bb6ac3.

📒 Files selected for processing (1)
  • .coderabbit.yaml (1 hunks)
🔇 Additional comments (2)
.coderabbit.yaml (2)

1-109: Configuration structure now aligns with the official CodeRabbit schema—excellent fix.

The file correctly implements the official YAML structure: top-level tone_instructions, nested reviews section with path_filters and path_instructions arrays, each item properly typed. This addresses the schema concerns flagged in the previous review.

The path filters comprehensively exclude build artifacts, caches, and security scan outputs (line 20–21), which nicely complements the CI/CD security audits mentioned in the PR. Path-specific instructions are well-targeted: API design for src/**/*.py, client patterns for clients/*, workflow security for .github/**/*, and documentation quality.


5-5: Tone instructions are concise and aligned with project priorities.

At 135 characters, the message stays well under the 250-character limit and clearly signals focus on security, performance, and best practices—consistent with the PR's security audit and linting theme.

- Combine identical instructions for *.md and docs/**/* into single entry
- Use **/*.md pattern to cover all markdown files
- Remove duplication while preserving all review instructions
- Recommend enabling gitleaks, semgrep, checkov in CodeRabbit UI
- Add IaC/container/K8s review instructions under path_instructions
- Keep path fields as strings to satisfy linter
…yaml

- Replace 8 identical path instruction blocks with single canonical rule
- Use brace-expanded glob to cover all original patterns:
  - **/*.tf, **/Dockerfile, **/Dockerfile.*
  - **/docker-compose.yml, **/docker-compose.*.yml
  - helm/**/*, k8s/**/*, kubernetes/**/*
- Preserve original instruction text for Infrastructure-as-Code security
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.coderabbit.yaml (1)

1-5: Consider adding a trailing newline at end of file.

Most style guides and linters (Flake8, Prettier, etc.) expect files to end with a newline. If not already present, add one after line 113 to follow Unix convention and avoid potential CI/linting warnings.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b0dfa6 and ae86695.

📒 Files selected for processing (1)
  • .coderabbit.yaml (1 hunks)
🔇 Additional comments (1)
.coderabbit.yaml (1)

1-113: Configuration now properly adheres to CodeRabbit's official schema.

The file has been successfully restructured to align with the official specification:

  • tone_instructions is now a top-level string property (within 250-char limit)
  • path_filters and path_instructions are properly nested under reviews
  • No unsupported custom sections remain

The configuration is comprehensive and well-organized. Path filters appropriately exclude build artifacts, caches, logs, and generated files. Path-specific instructions are clear, actionable, and tailored to each file type—covering API code, clients, CI/CD workflows, infrastructure-as-code, documentation, and configuration files.

@VirtualAgentics VirtualAgentics merged commit 7f246a3 into main Oct 20, 2025
3 checks passed
@VirtualAgentics VirtualAgentics deleted the add-coderabbit-config branch October 20, 2025 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants