Data Pipeline Enhancements by danelkay93 · Pull Request #13 · Folia-Labs/abyssal-tome

danelkay93 · 2025-07-13T03:00:54Z

This commit includes the initial work on enhancing the data pipeline. Due to environmental constraints (timeouts), I was unable to make significant progress.

I created scripts/load_data_to_sqlite.py to load processed JSON data into a SQLite database.
I attempted to run the scraper and processing scripts, but they timed out.
The data pipeline work is incomplete and will require further investigation in a more stable environment.

This commit includes the initial work on enhancing the data pipeline. Due to environmental constraints (timeouts), I was unable to make significant progress. - I created `scripts/load_data_to_sqlite.py` to load processed JSON data into a SQLite database. - I attempted to run the scraper and processing scripts, but they timed out. - The data pipeline work is incomplete and will require further investigation in a more stable environment.

coderabbitai · 2025-07-13T03:01:00Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Introduced a FastAPI backend with REST API endpoints for retrieving and searching game card rulings.
- Added persistent storage using SQLite for rulings data.
- Released an enriched JSON dataset containing detailed clarifications for game cards.
- Provided a script to load ruling data into the database.
- Implemented data models for rulings and provenance information.
Tests
- Added a test to verify the rulings API endpoint returns data as expected.
Chores
- Added Pyright type checking to pre-commit hooks and development dependencies.

Summary by CodeRabbit

New Features
- Introduced a REST API for accessing card ruling clarifications, including endpoints for listing, retrieving, and searching rulings.
- Added a persistent SQLite database to store and serve ruling data.
- Provided a new dataset of enriched rulings with provenance and clarification details.
Chores
- Automated loading of ruling data into the database from JSON files.
Tests
- Added a basic test to verify the rulings API endpoint responds correctly.

Walkthrough

The update replaces the previous Flet-based GUI application with a new FastAPI backend that serves game card rulings from a SQLite database via REST endpoints. Supporting scripts for AI enrichment and data loading are updated or added, and a basic test verifies the new API. A new enriched rulings dataset is also introduced.

Changes

File(s)	Change Summary
assets/processed_rulings_v3_ai_enriched.json	Added a new JSON file with AI-enriched rulings data, including provenance, tags, and clarification text.
scripts/enrich_rulings_ai.py	Replaced placeholder AI logic with real OpenAI API calls; added helper and improved error handling/logging.
scripts/load_data_to_sqlite.py	New script to validate and load JSON rulings data into SQLite using Pydantic and SQLAlchemy models.
src/abyssal_tome/app.py	Replaced GUI app with FastAPI backend; defines ORM and Pydantic models, and implements REST API endpoints.
src/abyssal_tome/models.py	Added SQLAlchemy ORM and Pydantic models for rulings and provenance data.
tests/test_app.py	Added a test for the `/rulings` API endpoint, checking status and response format.
.pre-commit-config.yaml	Added a new pre-commit hook for Pyright static type checking on Python files.
pyproject.toml	Added `pyright==1.1.365` as a pinned dev dependency.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FastAPI_App
    participant SQLite_DB

    Client->>FastAPI_App: GET /rulings
    FastAPI_App->>SQLite_DB: Query rulings (with pagination)
    SQLite_DB-->>FastAPI_App: Return list of rulings
    FastAPI_App-->>Client: Return JSON list of rulings

    Client->>FastAPI_App: GET /rulings/{ruling_id}
    FastAPI_App->>SQLite_DB: Query ruling by ID
    SQLite_DB-->>FastAPI_App: Return ruling or not found
    FastAPI_App-->>Client: Return ruling JSON or 404

    Client->>FastAPI_App: GET /search?query=...
    FastAPI_App->>SQLite_DB: Search rulings (LIKE on fields)
    SQLite_DB-->>FastAPI_App: Return matching rulings
    FastAPI_App-->>Client: Return JSON list of matches

Possibly related PRs

Refactor AI enrichment script to use LLM #12: The main PR and the retrieved PR both modify the same file scripts/enrich_rulings_ai.py by replacing placeholder AI logic with actual OpenAI API calls and adding the call_openai_api helper function, making their changes directly related.

Poem

A backend reborn, the GUI set free,
FastAPI now rules the data sea.
With SQLite as anchor, and endpoints so bright,
Rulings and clarifications are served day and night.
Tests hop in, JSONs abound—
In this new warren, robust data is found!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/llm-enrichment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai auto-generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

danelkay93 · 2025-07-13T03:01:22Z

@coderabbitai review

coderabbitai · 2025-07-13T03:01:27Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copilot

Pull Request Overview

This PR lays the groundwork for the new data pipeline by adding a SQLite loading script, converting the Flet app to a SQLAlchemy‐backed FastAPI service, and introducing a basic endpoint test.

Adds a basic test for the /rulings endpoint.
Refactors app.py to use SQLAlchemy models and FastAPI endpoints (/rulings, /rulings/{id}, /search).
Introduces scripts/load_data_to_sqlite.py to populate the SQLite database from enriched JSON.

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/test_app.py	Added TestClient and a test for GET `/rulings`.
src/abyssal_tome/app.py	Replaced Flet UI logic with FastAPI, SQLAlchemy setup, and endpoints.
scripts/load_data_to_sqlite.py	New script to read JSON and insert records into SQLite.
scripts/enrich_rulings_ai.py	Updated AI helper functions to call OpenAI API.
assets/processed_rulings_v3_ai_enriched.json	Sample data for the enrichment pipeline.

Comments suppressed due to low confidence (3)

tests/test_app.py:7

[nitpick] This test only covers the GET /rulings endpoint. Consider adding tests for pagination parameters (skip, limit), GET /rulings/{ruling_id}, and the /search endpoint to improve coverage.

    response = client.get("/rulings")

scripts/load_data_to_sqlite.py:73

The session is committed but not explicitly closed; consider calling session.close() in a finally block to ensure database connections are properly released.

        rulings_data = json.load(f)

Copilot · 2025-07-13T03:01:53Z

+
+# --- Database Setup ---
+DATABASE_URL = "sqlite:///abyssal_tome.db"
+engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})


The database tables are never created before handling requests. Consider calling Base.metadata.create_all(engine) at application startup to ensure the 'rulings' table exists.

Suggested change

engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})

engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})

# Ensure all tables are created at application startup

Base.metadata.create_all(engine)

coderabbitai

Actionable comments posted: 12

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32174e5 and 45ac0db.

⛔ Files ignored due to path filters (1)

abyssal_tome.db is excluded by !**/*.db

📒 Files selected for processing (5)

assets/processed_rulings_v3_ai_enriched.json (1 hunks)
scripts/enrich_rulings_ai.py (2 hunks)
scripts/load_data_to_sqlite.py (1 hunks)
src/abyssal_tome/app.py (1 hunks)
tests/test_app.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

scripts/load_data_to_sqlite.py (2)

src/abyssal_tome/app.py (3)

Provenance (33-38)

Ruling (40-53)

RulingDB (18-30)

scripts/enrich_rulings_ai.py (1)

main (308-380)

src/abyssal_tome/app.py (2)

scripts/load_data_to_sqlite.py (3)

RulingDB (38-50)

Provenance (16-21)

Ruling (23-33)

scripts/process_new_format.py (2)

Provenance (65-70)

Ruling (73-88)

🪛 Ruff (0.11.9)

scripts/load_data_to_sqlite.py

64-64: error() call on root logger

(LOG015)

69-69: Variable Session in function should be lowercase

(N806)

76-76: open() should be replaced by Path.open()

(PTH123)

scripts/enrich_rulings_ai.py

29-29: Line too long (102 > 100)

(E501)

36-36: error() call on root logger

(LOG015)

48-50: info() call on root logger

(LOG015)

49-49: Line too long (104 > 100)

(E501)

52-52: Line too long (133 > 100)

(E501)

54-54: Line too long (111 > 100)

(E501)

78-78: info() call on root logger

(LOG015)

105-105: info() call on root logger

(LOG015)

130-130: info() call on root logger

(LOG015)

133-133: Line too long (146 > 100)

(E501)

src/abyssal_tome/app.py

63-63: Missing return type annotation for public function get_db

(ANN201)

73-73: Missing return type annotation for public function get_rulings

(ANN201)

73-73: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

78-78: Missing return type annotation for public function get_ruling

(ANN201)

78-78: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

86-86: Missing return type annotation for public function search_rulings

(ANN201)

86-86: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

coderabbitai · 2025-07-13T03:05:16Z

+def test_get_rulings():
+    response = client.get("/rulings")
+    assert response.status_code == 200
+    assert isinstance(response.json(), list)


🧹 Nitpick (assertive)

🛠️ Refactor suggestion

Expand test coverage for comprehensive API validation.

The current test only verifies basic endpoint availability. Consider adding tests for:

Pagination parameters (skip, limit)

Individual ruling retrieval (/rulings/{ruling_id})

Search functionality (/search)

Data structure validation

Error cases (404, invalid parameters)

Would you like me to generate comprehensive test cases covering all endpoints and edge cases?

🤖 Prompt for AI Agents

In tests/test_app.py around lines 6 to 9, the current test only checks basic availability of the /rulings endpoint. Expand test coverage by adding tests for pagination parameters like skip and limit on /rulings, tests for retrieving individual rulings via /rulings/{ruling_id}, tests for the /search endpoint functionality, validation of the data structure returned by these endpoints, and tests for error cases such as 404 responses and invalid parameters. Implement these additional test cases to ensure comprehensive API validation.

Yes, please do.

@danelkay93 I'll generate comprehensive test cases covering all endpoints and edge cases for your FastAPI application.

Comprehensive Test Suite Expansion

Add tests for pagination parameters on /rulings endpoint

Add tests for individual ruling retrieval via /rulings/{ruling_id}

Add tests for search functionality via /search endpoint

Add data structure validation tests

Add error case testing (404, invalid parameters)

Add fixtures and test data setup

I'll implement a comprehensive test suite that covers all your FastAPI endpoints with proper mocking, data validation, and error handling. Another agent will handle implementing these changes to your test file.

(\ /)
( .. )
o(")(") excited to help expand test coverage

⏳ Preparing changes...

A summary of the changes CodeRabbit can apply:

Add a comprehensive pytest test file at tests/test_app.py that expands coverage by including fixtures, endpoint availability, pagination, validation, individual retrieval, search functionality, data structure validation, error handling, integration workflows, performance edge cases, content type checks, and database error handling to thoroughly test the FastAPI app’s ruling-related endpoints.

✅ Create PR with these edits

📌 Create commit in current branch

📋 Get copyable edits

coderabbitai · 2025-07-13T03:05:16Z

+[
+  {
+    "id": "f787ed89-a728-410b-be0e-dcb3d6ed118c",
+    "source_card_code": "00000",
+    "related_card_codes": [],
+    "provenance": {
+      "source_type": "discord_mythosbusters_rules",
+      "source_name": null,
+      "source_date": null,
+      "retrieval_date": "2024-03-16T10:00:00Z",
+      "source_url": "https://discord.com/channels/mythosbusters/1234567890"
+    },
+    "original_html_snippet": "UserX: Quick question about [Lucky!]. If I fail by 2, can I play two copies to pass?\nMJNewman_FFG: No, Lucky! (0) says 'If this skill test is successful instead'. Once the first Lucky resolves and makes the test successful, the trigger condition for the second Lucky is no longer met. (Approx Dec 2022)",
+    "tags": [],
+    "ruling_type": "CLARIFICATION",
+    "question": null,
+    "answer": null,
+    "text": "UserX: Quick question about [Lucky!]. If I fail by 2, can I play two copies to pass?\nMJNewman_FFG: No, Lucky! (0) says 'If this skill test is successful instead'. Once the first Lucky resolves and makes the test successful, the trigger condition for the second Lucky is no longer met. (Approx Dec 2022)"
+  },
+  {
+    "id": "07f645bc-da8e-45db-9c3a-3330d8585f87",
+    "source_card_code": "00000",
+    "related_card_codes": [],
+    "provenance": {
+      "source_type": "bgg_card_comment",
+      "source_name": null,
+      "source_date": null,
+      "retrieval_date": "2024-03-16T10:05:00Z",
+      "source_url": "https://boardgamegeek.com/arkham-horror-the-card-game/card/some-card/forums/0"
+    },
+    "original_html_snippet": "I saw a ruling somewhere that said Machete's bonus damage applies against Massive enemies if you're engaged with them. Was that official? Seems to be from an old email.",
+    "tags": [],
+    "ruling_type": "CLARIFICATION",
+    "question": null,
+    "answer": null,
+    "text": "I saw a ruling somewhere that said Machete's bonus damage applies against Massive enemies if you're engaged with them. Was that official? Seems to be from an old email."
+  },
+  {
+    "id": "67ff66c3-a24e-46a2-9d28-08298b3e6b9a",
+    "source_card_code": "00000",
+    "related_card_codes": [],
+    "provenance": {
+      "source_type": "generic_forum_post",
+      "source_name": null,
+      "source_date": null,
+      "retrieval_date": "2024-03-16T10:10:00Z",
+      "source_url": "someforum.com/thread/123"
+    },
+    "original_html_snippet": "Q: Can [Working a Hunch] be played if there are no clues at your location? A: The card says 'Discover 1 clue at your location'. If there are no clues to discover, you cannot play the card, as you cannot resolve its effect. This is a general game principle for 'discover' effects. (Source: FFG Email ruling, early 2018)",
+    "tags": [],
+    "ruling_type": "CLARIFICATION",
+    "question": null,
+    "answer": null,
+    "text": "Q: Can [Working a Hunch] be played if there are no clues at your location? A: The card says 'Discover 1 clue at your location'. If there are no clues to discover, you cannot play the card, as you cannot resolve its effect. This is a general game principle for 'discover' effects. (Source: FFG Email ruling, early 2018)"
+  }
+]


💡 Verification agent

🧩 Analysis chain

AI enrichment appears incomplete - verify the enrichment pipeline.

The JSON data shows signs of incomplete AI enrichment:

All entries use placeholder source_card_code: "00000"

Empty related_card_codes despite card references like "[Lucky!]" and "[Machete]" in text

Empty tags arrays

Null question/answer fields despite Q&A format in the text

This suggests the AI enrichment in scripts/enrich_rulings_ai.py may not have executed properly.

Run the following script to verify the AI enrichment process:

🏁 Script executed:

#!/bin/bash # Description: Check if the OpenAI API key is set and test the enrichment script # Test 1: Check if OPENAI_API_KEY environment variable is set if [ -z "$OPENAI_API_KEY" ]; then echo "ERROR: OPENAI_API_KEY environment variable is not set" else echo "OPENAI_API_KEY is set" fi # Test 2: Check if the enrichment script exists and has proper imports echo -e "\n--- Checking enrichment script ---" if [ -f "scripts/enrich_rulings_ai.py" ]; then # Check for OpenAI import and usage rg -A 2 "from openai import OpenAI|client = OpenAI" scripts/enrich_rulings_ai.py # Check if the AI functions are being called echo -e "\n--- AI function calls ---" rg "ai_get_related_cards|ai_generate_tags|ai_extract_q_and_a" scripts/enrich_rulings_ai.py fi # Test 3: Look for any error logs from previous runs echo -e "\n--- Checking for error patterns in logs ---" fd -e log -e txt | xargs -I {} rg -i "error.*openai|api.*key|enrichment.*failed" {} 2>/dev/null || echo "No log files found"

Length of output: 1459

Set and export OPENAI_API_KEY, then rerun AI enrichment

The enrichment script never saw a valid API key (we saw “ERROR: OPENAI_API_KEY environment variable is not set”), so it skipped all OpenAI calls. As a result, fields like source_card_code, related_card_codes, tags, and question/answer remain unset.

• Export your key before running the script:
• In your shell or CI:
export OPENAI_API_KEY=<your_key_here>
• Rerun:
python scripts/enrich_rulings_ai.py assets/processed_rulings_v3.json assets/processed_rulings_v3_ai_enriched.json
• Verify the output no longer contains placeholders:
• grep -R '"source_card_code": "00000"' assets/processed_rulings_v3_ai_enriched.json should return no matches
• Inspect a sample entry to ensure related_card_codes, tags, question, and answer are populated

🤖 Prompt for AI Agents

In assets/processed_rulings_v3_ai_enriched.json lines 1 to 56, the AI enrichment script did not run properly because the OPENAI_API_KEY environment variable was not set, resulting in missing data fields like source_card_code, related_card_codes, tags, question, and answer. To fix this, export your OpenAI API key in your shell or CI environment using 'export OPENAI_API_KEY=<your_key_here>', then rerun the enrichment script with the correct input and output file paths. After rerunning, verify that the output file no longer contains placeholder values such as "00000" for source_card_code and that the relevant fields are properly populated.

coderabbitai · 2025-07-13T03:05:16Z

+def main():
+    # Define paths
+    json_path = Path("assets/processed_rulings_v3_ai_enriched.json")
+    db_path = Path("abyssal_tome.db")
+
+    # Check if JSON file exists
+    if not json_path.exists():
+        logging.error(f"JSON file not found at {json_path}")
+        return
+
+    # Create engine and session
+    engine = create_engine(f"sqlite:///{db_path}")
+    Session = sessionmaker(bind=engine)
+    session = Session()
+
+    # Create tables
+    Base.metadata.create_all(engine)
+
+    # Load data from JSON
+    with open(json_path, 'r', encoding='utf-8') as f:
+        rulings_data = json.load(f)
+
+    # Process and add data to the database
+    for ruling_dict in rulings_data:
+        # Validate with Pydantic
+        ruling = Ruling(**ruling_dict)
+
+        # Create a RulingDB instance
+        ruling_db = RulingDB(
+            id=ruling.id,
+            source_card_code=ruling.source_card_code,
+            related_card_codes=ruling.related_card_codes,
+            ruling_type=ruling.ruling_type,
+            question=ruling.question,
+            answer=ruling.answer,
+            text=ruling.text,
+            provenance=ruling.provenance.dict(),
+            original_html_snippet=ruling.original_html_snippet,
+            tags=ruling.tags,
+        )
+        session.add(ruling_db)
+
+    # Commit the session
+    session.commit()
+    logging.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}")
+
+if __name__ == "__main__":
+    main()


🛠️ Refactor suggestion

⚠️ Potential issue

Add error handling and use proper conventions.

The script lacks error handling for database operations and has several issues flagged by static analysis.

Apply these improvements:

-def main(): +def main() -> None: # Define paths json_path = Path("assets/processed_rulings_v3_ai_enriched.json") db_path = Path("abyssal_tome.db") + logger = logging.getLogger(__name__) # Check if JSON file exists if not json_path.exists(): - logging.error(f"JSON file not found at {json_path}") + logger.error(f"JSON file not found at {json_path}") return # Create engine and session engine = create_engine(f"sqlite:///{db_path}") - Session = sessionmaker(bind=engine) - session = Session() + session_factory = sessionmaker(bind=engine) + session = session_factory() # Create tables Base.metadata.create_all(engine) # Load data from JSON - with open(json_path, 'r', encoding='utf-8') as f: + with json_path.open('r', encoding='utf-8') as f: rulings_data = json.load(f) - # Process and add data to the database - for ruling_dict in rulings_data: - # Validate with Pydantic - ruling = Ruling(**ruling_dict) - - # Create a RulingDB instance - ruling_db = RulingDB( - id=ruling.id, - source_card_code=ruling.source_card_code, - related_card_codes=ruling.related_card_codes, - ruling_type=ruling.ruling_type, - question=ruling.question, - answer=ruling.answer, - text=ruling.text, - provenance=ruling.provenance.dict(), - original_html_snippet=ruling.original_html_snippet, - tags=ruling.tags, - ) - session.add(ruling_db) - - # Commit the session - session.commit() - logging.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}") + try: + # Process and add data to the database + for ruling_dict in rulings_data: + # Validate with Pydantic + ruling = Ruling(**ruling_dict) + + # Check for existing ruling to avoid duplicates + existing = session.query(RulingDB).filter_by(id=ruling.id).first() + if existing: + logger.warning(f"Ruling {ruling.id} already exists, skipping") + continue + + # Create a RulingDB instance + ruling_db = RulingDB( + id=ruling.id, + source_card_code=ruling.source_card_code, + related_card_codes=ruling.related_card_codes, + ruling_type=ruling.ruling_type, + question=ruling.question, + answer=ruling.answer, + text=ruling.text, + provenance=ruling.provenance.dict(), + original_html_snippet=ruling.original_html_snippet, + tags=ruling.tags, + ) + session.add(ruling_db) + + # Commit the session + session.commit() + logger.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}") + except Exception as e: + session.rollback() + logger.error(f"Error loading rulings: {e}") + raise + finally: + session.close()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def main():

# Define paths

json_path = Path("assets/processed_rulings_v3_ai_enriched.json")

db_path = Path("abyssal_tome.db")

# Check if JSON file exists

if not json_path.exists():

logging.error(f"JSON file not found at {json_path}")

return

# Create engine and session

engine = create_engine(f"sqlite:///{db_path}")

Session = sessionmaker(bind=engine)

session = Session()

# Create tables

Base.metadata.create_all(engine)

# Load data from JSON

with open(json_path, 'r', encoding='utf-8') as f:

rulings_data = json.load(f)

# Process and add data to the database

for ruling_dict in rulings_data:

# Validate with Pydantic

ruling = Ruling(**ruling_dict)

# Create a RulingDB instance

ruling_db = RulingDB(

id=ruling.id,

source_card_code=ruling.source_card_code,

related_card_codes=ruling.related_card_codes,

ruling_type=ruling.ruling_type,

question=ruling.question,

answer=ruling.answer,

text=ruling.text,

provenance=ruling.provenance.dict(),

original_html_snippet=ruling.original_html_snippet,

tags=ruling.tags,

)

session.add(ruling_db)

# Commit the session

session.commit()

logging.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}")

if __name__ == "__main__":

main()

def main() -> None:

# Define paths

json_path = Path("assets/processed_rulings_v3_ai_enriched.json")

db_path = Path("abyssal_tome.db")

logger = logging.getLogger(__name__)

# Check if JSON file exists

if not json_path.exists():

logger.error(f"JSON file not found at {json_path}")

return

# Create engine and session

engine = create_engine(f"sqlite:///{db_path}")

session_factory = sessionmaker(bind=engine)

session = session_factory()

# Create tables

Base.metadata.create_all(engine)

# Load data from JSON

with json_path.open('r', encoding='utf-8') as f:

rulings_data = json.load(f)

try:

# Process and add data to the database

for ruling_dict in rulings_data:

# Validate with Pydantic

ruling = Ruling(**ruling_dict)

# Check for existing ruling to avoid duplicates

existing = session.query(RulingDB).filter_by(id=ruling.id).first()

if existing:

logger.warning(f"Ruling {ruling.id} already exists, skipping")

continue

# Create a RulingDB instance

ruling_db = RulingDB(

id=ruling.id,

source_card_code=ruling.source_card_code,

related_card_codes=ruling.related_card_codes,

ruling_type=ruling.ruling_type,

question=ruling.question,

answer=ruling.answer,

text=ruling.text,

provenance=ruling.provenance.dict(),

original_html_snippet=ruling.original_html_snippet,

tags=ruling.tags,

)

session.add(ruling_db)

# Commit the session

session.commit()

logger.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}")

except Exception as e:

session.rollback()

logger.error(f"Error loading rulings: {e}")

raise

finally:

session.close()

if __name__ == "__main__":

main()

🧰 Tools

🪛 Ruff (0.11.9)

64-64: error() call on root logger

(LOG015)

69-69: Variable Session in function should be lowercase

(N806)

76-76: open() should be replaced by Path.open()

(PTH123)

🤖 Prompt for AI Agents

In scripts/load_data_to_sqlite.py around lines 53 to 100, add try-except blocks around the database session operations to catch and log exceptions during data insertion and commit. Ensure the session is properly closed in a finally block to avoid resource leaks. Also, use logging instead of print statements and follow PEP8 conventions for imports and function definitions. This will improve robustness and maintainability of the script.

coderabbitai · 2025-07-13T03:05:17Z

+@app.get("/rulings", response_model=List[Ruling])
+def get_rulings(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
+    rulings = db.query(RulingDB).offset(skip).limit(limit).all()
+    return rulings


🧹 Nitpick (assertive)

Add parameter validation and adjust default limit.

The default limit of 100 might be too high for performance. Also, negative values should be prevented.

@app.get("/rulings", response_model=List[Ruling]) -def get_rulings(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)): +def get_rulings(skip: int = 0, limit: int = 20, db: Session = Depends(get_db)) -> List[Ruling]: + if skip < 0: + raise HTTPException(status_code=400, detail="Skip must be non-negative") + if limit < 1 or limit > 100: + raise HTTPException(status_code=400, detail="Limit must be between 1 and 100") rulings = db.query(RulingDB).offset(skip).limit(limit).all() return rulings

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@app.get("/rulings", response_model=List[Ruling])

def get_rulings(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):

rulings = db.query(RulingDB).offset(skip).limit(limit).all()

return rulings

@app.get("/rulings", response_model=List[Ruling])

def get_rulings(skip: int = 0, limit: int = 20, db: Session = Depends(get_db)) -> List[Ruling]:

if skip < 0:

raise HTTPException(status_code=400, detail="Skip must be non-negative")

if limit < 1 or limit > 100:

raise HTTPException(status_code=400, detail="Limit must be between 1 and 100")

rulings = db.query(RulingDB).offset(skip).limit(limit).all()

return rulings

🤖 Prompt for AI Agents

In src/abyssal_tome/app.py around lines 67 to 70, the get_rulings endpoint lacks validation for the skip and limit parameters, allowing negative values and has a high default limit of 100. Add validation to ensure skip and limit are non-negative integers and reduce the default limit to a lower value such as 20 to improve performance. Use parameter validation techniques like FastAPI's Query with ge=0 for non-negative enforcement and set the default limit accordingly.

coderabbitai · 2025-07-13T03:05:17Z

+@app.get("/search", response_model=List[Ruling])
+def search_rulings(query: str, db: Session = Depends(get_db)):
+    # A simple search implementation
+    search_query = f"%{query}%"
+    rulings = db.query(RulingDB).filter(
+        (RulingDB.question.like(search_query)) |
+        (RulingDB.answer.like(search_query)) |
+        (RulingDB.text.like(search_query))
+    ).all()
+    return rulings


🧹 Nitpick (assertive)

Enhance search to include more fields and add pagination.

The current search only covers question, answer, and text fields, missing original_html_snippet. Also, search results should be paginated.

@app.get("/search", response_model=List[Ruling]) -def search_rulings(query: str, db: Session = Depends(get_db)): +def search_rulings( + query: str, + skip: int = 0, + limit: int = 20, + db: Session = Depends(get_db) +) -> List[Ruling]: + if not query or len(query) < 2: + raise HTTPException(status_code=400, detail="Query must be at least 2 characters") + if skip < 0: + raise HTTPException(status_code=400, detail="Skip must be non-negative") + if limit < 1 or limit > 100: + raise HTTPException(status_code=400, detail="Limit must be between 1 and 100") + # A simple search implementation search_query = f"%{query}%" rulings = db.query(RulingDB).filter( (RulingDB.question.like(search_query)) | (RulingDB.answer.like(search_query)) | - (RulingDB.text.like(search_query)) - ).all() + (RulingDB.text.like(search_query)) | + (RulingDB.original_html_snippet.like(search_query)) + ).offset(skip).limit(limit).all() return rulings

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@app.get("/search", response_model=List[Ruling])

def search_rulings(query: str, db: Session = Depends(get_db)):

# A simple search implementation

search_query = f"%{query}%"

rulings = db.query(RulingDB).filter(

(RulingDB.question.like(search_query)) |

(RulingDB.answer.like(search_query)) |

(RulingDB.text.like(search_query))

).all()

return rulings

@app.get("/search", response_model=List[Ruling])

def search_rulings(

query: str,

skip: int = 0,

limit: int = 20,

db: Session = Depends(get_db)

) -> List[Ruling]:

if not query or len(query) < 2:

raise HTTPException(status_code=400, detail="Query must be at least 2 characters")

if skip < 0:

raise HTTPException(status_code=400, detail="Skip must be non-negative")

if limit < 1 or limit > 100:

raise HTTPException(status_code=400, detail="Limit must be between 1 and 100")

# A simple search implementation

search_query = f"%{query}%"

rulings = (

db.query(RulingDB)

.filter(

(RulingDB.question.like(search_query)) |

(RulingDB.answer.like(search_query)) |

(RulingDB.text.like(search_query)) |

(RulingDB.original_html_snippet.like(search_query))

)

.offset(skip)

.limit(limit)

.all()

)

return rulings

🧰 Tools

🪛 Ruff (0.11.9)

86-86: Missing return type annotation for public function search_rulings

(ANN201)

86-86: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🤖 Prompt for AI Agents

In src/abyssal_tome/app.py around lines 79 to 88, the search function only filters by question, answer, and text fields and returns all results without pagination. Update the filter to also include the original_html_snippet field in the search criteria using a LIKE condition. Add pagination parameters such as limit and offset to the function signature, apply them to the query using limit() and offset() methods, and return the paginated results.

This commit addresses a number of refactoring suggestions to improve the overall quality of the codebase. - Extracted duplicated Pydantic and SQLAlchemy models into a shared `src/abyssal_tome/models.py` module. - Improved `scripts/load_data_to_sqlite.py` with better error handling, logging, and session management. - Refactored `scripts/enrich_rulings_ai.py` to add error handling for the OpenAI API key, use a dedicated logger, and make the OpenAI model configurable. - Enhanced `src/abyssal_tome/app.py` with type annotations, a configurable database URL, endpoint validation, and improved search functionality. - Resolved a merge conflict in `scripts/enrich_rulings_ai.py`. - Added Pyright to the pre-commit configuration for static type checking.

coderabbitai

Actionable comments posted: 9

🔭 Outside diff range comments (1)

scripts/enrich_rulings_ai.py (1)
248-248: Consider using deep copy for nested dictionaries.

The current shallow copy may not properly isolate nested dictionaries like provenance if they're modified.
+        import copy
         # Make a copy to avoid modifying the original list of dicts in-place if it's reused
-        enriched_ruling = ruling_dict.copy()  # Works淺copy for dicts of primitives/nested dicts
+        enriched_ruling = copy.deepcopy(ruling_dict)  # Deep copy for nested structures

♻️ Duplicate comments (2)

scripts/load_data_to_sqlite.py (1)

20-33: Fix code style issues flagged by static analysis.

Apply these improvements:

+logger = logging.getLogger(__name__)
+
 if not json_path.exists():
-    logging.error(f"JSON file not found at {json_path}, aborting.")
+    logger.error(f"JSON file not found at {json_path}, aborting.")
     return

 engine = create_engine(f"sqlite:///{db_path}")
-Session = sessionmaker(bind=engine)
-session = Session()
+session_factory = sessionmaker(bind=engine)
+session = session_factory()

 try:
     Base.metadata.create_all(engine)
-    logging.info("Database tables created.")
+    logger.info("Database tables created.")

-    with open(json_path, 'r', encoding='utf-8') as f:
+    with json_path.open('r', encoding='utf-8') as f:
         rulings_data = json.load(f)

src/abyssal_tome/app.py (1)

14-16: Create database tables at application startup.

The application will fail if the database tables don't exist.

Add table creation after the engine setup:
 engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
 SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
+
+# Create tables at startup
+Base.metadata.create_all(engine)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 45ac0db and cf66170.

📒 Files selected for processing (6)

.pre-commit-config.yaml (1 hunks)
pyproject.toml (1 hunks)
scripts/enrich_rulings_ai.py (7 hunks)
scripts/load_data_to_sqlite.py (1 hunks)
src/abyssal_tome/app.py (1 hunks)
src/abyssal_tome/models.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

scripts/load_data_to_sqlite.py (1)

src/abyssal_tome/models.py (3)

Provenance (22-27)

Ruling (29-42)

RulingDB (8-20)

🪛 YAMLlint (1.37.1)

.pre-commit-config.yaml

[error] 28-28: too many spaces after hyphen

(hyphens)

🪛 Ruff (0.11.9)

scripts/load_data_to_sqlite.py

22-22: error() call on root logger

(LOG015)

26-26: Variable Session in function should be lowercase

(N806)

31-31: info() call on root logger

(LOG015)

33-33: open() should be replaced by Path.open()

(PTH123)

53-53: error() call on root logger

(LOG015)

56-56: info() call on root logger

(LOG015)

59-59: error() call on root logger

(LOG015)

63-63: info() call on root logger

(LOG015)

src/abyssal_tome/app.py

30-30: Module level import not at top of file

(E402)

35-35: Missing return type annotation for public function get_rulings

(ANN201)

35-35: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

35-35: Line too long (105 > 100)

(E501)

40-40: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

48-48: Missing return type annotation for public function search_rulings

(ANN201)

48-48: Line too long (120 > 100)

(E501)

48-48: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

scripts/enrich_rulings_ai.py

22-22: Line too long (106 > 100)

(E501)

27-27: Line too long (103 > 100)

(E501)

35-35: Line too long (102 > 100)

(E501)

58-58: Line too long (104 > 100)

(E501)

61-61: Line too long (133 > 100)

(E501)

63-63: Line too long (111 > 100)

(E501)

142-142: Line too long (146 > 100)

(E501)

352-352: Line too long (105 > 100)

(E501)

387-387: Line too long (103 > 100)

(E501)

🔇 Additional comments (4)

pyproject.toml (1)

47-48: Good addition of Pyright for type checking.

The version pinning ensures consistent type checking across environments.

scripts/enrich_rulings_ai.py (3)

20-24: Past review comment has been properly addressed.

The error handling for missing API key has been correctly implemented with proper logging and exit behavior.

17-17: Past review comment has been properly addressed.

The logger instance is now properly configured using logging.getLogger(__name__) instead of the root logger.

26-26: Past review comment has been properly addressed.

The model parameter is now configurable via the OPENAI_MODEL environment variable with a sensible default.

coderabbitai · 2025-07-13T03:49:34Z

+    -   id: pyright
+        name: Pyright
+        entry: .venv/bin/pyright
+        language: system
+        types: [python]


🧹 Nitpick (assertive)

Fix YAML formatting - too many spaces after hyphen.

Apply this diff to fix the YAML formatting:

- - id: pyright + - id: pyright

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 YAMLlint (1.37.1)

[error] 28-28: too many spaces after hyphen

(hyphens)

🤖 Prompt for AI Agents

In .pre-commit-config.yaml around lines 28 to 32, the YAML formatting is incorrect due to too many spaces after the hyphen in the list item for the pyright hook. Remove the extra spaces after the hyphen so that the hyphen is directly followed by the id key without additional spaces, ensuring proper YAML list syntax.

coderabbitai · 2025-07-13T03:49:35Z

+    class Config:
+        orm_mode = True


🧹 Nitpick (assertive)

Update deprecated Pydantic configuration.

The orm_mode configuration is deprecated in Pydantic v2.

Apply this diff to use the current configuration:

class Config: - orm_mode = True + from_attributes = True

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class Config:

orm_mode = True

class Config:

from_attributes = True

🤖 Prompt for AI Agents

In src/abyssal_tome/models.py at lines 41 to 42, the Pydantic Config class uses the deprecated orm_mode setting. Replace the Config class with the new Pydantic v2 style by defining a model_config attribute using the ConfigDict with orm_mode set to True. This updates the model to the current Pydantic configuration standard.

coderabbitai · 2025-07-13T03:49:35Z

+                logging.error(f"Error processing ruling: {ruling_dict.get('id')}, error: {e}")
+
+        session.commit()
+        logging.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}")
+
+    except Exception as e:
+        logging.error(f"An error occurred during database operations: {e}")
+        session.rollback()
+    finally:
+        session.close()
+        logging.info("Database session closed.")
+


🧹 Nitpick (assertive)

Use module logger instead of root logger.

Update the remaining logging calls:

- logging.error(f"Error processing ruling: {ruling_dict.get('id')}, error: {e}") + logger.error(f"Error processing ruling: {ruling_dict.get('id')}, error: {e}") session.commit() - logging.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}") + logger.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}") except Exception as e: - logging.error(f"An error occurred during database operations: {e}") + logger.error(f"An error occurred during database operations: {e}") session.rollback() finally: session.close() - logging.info("Database session closed.") + logger.info("Database session closed.")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

logging.error(f"Error processing ruling: {ruling_dict.get('id')}, error: {e}")

session.commit()

logging.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}")

except Exception as e:

logging.error(f"An error occurred during database operations: {e}")

session.rollback()

finally:

session.close()

logging.info("Database session closed.")

logger.error(f"Error processing ruling: {ruling_dict.get('id')}, error: {e}")

session.commit()

logger.info(f"Successfully loaded {len(rulings_data)} rulings into {db_path}")

except Exception as e:

logger.error(f"An error occurred during database operations: {e}")

session.rollback()

finally:

session.close()

logger.info("Database session closed.")

🧰 Tools

🪛 Ruff (0.11.9)

53-53: error() call on root logger

(LOG015)

56-56: info() call on root logger

(LOG015)

59-59: error() call on root logger

(LOG015)

63-63: info() call on root logger

(LOG015)

🤖 Prompt for AI Agents

In scripts/load_data_to_sqlite.py around lines 52 to 63, replace all calls to the root logger (logging.error, logging.info) with calls to the module-specific logger instance. Identify the logger created for the module (e.g., logger = logging.getLogger(__name__)) and use it for all logging calls in this block to ensure consistent and configurable logging.

coderabbitai · 2025-07-13T03:49:35Z

+from fastapi import Depends, FastAPI, HTTPException
+from sqlalchemy import create_engine
+from sqlalchemy.orm import sessionmaker, Session
+from src.abyssal_tome.models import Provenance, Ruling, RulingDB, Base
+from typing import List, Generator


🧹 Nitpick (assertive)

Move all imports to the top of the file.

Move the Query import to the top with other FastAPI imports:

-from fastapi import Depends, FastAPI, HTTPException +from fastapi import Depends, FastAPI, HTTPException, Query from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker, Session from src.abyssal_tome.models import Provenance, Ruling, RulingDB, Base from typing import List, Generator

And remove line 30:

-from fastapi import Query

Also applies to: 30-30

🤖 Prompt for AI Agents

In src/abyssal_tome/app.py around lines 3 to 7 and line 30, move the import of Query from its current position at line 30 to join the other FastAPI imports at the top of the file. Remove the separate import statement at line 30 after relocating it. This consolidates all imports at the top for better organization.

coderabbitai · 2025-07-13T03:49:35Z

+def get_rulings(skip: int = Query(0, ge=0), limit: int = Query(20, ge=0), db: Session = Depends(get_db)):
+    rulings = db.query(RulingDB).offset(skip).limit(limit).all()
+    return rulings
+


🧹 Nitpick (assertive)

Add return type annotation and fix line length.

@app.get("/rulings", response_model=List[Ruling]) -def get_rulings(skip: int = Query(0, ge=0), limit: int = Query(20, ge=0), db: Session = Depends(get_db)): +def get_rulings( + skip: int = Query(0, ge=0), + limit: int = Query(20, ge=0), + db: Session = Depends(get_db) +) -> List[Ruling]: rulings = db.query(RulingDB).offset(skip).limit(limit).all() return rulings

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_rulings(skip: int = Query(0, ge=0), limit: int = Query(20, ge=0), db: Session = Depends(get_db)):

rulings = db.query(RulingDB).offset(skip).limit(limit).all()

return rulings

@app.get("/rulings", response_model=List[Ruling])

def get_rulings(

skip: int = Query(0, ge=0),

limit: int = Query(20, ge=0),

db: Session = Depends(get_db)

) -> List[Ruling]:

rulings = db.query(RulingDB).offset(skip).limit(limit).all()

return rulings

🧰 Tools

🪛 Ruff (0.11.9)

35-35: Missing return type annotation for public function get_rulings

(ANN201)

35-35: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

35-35: Line too long (105 > 100)

(E501)

🤖 Prompt for AI Agents

In src/abyssal_tome/app.py around lines 32 to 35, the function get_rulings lacks a return type annotation and the line length may exceed style guidelines. Add an appropriate return type annotation to the function signature, such as List[RulingDB], and reformat the code to ensure no line exceeds the maximum allowed length, possibly by breaking long lines or adjusting spacing.

coderabbitai · 2025-07-13T03:49:35Z

+def search_rulings(query: str, skip: int = Query(0, ge=0), limit: int = Query(20, ge=0), db: Session = Depends(get_db)):
+    # A simple search implementation
+    search_query = f"%{query}%"
+    rulings = db.query(RulingDB).filter(
+        (RulingDB.question.like(search_query)) |
+        (RulingDB.answer.like(search_query)) |
+        (RulingDB.text.like(search_query)) |
+        (RulingDB.original_html_snippet.like(search_query))
+    ).offset(skip).limit(limit).all()
+    return rulings


🧹 Nitpick (assertive)

Add return type annotation and fix line length.

@app.get("/search", response_model=List[Ruling]) -def search_rulings(query: str, skip: int = Query(0, ge=0), limit: int = Query(20, ge=0), db: Session = Depends(get_db)): +def search_rulings( + query: str, + skip: int = Query(0, ge=0), + limit: int = Query(20, ge=0), + db: Session = Depends(get_db) +) -> List[Ruling]: # A simple search implementation search_query = f"%{query}%"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def search_rulings(query: str, skip: int = Query(0, ge=0), limit: int = Query(20, ge=0), db: Session = Depends(get_db)):

# A simple search implementation

search_query = f"%{query}%"

rulings = db.query(RulingDB).filter(

(RulingDB.question.like(search_query)) |

(RulingDB.answer.like(search_query)) |

(RulingDB.text.like(search_query)) |

(RulingDB.original_html_snippet.like(search_query))

).offset(skip).limit(limit).all()

return rulings

@app.get("/search", response_model=List[Ruling])

def search_rulings(

query: str,

skip: int = Query(0, ge=0),

limit: int = Query(20, ge=0),

db: Session = Depends(get_db)

) -> List[Ruling]:

# A simple search implementation

search_query = f"%{query}%"

rulings = db.query(RulingDB).filter(

(RulingDB.question.like(search_query)) |

(RulingDB.answer.like(search_query)) |

(RulingDB.text.like(search_query)) |

(RulingDB.original_html_snippet.like(search_query))

).offset(skip).limit(limit).all()

return rulings

🧰 Tools

🪛 Ruff (0.11.9)

48-48: Missing return type annotation for public function search_rulings

(ANN201)

48-48: Line too long (120 > 100)

(E501)

48-48: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🤖 Prompt for AI Agents

In src/abyssal_tome/app.py around lines 44 to 53, the search_rulings function lacks a return type annotation and some lines exceed recommended length limits. Add an appropriate return type annotation to the function signature reflecting the expected return type, such as a list of RulingDB objects. Also, refactor the filter conditions to break long lines into shorter ones, improving readability and adhering to line length standards.

coderabbitai · 2025-07-13T03:49:35Z


+api_key = os.environ.get("OPENAI_API_KEY")
+if not api_key:
+    logger.error("OPENAI_API_KEY environment variable not set. Please set it before running this script.")


🧹 Nitpick (assertive)

Fix line length violations.

Multiple lines exceed the 100-character limit as flagged by static analysis.

Consider breaking these long lines for better readability:

- logger.error("OPENAI_API_KEY environment variable not set. Please set it before running this script.") + logger.error( + "OPENAI_API_KEY environment variable not set. Please set it before running this script." + ) -def call_openai_api(prompt: str, model: str = os.environ.get("OPENAI_MODEL", "gpt-4o")) -> dict | None: +def call_openai_api( + prompt: str, model: str = os.environ.get("OPENAI_MODEL", "gpt-4o") +) -> dict | None: - {"role": "system", "content": "You are a helpful assistant designed to output JSON."}, + { + "role": "system", + "content": "You are a helpful assistant designed to output JSON." + },

Apply similar formatting to other long lines for consistency.

Also applies to: 27-27, 35-35, 58-58, 61-61, 63-63, 142-142, 352-352, 387-387

🧰 Tools

🪛 Ruff (0.11.9)

22-22: Line too long (106 > 100)

(E501)

🤖 Prompt for AI Agents

In scripts/enrich_rulings_ai.py at line 22 and other specified lines (27, 35, 58, 61, 63, 142, 352, 387), some lines exceed the 100-character limit. To fix this, break these long lines into multiple shorter lines using line continuation or string concatenation to improve readability and comply with style guidelines. Apply consistent formatting across all these lines.

coderabbitai · 2025-07-13T03:49:35Z

+    prompt = f"""
+    Given the following ruling text for the card with code '{source_card_code}', identify any other card codes mentioned in the text.
+    Card codes are 5-digit strings.
+    Return a JSON object with a single key "related_card_codes" containing a list of the identified card codes.
+    Do not include the source card code '{source_card_code}' in the list.
+
+    Ruling text:
+    ---
+    {ruling_text}
+    ---
+    """


🧹 Nitpick (assertive)

Consider improving prompt structure for better AI responses.

The prompt could be more structured to ensure consistent JSON responses and handle edge cases better.

prompt = f""" - Given the following ruling text for the card with code '{source_card_code}', identify any other card codes mentioned in the text. - Card codes are 5-digit strings. - Return a JSON object with a single key "related_card_codes" containing a list of the identified card codes. - Do not include the source card code '{source_card_code}' in the list. + Analyze the following ruling text for the card '{source_card_code}' and identify any other card codes mentioned. + + Rules: + - Card codes are exactly 5-digit strings (e.g., "01001", "02034") + - Look for patterns like [01001], card 01001, or just 01001 + - Exclude the source card code '{source_card_code}' + - Return empty list if no other cards are found + + Return JSON format: {{"related_card_codes": ["01001", "02034"]}} Ruling text: --- {ruling_text} --- """

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

prompt = f"""

Given the following ruling text for the card with code '{source_card_code}', identify any other card codes mentioned in the text.

Card codes are 5-digit strings.

Return a JSON object with a single key "related_card_codes" containing a list of the identified card codes.

Do not include the source card code '{source_card_code}' in the list.

Ruling text:

---

{ruling_text}

---

"""

prompt = f"""

Analyze the following ruling text for the card '{source_card_code}' and identify any other card codes mentioned.

Rules:

- Card codes are exactly 5-digit strings (e.g., "01001", "02034")

- Look for patterns like [01001], card 01001, or just 01001

- Exclude the source card code '{source_card_code}'

- Return empty list if no other cards are found

Return JSON format: {{"related_card_codes": ["01001", "02034"]}}

Ruling text:

---

{ruling_text}

---

"""

🧰 Tools

🪛 Ruff (0.11.9)

61-61: Line too long (133 > 100)

(E501)

63-63: Line too long (111 > 100)

(E501)

🤖 Prompt for AI Agents

In scripts/enrich_rulings_ai.py around lines 59 to 69, the prompt for the AI should be restructured to improve clarity and consistency of JSON responses. Refine the prompt by explicitly instructing the AI to always return a valid JSON object, even if no related card codes are found, and clarify the format and constraints. Add examples or specify how to handle cases with no matches to ensure robust parsing of the AI output.

coderabbitai · 2025-07-13T03:49:35Z

+        if isinstance(content, dict):
+            return content
+        return json.loads(content)


⚠️ Potential issue

Fix potential JSON parsing logic error.

The content from OpenAI API is always a string that needs JSON parsing. The isinstance check for dict will never be true.

- content = response.choices[0].message.content - if isinstance(content, dict): - return content - return json.loads(content) + content = response.choices[0].message.content + if content is None: + logger.error("Received null content from OpenAI API") + return None + return json.loads(content)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if isinstance(content, dict):

return content

return json.loads(content)

content = response.choices[0].message.content

if content is None:

logger.error("Received null content from OpenAI API")

return None

return json.loads(content)

🤖 Prompt for AI Agents

In scripts/enrich_rulings_ai.py around lines 40 to 42, remove the isinstance check for dict since the content from the OpenAI API is always a string. Always parse the content using json.loads without conditionally returning it as a dict.

…ped) This commit addresses a number of refactoring suggestions to improve the overall quality of the codebase. The rebase onto master was skipped due to timeouts in the environment. - Extracted duplicated Pydantic and SQLAlchemy models into a shared `src/abyssal_tome/models.py` module. - Improved `scripts/load_data_to_sqlite.py` with better error handling, logging, and session management. - Refactored `scripts/enrich_rulings_ai.py` to add error handling for the OpenAI API key, use a dedicated logger, and make the OpenAI model configurable. - Enhanced `src/abyssal_tome/app.py` with type annotations, a configurable database URL, endpoint validation, and improved search functionality. - Resolved a merge conflict in `scripts/enrich_rulings_ai.py`. - Added Pyright to the pre-commit configuration for static type checking.

Copilot AI review requested due to automatic review settings July 13, 2025 03:00

danelkay93 added the autoreview label Jul 13, 2025

Copilot AI reviewed Jul 13, 2025

View reviewed changes

coderabbitai bot requested changes Jul 13, 2025

View reviewed changes

danelkay93 closed this Jul 13, 2025

danelkay93 reopened this Jul 13, 2025

coderabbitai bot requested changes Jul 13, 2025

View reviewed changes

Conversation

danelkay93 commented Jul 13, 2025

Uh oh!

coderabbitai bot commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

danelkay93 commented Jul 13, 2025

Uh oh!

coderabbitai bot commented Jul 13, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danelkay93 Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Comprehensive Test Suite Expansion

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Jul 13, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Jul 13, 2025 •

edited

Loading