Skip to content

Add CypherSession class - Context retrieval #117

Closed
galshubeli wants to merge 6 commits intomainfrom
kg-agent
Closed

Add CypherSession class - Context retrieval #117
galshubeli wants to merge 6 commits intomainfrom
kg-agent

Conversation

@galshubeli
Copy link
Contributor

@galshubeli galshubeli commented Jun 26, 2025

Summary by CodeRabbit

  • New Features

    • Added a Jupyter notebook demonstrating an AI agent for querying a UFC knowledge graph using a web interface.
    • Introduced streaming response capability for question answering, allowing users to receive answers in real time.
  • Improvements

    • Enhanced the clarity and structure of system prompts for generating Cypher queries, including better instructions, error handling, and examples.
    • Improved validation and error messages for Cypher query relationship direction checks.
  • Refactor

    • Simplified and unified internal state management for chat sessions and query generation.
    • Streamlined ontology cleaning and Cypher query generation interfaces.
  • Bug Fixes

    • Added robust error handling for Cypher generation failures.

@galshubeli galshubeli requested a review from swilly22 June 26, 2025 13:04
@coderabbitai
Copy link

coderabbitai bot commented Jun 26, 2025

Walkthrough

This update refactors the chat session and QA streaming logic, consolidating QA step usage and removing the separate streaming QA step class. It restructures Cypher prompt templates for clarity and error handling, enhances Cypher validation, and introduces a comprehensive UFC knowledge graph AI agent demo notebook with Gradio integration.

Changes

File(s) Change Summary
graphrag_sdk/chat_session.py Refactored ChatSession to unify QA step usage, removed StreamingQAStep, improved state management, error handling, and ontology cleaning.
graphrag_sdk/fixtures/prompts.py Rewrote Cypher generation prompt templates, adding structure, error handling, and improved examples.
graphrag_sdk/helpers.py Refactored Cypher relation direction validation with improved regex parsing, label extraction, and error handling.
graphrag_sdk/steps/qa_step.py Added run_stream method to QAStep for streaming QA responses.
graphrag_sdk/steps/stream_qa_step.py Deleted StreamingQAStep class and file.
examples/ufc/pydantic-ai-agent-demo.ipynb Added a full-featured UFC knowledge graph AI agent demo notebook with Gradio web UI and Pydantic AI integration.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant GradioUI
    participant Agent
    participant ChatSession
    participant QAStep

    User->>GradioUI: Submit query
    GradioUI->>Agent: Pass user query and history
    Agent->>ChatSession: Generate Cypher/query context
    ChatSession->>QAStep: run or run_stream (QA)
    QAStep-->>ChatSession: QA response (stream or full)
    ChatSession-->>Agent: Cypher, context, and answer
    Agent-->>GradioUI: Final answer
    GradioUI-->>User: Display response
Loading

Possibly related PRs

  • [Feature] Streaming option to the Q&A step #103: Refactored streaming QA by introducing StreamingQAStep and send_message_stream; this PR removes StreamingQAStep and unifies streaming under QAStep, representing directly related but structurally opposite changes.

Suggested labels

enhancement, Impact S, Review effort 2/5

Suggested reviewers

  • swilly22
  • gkorland

Poem

A rabbit hopped through streaming code,
Refactored steps, a lighter load.
Prompts now clear, with rules in sight,
Cypher queries crafted right.
UFC demos, Gradio’s gleam,
Knowledge hops in every stream!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
graphrag_sdk/helpers.py (1)

210-299: Excellent refactoring of relationship direction validation!

The new implementation using a comprehensive regex pattern is more robust and handles various edge cases better. The improved error messages that include valid relations are very helpful.

However, there's an unused variable that should be removed:

-        except Exception as e:
+        except Exception:
             # Skip problematic patterns rather than failing
             continue
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 191fab9 and 444730e.

📒 Files selected for processing (4)
  • graphrag_sdk/chat_session.py (2 hunks)
  • graphrag_sdk/fixtures/prompts.py (2 hunks)
  • graphrag_sdk/helpers.py (2 hunks)
  • graphrag_sdk/kg.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
graphrag_sdk/helpers.py (2)
graphrag_sdk/kg.py (2)
  • ontology (131-132)
  • ontology (135-136)
graphrag_sdk/ontology.py (1)
  • get_relations_with_label (339-349)
🪛 Ruff (0.11.9)
graphrag_sdk/helpers.py

294-294: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🪛 Flake8 (7.2.0)
graphrag_sdk/helpers.py

[error] 294-294: local variable 'e' is assigned to but never used

(F841)


[error] 300-300: expected 2 blank lines, found 1

(E302)

graphrag_sdk/chat_session.py

[error] 186-186: expected 2 blank lines, found 1

(E302)


[error] 208-208: continuation line under-indented for visual indent

(E128)

🪛 Pylint (3.3.7)
graphrag_sdk/chat_session.py

[refactor] 186-186: Too many instance attributes (8/7)

(R0902)


[refactor] 207-207: Too many arguments (7/5)

(R0913)


[refactor] 207-207: Too many positional arguments (7/5)

(R0917)


[refactor] 186-186: Too few public methods (1/2)

(R0903)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: test (openai/gpt-4.1)
🔇 Additional comments (5)
graphrag_sdk/fixtures/prompts.py (2)

401-464: Excellent restructuring of the Cypher generation system prompt!

The new structure with clear sections for core requirements, query construction rules, relationship handling, and error handling significantly improves clarity and maintainability. The addition of explicit validation checklist and simple entity query examples will help ensure more consistent Cypher query generation.


466-523: Well-structured prompt updates for consistency!

The simplified prompts with explicit validation steps and the context analysis section in the history-aware prompt ensure consistent behavior across all Cypher generation scenarios. The emphasis on name normalization and relationship direction validation aligns perfectly with the improved system prompt.

graphrag_sdk/chat_session.py (2)

54-54: Good refactoring: Extracting ontology cleaning into a standalone function!

Moving clean_ontology_for_prompt to a standalone function eliminates code duplication between ChatSession and CypherSession. The implementation correctly handles the removal of 'unique' and 'required' attributes.

Also applies to: 318-343


276-316: Well-designed search method for Cypher-only interactions!

The search method provides a clean interface that returns structured results including execution time and proper error handling. This nicely separates Cypher query generation from Q&A processing, achieving the PR's objective.

graphrag_sdk/kg.py (1)

7-7: Clean integration of CypherSession into KnowledgeGraph!

The addition of the cypher_session() factory method follows the established pattern and provides a consistent interface for creating Cypher-only sessions alongside regular chat sessions.

Also applies to: 214-223

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
graphrag_sdk/chat_session.py (1)

12-12: Fix formatting: Add blank line before class definition.

 CYPHER_ERROR_RES = "Sorry, I could not find the answer to your question"
 
+
 class ResponseDict(dict):
🧹 Nitpick comments (4)
graphrag_sdk/chat_session.py (4)

57-57: Fix formatting: Add blank line before class definition.

         return self._response.error

+
 class ChatResponse:

62-69: Consider using keyword-only arguments to improve constructor clarity.

The ChatResponse constructor has many parameters which can lead to confusion and potential errors when calling. Consider using keyword-only arguments to improve clarity and prevent positional argument mistakes.

-    def __init__(self, question: str, context: str = None, cypher: str = None, 
-                 answer: str = None, execution_time: float = None, error: str = None):
+    def __init__(self, question: str, *, context: str = None, cypher: str = None, 
+                 answer: str = None, execution_time: float = None, error: str = None):

348-348: Fix formatting: Add blank line before function definition.

         return response

+
 def clean_ontology_for_prompt(ontology: dict) -> str:

348-373: Consider adding error handling for missing attributes.

The function assumes that 'unique' and 'required' keys exist in all attributes. Consider adding error handling to gracefully handle cases where these keys might be missing.

     for entity in ontology["entities"]:
         for attribute in entity["attributes"]:
-            del attribute['unique']
-            del attribute['required']
+            attribute.pop('unique', None)
+            attribute.pop('required', None)
     
     for relation in ontology["relations"]:
         for attribute in relation["attributes"]:
-            del attribute['unique']
-            del attribute['required']
+            attribute.pop('unique', None)
+            attribute.pop('required', None)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 444730e and f2777d4.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • graphrag_sdk/chat_session.py (9 hunks)
  • graphrag_sdk/kg.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (1)
  • graphrag_sdk/kg.py
🧰 Additional context used
🪛 Flake8 (7.2.0)
graphrag_sdk/chat_session.py

[error] 12-12: expected 2 blank lines, found 1

(E302)


[error] 57-57: expected 2 blank lines, found 1

(E302)


[error] 348-348: expected 2 blank lines, found 1

(E302)

🪛 Pylint (3.3.7)
graphrag_sdk/chat_session.py

[refactor] 62-62: Too many arguments (7/5)

(R0913)


[refactor] 62-62: Too many positional arguments (7/5)

(R0917)

🔇 Additional comments (6)
graphrag_sdk/chat_session.py (6)

125-134: Update docstring examples to reflect new property access patterns.

The enhanced docstring examples are well-structured and demonstrate both backward compatibility and new property access patterns effectively.


195-195: Potential null pointer exception when accessing last_complete_response.

The code safely checks if last_complete_response exists before accessing its answer property, which is good defensive programming.


211-224: Excellent backward compatibility implementation.

The refactored send_message method maintains backward compatibility while adding new functionality through the ResponseDict wrapper. The separation of concerns with _send_message_internal is well-designed.


235-244: Consistent error handling with structured response.

The error handling creates a proper ChatResponse object with error information, maintaining consistency with the new response structure.


311-346: Well-implemented search method with proper error handling.

The new search method provides a clean interface for cypher generation without QA. The error handling is consistent and the method properly updates the session state.


165-165: Extraction Verified: clean_ontology_for_prompt Behavior Preserved

  • clean_ontology_for_prompt is defined at graphrag_sdk/chat_session.py:348–373, matching the previous inline logic for removing unique and required from both entities and relations.
  • It’s invoked exactly once at graphrag_sdk/chat_session.py:165, replacing the inlined implementation without changing its behavior.

No further changes are needed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
graphrag_sdk/steps/qa_step.py (1)

54-74: Consider refactoring to reduce code duplication.

The run_stream method implementation is correct and follows the same pattern as the existing run method. However, the prompt formatting logic is duplicated between both methods.

Consider extracting the prompt formatting logic into a private helper method:

+    def _format_qa_prompt(self, question: str, cypher: str, context: str) -> str:
+        """Format the QA prompt with the given parameters."""
+        return self.qa_prompt.format(
+            context=context, cypher=cypher, question=question
+        )
+
     def run(self, question: str, cypher: str, context: str) -> str:
         """
         Run the QA step.
         
         Args:
             question (str): The question being asked.
             cypher (str): The Cypher query to run.
             context (str): Context for the QA.
             
         Returns:
             str: The response from the QA session.
         """
-        qa_prompt = self.qa_prompt.format(
-            context=context, cypher=cypher, question=question
-        )
+        qa_prompt = self._format_qa_prompt(question, cypher, context)

         logger.debug(f"QA Prompt: {qa_prompt}")
         qa_response = self.chat_session.send_message(qa_prompt)

         return qa_response.text

     def run_stream(self, question: str, cypher: str, context: str) -> Iterator[str]:
         """
         Run the QA step and stream the response chunks.
         
         Args:
             question (str): The question being asked.
             cypher (str): The Cypher query to run.
             context (str): Context for the QA.
             
         Returns:
             Iterator[str]: A generator that yields response chunks.
         """
-        qa_prompt = self.qa_prompt.format(
-            context=context, cypher=cypher, question=question
-        )
+        qa_prompt = self._format_qa_prompt(question, cypher, context)

         logger.debug(f"QA Prompt (Stream): {qa_prompt}")
         
         # Send the message and stream the response
         for chunk in self.chat_session.send_message_stream(qa_prompt):
             yield chunk

This refactoring would improve maintainability by following the DRY principle and ensuring consistent prompt formatting across both methods.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2777d4 and fa83b3b.

📒 Files selected for processing (3)
  • graphrag_sdk/chat_session.py (5 hunks)
  • graphrag_sdk/steps/qa_step.py (2 hunks)
  • graphrag_sdk/steps/stream_qa_step.py (0 hunks)
💤 Files with no reviewable changes (1)
  • graphrag_sdk/steps/stream_qa_step.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • graphrag_sdk/chat_session.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
graphrag_sdk/steps/qa_step.py (3)
graphrag_sdk/chat_session.py (1)
  • send_message_stream (162-185)
graphrag_sdk/models/litellm.py (1)
  • send_message_stream (239-275)
graphrag_sdk/models/model.py (1)
  • send_message_stream (106-107)
🔇 Additional comments (1)
graphrag_sdk/steps/qa_step.py (1)

2-2: LGTM - Import addition for streaming support.

The Iterator import is correctly added to support the new run_stream method's return type annotation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
examples/ufc/agent/falkor_agent.py (2)

84-85: Remove commented debug code

Debug print statements should be removed from production code.

         result = cypher_session.search(query)
-        # print(result)
         # Check if there was an error

147-149: Remove commented debug code

Another debug print statement that should be removed.

     cypher_session = kg_client.cypher_session()
-    # print(cypher_session.search("Who is Salsa Boy?"))
-
examples/ufc/pydantic-ai-agent-demo.ipynb (1)

232-232: Fix model name in comment

The comment mentions "gpt-4.1" but should be "gpt-4o" for consistency.

-# Initialize LiteModel for GraphRAG SDK (by default - gpt-4.1)
+# Initialize LiteModel for GraphRAG SDK (by default - gpt-4o)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa83b3b and 4eae718.

📒 Files selected for processing (5)
  • examples/ufc/agent/falkor_agent.py (1 hunks)
  • examples/ufc/demo-ufc.ipynb (14 hunks)
  • examples/ufc/ontology.json (10 hunks)
  • examples/ufc/pydantic-ai-agent-demo.ipynb (1 hunks)
  • graphrag_sdk/chat_session.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • graphrag_sdk/chat_session.py
🧰 Additional context used
🪛 Ruff (0.11.9)
examples/ufc/agent/falkor_agent.py

2-2: typing.Dict imported but unused

Remove unused import: typing.Dict

(F401)

🔇 Additional comments (2)
examples/ufc/ontology.json (1)

1-408: Well-structured ontology schema

The UFC ontology is comprehensive and follows consistent patterns. All entities have properly typed attributes with clear uniqueness and requirement constraints. The relationships between entities are logical and well-defined.

examples/ufc/demo-ufc.ipynb (1)

1-650: Excellent demo notebook

The notebook provides a comprehensive walkthrough of the GraphRAG SDK usage with the UFC dataset. The workflow from data loading through ontology extraction to QA is clear and well-documented.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
graphrag_sdk/chat_session.py (1)

113-141: Fix unused exception variable.

The method implementation is solid, but there's an unused variable in the exception handling.

Apply this diff to address the static analysis warning:

        try:
            (context, cypher, query_execution_time) = self.cypher_step.run(message)
-        except Exception as e:
+        except Exception:
            # If there's an error, return empty context and cypher with error message
            context = None
            cypher = CYPHER_ERROR_RES
            query_execution_time = None
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51b3bb8 and 30837b9.

📒 Files selected for processing (1)
  • graphrag_sdk/chat_session.py (6 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
graphrag_sdk/chat_session.py

133-133: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🔇 Additional comments (9)
graphrag_sdk/chat_session.py (9)

26-38: Updated example documentation looks comprehensive.

The enhanced docstring example clearly demonstrates both full QA pipeline usage and standalone Cypher query generation, which aligns well with the new generate_cypher_query method functionality.


51-55: Constructor parameter documentation is clear and well-structured.

The updated parameter descriptions properly reflect the new constructor signature and provide clear explanations of each parameter's purpose.


69-69: Good refactoring: Extracted ontology cleaning logic.

Using the extracted clean_ontology_for_prompt function improves code organization and reusability.


84-96: Excellent architectural improvement: Pre-initialized step instances.

This change from creating steps on-demand to pre-initializing them during construction provides better performance and cleaner state management. The approach enables proper state tracking with last_answer and avoids repeated object creation.


108-112: Well-designed helper method for state synchronization.

The _update_last_complete_response method properly maintains state consistency between the session and cypher step instances.


151-155: Clear and detailed return type documentation.

The updated docstring provides comprehensive details about the returned dictionary structure, making the API more discoverable.


157-179: Excellent refactoring: Improved method structure and consistency.

The refactored logic properly separates concerns by:

  1. Using the new generate_cypher_query method for query generation
  2. Maintaining consistent response format structure
  3. Properly updating state with _update_last_complete_response

This design makes the code more maintainable and provides better separation of concerns.


191-212: Solid streaming implementation with proper state management.

The streaming method correctly:

  1. Reuses the generate_cypher_query logic for consistency
  2. Uses the new qa_step.run_stream method
  3. Properly handles final response state updates
  4. Maintains consistency with the synchronous version

214-239: Well-implemented utility function with clear purpose.

The extracted clean_ontology_for_prompt function:

  1. Has a clear, descriptive name
  2. Includes comprehensive documentation
  3. Properly handles the ontology transformation
  4. Uses appropriate error handling (implicit through dict operations)

The function correctly removes the unique and required attributes that aren't needed for Q&A prompts, which should improve prompt clarity.

@galshubeli galshubeli closed this Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants