Skip to content

Conversation

@Eli4479
Copy link
Contributor

@Eli4479 Eli4479 commented Jun 9, 2025

Closes #64

📝 Description

Implement a hybrid database architecture using Supabase and Weaviate to create two main knowledge bases:

User_Info_collection: Store unified user profiles with Discord/GitHub linked identities, contribution stats, and skill analysis
GitHubInfoDB: Store repository codebase with fine-grained chunking for code search and analysis
Both databases will support RAG (Retrieval Augmented Generation) for the DevRel agent to provide intelligent assistance.

🔧 Changes Made

  • 🛠 Integrated Weaviate with Supabase for a hybrid vector+relational database setup
  • 📚 Defined schema for with Database

✅ Checklist

  • I have read the contributing guidelines.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added necessary documentation (if applicable).
  • Any dependent changes have been merged and published in downstream modules.

Summary by CodeRabbit

  • New Features

    • Introduced user authentication via GitHub and Discord OAuth.
    • Added integration with Supabase and Weaviate databases for managing users, repositories, code chunks, and interactions.
    • Defined comprehensive data models for users, repositories, code chunks, and interactions.
    • Added scripts to create and populate database schemas with sample data.
    • Provided Docker Compose setup for running a Weaviate instance.
    • Enhanced application lifecycle management with Weaviate client initialization and graceful shutdown.
  • Tests

    • Added extensive tests for CRUD operations on Supabase and Weaviate data models to ensure database integration works as expected.

Eli4479 added 3 commits June 2, 2025 13:47
- Implemented `create_schemas.py` to define schemas for user profiles, code chunks, and interactions in Weaviate.
- Added `populate_db.py` to insert sample data into the Weaviate collections.
- Created unit tests in `test_supabase.py` for user, interaction, code chunk, and repository models, including CRUD operations.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 9, 2025

Walkthrough

This pull request introduces a hybrid database architecture integrating Supabase and Weaviate. It adds modules for client initialization, Pydantic data models for both databases, authentication logic, database schema creation scripts, and test suites covering CRUD operations. Docker Compose configuration for Weaviate and scripts for populating both databases with sample data are also included.

Changes

File(s) Change Summary
backend/app/db/supabase/auth.py New module: Sync OAuth login/logout functions for GitHub/Discord using Supabase client with error handling.
backend/app/db/supabase/supabase_client.py New module: Loads environment variables, initializes, and provides Supabase client instance.
backend/app/db/weaviate/weaviate_client.py New module: Initializes and provides Weaviate client connection to local instance.
backend/app/model/supabase/models.py New module: Pydantic models for User, Repository, CodeChunk, and Interaction entities (Supabase).
backend/app/model/weaviate/models.py New module: Pydantic models for WeaviateUserProfile, WeaviateCodeChunk, WeaviateInteraction (Weaviate).
backend/app/scripts/supabase/populate_db.sql New SQL script: Creates and populates Supabase tables for users, repositories, code chunks, and interactions.
backend/app/scripts/weaviate/create_schemas.py New script: Creates Weaviate schemas for user profiles, code chunks, and interactions with vectorizer settings.
backend/app/scripts/weaviate/populate_db.py New script: Populates Weaviate collections with sample data for user profiles, code chunks, and interactions.
backend/docker-compose.yml New file: Docker Compose config for Weaviate service with modules, ports, volumes, and environment variables.
backend/main.py Adds Weaviate client initialization and cleanup to DevRAIApplication with error handling and logging.
tests/test_supabase.py New test module: Full CRUD tests for Supabase User, Repository, CodeChunk, and Interaction models.
tests/test_weaviate.py New test module: Full CRUD tests for WeaviateUserProfile, WeaviateCodeChunk, and WeaviateInteraction models.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant App
    participant Supabase
    participant Weaviate

    User->>App: Initiate OAuth login (GitHub/Discord)
    App->>Supabase: sign_in_with_oauth(provider, redirect_url)
    Supabase-->>App: Redirect URL
    App-->>User: Redirect to OAuth provider

    User->>App: Perform actions (CRUD)
    App->>Supabase: CRUD on structured data (User, Repo, etc.)
    App->>Weaviate: CRUD on vector data (Profile, CodeChunk, Interaction)
    Supabase-->>App: Structured data response
    Weaviate-->>App: Vector data response
    App-->>User: Results
Loading

Assessment against linked issues

Objective Addressed Explanation
Hybrid database architecture: Supabase for structured data, Weaviate for vector embeddings (#64)
User_Info_collection: Unified user profiles with Discord/GitHub, stats, skills (#64)
GitHubInfoDB: Repository codebase with chunking for code search/analysis (#64)
Synchronization: Foreign key relationships, cross-platform identity linking (#64)
RAG-enabled search, scalable vector search, rich metadata, multi-modal data storage (#64)

Poem

ʕ•ᴥ•ʔ
Hopping through the fields of code,
Two databases now share the load!
Supabase keeps things neat and right,
Weaviate vectors take flight.
Users, chunks, and chats align—
Hybrid dreams by design!
🥕✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Eli4479, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes the core database infrastructure for the application by implementing a hybrid data storage solution. It integrates Supabase for managing structured relational data like user profiles and repository metadata, and Weaviate for handling vector embeddings of code chunks and interactions. This setup is foundational for enabling the Retrieval Augmented Generation (RAG) capabilities required by the DevRel agent. The changes include defining the necessary data models and schemas, setting up client connections, providing initial data population scripts, and adding basic test coverage for database interactions.

Highlights

  • Hybrid Database Setup: Introduced a hybrid database architecture using Supabase for relational data and Weaviate for vector data, supporting RAG for the DevRel agent.
  • Database Schemas Defined: Defined the database schemas for both Supabase (User, Repository, CodeChunk, Interaction) and Weaviate (vectorized UserProfile, CodeChunk, Interaction) using Pydantic models and SQL/Weaviate schema definitions.
  • Database Client Connections: Added client setup and getter functions for connecting to both Supabase and Weaviate.
  • Sample Data & Testing: Included scripts to populate both databases with sample data and added basic CRUD test cases for interacting with the defined models in both databases.
  • Supabase Authentication: Added basic functions for GitHub and Discord OAuth login and user logout using the Supabase client.
  • Weaviate Docker Configuration: Added a Docker Compose service definition to easily run a local Weaviate instance.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configureGemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the foundational database layer for the DevRel agent, integrating Supabase for relational data and Weaviate for vector storage. The changes include defining database schemas, Pydantic models for data representation, client initialization for both databases, and scripts for schema creation and data population. The structure and models are well-defined, providing a solid base for the hybrid architecture. Review comments address areas for improvement related to configuration flexibility, consistency in Weaviate schema handling, and critical correctness issues in the Weaviate population script regarding async function usage and data structure alignment with models. Adherence to standard Python practices like PEP 8 for naming is generally followed.

from app.db.weaviate.weaviate_client import get_client


async def populate_Weaviate_code_chunk(client):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The function populate_Weaviate_code_chunk is defined as async def but is not awaited when called in populate_all_collections. This will cause a runtime error. Either remove async if it doesn't perform async operations, or ensure it's awaited.

Suggested change
async def populate_Weaviate_code_chunk(client):
def populate_Weaviate_code_chunk(client):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please resolve this? I understand this might not be necessary, being a temporary workaround to populate the database, but still maintaining the code quality throughout the codebase.

print("Populated: weaviate_code_chunk with sample data.")
except Exception as e:
print(f"Error populating weaviate_code_chunk: {e}")
async def populate_Weaviate_interaction(client):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The function populate_Weaviate_interaction is defined as async def but is not awaited when called in populate_all_collections. This will cause a runtime error. Either remove async if it doesn't perform async operations, or ensure it's awaited.

Suggested change
async def populate_Weaviate_interaction(client):
def populate_Weaviate_interaction(client):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar here please

Comment on lines 109 to 111
"userId": "095a5ff0-545a-48ff-83ad-2ea3566f5674",
"message": "Hi, can you explain the code chunk with ID 095a5ff0-545a-48ff-83ad-2ea3566f5674?",
"timestamp": "2023-01-01T12:00:00Z"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The sample data structure here (userId, message, timestamp) does not match the properties defined in the WeaviateInteraction model (supabaseInteractionId, conversationSummary, platform, topics). This will cause insertion errors when using batch.add_object with the Pydantic model's dictionary representation.

        {
            "supabaseInteractionId": "7c59fe66-53b6-44b5-8ae1-ddc29b071097",
            "conversationSummary": "Summary of interaction 7c59fe66-53b6-44b5-8ae1-ddc29b071097",
            "platform": "github",
            "topics": ["analysis", "health"]
        },

print("Populated: weaviate_interaction with sample data.")
except Exception as e:
print(f"Error populating weaviate_interaction: {e}")
async def populate_Weaviate_user_profile(client):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The function populate_Weaviate_user_profile is defined as async def but is not awaited when called in populate_all_collections. This will cause a runtime error. Either remove async if it doesn't perform async operations, or ensure it's awaited.

def populate_Weaviate_user_profile(client):

Comment on lines 245 to 247
await populate_Weaviate_code_chunk(client)
await populate_Weaviate_interaction(client)
await populate_Weaviate_user_profile(client)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The individual populate functions (populate_Weaviate_code_chunk, populate_Weaviate_interaction, populate_Weaviate_user_profile) are called with await, but they are not defined as async def functions. This will result in a runtime error.

    populate_Weaviate_code_chunk(client)
    populate_Weaviate_interaction(client)
    populate_Weaviate_user_profile(client)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I get it on this part. Aligned in a bit confusing way. Internally, there's no as such thing inside functions to await for. could you please refactor this, maybe?
would work, but seems slightly confusing.

result = supabase_client.auth.sign_in_with_oauth({
"provider": "github",
"options": {
"redirect_to": "http://localhost:3000/home"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The redirect URL http://localhost:3000/home is hardcoded. It would be better to make this configurable, perhaps via environment variables or application settings, to support different environments (development, staging, production).

Suggested change
"redirect_to": "http://localhost:3000/home"
"redirect_to": os.getenv("SUPABASE_REDIRECT_URL", "http://localhost:3000/home")

result = supabase_client.auth.sign_in_with_oauth({
"provider": "discord",
"options": {
"redirect_to": "http://localhost:3000/home"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the GitHub login, the redirect URL here is hardcoded. Consider making this configurable.

Suggested change
"redirect_to": "http://localhost:3000/home"
"redirect_to": os.getenv("SUPABASE_REDIRECT_URL", "http://localhost:3000/home")

Comment on lines 50 to 55
if "Weaviate_code_chunk" not in existing_collections:
create_code_chunk_schema(client)
if "Weaviate_interaction" not in existing_collections:
create_interaction_schema(client)
if "Weaviate_user_profile" not in existing_collections:
create_user_profile_schema(client)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The collection names used in the if conditions ("Weaviate_code_chunk", "Weaviate_interaction", "Weaviate_user_profile") use PascalCase with underscores, while the actual collection names created (name="weaviate_code_chunk", etc.) use snake_case. This inconsistency means the checks will always evaluate to True, and the schemas will be attempted to be created every time the script runs, potentially causing errors if they already exist. The casing in the checks should match the casing used for creation.

Suggested change
if "Weaviate_code_chunk" not in existing_collections:
create_code_chunk_schema(client)
if "Weaviate_interaction" not in existing_collections:
create_interaction_schema(client)
if "Weaviate_user_profile" not in existing_collections:
create_user_profile_schema(client)
if "weaviate_code_chunk" not in existing_collections:
create_code_chunk_schema(client)
if "weaviate_interaction" not in existing_collections:
create_interaction_schema(client)
if "weaviate_user_profile" not in existing_collections:
create_user_profile_schema(client)

backend/main.py Outdated
Comment on lines 21 to 22
weaviate_client = get_client()
print(f"Weaviate client initialized: {weaviate_client.is_ready()}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Initializing the Weaviate client as a class attribute means it's created when the class is defined, not when an instance is created or the application starts. This might be acceptable for a simple global client, but for more complex setups or testing, initializing it within __init__ or using a dedicated setup function might offer more flexibility and control over its lifecycle.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🔭 Outside diff range comments (1)
backend/main.py (1)

53-69: 🛠️ Refactor suggestion

Add Weaviate client cleanup in stop method.

The Weaviate client should be properly closed when the application stops to free resources.

     async def stop(self):
         """Stop the application"""
         logger.info("Stopping Devr.AI Application...")

         self.running = False

+        # Close Weaviate client
+        try:
+            if hasattr(self, 'weaviate_client') and self.weaviate_client is not None:
+                self.weaviate_client.close()
+                logger.info("Weaviate client closed")
+        except Exception as e:
+            logger.error(f"Error closing Weaviate client: {str(e)}")
+
         # Stop Discord bot
         try:
             if not self.discord_bot.is_closed():
                 await self.discord_bot.close()
         except Exception as e:
             logger.error(f"Error closing Discord bot: {str(e)}")

         # Stop queue manager
         await self.queue_manager.stop()

         logger.info("Devr.AI Application stopped")
🧹 Nitpick comments (8)
backend/app/db/weaviate/weaviate_client.py (2)

1-1: Remove unused import.

The os import is not used and should be removed as flagged by static analysis.

-import os
 import weaviate
🧰 Tools
🪛 Ruff (0.11.9)

1-1: os imported but unused

Remove unused import: os

(F401)


4-5: Fix misleading comment and consider making connection configurable.

The comment mentions "Weaviate Cloud" but the code connects to a local instance. Additionally, consider making the connection URL configurable via environment variables.

-# Connect to Weaviate Cloud
-client = weaviate.connect_to_local()
+# Connect to local Weaviate instance
+client = weaviate.connect_to_local()

For better configurability, consider this approach:

import weaviate
import os
from dotenv import load_dotenv

load_dotenv()

WEAVIATE_URL = os.getenv("WEAVIATE_URL", "http://localhost:8080")

if WEAVIATE_URL.startswith("http://localhost") or WEAVIATE_URL.startswith("http://127.0.0.1"):
    client = weaviate.connect_to_local()
else:
    client = weaviate.connect_to_wcs(
        cluster_url=WEAVIATE_URL,
        auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WEAVIATE_API_KEY"))
    )
backend/app/scripts/weaviate/create_schemas.py (1)

4-45: Reduce code duplication in schema creation functions.

The three schema creation functions follow an identical pattern with only the collection name, properties, and print message differing. This creates maintenance overhead and potential for inconsistencies.

Consider consolidating into a generic function:

+def create_schema(client, name: str, properties: list):
+    client.collections.create(
+        name=name,
+        properties=properties,
+        vectorizer_config=wc.Configure.Vectorizer.text2vec_cohere(),
+        generative_config=wc.Configure.Generative.openai()
+    )
+    print(f"Created: {name}")
+
 def create_user_profile_schema(client):
-    client.collections.create(
-        name="weaviate_user_profile",
-        properties=[
+    properties = [
             wc.Property(name="supabaseUserId", data_type=wc.DataType.TEXT),
             wc.Property(name="profileSummary", data_type=wc.DataType.TEXT),
             wc.Property(name="primaryLanguages", data_type=wc.DataType.TEXT_ARRAY),
             wc.Property(name="expertiseAreas", data_type=wc.DataType.TEXT_ARRAY),
-        ],
-        vectorizer_config=wc.Configure.Vectorizer.text2vec_cohere(),
-        generative_config=wc.Configure.Generative.openai()
-    )
-    print("Created: weaviate_user_profile")
+    ]
+    create_schema(client, "weaviate_user_profile", properties)
backend/app/scripts/weaviate/populate_db.py (4)

5-95: Improve sample code data quality.

The code content uses lorem ipsum-style placeholder text instead of actual code samples. For a more realistic development and testing environment, consider using actual code snippets that match the declared programming languages.

Example improvement for a C++ entry:

         {
             "supabaseChunkId": "095a5ff0-545a-48ff-83ad-2ea3566f5674",
-            "codeContent": (
-                "Maybe evening clearly trial want whose far. Sound life away senior difficult put. "
-                "Whose source hand so add Mr."
-            ),
+            "codeContent": "// Function to calculate factorial\nint factorial(int n) {\n    if (n <= 1) return 1;\n    return n * factorial(n - 1);\n}",
             "language": "C++",
-            "functionNames": ["comment"]
+            "functionNames": ["factorial"]
         },

105-106: Add missing line break for better code readability.

Missing line break between functions affects code readability.

     except Exception as e:
         print(f"Error populating weaviate_code_chunk: {e}")
+
 async def populate_Weaviate_interaction(client):

168-169: Add missing line break for better code readability.

Missing line break between functions affects code readability.

     except Exception as e:
         print(f"Error populating weaviate_interaction: {e}")
+
 async def populate_Weaviate_user_profile(client):

241-242: Add missing line break for better code readability.

Missing line break between functions affects code readability.

     except Exception as e:
         print(f"Error populating weaviate_user_profile: {e}")
+
 async def populate_all_collections():
backend/app/tests/test_weaviate.py (1)

2-2: Remove unused import.

The datetime import is not used in this file.

-from datetime import datetime
🧰 Tools
🪛 Ruff (0.11.9)

2-2: datetime.datetime imported but unused

Remove unused import: datetime.datetime

(F401)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f7f405 and 9d098e8.

📒 Files selected for processing (12)
  • backend/app/db/supabase/auth.py (1 hunks)
  • backend/app/db/supabase/supabase_client.py (1 hunks)
  • backend/app/db/weaviate/weaviate_client.py (1 hunks)
  • backend/app/model/supabase/models.py (1 hunks)
  • backend/app/model/weaviate/models.py (1 hunks)
  • backend/app/scripts/supabase/populate_db.sql (1 hunks)
  • backend/app/scripts/weaviate/create_schemas.py (1 hunks)
  • backend/app/scripts/weaviate/populate_db.py (1 hunks)
  • backend/app/tests/test_supabase.py (1 hunks)
  • backend/app/tests/test_weaviate.py (1 hunks)
  • backend/docker-compose.yml (1 hunks)
  • backend/main.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (4)
backend/main.py (1)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (8-9)
backend/app/scripts/weaviate/create_schemas.py (1)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (8-9)
backend/app/scripts/weaviate/populate_db.py (1)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (8-9)
backend/app/tests/test_weaviate.py (3)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (8-9)
backend/app/model/weaviate/models.py (3)
  • WeaviateUserProfile (5-13)
  • WeaviateCodeChunk (16-24)
  • WeaviateInteraction (27-35)
backend/app/tests/test_supabase.py (7)
  • insert_code_chunk (169-179)
  • update_code_chunk (187-191)
  • delete_code_chunk (192-196)
  • test_code_chunk (197-222)
  • update_interaction (135-139)
  • delete_interaction (140-144)
  • test_interaction (146-167)
🪛 Ruff (0.11.9)
backend/app/db/weaviate/weaviate_client.py

1-1: os imported but unused

Remove unused import: os

(F401)

backend/app/tests/test_weaviate.py

2-2: datetime.datetime imported but unused

Remove unused import: datetime.datetime

(F401)


18-18: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🪛 Pylint (3.3.7)
backend/app/model/supabase/models.py

[refactor] 7-7: Too few public methods (0/2)

(R0903)


[refactor] 64-64: Too few public methods (0/2)

(R0903)


[refactor] 111-111: Too few public methods (0/2)

(R0903)


[refactor] 146-146: Too few public methods (0/2)

(R0903)

backend/app/tests/test_weaviate.py

[refactor] 39-39: Either all return statements in a function should return an expression, or none of them should.

(R1710)


[error] 82-82: Assigning result of a function call, where the function has no return

(E1111)


[refactor] 106-106: Either all return statements in a function should return an expression, or none of them should.

(R1710)


[error] 150-150: Assigning result of a function call, where the function has no return

(E1111)


[refactor] 172-172: Either all return statements in a function should return an expression, or none of them should.

(R1710)


[error] 217-217: Assigning result of a function call, where the function has no return

(E1111)

backend/app/model/weaviate/models.py

[refactor] 5-5: Too few public methods (0/2)

(R0903)


[refactor] 16-16: Too few public methods (0/2)

(R0903)


[refactor] 27-27: Too few public methods (0/2)

(R0903)

🔇 Additional comments (8)
backend/app/db/supabase/supabase_client.py (1)

1-18: LGTM! Clean client initialization pattern.

The Supabase client initialization follows best practices with proper environment variable validation and a clear module interface.

backend/docker-compose.yml (1)

20-20: Note: Anonymous access enabled for development.

Anonymous access is enabled which is appropriate for local development but ensure this configuration is not used in production environments.

backend/app/model/weaviate/models.py (1)

5-35: Well-structured data models with appropriate use of Pydantic features.

The models are correctly designed as data transfer objects with proper field aliases and type annotations. The static analysis warnings about "too few public methods" are false positives - data models typically contain only fields and minimal behavior, which is the correct approach here.

The consistent use of:

  • Field aliases to match Weaviate schema naming conventions
  • Type annotations for Lists and proper field descriptions
  • 384-dimensional embedding vectors across all models

demonstrates good design consistency.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 5-5: Too few public methods (0/2)

(R0903)


[refactor] 16-16: Too few public methods (0/2)

(R0903)


[refactor] 27-27: Too few public methods (0/2)

(R0903)

backend/app/model/supabase/models.py (1)

7-184: Excellent comprehensive data models!

The models are well-designed with comprehensive documentation, proper type hints, and good field organization. They effectively bridge Supabase and Weaviate systems with appropriate foreign key references.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 7-7: Too few public methods (0/2)

(R0903)


[refactor] 64-64: Too few public methods (0/2)

(R0903)


[refactor] 111-111: Too few public methods (0/2)

(R0903)


[refactor] 146-146: Too few public methods (0/2)

(R0903)

backend/app/tests/test_weaviate.py (1)

42-46:

Details

❌ Incorrect review comment

Verify Weaviate API usage consistency.

The code mixes different Weaviate API patterns: client.collections.get() vs client.data_object.get(). Ensure you're using the correct API pattern consistently.

Also applies to: 57-60


🏁 Script executed:

#!/bin/bash
# Check Weaviate client documentation and API patterns used in the codebase
rg -A 3 -B 3 "client\.(collections|data_object)" --type py

Length of output: 7927


We need to verify whether the Python Weaviate client supports BM25 through client.collections.get(...).query.bm25 or if queries should be performed via the data_object module (e.g., client.data_object.search). Let's search for any usage of data_object.search in the codebase:


🏁 Script executed:

#!/bin/bash
rg -A3 -B3 "data_object\.search" --type py

Length of output: 44


Ignore Collections vs. DataObject mix – it’s intentional

The tests correctly use:

  • client.collections.get(…).query.bm25(…) for BM25-style text search via the Collections API
  • client.data_object.(create|get|update|delete)(…) for CRUD against individual objects

These are two distinct modules in the Weaviate Python client and aren’t interchangeable. No code changes required.

Likely an incorrect or invalid review comment.

backend/app/tests/test_supabase.py (1)

8-23: Excellent error handling and response validation!

The functions properly handle datetime serialization, validate HTTP status codes, and provide meaningful error messages. This is a solid foundation for database testing.

Also applies to: 56-75, 115-126

backend/app/scripts/supabase/populate_db.sql (2)

1-124: Excellent database schema design!

The schema is well-architected with:

  • Proper normalization and foreign key relationships
  • Good use of PostgreSQL features (JSONB, arrays, TIMESTAMPTZ)
  • Appropriate constraints and cascade behaviors
  • Comprehensive field coverage for the hybrid architecture

The table structure effectively supports both relational queries and vector database integration.


125-1057: Comprehensive sample data provides good test coverage.

The sample data insertions are extensive and realistic, covering diverse scenarios that will be valuable for testing the application's functionality across different user profiles, repositories, and interaction patterns.

- '8080'
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.31.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify that Weaviate version 1.31.0 is current and stable.

Please ensure this version is the latest stable release and doesn't have known security vulnerabilities.


🌐 Web query:

What is the latest stable version of Weaviate and are there any known security issues with version 1.31.0?

💡 Result:

As of June 9, 2025, the latest stable version of Weaviate is 1.29.x, released on February 17, 2025. (weaviate.io) Version 1.31.0 has not been released yet.

Regarding security issues in version 1.31.0, since it has not been released, there are no known security issues associated with it.

For the most current information on Weaviate releases and any associated security advisories, you can refer to the official Weaviate GitHub repository. (github.com)

Citations:


Weaviate image version is not released yet

The Docker Compose is referencing weaviate:1.31.0, but as of June 2025 the latest stable release is in the 1.29.x line (released Feb 17, 2025). Please update to a valid, released version:

• File: backend/docker-compose.yml
Line: 11
Change:

- image: cr.weaviate.io/semitechnologies/weaviate:1.31.0
+ image: cr.weaviate.io/semitechnologies/weaviate:1.29.x  # use the latest 1.29.x stable release

No known security issues exist for 1.31.0 (it isn’t released), but pinning to a released version ensures stability. Verify the exact 1.29.x tag you need from the official Weaviate release notes.

🤖 Prompt for AI Agents
In backend/docker-compose.yml at line 11, the Weaviate Docker image version is
set to 1.31.0, which is not yet released. Update this version to the latest
stable released version in the 1.29.x series by checking the official Weaviate
release notes for the exact tag. Replace "1.31.0" with the valid released
version to ensure stability and avoid referencing unreleased software.

Comment on lines 21 to 24
async def logout(access_token: str):
supabase_client.auth.set_session(access_token, refresh_token="")
supabase_client.auth.sign_out()
return {"message": "User logged out successfully"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve logout function with proper error handling.

The logout function has potential issues:

  • Setting session before signing out may be redundant
  • Missing error handling for sign-out failures
  • Function declared async but doesn't use await
-async def logout(access_token: str):
-    supabase_client.auth.set_session(access_token, refresh_token="")
-    supabase_client.auth.sign_out()
-    return {"message": "User logged out successfully"}
+def logout(access_token: str):
+    try:
+        supabase_client.auth.set_session(access_token, refresh_token="")
+        supabase_client.auth.sign_out()
+        return {"message": "User logged out successfully"}
+    except Exception as e:
+        raise Exception(f"Logout failed: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def logout(access_token: str):
supabase_client.auth.set_session(access_token, refresh_token="")
supabase_client.auth.sign_out()
return {"message": "User logged out successfully"}
def logout(access_token: str):
try:
supabase_client.auth.set_session(access_token, refresh_token="")
supabase_client.auth.sign_out()
return {"message": "User logged out successfully"}
except Exception as e:
raise Exception(f"Logout failed: {str(e)}")
🤖 Prompt for AI Agents
In backend/app/db/supabase/auth.py around lines 21 to 24, the logout function
improperly sets a session before signing out, lacks error handling for sign-out
failures, and is declared async without using await. Refactor by removing the
session setting line, add try-except to catch and handle errors from sign_out,
and if sign_out is asynchronous, use await; otherwise, remove async from the
function declaration.

Comment on lines 39 to 53
def get_user_profile_by_id(user_id: str):
client = get_client()
try:
questions = client.collections.get("Weaviate_user_profile")
response = questions.query.bm25(
query=user_id,
properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
)
if response and len(response) > 0:
user_profile_data = response[0]
return WeaviateUserProfile(**user_profile_data)
except Exception as e:
print(f"Error retrieving user profile: {e}")
return None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Standardize return behavior in get functions.

The get functions have inconsistent return patterns. Some return None on error while others have mixed return statements.

For consistency, ensure all get functions either:

  1. Return the object on success, None on failure, or
  2. Raise exceptions for errors instead of printing and returning None

Choose one pattern and apply it consistently across all CRUD functions.

Also applies to: 106-117, 172-183

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 39-39: Either all return statements in a function should return an expression, or none of them should.

(R1710)

🤖 Prompt for AI Agents
In backend/app/tests/test_weaviate.py around lines 39 to 53, the
get_user_profile_by_id function inconsistently handles errors by printing the
exception and returning None. To standardize, choose either to return the user
profile object on success and None on failure without printing errors, or raise
exceptions on errors. Apply the chosen pattern consistently across all get
functions, including those at lines 106-117 and 172-183, ensuring uniform error
handling and return behavior.

@smokeyScraper
Copy link
Contributor

Hey @Eli4479 ,
Thanks a lot for contributing.
Can you please resolve the potential issues mentioned by coderabbit? Meanwhile, I'll be reviewing this PR.
Thanks!!! :)

@smokeyScraper smokeyScraper self-requested a review June 9, 2025 20:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
backend/app/db/supabase/auth.py (1)

8-8: 🛠️ Refactor suggestion

Add a default fallback for the redirect URL environment variable.

The os.getenv("SUPABASE_REDIRECT_URL") call could return None if the environment variable is not set, which would cause issues with the OAuth flow.

-                "redirect_to": os.getenv("SUPABASE_REDIRECT_URL")
+                "redirect_to": os.getenv("SUPABASE_REDIRECT_URL", "http://localhost:3000/home")
backend/app/tests/test_weaviate.py (1)

116-127: Fix inconsistent return patterns.

These functions have inconsistent return statements - some paths return None implicitly while others return explicitly.

Make the return behavior consistent by ensuring all code paths have explicit return statements.

Also applies to: 191-202

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 116-116: Either all return statements in a function should return an expression, or none of them should.

(R1710)

🧹 Nitpick comments (3)
backend/app/db/supabase/auth.py (2)

3-3: Remove async declaration - function doesn't use await.

The login_with_oauth function is declared as async but doesn't use any await operations, making the async declaration unnecessary.

-async def login_with_oauth(provider: str):
+def login_with_oauth(provider: str):

13-13: Improve exception chaining for better debugging.

Use proper exception chaining to preserve the original exception context for better debugging.

-        raise Exception(f"OAuth login failed for {provider}: {str(e)}")
+        raise Exception(f"OAuth login failed for {provider}: {str(e)}") from e
-        raise Exception(f"Logout failed: {str(e)}")
+        raise Exception(f"Logout failed: {str(e)}") from e

Also applies to: 28-28

🧰 Tools
🪛 Ruff (0.11.9)

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

backend/app/tests/test_weaviate.py (1)

2-2: Remove unused import.

The datetime import is not used anywhere in the file.

-from datetime import datetime
🧰 Tools
🪛 Ruff (0.11.9)

2-2: datetime.datetime imported but unused

Remove unused import: datetime.datetime

(F401)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d098e8 and f04755a.

📒 Files selected for processing (8)
  • backend/app/db/supabase/auth.py (1 hunks)
  • backend/app/db/weaviate/weaviate_client.py (1 hunks)
  • backend/app/model/supabase/models.py (1 hunks)
  • backend/app/scripts/weaviate/create_schemas.py (1 hunks)
  • backend/app/scripts/weaviate/populate_db.py (1 hunks)
  • backend/app/tests/test_supabase.py (1 hunks)
  • backend/app/tests/test_weaviate.py (1 hunks)
  • backend/main.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • backend/app/db/weaviate/weaviate_client.py
  • backend/app/scripts/weaviate/populate_db.py
  • backend/main.py
  • backend/app/tests/test_supabase.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/app/scripts/weaviate/create_schemas.py (1)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (7-8)
🪛 Ruff (0.11.9)
backend/app/db/supabase/auth.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


28-28: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

backend/app/tests/test_weaviate.py

2-2: datetime.datetime imported but unused

Remove unused import: datetime.datetime

(F401)


18-18: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🪛 Pylint (3.3.7)
backend/app/tests/test_weaviate.py

[refactor] 41-41: Either all return statements in a function should return an expression, or none of them should.

(R1710)


[refactor] 63-70: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 79-84: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 116-116: Either all return statements in a function should return an expression, or none of them should.

(R1710)


[refactor] 135-145: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 156-161: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 191-191: Either all return statements in a function should return an expression, or none of them should.

(R1710)


[refactor] 210-220: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 231-236: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

backend/app/model/supabase/models.py

[refactor] 7-7: Too few public methods (0/2)

(R0903)


[refactor] 64-64: Too few public methods (0/2)

(R0903)


[refactor] 111-111: Too few public methods (0/2)

(R0903)


[refactor] 146-146: Too few public methods (0/2)

(R0903)

backend/app/scripts/weaviate/create_schemas.py

[error] 2-2: No name 'classes' in module 'weaviate'

(E0611)

🔇 Additional comments (3)
backend/app/scripts/weaviate/create_schemas.py (2)

42-47: LGTM! Collection naming inconsistency has been resolved.

The collection existence checks now correctly use lowercase names that match the actual collection names being created.


2-2:

Details

❓ Verification inconclusive

Verify the Weaviate import path.

Static analysis indicates a potential issue with the import path. Verify that weaviate.classes.config is the correct import path for your Weaviate client version.


🏁 Script executed:

#!/bin/bash
# Check if the weaviate.classes.config import is valid
python -c "
try:
    import weaviate.classes.config as wc
    print('✅ Import successful')
    print('Available attributes:', dir(wc))
except ImportError as e:
    print('❌ Import failed:', e)
    print('Checking weaviate package structure...')
    import weaviate
    print('Weaviate version:', weaviate.__version__)
    print('Available modules:', [attr for attr in dir(weaviate) if not attr.startswith('_')])
"

Length of output: 854


Action Required: Confirm the Weaviate import path

The sandbox check couldn’t import weaviate here, so please verify against your project environment that weaviate.classes.config exists. If it doesn’t, update the import to match your installed client’s structure. Common paths in recent versions include:

  • import weaviate.config
  • import weaviate.schema.config

Steps to verify in your environment:
• Run pip show weaviate-client to confirm the installed version.
• Inspect the package layout (e.g. python -c "import weaviate; print(dir(weaviate))").
• Refer to the official Python client docs for your version: https://weaviate.io/developers/weaviate/client-libraries/python

🧰 Tools
🪛 Pylint (3.3.7)

[error] 2-2: No name 'classes' in module 'weaviate'

(E0611)

backend/app/model/supabase/models.py (1)

55-56: LGTM! Mutable default argument issues have been resolved.

All previously identified mutable default arguments have been properly fixed using Field(default_factory=list) or None defaults. The models now follow Pydantic best practices.

Also applies to: 61-61, 102-102, 181-181

Comment on lines 30 to 32
client.data_object.create(
data_object=user_profile.dict(by_alias=True),
class_name="Weaviate_user_profile"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix API inconsistency and collection naming.

The code mixes different Weaviate client API patterns and uses inconsistent collection names. The schema creation uses lowercase names, but the tests use PascalCase names.

For insert operations, use the collections API consistently:

-        client.data_object.create(
-            data_object=user_profile.dict(by_alias=True),
-            class_name="Weaviate_user_profile"
-        )
+        collection = client.collections.get("weaviate_user_profile")
+        collection.data.insert(user_profile.dict(by_alias=True))

For query operations:

-        questions = client.collections.get("Weaviate_user_profile")
+        collection = client.collections.get("weaviate_user_profile")

Also applies to: 44-44, 107-109

🤖 Prompt for AI Agents
In backend/app/tests/test_weaviate.py around lines 30 to 32 (also lines 44 and
107 to 109), the code inconsistently uses Weaviate client API patterns and
mismatched collection names between schema creation and test operations. To fix
this, update all insert and query operations to use the collections API
consistently, ensuring collection names match the lowercase format used in the
schema. Replace any PascalCase collection names with the correct lowercase names
and adjust method calls to align with the collections API conventions.

ready = client.is_ready()
assert ready, "Weaviate client is not ready"
except Exception as e:
assert False, f"Weaviate client connection failed: {e}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace assert False with proper exception.

Using assert False is problematic because assertions are removed when Python runs with optimization (python -O).

-        assert False, f"Weaviate client connection failed: {e}"
+        raise AssertionError(f"Weaviate client connection failed: {e}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert False, f"Weaviate client connection failed: {e}"
raise AssertionError(f"Weaviate client connection failed: {e}")
🧰 Tools
🪛 Ruff (0.11.9)

18-18: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🤖 Prompt for AI Agents
In backend/app/tests/test_weaviate.py at line 18, replace the use of `assert
False` with raising a proper exception such as `RuntimeError` or a custom
exception. This ensures the error is always raised regardless of Python
optimization settings. Modify the code to raise the exception with the same
error message currently used in the assert statement.

Comment on lines 44 to 54
questions = client.collections.get("Weaviate_user_profile")
response = questions.query.bm25(
query=user_id,
properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
)
if response and len(response) > 0:
user_profile_data = response[0]
return WeaviateUserProfile(**user_profile_data)
except Exception as e:
print(f"Error retrieving user profile: {e}")
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Standardize the query API usage.

The current query approach using bm25 for ID lookups is incorrect. Use proper ID-based retrieval or fix the query logic.

-        questions = client.collections.get("Weaviate_user_profile")
-        response = questions.query.bm25(
-            query=user_id,
-            properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
-        )
-        if response and len(response) > 0:
-            user_profile_data = response[0]
-            return WeaviateUserProfile(**user_profile_data)
+        collection = client.collections.get("weaviate_user_profile")
+        response = collection.query.where(
+            weaviate.classes.query.Filter.by_property("supabaseUserId").equal(user_id)
+        )
+        if response.objects:
+            return WeaviateUserProfile(**response.objects[0].properties)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
questions = client.collections.get("Weaviate_user_profile")
response = questions.query.bm25(
query=user_id,
properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
)
if response and len(response) > 0:
user_profile_data = response[0]
return WeaviateUserProfile(**user_profile_data)
except Exception as e:
print(f"Error retrieving user profile: {e}")
return None
collection = client.collections.get("weaviate_user_profile")
response = collection.query.where(
weaviate.classes.query.Filter.by_property("supabaseUserId").equal(user_id)
)
if response.objects:
return WeaviateUserProfile(**response.objects[0].properties)
except Exception as e:
print(f"Error retrieving user profile: {e}")
return None
🤖 Prompt for AI Agents
In backend/app/tests/test_weaviate.py around lines 44 to 54, the code
incorrectly uses the bm25 query method for retrieving data by user ID, which is
not appropriate for ID-based lookups. Replace the bm25 query with the correct ID
retrieval method provided by the client API, such as a direct get by ID or an
ID-specific query function, to properly fetch the user profile data. Adjust the
code to use this method and handle the response accordingly.

Copy link
Contributor

@smokeyScraper smokeyScraper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please resolve the remaining issues pointed out by coderabbit? I am testing the PR.

from app.db.weaviate.weaviate_client import get_client


async def populate_Weaviate_code_chunk(client):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please resolve this? I understand this might not be necessary, being a temporary workaround to populate the database, but still maintaining the code quality throughout the codebase.

print("Populated: weaviate_code_chunk with sample data.")
except Exception as e:
print(f"Error populating weaviate_code_chunk: {e}")
async def populate_Weaviate_interaction(client):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar here please

Comment on lines 245 to 247
await populate_Weaviate_code_chunk(client)
await populate_Weaviate_interaction(client)
await populate_Weaviate_user_profile(client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I get it on this part. Aligned in a bit confusing way. Internally, there's no as such thing inside functions to await for. could you please refactor this, maybe?
would work, but seems slightly confusing.

@smokeyScraper
Copy link
Contributor

smokeyScraper commented Jun 10, 2025

@Eli4479 Thanks a lot for your contribution.

  • Could you please migrate /workspaces/Devr.AI/backend/app/tests to tests/ and resolve the remaining coderabbit's potential issues?
  • Also please do make sure adding the newly added packages to the poetry. Poetry is in inconsistent state as per the new module requirements. Refer Poetry Doc.
  • Weavite DB seems to be working fine, but Supabase is giving a UUID serialization error.

ps: it's always better to uncommit the recent unmerged changes and make changes before a new commit. Git history stays clean this way.

@smokeyScraper smokeyScraper added the enhancement New feature or request label Jun 10, 2025
create_interaction_schema(client)
if "weaviate_user_profile" not in existing_collections:
create_user_profile_schema(client)
print("✅ All schemas ensured.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add create_all_schemas() to execute the function.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
backend/app/db/supabase/auth.py (1)

16-20: Wrapper functions must await OR be synchronous.

After the previous change, these wrappers should be synchronous:

-async def login_with_github():
-    return login_with_oauth("github")
+def login_with_github():
+    return login_with_oauth("github")

Repeat for login_with_discord.

🧹 Nitpick comments (1)
tests/test_weaviate.py (1)

248-253: Pytest will not discover all_tests() – drop manual aggregator.

Pytest already collects every test_* function. Keeping all_tests():

  1. Duplicates execution if someone calls it manually.
  2. Hides individual test results under one call.

Delete the helper or convert it to a pytest fixture if you need orchestration.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f04755a and a13ca6b.

📒 Files selected for processing (4)
  • backend/app/db/supabase/auth.py (1 hunks)
  • backend/app/scripts/weaviate/populate_db.py (1 hunks)
  • tests/test_supabase.py (1 hunks)
  • tests/test_weaviate.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/scripts/weaviate/populate_db.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/test_weaviate.py (3)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (7-8)
backend/app/model/weaviate/models.py (3)
  • WeaviateUserProfile (5-13)
  • WeaviateCodeChunk (16-24)
  • WeaviateInteraction (27-35)
tests/test_supabase.py (8)
  • insert_code_chunk (169-179)
  • update_code_chunk (187-191)
  • delete_code_chunk (192-196)
  • test_code_chunk (197-222)
  • insert_interaction (115-125)
  • update_interaction (135-139)
  • delete_interaction (140-144)
  • test_interaction (146-167)
tests/test_supabase.py (2)
backend/app/model/supabase/models.py (4)
  • User (7-62)
  • Interaction (146-183)
  • CodeChunk (111-144)
  • Repository (64-109)
backend/app/db/supabase/supabase_client.py (1)
  • get_supabase_client (16-17)
🪛 Ruff (0.11.9)
tests/test_weaviate.py

44-44: SyntaxError: missing closing quote in string literal


49-49: SyntaxError: Expected 'else', found ':'


50-51: SyntaxError: Expected ')', found newline


51-51: SyntaxError: Unexpected indentation


52-52: SyntaxError: Expected except or finally after try block


52-52: SyntaxError: Expected a statement


52-53: SyntaxError: Expected an expression


53-53: SyntaxError: Unexpected indentation


56-56: SyntaxError: Expected a statement


57-57: SyntaxError: missing closing quote in string literal


57-58: SyntaxError: Expected ')', found newline


76-76: SyntaxError: missing closing quote in string literal


76-77: SyntaxError: Expected ')', found newline

backend/app/db/supabase/auth.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


28-28: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🪛 Pylint (3.3.7)
tests/test_weaviate.py

[error] 44-44: Parsing failed: 'unterminated string literal (detected at line 44) (tests.test_weaviate, line 44)'

(E0001)

tests/test_supabase.py

[error] 1-1: Attempted relative import beyond top-level package

(E0402)


[error] 3-3: Attempted relative import beyond top-level package

(E0402)

Comment on lines +56 to +66
def update_user_profile(user_id: str):
questions = get_client().collections.get("weaviate_user_profile"")
try:
user_profile = questions.query.bm25(
query=user_id,
properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
)
if user_profile:
user_profile[0]["profileSummary"] = "Updated profile summary"
questions.update(user_profile[0])
print("User profile updated successfully.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

update_user_profile cannot work – wrong API usage & more syntax errors.

  1. Same unterminated-string issue (get("weaviate_user_profile"")).
  2. You call collection.query.bm25() to fetch the object, mutate the returned dict, then pass the dict back to collection.update().
    Weaviate’s python-client requires:
client.data_object.update(
    uuid,              # object id
    class_name="WeaviateUserProfile",
    data_object={"profileSummary": "…"}
)

The current code will raise at runtime even after the syntax fix.

🧰 Tools
🪛 Ruff (0.11.9)

56-56: SyntaxError: Expected a statement


57-57: SyntaxError: missing closing quote in string literal


57-58: SyntaxError: Expected ')', found newline

🤖 Prompt for AI Agents
In tests/test_weaviate.py around lines 56 to 66, fix the unterminated string in
get("weaviate_user_profile") by removing the extra quote. Replace the incorrect
usage of collection.query.bm25() and collection.update() with the correct
Weaviate client API: first query to get the object's UUID, then call
client.data_object.update() with the UUID, class_name, and a data_object dict
containing only the fields to update. This ensures proper update calls without
runtime errors.

Comment on lines +42 to +50
client = get_client()
try:
questions = client.collections.get("weaviate_user_profile"")
response = questions.query.bm25(
query=user_id,
properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
)
if response and len(response) > 0:
user_profile_data = response[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fatal syntax error: unterminated string literal & invalid variable name.

client.collections.get("weaviate_user_profile"") contains two closing quotes, producing a SyntaxError that prevents the test suite from importing.
Additionally, questions is a meaningless name for a collection object.

-        questions = client.collections.get("weaviate_user_profile"")
+        collection = client.collections.get("weaviate_user_profile")

Without this fix no test in this module will run.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.11.9)

44-44: SyntaxError: missing closing quote in string literal


49-49: SyntaxError: Expected 'else', found ':'

🪛 Pylint (3.3.7)

[error] 44-44: Parsing failed: 'unterminated string literal (detected at line 44) (tests.test_weaviate, line 44)'

(E0001)

🤖 Prompt for AI Agents
In tests/test_weaviate.py around lines 42 to 50, fix the syntax error by
removing the extra closing quote in the string literal passed to
client.collections.get so it reads "weaviate_user_profile". Also, rename the
variable 'questions' to a more meaningful name like 'collection' to accurately
reflect that it holds a collection object.

Comment on lines 22 to 28
async def logout(access_token: str):
try:
supabase_client.auth.set_session(access_token, refresh_token="")
supabase_client.auth.sign_out()
return {"message": "User logged out successfully"}
except Exception as e:
raise Exception(f"Logout failed: {str(e)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Exception chaining & redundant set_session.

  1. Use raise … from e to preserve the original traceback (ruff B904).
  2. Setting a session before sign_out() is unnecessary according to Supabase docs.
-        supabase_client.auth.set_session(access_token, refresh_token="")
-        supabase_client.auth.sign_out()
+        supabase_client.auth.sign_out()
 ...
-    except Exception as e:
-        raise Exception(f"Logout failed: {str(e)}")
+    except Exception as e:
+        raise Exception(f"Logout failed: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def logout(access_token: str):
try:
supabase_client.auth.set_session(access_token, refresh_token="")
supabase_client.auth.sign_out()
return {"message": "User logged out successfully"}
except Exception as e:
raise Exception(f"Logout failed: {str(e)}")
async def logout(access_token: str):
try:
supabase_client.auth.sign_out()
return {"message": "User logged out successfully"}
except Exception as e:
raise Exception(f"Logout failed: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

28-28: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In backend/app/db/supabase/auth.py around lines 22 to 28, remove the call to
set_session before sign_out as it is unnecessary for logging out. Also, update
the exception handling to use 'raise ... from e' to preserve the original
traceback for better debugging.

Comment on lines +1 to +4
from ..backend.app.model.supabase.models import User, Interaction, CodeChunk, Repository
from uuid import uuid4
from ..backend.app.db.supabase.supabase_client import get_supabase_client
from datetime import datetime # Your User model import
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Invalid relative imports – tests won’t run outside package context.

from ..backend.app.… raises E0402 Attempted relative import beyond top-level package.

Switch to absolute imports that match the project’s module path (as used in the codebase):

-from ..backend.app.model.supabase.models import User, Interaction, CodeChunk, Repository
-from ..backend.app.db.supabase.supabase_client import get_supabase_client
+from app.model.supabase.models import User, Interaction, CodeChunk, Repository
+from app.db.supabase.supabase_client import get_supabase_client

Also remove the trailing comment on the datetime import – it’s misleading.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from ..backend.app.model.supabase.models import User, Interaction, CodeChunk, Repository
from uuid import uuid4
from ..backend.app.db.supabase.supabase_client import get_supabase_client
from datetime import datetime # Your User model import
from app.model.supabase.models import User, Interaction, CodeChunk, Repository
from uuid import uuid4
from app.db.supabase.supabase_client import get_supabase_client
from datetime import datetime # Your User model import
🧰 Tools
🪛 Pylint (3.3.7)

[error] 1-1: Attempted relative import beyond top-level package

(E0402)


[error] 3-3: Attempted relative import beyond top-level package

(E0402)

🤖 Prompt for AI Agents
In tests/test_supabase.py lines 1 to 4, the relative imports using '..' cause
import errors outside the package context. Replace these relative imports with
absolute imports that reflect the full project module path to ensure the tests
run correctly. Also, remove the trailing comment on the datetime import as it is
misleading and unnecessary.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (4)
backend/app/db/supabase/auth.py (1)

22-28: logout still sets a session before sign-out and loses tracebacks

Previous review comments pointed out that supabase_client.auth.set_session is redundant for logout and should be removed. The call is still present and can introduce side-effects (e.g., overwriting a valid refresh token with the empty string).
Also use exception chaining as above.

-        supabase_client.auth.set_session(access_token, refresh_token="")
-        supabase_client.auth.sign_out()
+        supabase_client.auth.sign_out()
@@
-    except Exception as e:
-        raise Exception(f"Logout failed: {str(e)}")
+    except Exception as e:
+        raise Exception(f"Logout failed: {e}") from e
🧰 Tools
🪛 Ruff (0.11.9)

28-28: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

tests/test_weaviate.py (3)

41-45: ⚠️ Potential issue

Syntax error & misleading variable name break the test module

The string literal has two closing quotes and questions is not an intuitive name for a collection handle. Python will raise a SyntaxError before the test suite even starts.

-        questions = client.collections.get("weaviate_user_profile"")
+        collection = client.collections.get("weaviate_user_profile")
🧰 Tools
🪛 Ruff (0.11.9)

44-44: SyntaxError: missing closing quote in string literal

🪛 Pylint (3.3.7)

[error] 44-44: Parsing failed: 'unterminated string literal (detected at line 44) (tests.test_weaviate, line 44)'

(E0001)


56-66: ⚠️ Potential issue

update_user_profile still uses an invalid Weaviate update flow

  1. Same unterminated-string issue as above.
  2. You mutate the dict returned by .query.bm25() and pass it to questions.update(), which is not part of the python-client API.
  3. client.data_object.update() requires the object UUID, class name, and a partial payload.
-    questions = get_client().collections.get("weaviate_user_profile"")
+    collection = get_client().collections.get("weaviate_user_profile")

@@
-        user_profile = questions.query.bm25(
+        result = collection.query.bm25(
             query=user_id,
             properties=["supabaseUserId", "profileSummary", "primaryLanguages", "expertiseAreas"]
         )
-        if user_profile:
-            user_profile[0]["profileSummary"] = "Updated profile summary"
-            questions.update(user_profile[0])
+        if result:
+            obj_uuid = result[0]["_additional"]["id"]
+            get_client().data_object.update(
+                uuid=obj_uuid,
+                class_name="weaviate_user_profile",
+                data_object={"profileSummary": "Updated profile summary"}
+            )
🧰 Tools
🪛 Ruff (0.11.9)

56-56: SyntaxError: Expected a statement


57-57: SyntaxError: missing closing quote in string literal


57-58: SyntaxError: Expected ')', found newline


75-78: ⚠️ Potential issue

Deletion helper uses the same unterminated string & wrong API

questions.data.delete_by_id() does not exist; use client.data_object.delete(uuid, class_name=…).

-    questions = get_client().collections.get("weaviate_user_profile"")
-    try:
-        deleted = questions.data.delete_by_id(user_id)
+    client = get_client()
+    try:
+        deleted = client.data_object.delete(
+            uuid=user_id,
+            class_name="weaviate_user_profile"
+        )
🧰 Tools
🪛 Ruff (0.11.9)

76-76: SyntaxError: missing closing quote in string literal


76-77: SyntaxError: Expected ')', found newline

🧹 Nitpick comments (2)
backend/app/db/supabase/auth.py (1)

8-9: Expose redirect URL via settings helper

Environment lookup is repeated in multiple files. Centralising configuration (e.g., with Pydantic Settings) avoids magic strings and makes default overrides test-friendly.

tests/test_weaviate.py (1)

248-252: all_tests will never be collected by pytest

Pytest discovers functions beginning with test_. Either rename to test_all or delete this manual runner to avoid duplicate execution.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a13ca6b and 9b0cde7.

📒 Files selected for processing (2)
  • backend/app/db/supabase/auth.py (1 hunks)
  • tests/test_weaviate.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/test_weaviate.py (3)
backend/app/db/weaviate/weaviate_client.py (1)
  • get_client (7-8)
backend/app/model/weaviate/models.py (3)
  • WeaviateUserProfile (5-13)
  • WeaviateCodeChunk (16-24)
  • WeaviateInteraction (27-35)
tests/test_supabase.py (8)
  • insert_code_chunk (169-179)
  • update_code_chunk (187-191)
  • delete_code_chunk (192-196)
  • test_code_chunk (197-222)
  • insert_interaction (115-125)
  • update_interaction (135-139)
  • delete_interaction (140-144)
  • test_interaction (146-167)
🪛 Ruff (0.11.9)
backend/app/db/supabase/auth.py

13-13: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


28-28: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

tests/test_weaviate.py

44-44: SyntaxError: missing closing quote in string literal


49-49: SyntaxError: Expected 'else', found ':'


50-51: SyntaxError: Expected ')', found newline


51-51: SyntaxError: Unexpected indentation


52-52: SyntaxError: Expected except or finally after try block


52-52: SyntaxError: Expected a statement


52-53: SyntaxError: Expected an expression


53-53: SyntaxError: Unexpected indentation


56-56: SyntaxError: Expected a statement


57-57: SyntaxError: missing closing quote in string literal


57-58: SyntaxError: Expected ')', found newline


76-76: SyntaxError: missing closing quote in string literal


76-77: SyntaxError: Expected ')', found newline

🪛 Pylint (3.3.7)
tests/test_weaviate.py

[error] 44-44: Parsing failed: 'unterminated string literal (detected at line 44) (tests.test_weaviate, line 44)'

(E0001)

🔇 Additional comments (1)
backend/app/db/supabase/auth.py (1)

1-2: Import path may break when backend/ is a package

The module lives in backend/app/... but imports from app.db….
If backend is also a Python package (i.e., has an __init__.py), this relative layout will fail when executed with python -m backend.….
Recommend using an absolute import rooted at backend.app… or, better, a project-level namespace package.

Comment on lines +89 to +95
def test_user_profile():
inserted_user = insert_user_profile()
assert inserted_user is not None, "User profile insertion failed"
get_user_profile_by_id(inserted_user.supabase_user_id)
update_user_profile(inserted_user.supabase_user_id)
delete_user_profile(inserted_user.supabase_user_id)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Tests execute CRUD helpers but assert nothing

Each test_* function should verify that:

  1. The object is retrievable after insertion.
  2. The update persisted the new value.
  3. The delete returned True and subsequent retrieval fails.

Without assertions the tests can pass even when the helpers are no-ops.

Also applies to: 165-170, 241-246

🤖 Prompt for AI Agents
In tests/test_weaviate.py around lines 89 to 95, the test_user_profile function
calls CRUD helper functions but lacks assertions to verify their effects. Add
assertions to confirm the user profile is retrievable after insertion, that
updates persist the new values, and that deletion returns True and prevents
further retrieval. Apply similar assertion enhancements to the test functions at
lines 165-170 and 241-246.

Comment on lines +210 to +215
if interaction:
interaction["conversationSummary"] = "Updated interaction summary"
client.data_object.update(
data_object=interaction,
class_name="weaviate_interaction"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Same UUID omission in update_interaction

Replicate the fix applied to update_code_chunk so the interaction update actually executes.

🤖 Prompt for AI Agents
In tests/test_weaviate.py around lines 210 to 215, the update_interaction
function is missing the UUID when calling client.data_object.update, causing the
update not to execute. Fix this by including the UUID of the interaction object
as an argument in the update call, similar to how it was done in
update_code_chunk, ensuring the update targets the correct data object.

Comment on lines +128 to +141
def update_code_chunk(code_chunk_id: str):
client = get_client()
try:
code_chunk = client.data_object.get(
id=code_chunk_id,
class_name="weaviate_code_chunk"
)
if code_chunk:
code_chunk["codeContent"] = "Updated code content"
client.data_object.update(
data_object=code_chunk,
class_name="weaviate_code_chunk"
)
print("Code chunk updated successfully.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

update_code_chunk omits the UUID parameter – call will raise

client.data_object.update() requires the object’s UUID; passing only a dict silently fails.

-            code_chunk["codeContent"] = "Updated code content"
-            client.data_object.update(
-                data_object=code_chunk,
-                class_name="weaviate_code_chunk"
-            )
+            client.data_object.update(
+                uuid=code_chunk["id"],
+                class_name="weaviate_code_chunk",
+                data_object={"codeContent": "Updated code content"}
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def update_code_chunk(code_chunk_id: str):
client = get_client()
try:
code_chunk = client.data_object.get(
id=code_chunk_id,
class_name="weaviate_code_chunk"
)
if code_chunk:
code_chunk["codeContent"] = "Updated code content"
client.data_object.update(
data_object=code_chunk,
class_name="weaviate_code_chunk"
)
print("Code chunk updated successfully.")
def update_code_chunk(code_chunk_id: str):
client = get_client()
try:
code_chunk = client.data_object.get(
id=code_chunk_id,
class_name="weaviate_code_chunk"
)
if code_chunk:
client.data_object.update(
uuid=code_chunk["id"],
class_name="weaviate_code_chunk",
data_object={"codeContent": "Updated code content"}
)
print("Code chunk updated successfully.")
🤖 Prompt for AI Agents
In tests/test_weaviate.py around lines 128 to 141, the update_code_chunk
function calls client.data_object.update without providing the required UUID
parameter, causing the update to fail silently. Modify the update call to
include the UUID of the object being updated by passing the code_chunk_id as the
uuid argument along with the data_object and class_name parameters.

@smokeyScraper
Copy link
Contributor

seems good to go. could you please review and merge @chandansgowda ?
the current potential issues by coderabbit are of tests to the db, won't be much of a problem as the db creation and population scripts seem to be working. I'll make changes if anything's needed later on finalizing scripts as present in the current backend/app/services/vector_db/service.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FEATURE REQUEST: Database Configuration

3 participants