-
-
Notifications
You must be signed in to change notification settings - Fork 204
NestBot AI Assistant Contexts #1891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary by CodeRabbit
WalkthroughAdd a new Context model and migrate Chunk to reference Context; update retrieval to use context.entity; refactor chunk/context creation into shared Base commands and extractors; update admin, Makefile, utilities, RAG tooling, and tests; adjust constants and error handling. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Assessment against linked issues
Out-of-scope changes
Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (7)
backend/apps/ai/models/context.py (1)
23-33
: Improve the string representation method.The current implementation could raise
AttributeError
and has redundant logic.Consider this more defensive approach:
def __str__(self): """Human readable representation.""" - entity = ( - getattr(self.content_object, "name", None) - or getattr(self.content_object, "key", None) - or str(self.content_object) - ) + if self.content_object: + entity = ( + getattr(self.content_object, "name", None) + or getattr(self.content_object, "key", None) + or str(self.content_object) + ) + else: + entity = "No linked object" + return ( f"{self.content_type.model if self.content_type else 'None'} {entity}: " f"{self.generated_text[:50]}" )backend/apps/ai/management/commands/ai_run_rag_tool.py (2)
34-39
: Improve the help text for content-types argument.The current help text "Content types to filter by (e.g., project chapter)" could be more descriptive.
Consider improving the help text:
- help="Content types to filter by (e.g., project chapter)", + help="Content types to filter by (e.g., 'project', 'chapter', 'committee')",
64-70
: Add input validation and improve output formatting.Consider adding validation for argument values and better output formatting.
Add validation before processing:
+ # Validate arguments + if not (0.0 <= options["threshold"] <= 1.0): + self.stderr.write(self.style.ERROR("Threshold must be between 0.0 and 1.0")) + return + if options["limit"] <= 0: + self.stderr.write(self.style.ERROR("Limit must be positive")) + return + self.stdout.write("\nProcessing query...") result = rag_tool.query( question=options["query"], limit=options["limit"], similarity_threshold=options["threshold"], content_types=options["content_types"], ) - self.stdout.write(f"\nAnswer: {result}") + self.stdout.write(self.style.SUCCESS(f"\nAnswer: {result}"))backend/tests/apps/ai/models/context_test.py (1)
10-15
: Consider extracting shared test utility.The
create_model_mock
function is duplicated fromchunk_test.py
. Consider extracting this to a shared test utility module to avoid code duplication.You could create a shared test utilities module:
# backend/tests/apps/ai/test_utils.py from unittest.mock import Mock def create_model_mock(model_class): mock = Mock(spec=model_class) mock._state = Mock() mock.pk = 1 mock.id = 1 return mockThen import it in both test files.
backend/apps/ai/agent/tools/rag/generator.py (2)
16-32
: Fix grammatical issues in SYSTEM_PROMPT.The system prompt contains several grammatical inconsistencies that should be corrected for professionalism:
-3. you will answer questions only related to OWASP and within the scope of OWASP. +3. You will answer questions only related to OWASP and within the scope of OWASP. 4. Be concise and directly answer the user's query. 5. Provide the necessary link if the context contains a URL. -6. If there is any query based on location, you need to look for latitude and longitude in the -context and provide the nearest OWASP chapter based on that. +6. If there is any query based on location, you need to look for latitude and longitude in the context and provide the nearest OWASP chapter based on that. 7. You can ask for more information if the query is very personalized or user-centric. -8. after trying all of the above, If the context does not contain the information or you think that -it is out of scope for OWASP, you MUST state: "please ask question related to OWASP." +8. After trying all of the above, if the context does not contain the information or you think that it is out of scope for OWASP, you MUST state: "please ask question related to OWASP."
115-119
: Add query context to exception logging.When logging OpenAI API errors, include relevant context to aid debugging.
except openai.OpenAIError: - logger.exception("OpenAI API error") + logger.exception("OpenAI API error for query: %s", query[:100]) answer = "I'm sorry, I'm currently unable to process your request."backend/apps/ai/agent/tools/rag/retriever.py (1)
75-183
: Refactor get_additional_context for better maintainability.This method is over 100 lines with repetitive patterns. Consider extracting content-type-specific handlers.
def get_additional_context(self, content_object, content_type: str) -> dict[str, Any]: """Get additional context information based on content type.""" clean_content_type = content_type.split(".")[-1] if "." in content_type else content_type handlers = { "chapter": self._get_chapter_context, "project": self._get_project_context, "event": self._get_event_context, "committee": self._get_committee_context, "message": self._get_message_context, } handler = handlers.get(clean_content_type) if handler: context = handler(content_object) return {k: v for k, v in context.items() if v is not None} return {} def _get_chapter_context(self, content_object) -> dict[str, Any]: """Extract context for chapter objects.""" return { "location": getattr(content_object, "suggested_location", None), "region": getattr(content_object, "region", None), # ... rest of the attributes }This approach would make the code more modular and easier to extend with new content types.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
backend/apps/ai/Makefile
(1 hunks)backend/apps/ai/admin.py
(2 hunks)backend/apps/ai/agent/tools/rag/generator.py
(1 hunks)backend/apps/ai/agent/tools/rag/rag_tool.py
(1 hunks)backend/apps/ai/agent/tools/rag/retriever.py
(1 hunks)backend/apps/ai/common/constants.py
(1 hunks)backend/apps/ai/common/utils.py
(3 hunks)backend/apps/ai/management/commands/ai_run_rag_tool.py
(1 hunks)backend/apps/ai/migrations/0005_context_alter_chunk_unique_together_chunk_context_and_more.py
(1 hunks)backend/apps/ai/models/chunk.py
(4 hunks)backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/models/chunk_test.py
(5 hunks)backend/tests/apps/ai/models/context_test.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (8)
backend/apps/ai/management/commands/ai_run_rag_tool.py (1)
backend/apps/ai/agent/tools/rag/rag_tool.py (2)
RagTool
(13-72)query
(38-72)
backend/apps/ai/admin.py (1)
backend/apps/ai/models/context.py (1)
Context
(10-33)
backend/tests/apps/ai/models/chunk_test.py (2)
backend/apps/ai/models/context.py (1)
Context
(10-33)backend/apps/ai/models/chunk.py (2)
Chunk
(12-74)update_data
(47-74)
backend/apps/ai/common/utils.py (1)
backend/apps/ai/models/context.py (1)
Context
(10-33)
backend/apps/ai/agent/tools/rag/retriever.py (3)
backend/apps/ai/models/chunk.py (1)
Chunk
(12-74)backend/apps/ai/agent/tools/rag/rag_tool.py (1)
query
(38-72)backend/apps/slack/models/message.py (1)
text
(83-85)
backend/tests/apps/ai/models/context_test.py (3)
backend/apps/ai/models/context.py (1)
Context
(10-33)backend/tests/apps/ai/models/chunk_test.py (1)
create_model_mock
(7-12)backend/apps/common/models.py (1)
TimestampedModel
(37-46)
backend/apps/ai/models/context.py (2)
backend/apps/common/models.py (1)
TimestampedModel
(37-46)backend/apps/ai/models/chunk.py (1)
Meta
(15-18)
backend/apps/ai/models/chunk.py (2)
backend/apps/ai/models/context.py (1)
Context
(10-33)backend/apps/common/utils.py (1)
truncate
(164-176)
🔇 Additional comments (27)
backend/apps/ai/Makefile (1)
21-23
: LGTM!The new Makefile target follows the established pattern and provides a convenient way to invoke the RAG tool management command.
backend/apps/ai/common/constants.py (1)
4-5
: LGTM!The new constants have sensible default values and follow the established naming conventions. The retrieval limit of 5 and similarity threshold of 0.5 are reasonable defaults for RAG functionality.
backend/apps/ai/common/utils.py (2)
12-12
: LGTM!The import of the Context model is correctly added to support the new data model structure.
63-63
: LGTM!The chunk creation correctly uses the new context-based association instead of the previous direct content_object relationship.
backend/apps/ai/models/context.py (1)
19-21
: LGTM!The Meta configuration is appropriate with a clear database table name and verbose name.
backend/apps/ai/management/commands/ai_run_rag_tool.py (1)
15-51
: LGTM! Well-structured arguments with good defaults.The argument definitions are comprehensive and use appropriate defaults from the constants module. The help text is clear and informative.
backend/apps/ai/admin.py (4)
6-6
: LGTM!Proper import of the new
Context
model.
9-20
: Well-configured admin interface for Context model.The
ContextAdmin
provides comprehensive management capabilities with appropriate fields for display, search, and filtering. The configuration aligns well with the Context model's structure.
29-32
: Admin updates correctly reflect the model refactoring.The changes properly adapt the
ChunkAdmin
to use the newcontext
relationship instead of the previous generic foreign key approach. Thelist_filter
oncontext__content_type
maintains the ability to filter by content type through the related Context model.
35-35
: LGTM!Proper registration of the
Context
model with its admin class.backend/tests/apps/ai/models/chunk_test.py (5)
4-4
: LGTM!Proper import of the new
Context
model for testing.
17-26
: Test properly updated for new Context relationship.The
test_str_method
correctly mocks thecontext
attribute instead of the previous generic foreign key fields, maintaining test coverage for the string representation functionality.
57-58
: Update data tests correctly adapted to new Context model.The test properly uses
Context
mocks and validates thecontext
parameter in theupdate_data
method calls, reflecting the model refactoring accurately.Also applies to: 67-67, 69-69
129-129
: Unique constraint updated correctly.The unique constraint test properly reflects the change from
("content_type", "object_id", "text")
to("context", "text")
, aligning with the model schema changes.
131-137
: Good test coverage for the new context relationship.The test validates the foreign key relationship properties including the related model, null/blank settings, ensuring the relationship is properly configured.
backend/apps/ai/migrations/0005_context_alter_chunk_unique_together_chunk_context_and_more.py (1)
14-42
: Context model creation looks good.The
Context
model is properly defined with all necessary fields including the generic foreign key components (content_type
,object_id
) and additional metadata fields.backend/apps/ai/agent/tools/rag/rag_tool.py (2)
16-36
: Good initialization with proper error handling.The constructor properly initializes the retriever and generator components with appropriate error handling and logging. The exception handling ensures any initialization failures are properly logged and re-raised.
58-64
: Well-structured retrieval logic.The retrieval step properly passes all the necessary parameters and includes appropriate logging for debugging purposes.
backend/tests/apps/ai/models/context_test.py (4)
19-67
: Excellent test coverage for string representation.The tests comprehensively cover various scenarios for the
__str__
method including edge cases like empty text, exact 50-character limit, and text truncation. This ensures the string representation works correctly in all situations.
69-100
: Thorough field property validation.The tests properly validate all field properties including types, constraints, defaults, and relationships. This ensures the model is correctly configured.
115-151
: Good coverage of model manager operations.The tests properly validate the standard Django model manager operations with appropriate mocking, ensuring the Context model integrates correctly with Django's ORM.
153-176
: Proper validation testing.The tests cover both successful validation and validation error scenarios, ensuring the model's validation logic works correctly.
backend/apps/ai/agent/tools/rag/generator.py (1)
35-52
: LGTM!The initialization method properly validates the API key and provides clear error messaging.
backend/apps/ai/models/chunk.py (2)
26-29
: LGTM!The
__str__
method properly handles nullable context and provides clear representation.
46-74
: AllChunk.update_data
Callers UpdatedNo remaining calls use the old
content_object
parameter. Verified in:
backend/apps/ai/common/utils.py
backend/tests/apps/ai/models/chunk_test.py
backend/apps/ai/admin.py
backend/apps/ai/agent/tools/rag/retriever.py
- All
ai_create_*_chunks
management commandsNo further changes are needed.
backend/apps/ai/agent/tools/rag/retriever.py (2)
44-66
: LGTM!Excellent error handling with appropriate logging and re-raising of exceptions.
255-278
: LGTM!Clean implementation with good handling of singular/plural forms.
backend/apps/ai/migrations/0005_context_alter_chunk_unique_together_chunk_context_and_more.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/apps/ai/admin.py
(2 hunks)backend/apps/ai/common/utils.py
(3 hunks)backend/apps/ai/migrations/0006_rename_generated_text_context_content_and_more.py
(1 hunks)backend/apps/ai/models/chunk.py
(4 hunks)backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/models/chunk_test.py
(5 hunks)backend/tests/apps/ai/models/context_test.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- backend/apps/ai/common/utils.py
- backend/apps/ai/models/context.py
- backend/apps/ai/admin.py
- backend/tests/apps/ai/models/context_test.py
- backend/tests/apps/ai/models/chunk_test.py
- backend/apps/ai/models/chunk.py
🔇 Additional comments (1)
backend/apps/ai/migrations/0006_rename_generated_text_context_content_and_more.py (1)
13-17
: LGTM! Field rename looks correct.The renaming of
generated_text
tocontent
in the Context model improves naming consistency and aligns with the broader refactor objectives.
backend/apps/ai/migrations/0006_rename_generated_text_context_content_and_more.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/apps/ai/migrations/0007_alter_chunk_context_alter_context_unique_together.py
(1 hunks)backend/apps/ai/models/chunk.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/apps/ai/models/chunk.py
🔇 Additional comments (2)
backend/apps/ai/migrations/0007_alter_chunk_context_alter_context_unique_together.py (2)
24-27
: LGTM! Proper unique constraint for generic foreign key pattern.The unique constraint on
(content_type, object_id)
is appropriate for preventing duplicate contexts for the same OWASP entity. This follows Django best practices for generic foreign key relationships.
1-11
: LGTM! Well-structured Django migration.The migration follows Django conventions with proper dependencies, imports, and class structure. The dependency on the previous migration maintains the correct sequence for the Context model refactoring.
backend/apps/ai/migrations/0007_alter_chunk_context_alter_context_unique_together.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
backend/tests/apps/ai/common/utils_test.py (1)
124-147
: Misleading test name/docstring: it asserts no sleep but says/calls “sleep_called”.This is the same issue previously flagged. The test name and docstring claim sleep is called, but the assertions enforce that it is not called.
Apply this rename to align with actual behavior:
- def test_create_chunks_and_embeddings_sleep_called(self, mock_datetime, mock_sleep): - """Tests that sleep is called when needed.""" + def test_create_chunks_and_embeddings_no_sleep_on_first_call(self, mock_datetime, mock_sleep): + """Tests that sleep is not called on the first request."""
🧹 Nitpick comments (24)
backend/apps/ai/management/commands/ai_update_committee_chunks.py (1)
3-6
: Avoid N+1 queries by selecting related repositoryextract_committee_content accesses committee.owasp_repository. Selecting it up-front improves performance for batch processing.
Apply:
+from django.db.models import QuerySet
And add:
class Command(BaseChunkCommand): key_field_name = "key" model_class = Committee def extract_content(self, entity: Committee) -> tuple[str, str]: """Extract content from the committee.""" return extract_committee_content(entity) + + def get_base_queryset(self) -> QuerySet: + """Select related fields to cut down on DB queries.""" + return super().get_base_queryset().select_related("owasp_repository")Also applies to: 8-14
backend/apps/ai/management/commands/ai_update_chapter_context.py (1)
3-6
: Select related repository to improve batch performanceextract_chapter_content uses chapter.owasp_repository; selecting it reduces per-entity DB hits.
Apply:
+from django.db.models import QuerySet
And add:
class Command(BaseContextCommand): key_field_name = "key" model_class = Chapter def extract_content(self, entity: Chapter) -> tuple[str, str]: """Extract content from the chapter.""" return extract_chapter_content(entity) + + def get_base_queryset(self) -> QuerySet: + """Select related fields to avoid N+1 queries when extracting content.""" + return super().get_base_queryset().select_related("owasp_repository")Also applies to: 8-14
backend/apps/ai/management/commands/ai_update_project_chunks.py (1)
18-20
: Select related repository to reduce DB roundtripsSince extract_project_content references project.owasp_repository, selecting it avoids N+1 queries during batch processing.
Apply:
def get_base_queryset(self) -> QuerySet: """Return the base queryset with ordering.""" - return super().get_base_queryset() + return super().get_base_queryset().select_related("owasp_repository")backend/tests/apps/ai/management/commands/ai_update_project_chunks_test.py (2)
22-22
: Rename test class for consistency with command nameThe test class is named “Create...”, but the command is ai_update_project_chunks. Rename for clarity.
Apply:
-class TestAiCreateProjectChunksCommand: +class TestAiUpdateProjectChunksCommand:
1-7
: Strengthen inheritance assertionAlso assert instance of BaseChunkCommand to better reflect the intended contract.
Apply:
from unittest.mock import Mock, patch import pytest from django.core.management.base import BaseCommand +from apps.ai.common.base.chunk_command import BaseChunkCommand
And:
def test_command_inheritance(self, command): - assert isinstance(command, BaseCommand) + assert isinstance(command, BaseCommand) + assert isinstance(command, BaseChunkCommand)Also applies to: 23-25
backend/tests/apps/ai/common/utils_test.py (3)
26-30
: Unnecessary test setup: mocking datetime.UTC and datetime.timedelta.In utils, UTC and timedelta are imported into module scope (from datetime import UTC, timedelta). Patching apps.ai.common.utils.datetime is sufficient; setting mock_datetime.UTC and mock_datetime.timedelta has no effect.
Drop these lines to simplify tests.
Also applies to: 126-130, 157-161, 193-195
44-50
: Rename local variable for clarity: it’s a Context, not content_object.Now that utils takes a Context, prefer naming accordingly to avoid confusion.
- mock_content_object = MagicMock() + mock_context = MagicMock() @@ - mock_content_object, + mock_context, @@ - context=mock_content_object, + context=mock_context, @@ - context=mock_content_object, + context=mock_context,Also applies to: 57-71
183-217
: Strengthen assertion: verify the actual sleep duration under rate limit.Given DEFAULT_LAST_REQUEST_OFFSET_SECONDS=5 and MIN_REQUEST_INTERVAL_SECONDS=10, expected sleep is 5 seconds. Assert the exact call.
- mock_sleep.assert_called_once() + mock_sleep.assert_called_once_with(5)backend/tests/apps/ai/common/extractors/project_test.py (1)
1-514
: Thorough and high-signal test suite.Covers wide scenarios: full/minimal data, partial/zero stats, URL filtering (with/without invalid_urls), date-only, zero health score, repo-only topics, and empty strings. The expectations align with extractor behavior.
Minor optional refinements:
- Consider parametrizing date formatting and statistics cases to reduce duplication.
- You can assert DELIMITER-aware boundaries by splitting metadata on the delimiter if you want to validate sections more strictly without relying on substring contains.
backend/tests/apps/ai/management/commands/ai_update_chapter_chunks_test.py (1)
22-24
: Nit: class name references “create” while command is “ai_update_*”.For consistency with the command name and other modules, prefer “Update” in the test class name.
-class TestAiCreateChapterChunksCommand: +class TestAiUpdateChapterChunksCommand:backend/tests/apps/ai/management/commands/ai_update_event_chunks_test.py (2)
1-1
: Docstring mismatch: refers to ai_create_event_chunks.Update module docstring to “ai_update_event_chunks” to match the command and file path.
-"""Tests for the ai_create_event_chunks Django management command.""" +"""Tests for the ai_update_event_chunks Django management command."""
26-28
: Nit: class name/docstring still use “create”.Align names with “update” for consistency across the codebase.
-class TestAiCreateEventChunksCommand: - """Test suite for the ai_create_event_chunks command.""" +class TestAiUpdateEventChunksCommand: + """Test suite for the ai_update_event_chunks command."""backend/tests/apps/ai/management/commands/ai_update_slack_message_chunks_test.py (1)
22-23
: Nit: class name references “create” while command is “ai_update_*”.Rename for consistency with the command and other tests.
-class TestAiCreateSlackMessageChunksCommand: +class TestAiUpdateSlackMessageChunksCommand:backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (1)
13-31
: Make --message-key and --all mutually exclusiveAvoid ambiguous CLI usage by preventing both flags from being supplied together.
Apply:
- parser.add_argument( - "--message-key", - type=str, - help="Process only the message with this key", - ) - parser.add_argument( - "--all", - action="store_true", - help="Process all the messages", - ) + group = parser.add_mutually_exclusive_group() + group.add_argument( + "--message-key", + type=str, + help="Process only the message with this key", + ) + group.add_argument( + "--all", + action="store_true", + help="Process all the messages", + )backend/apps/ai/models/context.py (1)
44-66
: Limit DB writes to changed fields and keep timestamps freshWhen updating existing contexts, save only the modified fields and ensure the auto-updated timestamp is included in update_fields to reflect the change.
Apply:
- if not created and (context.content != content or context.source != source): - context.content = content - context.source = source - if save: - context.save() + if not created and (context.content != content or context.source != source): + context.content = content + context.source = source + if save: + # Include nest_updated_at so the auto_now timestamp is refreshed even with update_fields. + context.save(update_fields=["content", "source", "nest_updated_at"])Note: Always returning the Context instance is consistent with the maintainers’ preference (per retrieved learnings).
backend/apps/ai/Makefile (1)
5-44
: Declare .PHONY targets to satisfy checkmake and avoid name clashesMarking these targets as phony is a low-effort improvement that pacifies checkmake and prevents accidental filename conflicts.
Apply:
+ .PHONY: ai-run-rag-tool ai-update-chapter-chunks ai-update-chapter-context \ + ai-update-committee-chunks ai-update-committee-context \ + ai-update-event-chunks ai-update-event-context \ + ai-update-project-chunks ai-update-project-context \ + ai-update-slack-message-chunks ai-update-slack-message-contextIf you want to fully appease checkmake’s minphony rules for this sub-makefile, you could also add no-op “all/clean/test” targets here or at a higher level, but that’s optional in this context.
backend/tests/apps/ai/common/base/context_command_test.py (2)
53-55
: Use new Context field names (entity_type_id/entity_id) in mocksThe Context model uses entity_type/entity_id now. These mock attributes still reflect the old names and can cause confusion when reading tests.
Apply this diff:
- context.content_type_id = 1 - context.object_id = 1 + context.entity_type_id = 1 + context.entity_id = 1
121-134
: Unreachable “creation fails” branch given update_data always returns a ContextContext.update_data currently always returns a Context instance (truthy). This test simulates a None return to drive the failure branch. If that’s intentional (to validate defensive handling), consider adding a brief comment in the test to explain why we mock a falsey return despite the implementation never doing so. Otherwise, this branch is effectively dead and could be removed.
backend/tests/apps/ai/common/base/ai_command_test.py (2)
78-85
: Tighten get_base_queryset test and remove confusing Mock callabilityMocking objects as callable can mislead future readers. Also assert the returned queryset for stronger verification.
Apply this diff:
def test_get_base_queryset(self, command): with patch.object(MockTestModel, "objects") as mock_objects: mock_manager = Mock() mock_objects.all.return_value = mock_manager - mock_objects.return_value = mock_manager - - command.get_base_queryset() + result = command.get_base_queryset() mock_objects.all.assert_called_once() + assert result == mock_manager
99-120
: Make add_argument assertions order-agnostic (optional)Relying on the exact order of parser.add_argument calls makes the test brittle. Consider asserting using a mapping over the flags rather than sequence positions.
backend/tests/apps/ai/common/base/chunk_command_test.py (2)
55-56
: Update mock Context fields to match the new schemaThese fields still use the legacy names. Align them with Context’s entity_type/entity_id to avoid future confusion.
Apply this diff:
- context.content_type_id = 1 - context.object_id = 1 + context.entity_type_id = 1 + context.entity_id = 1
189-231
: Consider adding a test for the “chunks up to date” branchYou comprehensively cover creation/update paths. A small test where latest_chunk_timestamp is present and >= context.nest_updated_at would validate the “already up to date” message and the no-op path.
Happy to draft this test if you want it included.
backend/tests/apps/ai/management/commands/ai_update_committee_chunks_test.py (2)
1-1
: Docstring mismatch with command nameThe file tests ai_update_committee_chunks but the docstring says ai_create_committee_chunks. Align the wording for clarity.
Apply this diff:
-"""Tests for the ai_create_committee_chunks command.""" +"""Tests for the ai_update_committee_chunks command."""
29-31
: Rename test class to reflect “update” (not “create”)The class name still references “create,” which can be misleading.
Apply this diff:
-class TestAiCreateCommitteeChunksCommand: +class TestAiUpdateCommitteeChunksCommand:
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (26)
backend/apps/ai/Makefile
(1 hunks)backend/apps/ai/common/base/ai_command.py
(1 hunks)backend/apps/ai/common/base/chunk_command.py
(1 hunks)backend/apps/ai/management/commands/ai_update_chapter_chunks.py
(1 hunks)backend/apps/ai/management/commands/ai_update_chapter_context.py
(1 hunks)backend/apps/ai/management/commands/ai_update_committee_chunks.py
(1 hunks)backend/apps/ai/management/commands/ai_update_committee_context.py
(1 hunks)backend/apps/ai/management/commands/ai_update_event_chunks.py
(1 hunks)backend/apps/ai/management/commands/ai_update_event_context.py
(1 hunks)backend/apps/ai/management/commands/ai_update_project_chunks.py
(1 hunks)backend/apps/ai/management/commands/ai_update_project_context.py
(1 hunks)backend/apps/ai/management/commands/ai_update_slack_message_chunks.py
(1 hunks)backend/apps/ai/management/commands/ai_update_slack_message_context.py
(1 hunks)backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/common/base/ai_command_test.py
(1 hunks)backend/tests/apps/ai/common/base/chunk_command_test.py
(1 hunks)backend/tests/apps/ai/common/base/context_command_test.py
(1 hunks)backend/tests/apps/ai/common/extractors/project_test.py
(1 hunks)backend/tests/apps/ai/common/utils_test.py
(4 hunks)backend/tests/apps/ai/management/commands/ai_update_chapter_chunks_test.py
(1 hunks)backend/tests/apps/ai/management/commands/ai_update_committee_chunks_test.py
(1 hunks)backend/tests/apps/ai/management/commands/ai_update_event_chunks_test.py
(1 hunks)backend/tests/apps/ai/management/commands/ai_update_project_chunks_test.py
(1 hunks)backend/tests/apps/ai/management/commands/ai_update_slack_message_chunks_test.py
(1 hunks)backend/tests/apps/ai/models/context_test.py
(1 hunks)backend/tests/apps/common/open_ai_test.py
(0 hunks)
💤 Files with no reviewable changes (1)
- backend/tests/apps/common/open_ai_test.py
🚧 Files skipped from review as they are similar to previous changes (6)
- backend/apps/ai/management/commands/ai_update_committee_context.py
- backend/apps/ai/common/base/chunk_command.py
- backend/tests/apps/ai/models/context_test.py
- backend/apps/ai/management/commands/ai_update_project_context.py
- backend/apps/ai/management/commands/ai_update_slack_message_context.py
- backend/apps/ai/common/base/ai_command.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-16T12:07:38.116Z
Learnt from: Dishant1804
PR: OWASP/Nest#1891
File: backend/apps/ai/models/context.py:44-66
Timestamp: 2025-08-16T12:07:38.116Z
Learning: In the Context.update_data method, the Context object should always be returned (never None), even when no changes are made to the content or source fields. This ensures consistent behavior for callers.
Applied to files:
backend/apps/ai/models/context.py
📚 Learning: 2025-07-28T14:51:14.736Z
Learnt from: adithya-naik
PR: OWASP/Nest#1894
File: frontend/src/components/TopContributorsList.tsx:74-74
Timestamp: 2025-07-28T14:51:14.736Z
Learning: In the OWASP/Nest project, the maintainer adithya-naik prefers not to create separate components for code that's only used in two specific cases, following the YAGNI principle to avoid over-engineering when the duplication is limited and manageable.
Applied to files:
backend/apps/ai/models/context.py
🧬 Code Graph Analysis (18)
backend/apps/ai/management/commands/ai_update_committee_chunks.py (4)
backend/apps/ai/common/base/chunk_command.py (1)
BaseChunkCommand
(12-91)backend/apps/ai/common/extractors/committee.py (1)
extract_committee_content
(6-55)backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_chunks.py (2)
Command
(10-20)extract_content
(14-16)
backend/apps/ai/management/commands/ai_update_project_chunks.py (7)
backend/apps/ai/common/base/chunk_command.py (1)
BaseChunkCommand
(12-91)backend/apps/ai/common/extractors/project.py (1)
extract_project_content
(6-97)backend/apps/ai/management/commands/ai_update_committee_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (2)
Command
(9-41)extract_content
(32-34)backend/apps/ai/management/commands/ai_update_event_chunks.py (3)
Command
(10-24)extract_content
(14-16)get_base_queryset
(18-20)backend/apps/ai/management/commands/ai_create_project_chunks.py (1)
Command
(14-178)
backend/tests/apps/ai/management/commands/ai_update_project_chunks_test.py (3)
backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_chunks.py (3)
Command
(10-20)extract_content
(14-16)get_base_queryset
(18-20)backend/apps/ai/management/commands/ai_update_event_chunks.py (3)
Command
(10-24)extract_content
(14-16)get_base_queryset
(18-20)
backend/tests/apps/ai/management/commands/ai_update_slack_message_chunks_test.py (1)
backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (5)
Command
(9-41)source_name
(40-41)extract_content
(32-34)get_default_queryset
(36-38)add_arguments
(13-30)
backend/tests/apps/ai/management/commands/ai_update_chapter_chunks_test.py (2)
backend/apps/ai/management/commands/ai_update_committee_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)
backend/tests/apps/ai/management/commands/ai_update_event_chunks_test.py (3)
backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_chunks.py (3)
Command
(10-20)extract_content
(14-16)get_base_queryset
(18-20)backend/apps/ai/management/commands/ai_update_event_chunks.py (4)
Command
(10-24)extract_content
(14-16)get_base_queryset
(18-20)get_default_queryset
(22-24)
backend/tests/apps/ai/common/extractors/project_test.py (1)
backend/apps/ai/common/extractors/project.py (1)
extract_project_content
(6-97)
backend/apps/ai/management/commands/ai_update_chapter_chunks.py (10)
backend/apps/ai/common/base/chunk_command.py (1)
BaseChunkCommand
(12-91)backend/apps/owasp/api/internal/queries/chapter.py (1)
chapter
(14-19)backend/apps/ai/common/extractors/chapter.py (1)
extract_chapter_content
(6-76)backend/apps/ai/management/commands/ai_update_committee_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_chunks.py (2)
Command
(10-20)extract_content
(14-16)backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (2)
Command
(9-41)extract_content
(32-34)backend/apps/ai/management/commands/ai_update_event_chunks.py (2)
Command
(10-24)extract_content
(14-16)backend/tests/apps/ai/common/base/ai_command_test.py (1)
extract_content
(50-51)backend/tests/apps/ai/common/base/chunk_command_test.py (1)
extract_content
(23-25)backend/tests/apps/ai/common/base/context_command_test.py (1)
extract_content
(21-22)
backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (6)
backend/apps/ai/common/base/chunk_command.py (1)
BaseChunkCommand
(12-91)backend/apps/slack/models/message.py (1)
cleaned_text
(52-63)backend/apps/ai/management/commands/ai_update_committee_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_chunks.py (3)
Command
(10-20)extract_content
(14-16)get_base_queryset
(18-20)backend/apps/ai/management/commands/ai_update_event_chunks.py (4)
Command
(10-24)extract_content
(14-16)get_default_queryset
(22-24)get_base_queryset
(18-20)
backend/apps/ai/management/commands/ai_update_event_chunks.py (6)
backend/apps/ai/common/base/chunk_command.py (1)
BaseChunkCommand
(12-91)backend/apps/ai/common/extractors/event.py (1)
extract_event_content
(6-49)backend/apps/ai/management/commands/ai_update_committee_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_chunks.py (3)
Command
(10-20)extract_content
(14-16)get_base_queryset
(18-20)backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (3)
Command
(9-41)extract_content
(32-34)get_default_queryset
(36-38)
backend/tests/apps/ai/common/base/ai_command_test.py (1)
backend/apps/ai/common/base/ai_command.py (10)
BaseAICommand
(12-109)source_name
(25-27)get_base_queryset
(29-31)get_default_queryset
(33-35)add_common_arguments
(37-54)add_arguments
(56-58)get_queryset
(60-69)get_entity_key
(71-73)setup_openai_client
(75-83)handle_batch_processing
(85-109)
backend/tests/apps/ai/common/base/chunk_command_test.py (4)
backend/apps/ai/common/base/chunk_command.py (4)
BaseChunkCommand
(12-91)help
(15-17)process_chunks_batch
(19-77)handle
(79-91)backend/apps/ai/models/chunk.py (1)
Chunk
(12-76)backend/apps/ai/models/context.py (1)
Context
(21-65)backend/apps/ai/common/base/context_command.py (2)
help
(12-14)handle
(45-54)
backend/apps/ai/management/commands/ai_update_event_context.py (6)
backend/apps/ai/common/base/context_command.py (1)
BaseContextCommand
(9-54)backend/apps/ai/common/extractors/event.py (1)
extract_event_content
(6-49)backend/apps/ai/management/commands/ai_update_chapter_context.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_committee_context.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_project_context.py (3)
Command
(10-20)extract_content
(14-16)get_base_queryset
(18-20)backend/apps/ai/common/base/ai_command.py (2)
get_base_queryset
(29-31)get_default_queryset
(33-35)
backend/tests/apps/ai/common/utils_test.py (2)
backend/apps/ai/common/utils.py (2)
create_chunks_and_embeddings
(19-70)regenerate_chunks_for_context
(73-101)backend/apps/ai/models/context.py (1)
regenerate_chunks_for_context
(14-18)
backend/tests/apps/ai/management/commands/ai_update_committee_chunks_test.py (4)
backend/apps/ai/management/commands/ai_update_committee_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/management/commands/ai_update_chapter_chunks.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/common/base/ai_command.py (1)
get_entity_key
(71-73)backend/apps/ai/common/base/chunk_command.py (1)
help
(15-17)
backend/tests/apps/ai/common/base/context_command_test.py (2)
backend/apps/ai/common/base/context_command.py (4)
BaseContextCommand
(9-54)help
(12-14)process_context_batch
(16-43)handle
(45-54)backend/apps/ai/models/context.py (2)
Context
(21-65)update_data
(45-65)
backend/apps/ai/models/context.py (3)
backend/apps/common/models.py (1)
TimestampedModel
(37-46)backend/apps/ai/common/utils.py (1)
regenerate_chunks_for_context
(73-101)backend/apps/ai/models/chunk.py (2)
Meta
(15-18)update_data
(45-76)
backend/apps/ai/management/commands/ai_update_chapter_context.py (7)
backend/apps/ai/common/base/context_command.py (1)
BaseContextCommand
(9-54)backend/apps/owasp/api/internal/queries/chapter.py (1)
chapter
(14-19)backend/apps/ai/common/extractors/chapter.py (1)
extract_chapter_content
(6-76)backend/apps/ai/management/commands/ai_update_committee_context.py (2)
Command
(8-14)extract_content
(12-14)backend/tests/apps/ai/common/base/ai_command_test.py (1)
extract_content
(50-51)backend/tests/apps/ai/common/base/chunk_command_test.py (1)
extract_content
(23-25)backend/tests/apps/ai/common/base/context_command_test.py (1)
extract_content
(21-22)
🪛 checkmake (0.2.2)
backend/apps/ai/Makefile
[warning] 1-1: Missing required phony target "all"
(minphony)
[warning] 1-1: Missing required phony target "clean"
(minphony)
[warning] 1-1: Missing required phony target "test"
(minphony)
🔇 Additional comments (20)
backend/apps/ai/management/commands/ai_update_committee_chunks.py (1)
8-14
: Solid, minimal command skeletonClean delegation to the extractor and proper BaseChunkCommand usage. Looks good.
backend/apps/ai/management/commands/ai_update_chapter_context.py (1)
8-14
: Straightforward context updaterDelegation to extract_chapter_content and BaseContextCommand usage are correct.
backend/apps/ai/management/commands/ai_update_project_chunks.py (1)
10-17
: Good adherence to the shared chunking flowThe command correctly sets model_class/key_field_name and delegates extraction.
backend/apps/ai/management/commands/ai_update_event_context.py (1)
10-24
: LGTM: sensible defaults and delegation
- Delegates to extract_event_content.
- get_default_queryset uses Event.upcoming_events(), which better fits events than a generic is_active filter.
backend/tests/apps/ai/common/utils_test.py (3)
245-294
: Regeneration flow looks solid.Good coverage: deletes existing chunks when present, splits content, instantiates OpenAI, and delegates to create_chunks_and_embeddings with the expected args. Logging assertions are precise.
295-327
: Correct handling of no-content case.Appropriate early exit with a clear warning; avoids unnecessary OpenAI calls. Assertions look accurate.
329-372
: No-existing-chunks path is correct.Skips delete when count is zero, proceeds to split and create. Assertions capture expected behavior.
backend/tests/apps/ai/management/commands/ai_update_chapter_chunks_test.py (1)
40-47
: Delegation test is precise and correct.Patches the extractor at the command module path, verifies single call with the entity, and validates returned tuple. Nicely done.
backend/tests/apps/ai/management/commands/ai_update_event_chunks_test.py (2)
61-69
: Base queryset delegation check is good.Verifies the super call path without over-specifying internals. Looks correct.
71-79
: Default queryset logic is correct.Asserting Event.upcoming_events() is used by default matches the command contract.
backend/tests/apps/ai/management/commands/ai_update_slack_message_chunks_test.py (2)
64-84
: Argument wiring test is clear and sufficient.Verifies the expected flags and defaults without over-constraining implementation details. Looks good.
44-55
: cleaned_text is a @Property, not a method—no changes neededThe
Message
model inbackend/apps/slack/models/message.py
(lines 52–54) definescleaned_text
with@property
, soextract_content
always receives a string (orNone
). The existing tests already cover both non-empty andNone
cases. You can safely ignore the proposed callable‐branch change and additional test.Likely an incorrect or invalid review comment.
backend/apps/ai/management/commands/ai_update_chapter_chunks.py (1)
12-14
: Extractor delegation and return order confirmedFile: backend/apps/ai/management/commands/ai_update_chapter_chunks.py
Lines: 12-14Snippet showing the final state of code at these lines
def extract_content(self, entity: Chapter) -> tuple[str, str]: """Extract content from the chapter.""" return extract_chapter_content(entity)
extract_chapter_content’s implementation (lines 73–76 in backend/apps/ai/common/extractors/chapter.py) returns
- prose content
- metadata content
per its signature and docstring, matching BaseChunkCommand’s expected (prose, metadata) contract. Ready to merge.backend/apps/ai/management/commands/ai_update_slack_message_chunks.py (2)
36-41
: LGTM on queryset and source namingDefaulting to the base queryset (no is_active) for messages and naming the source "slack_message" fit the model and logging conventions.
32-35
: No change needed for cleaned_text usage
cleaned_text
is decorated with@property
, soentity.cleaned_text
already returns astr
. Calling it as a method isn’t required and would actually error out.Likely an incorrect or invalid review comment.
backend/apps/ai/models/context.py (2)
24-34
: Solid GFK design and uniqueness guaranteesRequired entity_type/entity_id with a GenericForeignKey and unique_together enforces the “one context per entity” invariant discussed in the PR. This aligns with the project direction and keeps lookups efficient via the implicit unique index.
14-19
: Wrapper to avoid circular import is fineThe local indirection for regenerate_chunks_for_context avoids import cycles cleanly. No issues.
backend/apps/ai/management/commands/ai_update_event_chunks.py (2)
14-16
: Good use of extractor with BaseChunkCommandDelegation to extract_event_content(entity) matches the shared chunking flow and keeps entity-specific logic out of the command.
22-24
: Sensible default queryset for eventsDefaulting to upcoming events instead of an is_active filter is appropriate for reducing churn and processing scope.
backend/apps/ai/Makefile (1)
1-3
: Makefile target valid –ai_run_rag_tool
existsThe
ai_run_rag_tool
management command is present at
• backend/apps/ai/management/commands/ai_run_rag_tool.py
and is exercised by tests in
• backend/tests/apps/ai/management/commands/ai_run_rag_tool_test.pyNo changes to the
ai-run-rag-tool:
target are needed.
backend/tests/apps/ai/management/commands/ai_update_committee_chunks_test.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (6)
backend/apps/ai/models/context.py (3)
30-34
: Prefer named UniqueConstraint over unique_together (clearer, non-deprecated style).UniqueConstraint provides a named, explicit constraint that’s easier to manage in migrations.
Apply:
class Meta: db_table = "ai_contexts" verbose_name = "Context" - unique_together = ("entity_type", "entity_id") + constraints = [ + models.UniqueConstraint( + fields=("entity_type", "entity_id"), + name="uniq_ai_context_entity", + ) + ]
35-43
: Make str robust when the related entity is missing.If the generic target has been deleted, self.entity may be None, resulting in “model None: …”. Prefer a clearer placeholder.
- entity = ( - getattr(self.entity, "name", None) - or getattr(self.entity, "key", None) - or str(self.entity) - ) + entity_obj = self.entity + entity = ( + getattr(entity_obj, "name", None) + or getattr(entity_obj, "key", None) + or (str(entity_obj) if entity_obj is not None else "-") + )
14-19
: Consider removing the wrapper regenerate_chunks_for_context to reduce coupling.The indirection exists only to sidestep import cycles, but it reintroduces a model-to-utils dependency surface. Since update_data no longer calls it, prefer importing utils where needed or keeping the wrapper in a non-model module (e.g., apps.ai.common.init).
If retained, add a type hint for clarity:
-def regenerate_chunks_for_context(context): +def regenerate_chunks_for_context(context: "Context"): """Import regenerate_chunks_for_context to avoid circular import."""backend/tests/apps/ai/models/context_test.py (3)
32-38
: Rename to reflect field name (entity_type).The test name still uses the old “content_type” wording.
- def test_content_type_field_properties(self): + def test_entity_type_field_properties(self):
39-42
: Rename to reflect field name and, if adopting BigInteger, update the asserted type.If you switch entity_id to PositiveBigIntegerField, the test should assert that. Otherwise, just rename the test.
- def test_object_id_field_properties(self): + def test_entity_id_field_properties(self): field = Context._meta.get_field("entity_id") - assert field.__class__.__name__ == "PositiveIntegerField" + assert field.__class__.__name__ == "PositiveBigIntegerField"If you keep PositiveIntegerField, only apply the rename and keep the original assertion.
377-377
: Add a test to ensure unchanged contexts don’t trigger save (preserve timestamps).This guards against accidental regressions that would bump nest_updated_at unnecessarily.
+ @patch("apps.ai.models.context.Context.objects.get_or_create") + def test_update_data_noop_does_not_save_when_unchanged(self, mock_get_or_create): + content_object = Mock() + content_object.pk = 1 + with patch("apps.ai.models.context.ContentType.objects.get_for_model") as mock_gfm: + ct = Mock(spec=ContentType) + ct.get_source_expressions = Mock(return_value=[]) + mock_gfm.return_value = ct + existing = create_model_mock(Context) + existing.content = "same" + existing.source = "src" + existing.save = Mock() + mock_get_or_create.return_value = (existing, False) + + result = Context.update_data("same", content_object, source="src", save=True) + + assert result == existing + existing.save.assert_not_called()
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/models/context_test.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-16T12:07:38.116Z
Learnt from: Dishant1804
PR: OWASP/Nest#1891
File: backend/apps/ai/models/context.py:44-66
Timestamp: 2025-08-16T12:07:38.116Z
Learning: In the Context.update_data method, the Context object should always be returned (never None), even when no changes are made to the content or source fields. This ensures consistent behavior for callers.
Applied to files:
backend/apps/ai/models/context.py
📚 Learning: 2025-07-28T14:51:14.736Z
Learnt from: adithya-naik
PR: OWASP/Nest#1894
File: frontend/src/components/TopContributorsList.tsx:74-74
Timestamp: 2025-07-28T14:51:14.736Z
Learning: In the OWASP/Nest project, the maintainer adithya-naik prefers not to create separate components for code that's only used in two specific cases, following the YAGNI principle to avoid over-engineering when the duplication is limited and manageable.
Applied to files:
backend/apps/ai/models/context.py
🧬 Code Graph Analysis (2)
backend/apps/ai/models/context.py (4)
backend/apps/common/models.py (1)
TimestampedModel
(37-46)backend/apps/ai/common/utils.py (1)
regenerate_chunks_for_context
(73-101)backend/apps/owasp/admin/entity_member.py (1)
entity
(61-74)backend/apps/ai/models/chunk.py (2)
Meta
(15-18)update_data
(45-76)
backend/tests/apps/ai/models/context_test.py (2)
backend/apps/ai/models/context.py (2)
Context
(21-67)update_data
(45-67)backend/apps/common/models.py (1)
TimestampedModel
(37-46)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is still far from what'd approve with no hesitations. Please address the current comments, create issue to track your technical debt (both refactoring I suggested and tests). This can be merged after that.
backend/apps/ai/models/context.py
Outdated
def regenerate_chunks_for_context(context): | ||
"""Import regenerate_chunks_for_context to avoid circular import.""" | ||
from apps.ai.common.utils import regenerate_chunks_for_context as _regenerate_chunks | ||
|
||
return _regenerate_chunks(context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is here for what purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to resolve the circular import I cleaned out the code but this missed out, removed it
from apps.ai.common.constants import DELIMITER | ||
|
||
|
||
def extract_chapter_content(chapter) -> tuple[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have 2 suggestions for the extractors:
- why do you need separation for prose/metadata if you join them later anyway?
- this needs to be migrated to Jinja templates like ones we have for Slack responses
Please create an issue to track this refactoring.
backend/apps/ai/common/utils.py
Outdated
""" | ||
from apps.ai.models.chunk import Chunk | ||
|
||
old_chunk_count = context.chunks.count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to delete them w/o pre-checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes -- removed the checks
additional_context = self.get_additional_context( | ||
chunk.content_object, chunk.content_type.model | ||
chunk.context.entity, chunk.context.entity_type.model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you don't need the second param as you can figure that our from the model instance inside of get_additional_context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/apps/ai/agent/tools/rag/retriever.py (1)
59-67
: Update exception handling for OpenAI SDK v1.xThe v1.x Python client no longer defines
OpenAIError
. Instead you should catch its new base exception (openai.APIError
) and/or the more specific subclasses (e.g.,RateLimitError
,APIConnectionError
,APITimeoutError
) to handle errors appropriately.• Affected file:
backend/apps/ai/agent/tools/rag/retriever.py
(lines 59–67)
• Replace the genericexcept openai.OpenAIError:
with granular catches and a finalAPIError
fallback.Proposed diff:
try: response = self.openai_client.embeddings.create( input=[query], model=self.embedding_model, ) return response.data[0].embedding - except openai.OpenAIError: - logger.exception("OpenAI API error") - raise + except openai.RateLimitError: + logger.warning("Rate limit exceeded, backing off before retry") + # backoff & retry logic + except (openai.APIConnectionError, openai.APITimeoutError) as e: + logger.warning("Network error from OpenAI API: %s", e) + # retry logic + except openai.APIError as e: + logger.exception("Unexpected OpenAI API error") + raise
♻️ Duplicate comments (1)
backend/apps/ai/agent/tools/rag/retriever.py (1)
217-222
: Good fix: filter through Context entity_type (matches the model changes)Filtering by context__entity_type__… is correct post-refactor away from Chunk.content_type. This addresses the earlier review note.
🧹 Nitpick comments (12)
backend/tests/apps/ai/management/commands/ai_update_committee_context_test.py (4)
69-79
: Patch the correct base method: target BaseAICommand.get_base_queryset, not BaseContextCommand.
get_base_queryset
is defined on BaseAICommand per the shared snippets. Patching BaseContextCommand here works by overshadowing via attribute injection, but it doesn’t validate the real inheritance path and can hide regressions.Apply this diff:
- with patch( - "apps.ai.common.base.context_command.BaseContextCommand.get_base_queryset" - ) as mock_super: + with patch( + "apps.ai.common.base.ai_command.BaseAICommand.get_base_queryset" + ) as mock_super:
152-173
: Mirror production semantics: have update_data return a truthy object, not True.
Context.update_data
returns a Context instance (truthy), not a boolean. Returning a mock object here reduces the risk of a future mismatch if the implementation changes.- with patch("apps.ai.common.base.context_command.Context") as mock_context_class: - mock_context_class.update_data.return_value = True + with patch("apps.ai.common.base.context_command.Context") as mock_context_class: + mock_context_class.update_data.return_value = Mock()
188-212
: Failure path may be unreachable with current update_data contract.
Context.update_data
returns a Context object and raises on DB errors; it doesn’t return False. The tested “failure” branch will not occur in production unless the contract changes or you explicitly return falsy values fromupdate_data
.Consider one of:
- Adjust BaseContextCommand.process_context_batch to catch exceptions from
update_data
and log an error (then add a test usingside_effect=Exception("db error")
), or- Document that
update_data
returns a truthy object on success and update this test to validate exception handling rather than a False return.
1-1
: Rename staleai_create_committee_context
references in test filePlease update all remaining occurrences of
ai_create_committee_context
toai_update_committee_context
and rename the test class accordingly:• File:
backend/tests/apps/ai/management/commands/ai_update_committee_context_test.py
– Line 1:
diff -"""Tests for the ai_create_committee_context command.""" +"""Tests for the ai_update_committee_context command."""
– Around line 27–29:
diff -class TestAiCreateCommitteeContextCommand: - """Test suite for the ai_create_committee_context command.""" +class TestAiUpdateCommitteeContextCommand: + """Test suite for the ai_update_committee_context command."""
This will ensure clarity and make future greps accurate.
backend/apps/ai/agent/tools/rag/retriever.py (8)
24-30
: Make SUPPORTED_ENTITY_TYPES immutable for intent and tiny membership winsTuples are fine, but a frozenset communicates immutability and gives O(1) membership checks (you iterate and also check membership later). This is a tiny nit, not a blocker.
- SUPPORTED_ENTITY_TYPES = ( + SUPPORTED_ENTITY_TYPES = frozenset(( "chapter", "committee", "event", "message", "project", - ) + ))
72-78
: Prefer persisted source_name when available; tighten docstring and add a slug fallbackIf Context stores a source/source_name (as suggested by the AI command snippets), prefer using that value when available for consistency across pipelines. Also, the docstring still says “content object”; here it’s an entity. Adding “slug” is a cheap, useful fallback.
- def get_source_name(self, entity) -> str: - """Get the name/identifier for the content object.""" - for attr in ("name", "title", "login", "key", "summary"): + def get_source_name(self, entity) -> str: + """Get a human-readable identifier for the entity.""" + for attr in ("name", "title", "login", "key", "summary", "slug"): if getattr(entity, attr, None): return str(getattr(entity, attr)) return str(entity)
79-85
: Docstring drift: “content type” → “entity type”; clarify paramsThe function operates on entities now. Tightening the docstring avoids future confusion.
- def get_additional_context(self, entity, entity_type: str) -> dict[str, Any]: - """Get additional context information based on content type. + def get_additional_context(self, entity, entity_type: str) -> dict[str, Any]: + """Get additional context information based on entity type. @@ - entity: The source object. - entity_type: The model name of the content object. + entity: The related entity (from Context.entity). + entity_type: The entity model name (e.g., "project", "chapter").
91-91
: Minor naming clarity: clean_content_type → clean_entity_typeVariable name can be misleading after the refactor to entities. Not a blocker; consider renaming across this method when convenient.
224-224
: Avoid hauling the embedding payload; defer the “embedding” fieldAfter the annotate is done, the embedding value isn’t used in Python. Deferring it trims network I/O and memory, especially with large vector dims.
- chunks = queryset.select_related("context__entity_type").order_by("-similarity")[:limit] + chunks = ( + queryset + .select_related("context__entity_type") + .defer("embedding") + .order_by("-similarity")[:limit] + )
228-235
: Avoid double entity fetches, correct log message, and prefer persisted context source_nameAccessing GenericForeignKey triggers a DB hit; you access it for the guard and then again for source/extra context. Cache it once, fix the log message, and prefer a stored source_name when available to avoid recomputing.
- for chunk in chunks: - if not chunk.context or not chunk.context.entity: - logger.warning("Content object is None for chunk %s. Skipping.", chunk.id) - continue - - source_name = self.get_source_name(chunk.context.entity) - additional_context = self.get_additional_context( - chunk.context.entity, chunk.context.entity_type.model - ) + for chunk in chunks: + if not chunk.context: + logger.warning("Context is None for chunk %s. Skipping.", chunk.id) + continue + + entity = getattr(chunk.context, "entity", None) + if not entity: + logger.warning("Context entity is None for chunk %s. Skipping.", chunk.id) + continue + + source_name = getattr(chunk.context, "source_name", None) or self.get_source_name(entity) + additional_context = self.get_additional_context(entity, chunk.context.entity_type.model)If Context does not have source/source_name, ignore that piece of the change and keep the fallback.
263-265
: Optional: enrich keyword detection with synonymsThe simple singular/plural check works; consider a small synonyms map (e.g., “repo”, “repos” → “project”; “chapter(s)” → “chapter”) to improve recall.
232-235
: Use Context metadata where possible to reduce N+1sLonger-term, consider persisting and reading “additional_context” and “source_name” from Context at creation time. That would avoid hitting the GFK per chunk at retrieval time, especially when limit is high.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
backend/apps/ai/agent/tools/rag/retriever.py
(4 hunks)backend/apps/ai/common/base/context_command.py
(1 hunks)backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/agent/tools/rag/retriever_test.py
(1 hunks)backend/tests/apps/ai/common/base/context_command_test.py
(1 hunks)backend/tests/apps/ai/management/commands/ai_update_committee_context_test.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- backend/apps/ai/common/base/context_command.py
- backend/tests/apps/ai/agent/tools/rag/retriever_test.py
- backend/tests/apps/ai/common/base/context_command_test.py
- backend/apps/ai/models/context.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
backend/tests/apps/ai/management/commands/ai_update_committee_context_test.py (4)
backend/apps/ai/management/commands/ai_update_committee_context.py (2)
Command
(8-14)extract_content
(12-14)backend/apps/ai/common/base/context_command.py (4)
BaseContextCommand
(9-54)help
(12-14)process_context_batch
(16-43)handle
(45-54)backend/apps/ai/common/base/ai_command.py (6)
get_base_queryset
(29-31)get_default_queryset
(33-35)add_arguments
(56-58)get_queryset
(60-69)get_entity_key
(71-73)source_name
(25-27)backend/apps/ai/models/context.py (1)
update_data
(45-67)
backend/apps/ai/agent/tools/rag/retriever.py (3)
backend/apps/owasp/admin/entity_member.py (1)
entity
(61-74)backend/apps/ai/common/base/ai_command.py (1)
source_name
(25-27)backend/apps/ai/management/commands/ai_update_slack_message_context.py (1)
source_name
(41-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (2)
backend/tests/apps/ai/management/commands/ai_update_committee_context_test.py (2)
79-90
: LGTM on default queryset filtering.Good verification that the default path applies
is_active=True
.
213-231
: LGTM on handle() batch plumbing.Solid assertion of option forwarding and
process_batch_func
binding.
backend/tests/apps/ai/management/commands/ai_update_committee_context_test.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (9)
backend/apps/ai/models/context.py (2)
49-60
: Close race on create and avoid bumping timestamps when unchangedTwo concurrent callers can both miss the get() and try to create, causing IntegrityError on save. Also, saving when content/source are unchanged will bump nest_updated_at, breaking “stale-chunk” detection based on timestamps.
Apply atomic get_or_create and only save when fields actually change; always return Context per your established contract:
@staticmethod def update_data( content: str, entity, source: str = "", *, save: bool = True, - ) -> "Context": + ) -> "Context": """Create or update context for a given entity.""" - entity_type = ContentType.objects.get_for_model(entity) - entity_id = entity.pk - - try: - context = Context.objects.get(entity_type=entity_type, entity_id=entity_id) - except Context.DoesNotExist: - context = Context(entity_type=entity_type, entity_id=entity_id) - - context.content = content - context.source = source - - if save: - context.save() - - return context + entity_type = ContentType.objects.get_for_model(entity) + entity_id = entity.pk + + if save: + # Atomic to avoid duplicate rows under concurrency + context, created = Context.objects.get_or_create( + entity_type=entity_type, + entity_id=entity_id, + defaults={"content": content, "source": source}, + ) + if created: + return context + # No-op if unchanged; preserve timestamps + if context.content == content and context.source == source: + return context + # Update only changed fields + context.content = content + context.source = source + context.save(update_fields=["content", "source"]) + return context + else: + # Return unsaved instance for bulk-save flows + try: + context = Context.objects.get(entity_type=entity_type, entity_id=entity_id) + except Context.DoesNotExist: + return Context( + entity_type=entity_type, entity_id=entity_id, content=content, source=source + ) + context.content = content + context.source = source + return contextThis keeps the “always return Context” behavior while removing the race and spurious timestamp updates.
19-21
: Upgradeentity_id
to PositiveBigIntegerField to match 64-bit primary keysWe’ve confirmed that your project’s
DEFAULT_AUTO_FIELD
is set toBigAutoField
inbackend/settings/base.py:194
. SinceGenericForeignKey
may point to any model using the default auto field, you should update theentity_id
field toPositiveBigIntegerField
to avoid potential overflows.• File:
backend/apps/ai/models/context.py
, Lines 19–21- entity_id = models.PositiveIntegerField() + entity_id = models.PositiveBigIntegerField()• After changing the field type, generate and apply a Django migration to alter the column in the database.
backend/tests/apps/ai/agent/tools/rag/retriever_test.py (2)
19-20
: Patch where the symbol is used: target apps.ai.agent.tools.rag.retriever.openai.OpenAIPatching the top-level openai.OpenAI is brittle and can leak across modules. Patch the symbol in the module under test.
- patch("openai.OpenAI") as mock_openai, + patch("apps.ai.agent.tools.rag.retriever.openai.OpenAI") as mock_openai,Apply the same change to all occurrences listed in the line ranges.
Also applies to: 45-46, 54-55, 74-75, 89-90, 104-105, 118-120, 135-136, 151-153, 171-172, 451-454, 483-485, 534-536, 567-569
70-84
: Make exception type resilient across OpenAI SDK versionsopenai.OpenAIError may not exist in some SDK versions. Use the exception from the retriever module’s namespace and assert accordingly.
- with ( - patch.dict(os.environ, {"DJANGO_OPEN_AI_SECRET_KEY": "test-key"}), - patch("openai.OpenAI") as mock_openai, - ): + from apps.ai.agent.tools.rag import retriever as retriever_mod + with ( + patch.dict(os.environ, {"DJANGO_OPEN_AI_SECRET_KEY": "test-key"}), + patch("apps.ai.agent.tools.rag.retriever.openai.OpenAI") as mock_openai, + ): mock_client = MagicMock() - mock_client.embeddings.create.side_effect = openai.OpenAIError("API Error") + Exc = getattr(retriever_mod.openai, "OpenAIError", Exception) + mock_client.embeddings.create.side_effect = Exc("API Error") mock_openai.return_value = mock_client retriever = Retriever() - with pytest.raises(openai.OpenAIError): + with pytest.raises(Exc): retriever.get_query_embedding("test query")Repeat similarly for any other tests that rely on OpenAIError.
backend/apps/ai/common/utils.py (3)
39-41
: Docstring claims ValueError is raised, but code does not raise itEither enforce a ValueError when context is invalid, or align the docstring with current behavior (log-and-return-empty on error). Given current tests, remove the incorrect Raises section.
- Raises: - ValueError: If context is None or invalid -
43-44
: Remove duplicate import of Chunk inside the functionChunk is already imported at module scope. The inner import is leftover and unnecessary.
- from apps.ai.models.chunk import Chunk
91-99
: Log success only when chunks were actually (re)createdCurrently logs success unconditionally. Use the return value to decide.
- create_chunks_and_embeddings( + new_chunks = create_chunks_and_embeddings( chunk_texts=new_chunk_texts, context=context, openai_client=openai_client, save=True, ) - logger.info("Successfully completed chunk regeneration for new context") + if not new_chunks: + logger.error( + "Chunk regeneration failed or produced no chunks for context id=%s", + getattr(context, "id", None), + ) + return + logger.info( + "Successfully completed chunk regeneration for context id=%s (%d chunks)", + getattr(context, "id", None), + len(new_chunks), + )backend/tests/apps/ai/common/utils_test.py (1)
117-121
: Rename to reflect actual assertion (sleep is NOT called here)The name/docstring say “sleep called” but the test asserts not_called.
- def test_create_chunks_and_embeddings_sleep_called( + def test_create_chunks_and_embeddings_no_sleep_on_first_call( self, mock_datetime, mock_sleep, mock_update_data ): - """Tests that sleep is called when needed.""" + """Tests that sleep is not called on the first request."""backend/apps/ai/agent/tools/rag/retriever.py (1)
151-158
: Normalize leaders to a JSON-serializable shape (avoid M2M manager)For committees, leaders currently reads the relation (likely a manager/QuerySet), which won’t serialize cleanly and is inconsistent with leaders_raw used elsewhere.
- "leaders": getattr(entity, "leaders", []), + "leaders": getattr(entity, "leaders_raw", []),If you must use the relation, convert explicitly, e.g., [u.login for u in entity.leaders.all()].
🧹 Nitpick comments (9)
backend/apps/ai/models/context.py (2)
23-27
: Prefer UniqueConstraint over unique_together (modern Django API)unique_together is legacy. Use UniqueConstraint for clarity and flexibility (e.g., conditionals, names).
class Meta: db_table = "ai_contexts" verbose_name = "Context" - unique_together = ("entity_type", "entity_id") + constraints = [ + models.UniqueConstraint( + fields=("entity_type", "entity_id"), + name="uniq_ai_context_entity", + ), + ]
28-35
: Minor: friendlier fallback in str when entity is missingIf the entity was deleted, str(None) renders “None”, which can be confusing in admin logs. Consider a clearer placeholder.
- entity = ( + entity = ( getattr(self.entity, "name", None) or getattr(self.entity, "key", None) - or str(self.entity) + or ("<deleted>" if self.entity is None else str(self.entity)) )backend/apps/ai/common/utils.py (3)
45-53
: Rate limiting logic has no state; time_since_last_request is constantYou re-derive “last request time” from now, so time_since_last_request is always DEFAULT_LAST_REQUEST_OFFSET_SECONDS, not the real elapsed time. Consider tracking last_request_at across calls.
Example approach:
- Add a module-global last_request_at: datetime | None = None
- Before calling the API, compare now - last_request_at to MIN_REQUEST_INTERVAL_SECONDS
- Update last_request_at = now after the request
I can draft a patch if you want to persist this state here or behind a small helper.
66-68
: Catch ValueError from zip(strict=True) length mismatchesIf the embeddings response length differs, zip(..., strict=True) raises ValueError. Include it in the handled exceptions.
- except (OpenAIError, AttributeError, TypeError): + except (OpenAIError, AttributeError, TypeError, ValueError): logger.exception("Failed to create chunks and embeddings") return []Optionally, validate len(response.data) == len(chunk_texts) and log a clearer error before zipping.
60-65
: Optional: deduplicate chunk_texts before embedding to reduce costSending duplicate texts increases API usage and can hit uniqueness on (context, text). You can preserve order with a simple de-dup.
# Before calling embeddings.create(...) seen = set() unique_texts = [] for t in chunk_texts: if t not in seen: unique_texts.append(t) seen.add(t) chunk_texts = unique_textsbackend/tests/apps/ai/common/utils_test.py (2)
82-92
: Use the exception class from the module under testTo avoid SDK-version drift, set side_effect to the OpenAIError imported by utils, not openai.OpenAIError.
- mock_openai_client.embeddings.create.side_effect = openai.OpenAIError( - "API connection failed" - ) + from apps.ai.common import utils as utils_mod + mock_openai_client.embeddings.create.side_effect = utils_mod.OpenAIError( + "API connection failed" + )
263-281
: Stabilize success-path assertion by returning a non-empty listWith the proposed utils change to log success only when chunks are created, ensure the mocked helper returns a truthy, non-empty list.
mock_openai_client = MagicMock() mock_openai_class.return_value = mock_openai_client + mock_create_chunks.return_value = [MagicMock()] regenerate_chunks_for_context(context) mock_existing_chunks.all.assert_called_once() mock_existing_chunks.all().delete.assert_called_once()
Repeat similarly in test_regenerate_chunks_for_context_no_existing_chunks if you want to assert the success log path.
backend/apps/ai/agent/tools/rag/retriever.py (2)
72-77
: Nit: docstring still says “content object”; it’s actually an entityMinor naming mismatch; update for clarity.
- def get_source_name(self, entity) -> str: - """Get the name/identifier for the content object.""" + def get_source_name(self, entity) -> str: + """Get the name/identifier for the entity."""
223-224
: Performance nit: also select_related("context")You already select_related("context__entity_type"). Adding "context" avoids an extra fetch for the FK itself.
- chunks = queryset.select_related("context__entity_type").order_by("-similarity")[:limit] + chunks = queryset.select_related("context", "context__entity_type").order_by("-similarity")[:limit]
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
backend/apps/ai/agent/tools/rag/retriever.py
(4 hunks)backend/apps/ai/common/utils.py
(2 hunks)backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/agent/tools/rag/retriever_test.py
(1 hunks)backend/tests/apps/ai/common/utils_test.py
(4 hunks)backend/tests/apps/ai/models/context_test.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/tests/apps/ai/models/context_test.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-16T12:07:38.124Z
Learnt from: Dishant1804
PR: OWASP/Nest#1891
File: backend/apps/ai/models/context.py:44-66
Timestamp: 2025-08-16T12:07:38.124Z
Learning: In the Context.update_data method, the Context object should always be returned (never None), even when no changes are made to the content or source fields. This ensures consistent behavior for callers.
Applied to files:
backend/apps/ai/models/context.py
📚 Learning: 2025-07-28T14:51:14.736Z
Learnt from: adithya-naik
PR: OWASP/Nest#1894
File: frontend/src/components/TopContributorsList.tsx:74-74
Timestamp: 2025-07-28T14:51:14.736Z
Learning: In the OWASP/Nest project, the maintainer adithya-naik prefers not to create separate components for code that's only used in two specific cases, following the YAGNI principle to avoid over-engineering when the duplication is limited and manageable.
Applied to files:
backend/apps/ai/models/context.py
🧬 Code graph analysis (5)
backend/apps/ai/common/utils.py (2)
backend/apps/ai/models/chunk.py (3)
Chunk
(12-76)update_data
(45-76)split_text
(35-42)backend/apps/ai/models/context.py (2)
Context
(14-60)update_data
(38-60)
backend/apps/ai/models/context.py (3)
backend/apps/common/models.py (1)
TimestampedModel
(37-46)backend/apps/owasp/admin/entity_member.py (1)
entity
(61-74)backend/apps/ai/models/chunk.py (2)
Meta
(15-18)update_data
(45-76)
backend/tests/apps/ai/agent/tools/rag/retriever_test.py (1)
backend/apps/ai/agent/tools/rag/retriever.py (6)
Retriever
(21-268)get_query_embedding
(49-70)get_source_name
(72-77)get_additional_context
(79-182)extract_content_types_from_query
(247-268)retrieve
(184-245)
backend/tests/apps/ai/common/utils_test.py (1)
backend/apps/ai/common/utils.py (2)
create_chunks_and_embeddings
(19-70)regenerate_chunks_for_context
(73-98)
backend/apps/ai/agent/tools/rag/retriever.py (1)
backend/apps/owasp/admin/entity_member.py (1)
entity
(61-74)
🔇 Additional comments (2)
backend/apps/ai/models/context.py (1)
17-21
: Confirm “optional GFK” requirement vs current required fieldsThe PR/issue description mentions an optional GenericForeignKey on Context. Here both entity_type and entity_id are required. If contexts without an entity are needed (e.g., global/system contexts), make these fields nullable and validate the pair (both set or both null).
If optional linkage is intended:
- entity_type = models.ForeignKey(ContentType, on_delete=models.CASCADE) - entity_id = models.PositiveIntegerField() + entity_type = models.ForeignKey(ContentType, on_delete=models.CASCADE, null=True, blank=True) + entity_id = models.PositiveBigIntegerField(null=True, blank=True)Add a clean() or DB constraint to ensure both-null or both-set for integrity. If the “required” design is intentional, please update the docs/issue to reflect that.
backend/tests/apps/ai/common/utils_test.py (1)
96-113
: Docstring says “failure path when context is None” but behavior is log-and-emptyThis matches current behavior. If you adopt an explicit ValueError in utils (alternative path), update this test accordingly. Otherwise no action needed.
@@ -60,6 +60,3 @@ def test_complete_general_exception(self, mock_openai, mock_logger, openai_insta | |||
response = openai_instance.complete() | |||
|
|||
assert response is None | |||
mock_logger.exception.assert_called_once_with( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I acidentally removed it while I was testing other assertions -- I have added it back
result = str(context) | ||
assert result == ( | ||
"test_model Test Object: This is test content that is longer than 50 charac" | ||
# cspell:ignore charac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not something we normally do. Use truncate
for Context::str instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
backend/apps/ai/models/context.py (1)
38-61
: Make update_data concurrency-safe and avoid unnecessary timestamp bumps.The current get/except-create pattern can race under concurrent calls and always saves even when nothing changed, unnecessarily bumping nest_updated_at (hurting “chunks stale if context updated” logic). Use get_or_create for atomicity and only persist when content/source actually change. This keeps the “always return Context” behavior (per the retrieved learning).
@@ @staticmethod def update_data( - content: str, - entity, + content: str, + entity: models.Model, source: str = "", *, save: bool = True, ) -> "Context": - """Create or update context for a given entity.""" - entity_type = ContentType.objects.get_for_model(entity) - entity_id = entity.pk - - try: - context = Context.objects.get(entity_type=entity_type, entity_id=entity_id) - except Context.DoesNotExist: - context = Context(entity_type=entity_type, entity_id=entity_id) - - context.content = content - context.source = source - - if save: - context.save() - - return context + """Create or update context for a given entity. + Always returns a Context instance (never None).""" + entity_type = ContentType.objects.get_for_model(entity) + entity_id = entity.pk + + if save: + context, created = Context.objects.get_or_create( + entity_type=entity_type, + entity_id=entity_id, + defaults={"content": content, "source": source}, + ) + if not created and (context.content != content or context.source != source): + context.content = content + context.source = source + # Include nest_updated_at to ensure the timestamp is persisted. + context.save(update_fields=["content", "source", "nest_updated_at"]) + return context + else: + # Support bulk-save flows by returning an unsaved instance when missing. + try: + context = Context.objects.get(entity_type=entity_type, entity_id=entity_id) + except Context.DoesNotExist: + return Context( + entity_type=entity_type, + entity_id=entity_id, + content=content, + source=source, + ) + # Update in-memory only (caller decides when to save). + context.content = content + context.source = source + return contextbackend/tests/apps/ai/models/context_test.py (1)
297-336
: Update “new context with save=True” test to patch get_or_create.This mirrors the creation path after the refactor.
- @patch("apps.ai.models.context.Context.objects.get") - @patch("apps.ai.models.context.Context.__init__") - def test_update_data_new_context_with_save(self, mock_init, mock_get): + @patch("apps.ai.models.context.Context.objects.get_or_create") + def test_update_data_new_context_with_save(self, mock_get_or_create): """Test update_data creating a new context with save=True.""" from apps.ai.models.context import Context as ContextModel - mock_get.side_effect = ContextModel.DoesNotExist - mock_init.return_value = None # __init__ should return None + new_ctx = create_model_mock(ContextModel) + mock_get_or_create.return_value = (new_ctx, True) @@ - mock_get_for_model.assert_called_once_with(mock_content_object) - mock_get.assert_called_once_with( - entity_type=mock_content_type, - entity_id=1, - ) - mock_init.assert_called_once_with( - entity_type=mock_content_type, - entity_id=1, - ) - assert result.content == content - assert result.source == source - mock_save.assert_called_once() + mock_get_for_model.assert_called_once_with(mock_content_object) + mock_get_or_create.assert_called_once_with( + entity_type=mock_content_type, + entity_id=1, + defaults={"content": content, "source": source}, + ) + assert result == new_ctx
🧹 Nitpick comments (8)
backend/apps/ai/models/context.py (4)
29-36
: Harden str when the related entity is missing.If the target object is deleted, self.entity may be None and str(self.entity) yields "None". Fall back to entity_id for a clearer label.
- entity = ( - getattr(self.entity, "name", None) - or getattr(self.entity, "key", None) - or str(self.entity) - ) - return f"{self.entity_type.model} {entity}: {truncate(self.content, 50)}" + obj = self.entity + label = ( + getattr(obj, "name", None) + or getattr(obj, "key", None) + or (str(obj) if obj is not None else str(self.entity_id)) + ) + return f"{self.entity_type.model} {label}: {truncate(self.content, 50)}"
24-28
: Optional: prefer UniqueConstraint over unique_together (named, modern style).Not urgent, but UniqueConstraint provides naming and advanced options.
class Meta: db_table = "ai_contexts" verbose_name = "Context" - unique_together = ("entity_type", "entity_id") + constraints = [ + models.UniqueConstraint( + fields=("entity_type", "entity_id"), + name="uq_ai_context_entity", + ) + ]
3-3
: Remove unused logger.logging and logger are unused in this module.
-import logging @@ -logger = logging.getLogger(__name__)Also applies to: 12-12
20-21
: Use PositiveBigIntegerField for entity_id to match BigAutoField default PKConfirmed that DEFAULT_AUTO_FIELD is set to
django.db.models.BigAutoField
in backend/settings/base.py:194, and most apps (and their migrations) useBigAutoField
for primary keys. To avoid potential overflow when referencing large PKs via the GenericForeignKey, switchentity_id
to a 64-bit field.• File to update:
backend/apps/ai/models/context.py
- entity_id = models.PositiveIntegerField() + entity_id = models.PositiveBigIntegerField()• After applying, run
./manage.py makemigrations ai
to generate anAlterField
migration.
• Review any tests or fixtures that set or compareentity_id
and update them if they assume a 32-bit range.backend/tests/apps/ai/models/context_test.py (4)
66-78
: Make manager.create test reflect required fields.Context requires entity_type and entity_id; the current test calls create without them (only works because of patching). Include those args for realism.
- @patch("apps.ai.models.context.Context.objects.create") + @patch("apps.ai.models.context.Context.objects.create") def test_context_manager_create(self, mock_create): mock_context = create_model_mock(Context) mock_create.return_value = mock_context content = "Test text" source = "test_source" - result = Context.objects.create(content=content, source=source) + mock_ct = Mock(spec=ContentType) + result = Context.objects.create( + content=content, + source=source, + entity_type=mock_ct, + entity_id=1, + ) - mock_create.assert_called_once_with(content=content, source=source) + mock_create.assert_called_once_with( + content=content, + source=source, + entity_type=mock_ct, + entity_id=1, + ) assert result == mock_context
233-264
: Adjust str expectation when entity is missing.With the hardened str, it falls back to entity_id instead of “Unknown”.
@@ - result = str(context) - assert result == "test_model Unknown: Another test content" + result = str(context) + assert result == "test_model 456: Another test content"
40-43
: Update field type assertion if switching to PositiveBigIntegerField.If you adopt the 64-bit entity_id change, update this assertion accordingly.
- assert field.__class__.__name__ == "PositiveIntegerField" + assert field.__class__.__name__ == "PositiveBigIntegerField"
158-168
: Add a no-op update test to prevent accidental timestamp bumps.Recommend adding a test ensuring update_data(save=True) does not call save() when values are unchanged (protects stale-chunk detection).
def test_update_data_noop_does_not_save(monkeypatch): from apps.ai.models.context import Context as ContextModel mock_entity = Mock() mock_entity.pk = 1 mock_ct = Mock(spec=ContentType) mock_ct.get_source_expressions = Mock(return_value=[]) # Existing instance with given values ctx = create_model_mock(ContextModel) ctx.content = "same" ctx.source = "src" with patch("apps.ai.models.context.ContentType.objects.get_for_model", return_value=mock_ct), \ patch("apps.ai.models.context.Context.objects.get_or_create", return_value=(ctx, False)) as goc, \ patch.object(ContextModel, "save") as mock_save: result = Context.update_data("same", mock_entity, source="src", save=True) goc.assert_called_once() mock_save.assert_not_called() assert result is ctx
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
backend/apps/ai/models/context.py
(1 hunks)backend/tests/apps/ai/models/context_test.py
(1 hunks)backend/tests/apps/common/open_ai_test.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/tests/apps/common/open_ai_test.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-16T12:07:38.124Z
Learnt from: Dishant1804
PR: OWASP/Nest#1891
File: backend/apps/ai/models/context.py:44-66
Timestamp: 2025-08-16T12:07:38.124Z
Learning: In the Context.update_data method, the Context object should always be returned (never None), even when no changes are made to the content or source fields. This ensures consistent behavior for callers.
Applied to files:
backend/apps/ai/models/context.py
backend/tests/apps/ai/models/context_test.py
📚 Learning: 2025-07-28T14:51:14.736Z
Learnt from: adithya-naik
PR: OWASP/Nest#1894
File: frontend/src/components/TopContributorsList.tsx:74-74
Timestamp: 2025-07-28T14:51:14.736Z
Learning: In the OWASP/Nest project, the maintainer adithya-naik prefers not to create separate components for code that's only used in two specific cases, following the YAGNI principle to avoid over-engineering when the duplication is limited and manageable.
Applied to files:
backend/apps/ai/models/context.py
🧬 Code graph analysis (2)
backend/apps/ai/models/context.py (3)
backend/apps/common/models.py (1)
TimestampedModel
(37-46)backend/apps/common/utils.py (1)
truncate
(164-176)backend/apps/ai/models/chunk.py (2)
Meta
(15-18)update_data
(45-76)
backend/tests/apps/ai/models/context_test.py (3)
backend/apps/ai/models/context.py (2)
Context
(15-61)update_data
(39-61)backend/apps/common/models.py (1)
TimestampedModel
(37-46)backend/apps/common/utils.py (1)
truncate
(164-176)
@patch("apps.ai.models.context.Context.objects.get") | ||
def test_update_data_existing_context(self, mock_get): | ||
mock_context = create_model_mock(Context) | ||
mock_get.return_value = mock_context | ||
|
||
content = "Test" | ||
mock_content_object = Mock() | ||
mock_content_object.pk = 1 | ||
|
||
with patch( | ||
"apps.ai.models.context.ContentType.objects.get_for_model" | ||
) as mock_get_for_model: | ||
mock_content_type = Mock() | ||
mock_content_type.get_source_expressions = Mock(return_value=[]) | ||
mock_get_for_model.return_value = mock_content_type | ||
|
||
result = Context.update_data(content, mock_content_object, source="src", save=True) | ||
|
||
mock_get_for_model.assert_called_once_with(mock_content_object) | ||
mock_get.assert_called_once_with( | ||
entity_type=mock_content_type, | ||
entity_id=1, | ||
) | ||
assert result == mock_context | ||
assert mock_context.content == content | ||
assert mock_context.source == "src" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Align existing-context test with get_or_create path.
If update_data switches to get_or_create, this test should patch that and simulate created=False.
- @patch("apps.ai.models.context.Context.objects.get")
- def test_update_data_existing_context(self, mock_get):
+ @patch("apps.ai.models.context.Context.objects.get_or_create")
+ def test_update_data_existing_context(self, mock_get_or_create):
mock_context = create_model_mock(Context)
- mock_get.return_value = mock_context
+ mock_get_or_create.return_value = (mock_context, False)
@@
- mock_get_for_model.assert_called_once_with(mock_content_object)
- mock_get.assert_called_once_with(
- entity_type=mock_content_type,
- entity_id=1,
- )
+ mock_get_for_model.assert_called_once_with(mock_content_object)
+ mock_get_or_create.assert_called_once_with(
+ entity_type=mock_content_type,
+ entity_id=1,
+ defaults={"content": content, "source": "src"},
+ )
assert result == mock_context
assert mock_context.content == content
assert mock_context.source == "src"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@patch("apps.ai.models.context.Context.objects.get") | |
def test_update_data_existing_context(self, mock_get): | |
mock_context = create_model_mock(Context) | |
mock_get.return_value = mock_context | |
content = "Test" | |
mock_content_object = Mock() | |
mock_content_object.pk = 1 | |
with patch( | |
"apps.ai.models.context.ContentType.objects.get_for_model" | |
) as mock_get_for_model: | |
mock_content_type = Mock() | |
mock_content_type.get_source_expressions = Mock(return_value=[]) | |
mock_get_for_model.return_value = mock_content_type | |
result = Context.update_data(content, mock_content_object, source="src", save=True) | |
mock_get_for_model.assert_called_once_with(mock_content_object) | |
mock_get.assert_called_once_with( | |
entity_type=mock_content_type, | |
entity_id=1, | |
) | |
assert result == mock_context | |
assert mock_context.content == content | |
assert mock_context.source == "src" | |
@patch("apps.ai.models.context.Context.objects.get_or_create") | |
def test_update_data_existing_context(self, mock_get_or_create): | |
mock_context = create_model_mock(Context) | |
mock_get_or_create.return_value = (mock_context, False) | |
content = "Test" | |
mock_content_object = Mock() | |
mock_content_object.pk = 1 | |
with patch( | |
"apps.ai.models.context.ContentType.objects.get_for_model" | |
) as mock_get_for_model: | |
mock_content_type = Mock() | |
mock_content_type.get_source_expressions = Mock(return_value=[]) | |
mock_get_for_model.return_value = mock_content_type | |
result = Context.update_data(content, mock_content_object, source="src", save=True) | |
mock_get_for_model.assert_called_once_with(mock_content_object) | |
mock_get_or_create.assert_called_once_with( | |
entity_type=mock_content_type, | |
entity_id=1, | |
defaults={"content": content, "source": "src"}, | |
) | |
assert result == mock_context | |
assert mock_context.content == content | |
assert mock_context.source == "src" |
🤖 Prompt for AI Agents
In backend/tests/apps/ai/models/context_test.py around lines 142 to 168, the
test currently patches Context.objects.get but update_data now uses
get_or_create; change the test to patch Context.objects.get_or_create instead,
have it return (mock_context, False) to simulate existing context, and assert
get_for_model and get_or_create were called with the correct args
(entity_type=mock_content_type, entity_id=1); then verify the returned object is
mock_context, its content and source fields were updated, and that save was
invoked when save=True.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
backend/tests/apps/ai/models/context_test.py (2)
298-337
: Align “new context + save=True” test with a get_or_create-based implementation.If
Context.update_data
is (or will be) implemented withget_or_create
, patchingContext.__init__
andobjects.get
becomes brittle. Prefer patchingobjects.get_or_create
and assertingdefaults
carrycontent
andsource
. This also models the concurrency-safe path.Proposed rewrite:
-@patch("apps.ai.models.context.Context.objects.get") -@patch("apps.ai.models.context.Context.__init__") -def test_update_data_new_context_with_save(self, mock_init, mock_get): - """Test update_data creating a new context with save=True.""" - from apps.ai.models.context import Context as ContextModel - - mock_get.side_effect = ContextModel.DoesNotExist - mock_init.return_value = None # __init__ should return None +@patch("apps.ai.models.context.Context.objects.get_or_create") +def test_update_data_new_context_with_save(self, mock_get_or_create): + """Test update_data creating a new context with save=True.""" + from apps.ai.models.context import Context as ContextModel + new_ctx = create_model_mock(ContextModel) + mock_get_or_create.return_value = (new_ctx, True) @@ - with patch( - "apps.ai.models.context.ContentType.objects.get_for_model" - ) as mock_get_for_model: + with patch("apps.ai.models.context.ContentType.objects.get_for_model") as mock_get_for_model: mock_content_type = Mock(spec=ContentType) - mock_content_type.get_source_expressions = Mock(return_value=[]) mock_get_for_model.return_value = mock_content_type @@ - # Mock the context instance and its save method - with patch.object(ContextModel, "save") as mock_save: - result = Context.update_data( - content, mock_content_object, source=source, save=True - ) - - mock_get_for_model.assert_called_once_with(mock_content_object) - mock_get.assert_called_once_with( - entity_type=mock_content_type, - entity_id=1, - ) - mock_init.assert_called_once_with( - entity_type=mock_content_type, - entity_id=1, - ) - assert result.content == content - assert result.source == source - mock_save.assert_called_once() + result = Context.update_data(content, mock_content_object, source=source, save=True) + mock_get_for_model.assert_called_once_with(mock_content_object) + mock_get_or_create.assert_called_once_with( + entity_type=mock_content_type, + entity_id=1, + defaults={"content": content, "source": source}, + ) + assert result == new_ctxTo verify which implementation the code currently uses, run:
#!/bin/bash # Inspect update_data implementation in this branch rg -nC2 -P 'class\s+Context\b.*?@staticmethod.*?def\s+update_data|get_or_create|objects\.get\(' --type=py apps/ai/models/context.py
338-377
: Do the same for the “new context + save=False” variant.Mirror the
get_or_create
approach and assert thatsave
is not forced whensave=False
.-@patch("apps.ai.models.context.Context.objects.get") -@patch("apps.ai.models.context.Context.__init__") -def test_update_data_new_context_without_save(self, mock_init, mock_get): +@patch("apps.ai.models.context.Context.objects.get_or_create") +def test_update_data_new_context_without_save(self, mock_get_or_create): @@ - mock_get.side_effect = ContextModel.DoesNotExist - mock_init.return_value = None # __init__ should return None + new_ctx = create_model_mock(ContextModel) + mock_get_or_create.return_value = (new_ctx, True) @@ - with patch( - "apps.ai.models.context.ContentType.objects.get_for_model" - ) as mock_get_for_model: + with patch("apps.ai.models.context.ContentType.objects.get_for_model") as mock_get_for_model: mock_content_type = Mock(spec=ContentType) - mock_content_type.get_source_expressions = Mock(return_value=[]) mock_get_for_model.return_value = mock_content_type @@ - # Mock the context instance and its save method - with patch.object(ContextModel, "save") as mock_save: - result = Context.update_data( - content, mock_content_object, source=source, save=False - ) - - mock_get_for_model.assert_called_once_with(mock_content_object) - mock_get.assert_called_once_with( - entity_type=mock_content_type, - entity_id=1, - ) - mock_init.assert_called_once_with( - entity_type=mock_content_type, - entity_id=1, - ) - assert result.content == content - assert result.source == source - mock_save.assert_not_called() + result = Context.update_data(content, mock_content_object, source=source, save=False) + mock_get_for_model.assert_called_once_with(mock_content_object) + mock_get_or_create.assert_called_once_with( + entity_type=mock_content_type, + entity_id=1, + defaults={"content": content, "source": source}, + ) + assert result == new_ctx
🧹 Nitpick comments (9)
backend/tests/apps/ai/models/context_test.py (9)
12-19
: Trim unused test double surface; tighten mock spec.
create_model_mock
sets achunks
relation that isn’t used anywhere in this module. Extra attributes on specs make tests harder to maintain and can hide accidental usage. Also consider autospeccing to catch typos at call sites.Apply this diff to simplify:
def create_model_mock(model_class): - mock = Mock(spec=model_class) + mock = Mock(spec=model_class) mock._state = Mock() mock.pk = 1 mock.id = 1 - mock.chunks = Mock() - mock.chunks.count.return_value = 0 return mock
32-38
: Assert FK target model explicitly to harden the test.You already verify
on_delete=CASCADE
. Add an assertion that the FK actually points toContentType
; this catches accidental regresses to a different model.def test_content_type_field_properties(self): field = Context._meta.get_field("entity_type") assert field.null is False assert field.blank is False assert hasattr(field, "remote_field") assert field.remote_field.on_delete.__name__ == "CASCADE" + assert field.remote_field.model is ContentType
49-61
: Avoid patching model init; this test only exercises Django internals.Patching
Context.__init__
is brittle and doesn’t validate our code paths. If we keep this test, it should focus onContext.update_data(save=True)
callingsave()
. Otherwise, it’s redundant with the dedicatedupdate_data
tests below.Two options (pick one):
- Remove the test entirely (preferred for noise reduction), or
- Rewrite to cover
update_data(save=True)
instead ofContext.save()
in isolation.Removal diff:
-@patch("apps.ai.models.context.Context.save") -@patch("apps.ai.models.context.Context.__init__") -def test_context_creation_with_save(self, mock_init, mock_save): - mock_init.return_value = None - - content = "Test generated text" - source = "test_source" - - context = Context(content=content, source=source) - context.save() - - mock_save.assert_called_once()
65-97
: Manager call-through tests add little value; consider trimming or upgrading to integration.Asserting
objects.create/filter/get
invocations mostly re-tests Django. Either:
- Convert to behavior-oriented tests that exercise your code paths which use these managers, or
- Mark as Django DB tests and assert real persistence semantics, or
- Remove to keep the suite lean.
98-121
: Test field-level validation instead of patching full_clean.Mocking
full_clean
verifies nothing about validation. You can avoid DB I/O and still assert constraints using the field’sclean
. This keeps the test fast and meaningful.-@patch("apps.ai.models.context.Context.full_clean") -def test_context_validation(self, mock_full_clean): - context = Context() - context.content = "Valid text" - context.source = "A" * 100 - - context.full_clean() - - mock_full_clean.assert_called_once() +def test_context_validation(self): + # Field-level validation should allow max length 100 without DB access. + context = Context(content="Valid text", source="A" * 100) + Context._meta.get_field("source").clean(context.source, context) -@patch("apps.ai.models.context.Context.full_clean") -def test_context_validation_source_too_long(self, mock_full_clean): - from django.core.exceptions import ValidationError - - mock_full_clean.side_effect = ValidationError("Source too long") - - context = Context() - context.content = "Valid text" - context.source = "A" * 101 - - with pytest.raises(ValidationError) as exc_info: - context.full_clean() - assert "Source too long" in str(exc_info.value) +def test_context_validation_source_too_long(self): + from django.core.exceptions import ValidationError + context = Context(content="Valid text") + with pytest.raises(ValidationError) as exc_info: + Context._meta.get_field("source").clean("A" * 101, context) + # Error wording may vary by Django version; keep assertion resilient. + assert "at most" in str(exc_info.value)
127-140
: Drop refresh_from_db/delete stubs; they don’t exercise our code.These tests only assert that patched Django methods were called, which is tautological. If you need them, convert to higher-level behavior where these calls are side effects of our logic.
-@patch("apps.ai.models.context.Context.refresh_from_db") -def test_context_refresh_from_db(self, mock_refresh): - context = Context() - context.refresh_from_db() - - mock_refresh.assert_called_once() - -@patch("apps.ai.models.context.Context.delete") -def test_context_delete(self, mock_delete): - context = Context() - context.delete() - - mock_delete.assert_called_once()
141-167
: Also assert save was invoked on existing-context update with save=True.
update_data(..., save=True)
should persist changes even for existing contexts. Sincemock_context
is a spec’d model, you can verifysave()
was called.assert result == mock_context assert mock_context.content == content assert mock_context.source == "src" + mock_context.save.assert_called_once()
141-167
: Add a no-op-change test to codify the “always return a Context” guarantee.Based on the retrieved learnings,
update_data
should always return aContext
, even whencontent
/source
don’t change. Add a test to pin this behavior.Example test (place near the other
update_data
tests):def test_update_data_no_changes_returns_context(self): mock_context = create_model_mock(Context) mock_context.content = "same" mock_context.source = "src" with patch("apps.ai.models.context.ContentType.objects.get_for_model") as mock_get_for_model, \ patch("apps.ai.models.context.Context.objects.get") as mock_get: mock_ct = Mock(spec=ContentType) mock_get_for_model.return_value = mock_ct mock_get.return_value = mock_context result = Context.update_data("same", Mock(pk=1), source="src", save=False) assert result is mock_contextI can push a commit with this (and the other tweaks) if you want me to.
Also applies to: 298-377
168-202
: Improve test resilience around truncation helper semanticsThe test for
Context.__str__
currently asserts a hardcoded mid-word truncation ("50 cha..."
), which will break iftruncate
ever switches to word boundaries or changes its ellipsis. To make this more robust, consider one of the following approaches:• Use prefix/suffix checks plus a sanity substring check:
assert result.startswith("test_model Test Object: ") assert result.endswith("...") assert "This is test content" in result # sanity check on content presence• Or, for exact matching, compute the expectation with the same helper:
from apps.common.utils import truncate expected = f"test_model Test Object: {truncate(context.content, 50)}" assert result == expectedNote:
truncate
is imported fromapps.common.utils
(seebackend/apps/common/utils.py
at line 164).File: backend/tests/apps/ai/models/context_test.py: lines 168–202
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
backend/tests/apps/ai/models/context_test.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-16T12:07:38.124Z
Learnt from: Dishant1804
PR: OWASP/Nest#1891
File: backend/apps/ai/models/context.py:44-66
Timestamp: 2025-08-16T12:07:38.124Z
Learning: In the Context.update_data method, the Context object should always be returned (never None), even when no changes are made to the content or source fields. This ensures consistent behavior for callers.
Applied to files:
backend/tests/apps/ai/models/context_test.py
🧬 Code graph analysis (1)
backend/tests/apps/ai/models/context_test.py (2)
backend/apps/ai/models/context.py (2)
Context
(15-61)update_data
(39-61)backend/apps/common/models.py (1)
TimestampedModel
(37-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run backend tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run frontend e2e tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/tests/apps/ai/models/context_test.py (1)
23-31
: LGTM: Meta and content field checks are precise.The db_table/verbose_name assertions and TextField verification are appropriate and stable.
Proposed change
Resolves #1873
Checklist
make check-test
locally; all checks and tests passed.