-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/file upload integration #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 28 comments.
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| file=file_content, | ||
| file_options={ | ||
| "content-type": file_type, | ||
| "upsert": "false", # Don't overwrite existing files |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upsert option is set to "false" (string), but it should be a boolean value False. Check the Supabase Python SDK documentation to verify the correct type, as strings may not be properly interpreted.
| "upsert": "false", # Don't overwrite existing files | |
| "upsert": False, # Don't overwrite existing files |
|
|
||
|
|
||
| @router.get("/{document_id}/download") | ||
| async def get_document_download_url(document_id: str, expires_in: int = 3600): |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing validation for expires_in parameter. Negative or excessively large values could cause issues. Add validation to ensure it's within a reasonable range:
if expires_in <= 0 or expires_in > 86400: # Max 24 hours
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="expires_in must be between 1 and 86400 seconds (24 hours)"
)| # Delete from Supabase Storage | ||
| try: | ||
| await delete_file(document["storage_ref"]) | ||
| except StorageError as e: | ||
| logger.warning( | ||
| f"Failed to delete from storage (continuing with DB delete): {e}" | ||
| ) | ||
|
|
||
| # Delete from database | ||
| delete_query = "DELETE FROM documents WHERE id = %s" | ||
| affected_rows = execute_statement(delete_query, (str(doc_uuid),)) |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If storage deletion fails but database deletion succeeds, the document metadata is removed but the file remains in storage, creating an orphaned file. Consider either:
- Failing the entire operation if storage deletion fails
- Implementing a background cleanup job to remove orphaned files
- Logging orphaned files for manual cleanup
The current approach (continuing after storage failure) creates data inconsistency.
| if (downloadData?.download_url && selectedDocumentId) { | ||
| window.open(downloadData.download_url, "_blank"); | ||
| setSelectedDocumentId(null); // Reset after opening | ||
| } |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation causes the download URL to open in a new tab on every render when downloadData and selectedDocumentId are truthy. This should be inside a useEffect hook to prevent it from executing on every render cycle.
useEffect(() => {
if (downloadData?.download_url && selectedDocumentId) {
window.open(downloadData.download_url, "_blank");
setSelectedDocumentId(null);
}
}, [downloadData, selectedDocumentId]);| if (downloadData?.download_url && selectedDocumentId) { | |
| window.open(downloadData.download_url, "_blank"); | |
| setSelectedDocumentId(null); // Reset after opening | |
| } | |
| import { useEffect } from "react"; | |
| useEffect(() => { | |
| if (downloadData?.download_url && selectedDocumentId) { | |
| window.open(downloadData.download_url, "_blank"); | |
| setSelectedDocumentId(null); // Reset after opening | |
| } | |
| }, [downloadData, selectedDocumentId]); |
| class DocumentUploadRequest(BaseModel): | ||
| """Request model for file upload endpoint.""" | ||
|
|
||
| thread_id: UUID = Field(..., description="Thread to attach document to") | ||
| permission: PermissionEnum = Field( | ||
| default=PermissionEnum.PRIVATE, description="Access permission level" | ||
| ) | ||
| metadata: Optional[Dict[str, Any]] = Field( | ||
| default=None, description="Additional metadata" | ||
| ) |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DocumentUploadRequest model is defined but never used in the endpoints. The upload endpoint uses Form parameters directly instead of utilizing this Pydantic model. Consider either using the model for validation or removing it to reduce code clutter.
| alert("Files uploaded successfully!"); | ||
| refetchDocuments(); | ||
| } catch (error) { | ||
| console.error("Error uploading files:", error); | ||
| alert("Error uploading files. Please try again."); |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Using alert() for error messages is not user-friendly and doesn't follow modern UI patterns. Consider using a proper toast notification system or error state display component instead.
| if not result: | ||
| # Rollback storage upload if database insert fails | ||
| try: | ||
| await delete_file(storage_ref) | ||
| except StorageError: | ||
| logger.error(f"Failed to rollback storage upload: {storage_ref}") | ||
|
|
||
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
| detail="Failed to save document metadata", | ||
| ) |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential race condition: if the file is successfully uploaded to storage but the database insert fails, the rollback attempts to delete the file. However, if the deletion also fails, the file remains orphaned in storage with no database record. Consider implementing a cleanup job or improving the error handling to ensure consistency between storage and database.
| query = """ | ||
| SELECT id, thread_id, uploader_id, file_name, file_type, | ||
| file_size, storage_ref, indexed, permission, | ||
| created_at, updated_at, metadata | ||
| FROM documents | ||
| WHERE 1=1 | ||
| """ | ||
| count_query = "SELECT COUNT(*) as total FROM documents WHERE 1=1" | ||
| params = [] | ||
|
|
||
| if thread_id: | ||
| try: | ||
| UUID(thread_id) | ||
| except ValueError: | ||
| raise HTTPException( | ||
| status_code=status.HTTP_400_BAD_REQUEST, | ||
| detail="Invalid thread_id format", | ||
| ) | ||
| query += " AND thread_id = %s" | ||
| count_query += " AND thread_id = %s" | ||
| params.append(thread_id) | ||
|
|
||
| if uploader_id: | ||
| try: | ||
| UUID(uploader_id) | ||
| except ValueError: | ||
| raise HTTPException( | ||
| status_code=status.HTTP_400_BAD_REQUEST, | ||
| detail="Invalid uploader_id format", | ||
| ) | ||
| query += " AND uploader_id = %s" | ||
| count_query += " AND uploader_id = %s" | ||
| params.append(uploader_id) | ||
|
|
||
| # Add pagination | ||
| offset = (page - 1) * per_page | ||
| query += " ORDER BY created_at DESC LIMIT %s OFFSET %s" | ||
|
|
||
| # Get total count | ||
| total_result = execute_query( | ||
| count_query, tuple(params) if params else None, fetch_one=True | ||
| ) | ||
| total = total_result["total"] if total_result else 0 | ||
|
|
||
| # Get documents | ||
| params.extend([per_page, offset]) | ||
| documents = execute_query(query, tuple(params)) |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing database indexes for frequently queried columns. The list_documents endpoint filters by thread_id and uploader_id and orders by created_at. Without proper indexes, these queries will perform full table scans as the documents table grows. Ensure the database schema includes indexes on:
thread_iduploader_idcreated_at(for sorting)- Composite index on
(thread_id, created_at)and(uploader_id, created_at)for optimal query performance
| @@ -0,0 +1,112 @@ | |||
| import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query"; | |||
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused imports useMutation, useQueryClient.
| import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query"; | |
| import { useQuery } from "@tanstack/react-query"; |
| uploadDocument, | ||
| getDocument, | ||
| getDocuments, | ||
| getDocumentDownloadUrl, | ||
| deleteDocument, |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused imports deleteDocument, uploadDocument.
| uploadDocument, | |
| getDocument, | |
| getDocuments, | |
| getDocumentDownloadUrl, | |
| deleteDocument, | |
| getDocument, | |
| getDocuments, | |
| getDocumentDownloadUrl, |
📝 Description
This pull request introduces a new document upload and storage feature integrated with Supabase, along with related backend and frontend dependency updates. The main changes include new document models, Supabase storage service utilities, API endpoint registration for documents, configuration additions for file uploads, and required package updates.
Backend: Document Upload & Storage Integration
models/document.pyto support file upload, metadata, permissions, and responses for document operations.services/storage.pyfor uploading, deleting, and generating URLs for files, including validation helpers for file type and size.api/endpoints/__init__.py,api/routes.py) and exported document models inmodels/__init__.py.core/config.pyfor Supabase credentials, storage bucket, max file size, and allowed file types.Backend: Dependency Updates
supabase,asyncpg,sqlalchemy[asyncio],python-multipart, andaiofilesinrequirements.txt.Frontend: Dependency Updates
@tanstack/react-queryfor data fetching and updateduuidto version 13.0.0 for unique identifier generation inpackage.jsonandpackage-lock.json.Test instruction
🎯 Type of Change
🧪 Testing
📋 Checklist
📸 Screenshots (if applicable)
N/A
🔗 Related Issues