-
Notifications
You must be signed in to change notification settings - Fork 24
Bug Fix: Inconsistent Field Naming - avatar vs imageUrl #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❌ Deploy Preview for splitwizer failed. Why did it fail? →
|
WalkthroughThe changes standardize the user profile image field across the backend, replacing all uses of Changes
Sequence Diagram(s)sequenceDiagram
participant Admin as Admin/Operator
participant Script as Migration Script
participant DB as MongoDB
Admin->>Script: Run migrate_avatar_to_imageurl.py
Script->>DB: Query users with avatar field
Script->>DB: For each user, check for imageUrl conflicts
Script->>DB: Prepare bulk updates (set imageUrl, unset avatar)
Script->>DB: Execute bulk update
Script->>DB: Log statistics and conflicts
Script->>Admin: Output summary and log file location
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (4)
backend/scripts/backup_db.py (2)
29-32: Consider making backup directory configurable.The backup directory is hardcoded as "backups". Consider making this configurable via environment variable for flexibility in different deployment environments.
- backup_dir = "backups" + backup_dir = os.getenv("BACKUP_DIR", "backups")
53-53: Improve JSON serialization robustness.The
default=strparameter is good, but consider adding explicit handling for datetime objects and other common MongoDB types.- json.dump(documents, f, indent=2, default=str) + def json_serializer(obj): + if isinstance(obj, datetime): + return obj.isoformat() + return str(obj) + json.dump(documents, f, indent=2, default=json_serializer)backend/scripts/migrate_avatar_to_imageurl.py (2)
62-65: Consider using context manager for MongoDB connection.The MongoDB client connection is not explicitly closed, which could lead to connection leaks in case of errors or exceptions.
- # Connect to MongoDB - client = MongoClient(MONGODB_URL) - db = client[DATABASE_NAME] - users = db.users + # Connect to MongoDB + with MongoClient(MONGODB_URL) as client: + db = client[DATABASE_NAME] + users = db.users + # ... rest of the migration logic should be indented under this context
131-133: Consider the destructive nature of collection drop.The rollback function drops the entire users collection, which is a destructive operation. Consider adding a confirmation mechanism or additional safety checks, especially for production environments.
Consider implementing a less destructive rollback approach:
- Rename current collection as backup before dropping
- Add confirmation prompts for production environments
- Validate backup data integrity before proceeding with rollback
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/app/auth/schemas.py(1 hunks)backend/app/auth/service.py(3 hunks)backend/app/groups/service.py(2 hunks)backend/app/user/service.py(1 hunks)backend/scripts/backup_db.py(1 hunks)backend/scripts/migrate_avatar_to_imageurl.py(1 hunks)backend/tests/auth/test_auth_routes.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
backend/**/*.py
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Backend code must be implemented using FastAPI with Python 3.9+ and organized under the /backend/ directory.
Files:
backend/app/auth/service.pybackend/tests/auth/test_auth_routes.pybackend/app/user/service.pybackend/scripts/backup_db.pybackend/app/groups/service.pybackend/app/auth/schemas.pybackend/scripts/migrate_avatar_to_imageurl.py
backend/app/{auth,user,groups,expenses}/**/*.py
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
backend/app/{auth,user,groups,expenses}/**/*.py: Backend services should be modular, with authentication and user registration in app/auth/, user profile management in app/user/, group management in app/groups/, and expense tracking in app/expenses/.
When adding a new API endpoint, add the route to the appropriate service router file in the backend.
Files:
backend/app/auth/service.pybackend/app/user/service.pybackend/app/groups/service.pybackend/app/auth/schemas.py
backend/app/auth/**/*.py
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
JWT authentication with refresh token rotation must be implemented in the backend.
Files:
backend/app/auth/service.pybackend/app/auth/schemas.py
backend/tests/**/*.py
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Backend tests must be placed in the /backend/tests/ directory and run using pytest.
Files:
backend/tests/auth/test_auth_routes.py
🧠 Learnings (1)
backend/scripts/backup_db.py (1)
Learnt from: CR
PR: Devasy23/splitwiser#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-26T09:41:01.332Z
Learning: Applies to backend/app/database.py : MongoDB must be used for persistent data storage in the backend, with connection logic in backend/app/database.py.
🧬 Code Graph Analysis (3)
backend/tests/auth/test_auth_routes.py (1)
backend/tests/user/test_user_routes.py (2)
test_update_user_profile_partial_update(91-108)setup_test_user(26-49)
backend/app/user/service.py (1)
backend/tests/user/test_user_service.py (2)
test_transform_user_document_missing_optional_fields(63-80)test_transform_user_document_with_updated_at_different_from_created_at(82-100)
backend/app/auth/schemas.py (3)
backend/app/user/schemas.py (2)
UserProfileUpdateRequest(14-17)UserProfileResponse(5-12)backend/app/groups/schemas.py (1)
GroupUpdateRequest(21-23)backend/tests/user/test_user_service.py (1)
test_transform_user_document_missing_optional_fields(63-80)
🔇 Additional comments (12)
backend/app/auth/schemas.py (1)
45-45: Field renameavatar→imageUrlverifiedI confirmed that all schema definitions, services, routes, and tests now use
imageUrl. The only remaining references toavatarlive in the migration script (backend/scripts/migrate_avatar_to_imageurl.py), which is intentional.– No breaking changes detected
– Schema consistency maintained acrossbackend/app/auth,user, andgroups
– Tests updated to useimageUrlAll set to merge.
backend/tests/auth/test_auth_routes.py (1)
171-171: All tests consistently updated to useimageUrlVerified no remaining
avatarreferences inbackend/tests/, and all test cases now use the newimageUrlfield.
- Checked with ripgrep: no occurrences of
"avatar"inbackend/tests/- Confirmed
"imageUrl"appears in every relevant test fileTests align with the updated schema—no further changes needed.
backend/app/auth/service.py (3)
118-118: LGTM! Consistent field naming in user creation.The field name change from
avatartoimageUrlin user creation is consistent with the schema update.
205-206: LGTM! Google auth user update logic correctly updated.Both the condition check and assignment use the new
imageUrlfield name, maintaining the existing logic for updating user profile images from Google OAuth.
218-218: All avatar references have been removed
The verification script (rg -i "avatar" backend/app/auth/) returned no matches inbackend/app/auth/.LGTM! New Google user creation consistently uses
imageUrl.backend/app/user/service.py (1)
44-44: Approve removal of fallback logic after migration completionThe migration script
backend/scripts/migrate_avatar_to_imageurl.pycorrectly migrates allavatarvalues intoimageUrl, unsets the deprecated field, and logs any conflicts for manual resolution. Thetransform_user_documenttests inbackend/tests/user/test_user_service.pyhave been updated to expectimageUrl(orNone) and no longer rely on the oldavatarfield.• Migration script: backend/scripts/migrate_avatar_to_imageurl.py
• Test coverage: backend/tests/user/test_user_service.pyNo further changes needed—this can be deployed once the migration has been run and conflicts addressed.
backend/app/groups/service.py (3)
57-57: LGTM! Field name standardization implemented correctly.The change from
user.get("avatar")touser.get("imageUrl")correctly implements the field standardization objective. The code now consistently uses the canonicalimageUrlfield name.
63-63: Consistent field naming in fallback case.The change to
"imageUrl": Nonemaintains consistency with the standardized field naming when creating fallback user objects.
78-78: Exception handling maintains field consistency.The change ensures that even in error scenarios, the API response uses the standardized
imageUrlfield name, maintaining consistency across all code paths.backend/scripts/migrate_avatar_to_imageurl.py (3)
82-90: Conflict detection logic handles edge cases well.The conflict detection correctly identifies users with differing
avatarandimageUrlvalues and logs them appropriately. The logic also handles the case where both fields exist with the same value by counting them asusers_with_both_fields.
93-99: Bulk operation correctly combines set and unset operations.The
UpdateOneoperation properly sets the newimageUrlfield and removes the deprecatedavatarfield in a single atomic operation, which is efficient and maintains data consistency.
143-155: Excellent logging and user feedback.The main execution block provides comprehensive statistics logging and user-friendly feedback about the migration results and log file location. This helps with monitoring and debugging migration issues.
| MONGODB_URL = os.getenv("MONGODB_URL") | ||
| DATABASE_NAME = os.getenv("DATABASE_NAME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify MongoDB connection environment variables.
The script relies on MONGODB_URL and DATABASE_NAME environment variables but doesn't validate their presence before proceeding.
🏁 Script executed:
#!/bin/bash
# Check if the environment variables are properly handled in other parts of the codebase
rg -A 3 "MONGODB_URL.*DATABASE_NAME" --type pyLength of output: 48
🏁 Script executed:
#!/bin/bash
# Display the first 100 lines of the backup_db.py script to check env var handling
head -n 100 backend/scripts/backup_db.pyLength of output: 2618
Add validation for required environment variables
The script loads MONGODB_URL and DATABASE_NAME but will fail with a less‐helpful error if they aren’t set. Before using them, explicitly check and raise an informative exception.
• File: backend/scripts/backup_db.py
• Around lines 17–22 (after loading env vars)
Suggested change:
load_dotenv(os.path.join(BACKEND_DIR, ".env"))
# Get MongoDB connection details from environment
MONGODB_URL = os.getenv("MONGODB_URL")
DATABASE_NAME = os.getenv("DATABASE_NAME")
+if not MONGODB_URL or not DATABASE_NAME:
+ missing = []
+ if not MONGODB_URL:
+ missing.append("MONGODB_URL")
+ if not DATABASE_NAME:
+ missing.append("DATABASE_NAME")
+ raise EnvironmentError(
+ f"Missing required environment variable(s): {', '.join(missing)}"
+ )This ensures the script fails fast with a clear message when required configuration is absent.
🤖 Prompt for AI Agents
In backend/scripts/backup_db.py around lines 17 to 22, the environment variables
MONGODB_URL and DATABASE_NAME are loaded but not validated. Add explicit checks
after loading these variables to verify they are not None or empty, and if they
are missing, raise a clear and informative exception indicating which variable
is required. This will ensure the script fails fast with a helpful error message
when the configuration is absent.
| # Convert ObjectId to string for JSON serialization | ||
| for doc in documents: | ||
| doc["_id"] = str(doc["_id"]) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Potential memory issue with large collections.
Using list(collection.find({})) loads all documents into memory at once, which could cause memory issues for large collections.
Consider using pagination or streaming approach:
- documents = list(collection.find({}))
-
- # Convert ObjectId to string for JSON serialization
- for doc in documents:
- doc["_id"] = str(doc["_id"])
+ documents = []
+ batch_size = 1000
+ cursor = collection.find({}).batch_size(batch_size)
+ async for doc in cursor:
+ doc["_id"] = str(doc["_id"])
+ documents.append(doc)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/scripts/backup_db.py around lines 46 to 49, loading all documents
from the collection into memory at once using list(collection.find({})) can
cause memory issues with large collections. Modify the code to iterate over the
cursor returned by collection.find({}) directly without converting it to a list,
processing each document one by one to avoid loading everything into memory
simultaneously. This streaming approach will reduce memory usage and improve
scalability.
| except Exception as e: | ||
| print(f"Backup failed: {str(e)}") | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling and cleanup.
The error handling prints but doesn't clean up partial backup files on failure.
except Exception as e:
print(f"Backup failed: {str(e)}")
+ # Clean up partial backup on failure
+ if 'backup_path' in locals() and os.path.exists(backup_path):
+ import shutil
+ shutil.rmtree(backup_path, ignore_errors=True)
+ print(f"Cleaned up partial backup at: {backup_path}")
raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except Exception as e: | |
| print(f"Backup failed: {str(e)}") | |
| raise | |
| except Exception as e: | |
| print(f"Backup failed: {str(e)}") | |
| # Clean up partial backup on failure | |
| if 'backup_path' in locals() and os.path.exists(backup_path): | |
| import shutil | |
| shutil.rmtree(backup_path, ignore_errors=True) | |
| print(f"Cleaned up partial backup at: {backup_path}") | |
| raise |
🤖 Prompt for AI Agents
In backend/scripts/backup_db.py around lines 70 to 72, the current error
handling prints the exception but does not remove any partial backup files
created before the failure. Modify the except block to include cleanup logic
that deletes any incomplete backup files to avoid leaving corrupted or partial
backups on disk. Ensure the cleanup runs before re-raising the exception.
| 4. Logs migration statistics | ||
| """ | ||
|
|
||
| from backup_db import create_backup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify backup_db module availability.
The relative import assumes backup_db.py exists in the same directory. Consider adding error handling for import failures.
🏁 Script executed:
#!/bin/bash
# Check if backup_db.py exists in the scripts directory
fd -t f "backup_db.py" backend/scripts/Length of output: 68
🏁 Script executed:
#!/bin/bash
# Inspect environment variable loading, log directory creation, and MongoDB connection
rg -n "load_dotenv" backend/scripts/migrate_avatar_to_imageurl.py
rg -n "os.makedirs" backend/scripts/migrate_avatar_to_imageurl.py
rg -n "basicConfig" backend/scripts/migrate_avatar_to_imageurl.py
rg -n "MongoClient" backend/scripts/migrate_avatar_to_imageurl.pyLength of output: 566
Ensure robust error handling for imports, environment variables, logging setup, and MongoDB connections
- Wrap the
from backup_db import create_backup(line 10) in a try/except to catchImportErrorand emit a clear message if the module isn’t found. - After
load_dotenv(...)(line 28), explicitly validate required env vars (e.g.MONGODB_URL, any others your script needs):required = ["MONGODB_URL"] missing = [var for var in required if not os.getenv(var)] if missing: logging.error(f"Missing environment variables: {', '.join(missing)}") sys.exit(1)
- Surround
os.makedirs(log_dir, exist_ok=True)(line 40) with try/except to catchPermissionError, log it, and exit gracefully if the directory isn’t writable. - Before calling
MongoClient(MONGODB_URL)on lines 63 and 119, ensureMONGODB_URLis non‐empty and consider catchingpymongo.errors.ConfigurationErrorto handle malformed URIs.
These changes will prevent silent failures and make debugging easier.
|
@vrajpatelll great job, Have you ran backup & migrate script? Or that is planned on merge? |
…ajpatelll/main
Devasy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Overview
This PR implements the migration from the deprecated
avatarfield to the newimageUrlfield for user profile images across the Splitwiser backend. The goal is to standardize user image handling, improve API consistency, and prepare for future frontend integrations.✅ Completed
Phase 1: Code Standardization
auth/schemas.pyto useimageUrlinstead ofavatar.auth/service.pyto useimageUrlin all user creation and update operations.groups/service.pyto useimageUrlwhen enriching member data.imageUrl.imageUrl.Phase 2: Database Migration
avatarbut noimageUrl, copies values, and removesavatar.avatarif needed.Phase 3: Testing & Validation
imageUrlconsistently.Next Steps:
Focus on documentation, code cleanup, and final schema validation to complete the migration and ensure long-term maintainability.
Closes:
Summary by CodeRabbit
New Features
Bug Fixes
Tests