-
Notifications
You must be signed in to change notification settings - Fork 12
Modernize file storage with UUID-based naming, compression, and tests #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Refactor file storage to use UUID-based filenames instead of timestamps - Add proper get() function that finds vCons by UUID - Add delete(), exists(), and list_vcons() functions - Add optional gzip compression support - Add date-based directory organization (YYYY/MM/DD) - Add configurable file size limits and permissions - Add Docker volume (vcon_files) for persistent file storage - Add comprehensive test suite with 30+ tests - Update documentation with configuration options and examples 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
Comment @cursor review or bugbot run to trigger another review on this PR
server/storage/file/__init__.py
Outdated
| max_size = opts.get("max_file_size", default_options["max_file_size"]) | ||
| if len(vcon_data.encode("utf-8")) > max_size: | ||
| raise ValueError( | ||
| f"vCon data exceeds max file size: {len(vcon_data)} > {max_size}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Error message shows character count instead of byte count
The file size check correctly compares byte length using len(vcon_data.encode("utf-8")), but the error message incorrectly displays len(vcon_data) which is the character count. For vCon data containing non-ASCII characters, these values will differ, making the error message misleading and confusing for debugging.
example_config.yml
Outdated
| compression: false | ||
| max_file_size: 10485760 | ||
| file_permissions: 0644 | ||
| dir_permissions: 0755 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: YAML config permissions may parse incorrectly across versions
The file_permissions: 0644 and dir_permissions: 0755 values are ambiguous in YAML. In YAML 1.1 (older PyYAML), leading-zero numbers are octal, but in YAML 1.2 (newer behavior), only 0o prefix indicates octal. Depending on the PyYAML version, 0644 could be parsed as 420 (correct octal interpretation) or 644 (incorrect decimal interpretation), resulting in wrong file permissions. Using quoted strings like "0644" with explicit conversion, or the 0o644 notation in a way YAML understands, would be more reliable.
| # Check flat structure first | ||
| flat_path = base_path / f"{vcon_uuid}.{ext}" | ||
| if flat_path.exists(): | ||
| return flat_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Path traversal allows reading and deleting arbitrary files
The vcon_uuid parameter is used directly in file path construction without validation for path traversal sequences like ../. An attacker passing a malicious UUID such as ../../etc/config could read, check existence of, or delete arbitrary .json or .json.gz files outside the storage directory via the get(), exists(), and delete() functions. The _find_vcon_file helper constructs paths using base_path / f"{vcon_uuid}.{ext}" which allows .. components to escape the intended directory when the path is resolved by filesystem operations.
Additional Locations (2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR modernizes the file storage backend by replacing timestamp-based filenames with UUID-based naming, adding comprehensive CRUD operations, optional gzip compression, date-based directory organization, and a full test suite with 30+ tests.
Key changes:
- Implement UUID-based file naming with optional YYYY/MM/DD directory structure
- Add new functions:
get(),delete(),exists(), andlist_vcons()with pagination support - Add optional gzip compression and configurable file size limits and permissions
- Add Docker volume configuration for persistent storage and comprehensive test coverage
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| server/storage/file/init.py | Complete refactor of file storage module with UUID-based naming, compression support, CRUD operations, and date-based organization |
| server/storage/file/test_file_storage.py | New comprehensive test suite covering all storage operations, compression, pagination, and edge cases |
| server/storage/file/README.md | Updated documentation with configuration options, API reference, usage examples, and best practices |
| example_docker-compose.yml | Added vcon_files volume mount to conserver and api services for persistent file storage |
| example_config.yml | Added file storage configuration section with all available options |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server/storage/file/__init__.py
Outdated
| if len(vcon_data.encode("utf-8")) > max_size: | ||
| raise ValueError( | ||
| f"vCon data exceeds max file size: {len(vcon_data)} > {max_size}" |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message displays the length of the string (characters) but compares it to bytes. The size check on line 117 uses len(vcon_data.encode("utf-8")) which gives bytes, so the error message should show the actual byte size, not len(vcon_data) which is the character count.
| if len(vcon_data.encode("utf-8")) > max_size: | |
| raise ValueError( | |
| f"vCon data exceeds max file size: {len(vcon_data)} > {max_size}" | |
| vcon_data_bytes = vcon_data.encode("utf-8") | |
| if len(vcon_data_bytes) > max_size: | |
| raise ValueError( | |
| f"vCon data exceeds max file size: {len(vcon_data_bytes)} > {max_size} bytes" |
Co-authored-by: thomas.howe <[email protected]>
|
Cursor Agent can help with this pull request. Just |
Co-authored-by: thomas.howe <[email protected]>
|
Cursor Agent can help with this pull request. Just |
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Modernize file storage with UUID-based naming, compression, and tests
🤖 Generated with Claude Code
Note
Introduce a local file storage backend with UUID-based filenames, optional gzip and date-based organization, plus config, Docker volume, docs, and comprehensive tests.
server/storage/file/__init__.pywithsave,get,delete,exists,list_vconsand helpers; supports UUID-based filenames, optional gzip, date-based directories, size/permission controls, and cleanup.server/storage/file/test_file_storage.pycovering CRUD, compression, date organization, pagination, and edge cases.example_config.ymlwithstorages.fileoptions (path,organize_by_date,compression,max_file_size,file_permissions,dir_permissions).vcon_filesto/data/vconsinexample_docker-compose.yml; declarevcon_filesinvolumes.server/storage/file/README.mdwith configuration, features, Docker volume example, usage, API reference, and best practices.Written by Cursor Bugbot for commit 288adc4. Configure here.