-
Notifications
You must be signed in to change notification settings - Fork 1
Lal/dataset api v5 prod 14582 #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lal/dataset api v5 prod 14582 #58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds version compatibility support for the cosmotech-api package, introducing backward compatibility for API v4 while supporting the newer v5. The changes implement version-aware functionality to handle breaking changes between API versions.
Key changes:
- Added semantic version utility to detect cosmotech-api version at runtime
- Removed credentials parameter handling and updated function signatures across multiple modules
- Implemented version-specific dataset download functions for v4 and v5 compatibility
- Added comprehensive test skipping for version-specific functionality
Reviewed Changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
cosmotech/coal/utils/semver.py | New utility for semantic version detection of packages |
cosmotech/coal/utils/decorator.py | New timing decorator for performance monitoring |
cosmotech/coal/cosmotech_api/runner/datasets.py | Version-aware dataset download with separate v4/v5 implementations |
cosmotech/coal/cosmotech_api/dataset/download/file.py | Refactored file processing with new decorator and extracted functions |
Multiple test files | Updated to remove credentials parameters and add version-specific test skipping |
requirements.past.txt | New dependency file for legacy API support |
pyproject.toml | Added past dependencies configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
LOGGER.debug(T("coal.services.dataset.processing_text").format(file_name=target_file)) | ||
with open(target_file, "r") as _file: | ||
current_filename = os.path.basename(target_file) | ||
content[current_filename] = "".join(line for line in _file) |
Copilot
AI
Aug 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text processing function removes newlines by using "".join() instead of "\n".join(). This changes the original file content structure and may cause data loss.
content[current_filename] = "".join(line for line in _file) | |
content[current_filename] = _file.read() |
Copilot uses AI. Check for mistakes.
tmp_dataset_dir_path = Path(tmp_dataset_dir) | ||
for part in dataset.parts: | ||
part_file_path = tmp_dataset_dir_path / part.source_name | ||
data_part = dataset_api_instance.download_dataset_part(organization_id, workspace_id, dataset_id, part.id) |
Copilot
AI
Aug 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method call uses positional arguments instead of keyword arguments, which could lead to parameter mismatch if the API signature changes. Consider using keyword arguments for clarity and robustness.
data_part = dataset_api_instance.download_dataset_part(organization_id, workspace_id, dataset_id, part.id) | |
data_part = dataset_api_instance.download_dataset_part( | |
organization_id=organization_id, | |
workspace_id=workspace_id, | |
dataset_id=dataset_id, | |
dataset_part_id=part.id, | |
) |
Copilot uses AI. Check for mistakes.
with open(part_file_path, "wb") as binary_file: | ||
binary_file.write(data_part) | ||
|
||
if read_files: |
Copilot
AI
Aug 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file content reading is performed inside the loop for each dataset part, but the read_files check happens after file writing. Consider moving this check before the file writing operation to avoid unnecessary disk I/O when read_files is False.
if read_files: | |
if read_files: | |
part_file_path = tmp_dataset_dir_path / part.source_name | |
data_part = dataset_api_instance.download_dataset_part(organization_id, workspace_id, dataset_id, part.id) | |
with open(part_file_path, "wb") as binary_file: | |
binary_file.write(data_part) |
Copilot uses AI. Check for mistakes.
LOGGER.debug(T("coal.services.dataset.processing_json").format(file_name=target_file)) | ||
with open(target_file, "r") as _file: | ||
current_filename = os.path.basename(target_file) | ||
content[current_filename] = json.load(_file) |
Copilot
AI
Aug 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Using the same variable name '_file' for both the function parameter and the file handle in the with statement creates confusion. Consider using a more descriptive name like 'json_file' for the file handle.
content[current_filename] = json.load(_file) | |
with open(target_file, "r") as json_file: | |
current_filename = os.path.basename(target_file) | |
content[current_filename] = json.load(json_file) |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything seems ok to me to merge in the feature branch, we probably will need to update the action that runs the tests to run for 3.x and 5.x version of cosmotech-api
The cosmotech-api v5 lib is not released yet so testing will need either installing pre-release or using fixed version. pre-release is way to risky and fixed is way to static. |
No description provided.