Skip to content

Conversation

lalepee
Copy link
Contributor

@lalepee lalepee commented Aug 19, 2025

No description provided.

@lalepee lalepee requested a review from neomatamune August 19, 2025 17:21
@neomatamune neomatamune requested a review from Copilot August 19, 2025 17:31
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds version compatibility support for the cosmotech-api package, introducing backward compatibility for API v4 while supporting the newer v5. The changes implement version-aware functionality to handle breaking changes between API versions.

Key changes:

  • Added semantic version utility to detect cosmotech-api version at runtime
  • Removed credentials parameter handling and updated function signatures across multiple modules
  • Implemented version-specific dataset download functions for v4 and v5 compatibility
  • Added comprehensive test skipping for version-specific functionality

Reviewed Changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
cosmotech/coal/utils/semver.py New utility for semantic version detection of packages
cosmotech/coal/utils/decorator.py New timing decorator for performance monitoring
cosmotech/coal/cosmotech_api/runner/datasets.py Version-aware dataset download with separate v4/v5 implementations
cosmotech/coal/cosmotech_api/dataset/download/file.py Refactored file processing with new decorator and extracted functions
Multiple test files Updated to remove credentials parameters and add version-specific test skipping
requirements.past.txt New dependency file for legacy API support
pyproject.toml Added past dependencies configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

LOGGER.debug(T("coal.services.dataset.processing_text").format(file_name=target_file))
with open(target_file, "r") as _file:
current_filename = os.path.basename(target_file)
content[current_filename] = "".join(line for line in _file)
Copy link

Copilot AI Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text processing function removes newlines by using "".join() instead of "\n".join(). This changes the original file content structure and may cause data loss.

Suggested change
content[current_filename] = "".join(line for line in _file)
content[current_filename] = _file.read()

Copilot uses AI. Check for mistakes.

tmp_dataset_dir_path = Path(tmp_dataset_dir)
for part in dataset.parts:
part_file_path = tmp_dataset_dir_path / part.source_name
data_part = dataset_api_instance.download_dataset_part(organization_id, workspace_id, dataset_id, part.id)
Copy link

Copilot AI Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method call uses positional arguments instead of keyword arguments, which could lead to parameter mismatch if the API signature changes. Consider using keyword arguments for clarity and robustness.

Suggested change
data_part = dataset_api_instance.download_dataset_part(organization_id, workspace_id, dataset_id, part.id)
data_part = dataset_api_instance.download_dataset_part(
organization_id=organization_id,
workspace_id=workspace_id,
dataset_id=dataset_id,
dataset_part_id=part.id,
)

Copilot uses AI. Check for mistakes.

with open(part_file_path, "wb") as binary_file:
binary_file.write(data_part)

if read_files:
Copy link

Copilot AI Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file content reading is performed inside the loop for each dataset part, but the read_files check happens after file writing. Consider moving this check before the file writing operation to avoid unnecessary disk I/O when read_files is False.

Suggested change
if read_files:
if read_files:
part_file_path = tmp_dataset_dir_path / part.source_name
data_part = dataset_api_instance.download_dataset_part(organization_id, workspace_id, dataset_id, part.id)
with open(part_file_path, "wb") as binary_file:
binary_file.write(data_part)

Copilot uses AI. Check for mistakes.

LOGGER.debug(T("coal.services.dataset.processing_json").format(file_name=target_file))
with open(target_file, "r") as _file:
current_filename = os.path.basename(target_file)
content[current_filename] = json.load(_file)
Copy link

Copilot AI Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using the same variable name '_file' for both the function parameter and the file handle in the with statement creates confusion. Consider using a more descriptive name like 'json_file' for the file handle.

Suggested change
content[current_filename] = json.load(_file)
with open(target_file, "r") as json_file:
current_filename = os.path.basename(target_file)
content[current_filename] = json.load(json_file)

Copilot uses AI. Check for mistakes.

@lalepee lalepee changed the base branch from main to feature/cosmotech_api_5.0 August 20, 2025 13:45
Copy link
Member

@neomatamune neomatamune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything seems ok to me to merge in the feature branch, we probably will need to update the action that runs the tests to run for 3.x and 5.x version of cosmotech-api

@lalepee
Copy link
Contributor Author

lalepee commented Aug 22, 2025

Everything seems ok to me to merge in the feature branch, we probably will need to update the action that runs the tests to run for 3.x and 5.x version of cosmotech-api

The cosmotech-api v5 lib is not released yet so testing will need either installing pre-release or using fixed version. pre-release is way to risky and fixed is way to static.
I'll add a test for retro compatibility that install the [past] requirement.txt, this will execute the same test for the time being but at the cosmotech-api release it will test correctly.

@neomatamune neomatamune merged commit 84262d4 into feature/cosmotech_api_5.0 Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants