-
Notifications
You must be signed in to change notification settings - Fork 83
feat(clp-package): Add utility function to validate dataset's existence in CLI scripts. #1036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughA new Changes
Sequence Diagram(s)sequenceDiagram
participant Scheduler as Query Scheduler
participant DB as Metadata Database
Scheduler->>DB: fetch_existing_datasets()
DB-->>Scheduler: List of datasets
alt Dataset not in cache
Scheduler->>DB: fetch_existing_datasets()
DB-->>Scheduler: Updated list of datasets
alt Dataset still missing
Scheduler->>Scheduler: Mark job as failed and skip
else Dataset found
Scheduler->>Scheduler: Proceed with job
end
else Dataset in cache
Scheduler->>Scheduler: Proceed with job
end
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (2)📓 Common learningscomponents/clp-package-utils/clp_package_utils/general.py (17)🧬 Code Graph Analysis (1)components/clp-package-utils/clp_package_utils/general.py (3)
🪛 Pylint (3.3.7)components/clp-package-utils/clp_package_utils/general.py[error] 28-28: Unable to import 'clp_py_utils.clp_metadata_db_utils' (E0401) [error] 29-34: Unable to import 'clp_py_utils.core' (E0401) [error] 35-35: Unable to import 'clp_py_utils.sql_adapter' (E0401) 🔇 Additional comments (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
haiqi96
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel the situation is bit tricky here because this change is dependent on PR1023, and it's also important to see how validate_dataset will be used to decide what should be the argument type of validate_dataset.
instead of revewing and merging this change alone, I think we should
- Anyway, merge 1023 first.
Then, either:
- Combine this into PR1004 so it's clear how
validate_datasetwill be used. - Or, merge PR1004 first. it's ok if it doesn't have full support of validating datasets in archive manager or decompress.py scripts. And then make this PR to support archive manager and decompress properly.
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/clp-package-utils/clp_package_utils/general.py(3 hunks)components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py(2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: kirkrodrigues
PR: y-scope/clp#881
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:35-41
Timestamp: 2025-05-06T09:48:55.408Z
Learning: For installation scripts in the CLP project, prefer explicit error handling over automatic dependency resolution (like `apt-get install -f`) when installing packages to give users more control over their system.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#831
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:0-0
Timestamp: 2025-04-17T16:55:06.658Z
Learning: In the compression scheduler, the team prefers initializing in-memory caches from the database at startup rather than performing repeated database queries for efficiency reasons. This approach maintains both performance and reliability across process restarts.
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
components/clp-package-utils/clp_package_utils/general.py (2)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
🧬 Code Graph Analysis (1)
components/clp-package-utils/clp_package_utils/general.py (3)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
Database(88-167)get_clp_connection_params_and_type(146-167)components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
validate_and_cache_dataset(160-179)components/clp-py-utils/clp_py_utils/sql_adapter.py (3)
SQL_Adapter(60-131)create_connection(99-105)create_connection(113-114)
🪛 Pylint (3.3.7)
components/clp-package-utils/clp_package_utils/general.py
[error] 28-28: Unable to import 'clp_py_utils.clp_metadata_db_utils'
(E0401)
[error] 29-34: Unable to import 'clp_py_utils.core'
(E0401)
[error] 35-35: Unable to import 'clp_py_utils.sql_adapter'
(E0401)
[error] 592-592: Undefined variable 'closing'
(E0602)
[error] 592-592: Undefined variable 'closing'
(E0602)
🪛 Flake8 (7.2.0)
components/clp-package-utils/clp_package_utils/general.py
[error] 592-592: undefined name 'closing'
(F821)
[error] 592-592: undefined name 'closing'
(F821)
🔇 Additional comments (1)
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
160-179: Excellent caching implementation!The function correctly implements the caching pattern with early return optimization when the dataset is already cached, avoiding unnecessary database queries. This aligns well with the team's preference for efficient in-memory caching as noted in the retrieved learnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
161-181: Add parameter validation and improve documentation.The function implementation looks good, but consider these improvements:
Parameter validation: Add validation for the
datasetparameter to handle None or empty values gracefully, especially given the past review comment about avoiding calls when dataset is None.Documentation clarity: Fix the grammatical issue in the docstring.
def validate_and_cache_dataset( db_cursor, table_prefix: str, dataset: str, existing_datasets: Set[str] | None = None, ) -> Tuple[bool, Set[str]]: """ - Checks if the provided dataset currently exists in the metadata database and cache it locally. + Checks if the provided dataset currently exists in the metadata database and caches it locally. If the dataset already exists in the local cache, the database query is skipped and the cache is not updated. :param db_cursor: :param table_prefix: :param dataset: :param existing_datasets: :return: Whether the dataset exists, and a refreshed cache of datasets if a lookup is required. """ + if not dataset: + return False, existing_datasets or set() + if existing_datasets is not None and dataset in existing_datasets: return True, existing_datasets existing_datasets = fetch_existing_datasets(db_cursor, table_prefix) return dataset in existing_datasets, existing_datasets
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.237Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: kirkrodrigues
PR: y-scope/clp#881
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:35-41
Timestamp: 2025-05-06T09:48:55.408Z
Learning: For installation scripts in the CLP project, prefer explicit error handling over automatic dependency resolution (like `apt-get install -f`) when installing packages to give users more control over their system.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#831
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:0-0
Timestamp: 2025-04-17T16:55:06.658Z
Learning: In the compression scheduler, the team prefers initializing in-memory caches from the database at startup rather than performing repeated database queries for efficiency reasons. This approach maintains both performance and reliability across process restarts.
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (2)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.237Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
🪛 Flake8 (7.2.0)
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py
[error] 3-3: 'contextlib.closing' imported but unused
(F401)
🪛 Pylint (3.3.7)
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py
[warning] 3-3: Unused closing imported from contextlib
(W0611)
…taset when using CLP storage engine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/clp-package-utils/clp_package_utils/general.py(2 hunks)components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py(1 hunks)components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py(2 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: kirkrodrigues
PR: y-scope/clp#881
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:35-41
Timestamp: 2025-05-06T09:48:55.408Z
Learning: For installation scripts in the CLP project, prefer explicit error handling over automatic dependency resolution (like `apt-get install -f`) when installing packages to give users more control over their system.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#831
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:0-0
Timestamp: 2025-04-17T16:55:06.658Z
Learning: In the compression scheduler, the team prefers initializing in-memory caches from the database at startup rather than performing repeated database queries for efficiency reasons. This approach maintains both performance and reliability across process restarts.
components/clp-package-utils/clp_package_utils/general.py (6)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:104-110
Timestamp: 2024-11-15T16:21:52.122Z
Learning: In `clp_package_utils/scripts/native/del_archives.py`, when deleting archives, the `archive` variable retrieved from the database is controlled and is always a single string without path components. Therefore, it's acceptable to skip additional validation checks for directory traversal in this context.
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (3)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (5)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: haiqi96
PR: y-scope/clp#569
File: components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:380-392
Timestamp: 2024-11-15T20:07:22.256Z
Learning: The current implementation assumes single-threaded execution, so race conditions in functions like `is_target_extracted` in `components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py` are not currently a concern.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
🧬 Code Graph Analysis (2)
components/clp-package-utils/clp_package_utils/general.py (4)
components/webui/server/src/configConstants.ts (1)
CLP_DEFAULT_DATASET_NAME(7-7)components/clp-py-utils/clp_py_utils/clp_config.py (4)
Database(80-159)StorageEngine(47-49)StorageType(52-54)get_clp_connection_params_and_type(138-159)components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
validate_and_cache_dataset(188-211)components/clp-py-utils/clp_py_utils/sql_adapter.py (3)
SQL_Adapter(60-131)create_connection(99-105)create_connection(113-114)
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (1)
components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
validate_and_cache_dataset(188-211)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: lint-check (macos-latest)
🔇 Additional comments (4)
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (2)
45-45: LGTM!The import is correctly added for the new centralized validation function.
651-653: LGTM!The integration correctly replaces the previous internal validation logic with the centralized utility function. The
existing_datasetscache is properly maintained and passed to the function.components/clp-package-utils/clp_package_utils/general.py (2)
10-10: LGTM!The missing import for
closinghas been correctly added, addressing the previous review feedback.
567-592: Well-designed validation function with proper error handling.The function correctly validates storage engine constraints and provides appropriate error messages. The logic for handling CLP_S vs non-CLP_S engines is sound, and the database connection management follows best practices.
Note: This function will be affected by the logic bug in
validate_and_cache_datasetsince it calls that function without the cache parameter.
|
@coderabbitai |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@Bill-hbrhbr I'll run a fresh analysis of all imports in your PR to check for any unused or missing imports. Based on my comprehensive analysis, here's the current state of imports in your PR: 📋 Import Analysis Results✅ Clean Files (No Issues)
🔧 One Issue Found
🎯 Action RequiredRemove this unused import from from clp_py_utils.clp_metadata_db_utils import (
fetch_existing_datasets, # ❌ Remove this line
validate_and_cache_dataset,
# ... keep other imports
)✅ What's Working Great
💡 SummaryYour import hygiene is excellent with just one leftover import to clean up. The 🏁 Script executed: Length of output: 7943 🏁 Script executed: Length of output: 1500 Based on my detailed analysis, here's the accurate import status for your PR: 📋 Corrected Import Analysis Results✅ Files with Perfect Import Hygiene
🔧 Action RequiredRemove this one unused import from from clp_py_utils.clp_metadata_db_utils import (
create_metadata_db_tables,
fetch_existing_datasets, # ❌ Remove this line - unused after refactoring
get_archives_table_name,
get_files_table_name,
validate_and_cache_dataset,
)✅ What the Analysis Revealed
💡 SummaryYour import management is excellent with just one cleanup needed! The AST analyzer had some false positives, but manual verification confirms that only 🏁 Script executed: Length of output: 2557 🏁 Script executed: Length of output: 3624 📋 Final Import Analysis ResultsBased on the detailed verification, here's the accurate status:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/clp-package-utils/clp_package_utils/general.py(3 hunks)components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.398Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.571Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
components/clp-package-utils/clp_package_utils/general.py (13)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.398Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.571Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-07-03T13:33:27.447Z
Learning: In the CLP codebase, `from __future__ import annotations` imports are used to enable Python pipe notation for union types (e.g., `int | str`) in Python versions before 3.10, and should not be considered unused imports.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-07-03T13:33:27.447Z
Learning: In the CLP codebase, `from __future__ import annotations` imports are used to enable Python pipe notation for union types (e.g., `int | str`) in Python versions before 3.10, and should not be considered unused imports.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:104-110
Timestamp: 2024-11-15T16:21:52.122Z
Learning: In `clp_package_utils/scripts/native/del_archives.py`, when deleting archives, the `archive` variable retrieved from the database is controlled and is always a single string without path components. Therefore, it's acceptable to skip additional validation checks for directory traversal in this context.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (8)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.398Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.571Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: haiqi96
PR: y-scope/clp#569
File: components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:380-392
Timestamp: 2024-11-15T20:07:22.256Z
Learning: The current implementation assumes single-threaded execution, so race conditions in functions like `is_target_extracted` in `components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py` are not currently a concern.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
Learnt from: haiqi96
PR: y-scope/clp#569
File: components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:76-77
Timestamp: 2024-11-17T23:24:08.862Z
Learning: The `query_scheduler.py` file operates with single threading, and multithreading is not used.
🪛 Flake8 (7.2.0)
components/clp-package-utils/clp_package_utils/general.py
[error] 15-15: 'clp_py_utils.clp_config.CLP_DEFAULT_DATASET_NAME' imported but unused
(F401)
[error] 30-30: 'clp_py_utils.clp_metadata_db_utils.validate_and_cache_dataset' imported but unused
(F401)
🪛 Pylint (3.3.7)
components/clp-package-utils/clp_package_utils/general.py
[error] 30-30: Unable to import 'clp_py_utils.clp_metadata_db_utils'
(E0401)
[error] 31-36: Unable to import 'clp_py_utils.core'
(E0401)
[error] 37-37: Unable to import 'clp_py_utils.sql_adapter'
(E0401)
[warning] 30-30: Unused validate_and_cache_dataset imported from clp_py_utils.clp_metadata_db_utils
(W0611)
🔇 Additional comments (2)
components/clp-package-utils/clp_package_utils/general.py (1)
567-582: LGTM!The
validate_datasetfunction implementation is well-structured:
- Proper use of context managers for database resource cleanup
- Clear error messaging for non-existent datasets
- Follows established CLP codebase patterns for database interactions
- Appropriate for single-use validation at function startup as intended
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (1)
650-665: Well-implemented dataset validation with efficient caching.The inline validation logic correctly handles runtime dataset validation:
- Uses cache-miss strategy to minimize database queries
- Properly updates the local cache when datasets are found
- Includes clear error logging and appropriate job failure handling
- The comment about assuming datasets are never deleted provides helpful context
This implementation aligns well with the CLP scheduler's requirements for efficient runtime validation.
Co-authored-by: haiqi96 <[email protected]>
|
@coderabbitai |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@Bill-hbrhbr I'll run a comprehensive analysis of all imports in your PR to check for unused or missing imports. Based on my comprehensive analysis, here's the current state of imports in your PR: 📋 Import Analysis Results✅ Clean Files (Perfect Import Hygiene)1.
2.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/clp-package-utils/clp_package_utils/general.py(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.407Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.638Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
components/clp-package-utils/clp_package_utils/general.py (13)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-package-utils/clp_package_utils/general.py:564-579
Timestamp: 2025-06-28T07:10:47.295Z
Learning: The validate_dataset function in components/clp-package-utils/clp_package_utils/general.py is designed to be called once upon function startup for dataset validation, not repeatedly during execution, making caching optimizations unnecessary.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.407Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1050
File: components/clp-package-utils/clp_package_utils/scripts/native/search.py:239-244
Timestamp: 2025-06-29T22:01:05.569Z
Learning: In the CLP codebase, dataset validation is handled at the outer layer in wrapper scripts (like clp_package_utils/scripts/search.py) that call native implementation scripts (like clp_package_utils/scripts/native/search.py). The native scripts are internal components that receive pre-validated parameters, so they don't need their own dataset validation logic.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.638Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1004
File: components/clp-package-utils/clp_package_utils/scripts/native/decompress.py:139-144
Timestamp: 2025-06-24T08:54:14.438Z
Learning: In the CLP codebase, the get_orig_file_id function signature was changed after a recent merge to no longer accept a dataset parameter, making previous suggestions that reference this parameter invalid.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-07-03T13:33:27.460Z
Learning: In the CLP codebase, `from __future__ import annotations` imports are used to enable Python pipe notation for union types (e.g., `int | str`) in Python versions before 3.10, and should not be considered unused imports.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-07-03T13:33:27.460Z
Learning: In the CLP codebase, `from __future__ import annotations` imports are used to enable Python pipe notation for union types (e.g., `int | str`) in Python versions before 3.10, and should not be considered unused imports.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:104-110
Timestamp: 2024-11-15T16:21:52.122Z
Learning: In `clp_package_utils/scripts/native/del_archives.py`, when deleting archives, the `archive` variable retrieved from the database is controlled and is always a single string without path components. Therefore, it's acceptable to skip additional validation checks for directory traversal in this context.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#868
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:141-144
Timestamp: 2025-05-05T16:32:55.163Z
Learning: The column metadata table (created by `_create_column_metadata_table`) is only needed for dataset-specific workflows in `CLP_S` and is obsolete for non-dataset workflows.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.
🧬 Code Graph Analysis (1)
components/clp-package-utils/clp_package_utils/general.py (3)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
Database(80-159)get_clp_connection_params_and_type(138-159)components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py (1)
fetch_existing_datasets(173-185)components/clp-py-utils/clp_py_utils/sql_adapter.py (3)
SQL_Adapter(60-131)create_connection(99-105)create_connection(113-114)
🪛 Pylint (3.3.7)
components/clp-package-utils/clp_package_utils/general.py
[error] 35-35: Unable to import 'clp_py_utils.sql_adapter'
(E0401)
[error] 572-572: Undefined variable 'clp_config'
(E0602)
🪛 Flake8 (7.2.0)
components/clp-package-utils/clp_package_utils/general.py
[error] 572-572: undefined name 'clp_config'
(F821)
🔇 Additional comments (1)
components/clp-package-utils/clp_package_utils/general.py (1)
10-10: LGTM! Imports are correctly added for the new functionality.All the new imports (
closing,Database,fetch_existing_datasets,SQL_Adapter) are properly used in the newvalidate_datasetfunction.Also applies to: 18-18, 20-20, 35-35
|
@coderabbitai |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@Bill-hbrhbr I'll conduct a comprehensive review of all imports in your PR to check for unused or missing imports, and verify proper import hygiene. 🏁 Script executed: Length of output: 10675 Based on my comprehensive import analysis, here's the complete review: 📋 Import Review Results✅ All Files Have Excellent Import Hygiene1.
2.
3.
✅ Validation Results
💡 SummaryOutstanding import management! 🎉 Your PR demonstrates exemplary import hygiene:
No cleanup or changes needed - your import organization is perfect! This level of import discipline makes the codebase more maintainable and demonstrates excellent coding practices. 🎯 🧠 Learnings used |
haiqi96
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
feat(clp-package): Add utility function to validate dataset's existence from the CLI.
Co-authored-by: kirkrodrigues <[email protected]>
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified the PR title directly.
…ce in CLI scripts. (y-scope#1036)
Description
Add utility functions to allow CLP package scripts to check whether the dataset name exists in the metadata, both upon script startup and during execution.
CLP_Sstorage engine.The facilitates PR #1004 if we want output error messages to users when they search for a dataset that currently does not exist in the metadata database.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Bug Fixes
Refactor