-
-
Notifications
You must be signed in to change notification settings - Fork 201
Description
Metadata
New Tests Added: Yes
Documentation Updated: No (CLI help text serves as documentation)
Change Log Entry: "Add CLI commands for uploading resources: openml upload dataset, openml upload flow, and openml upload run"
Details
What does this PR implement/fix?
This PR adds new CLI subcommands under openml upload to enable users to upload resources directly from the command line:
openml upload dataset <file_path> - Upload a dataset from ARFF or CSV file
openml upload flow <file_path> - Upload a flow from a serialized model file
openml upload run <run_file> - Upload a run from a run XML file
Why is this change necessary? What is the problem it solves?
Currently, users must write Python code to upload resources to OpenML. This creates friction for users who want to quickly share datasets, models, or experimental results. Adding CLI commands makes it easier to upload resources directly from the command line, especially useful for automation and scripting scenarios.
This directly addresses the ESoC 2025 goal of "Improving user experience in OpenML" by lowering the barrier for contribution.
How can I reproduce the issue this PR is solving and its solution?
Before (requires Python code):
import openml
dataset = openml.datasets.create_dataset(
name="My Dataset",
description="Dataset description",
data_file="data.arff"
)
dataset.publish()
After (CLI commands):
# Upload a dataset
openml upload dataset data.arff --name "My Dataset" --description "Dataset description"
# Upload a flow
openml upload flow model.pkl --name "My Model" --description "Model description"
# Upload a run from file
openml upload run run_results.xml
Implementation Details:
Added upload functions in openml/cli.py: upload_dataset(), upload_flow(), and upload_run()
Integrated into main CLI parser with proper argument handling
Added comprehensive test suite in tests/test_openml/test_cli.py
Uses existing upload functions from respective modules
Requires API key to be configured
All tests use mocked API calls
Any other comments?
All pre-commit hooks pass (ruff, mypy, formatting)
No breaking changes
Follows project code style and patterns
Includes proper error handling for authentication
Ready for review