Skip to content

feat : Add CLI Command for Uploading Resources to OpenML #1507

@pankajbaid567

Description

@pankajbaid567

Metadata

New Tests Added: Yes

Documentation Updated: No (CLI help text serves as documentation)

Change Log Entry: "Add CLI commands for uploading resources: openml upload dataset, openml upload flow, and openml upload run"

Details
What does this PR implement/fix?

This PR adds new CLI subcommands under openml upload to enable users to upload resources directly from the command line:

openml upload dataset <file_path> - Upload a dataset from ARFF or CSV file
openml upload flow <file_path> - Upload a flow from a serialized model file
openml upload run <run_file> - Upload a run from a run XML file
Why is this change necessary? What is the problem it solves?
Currently, users must write Python code to upload resources to OpenML. This creates friction for users who want to quickly share datasets, models, or experimental results. Adding CLI commands makes it easier to upload resources directly from the command line, especially useful for automation and scripting scenarios.

This directly addresses the ESoC 2025 goal of "Improving user experience in OpenML" by lowering the barrier for contribution.

How can I reproduce the issue this PR is solving and its solution?
Before (requires Python code):

import openml
dataset = openml.datasets.create_dataset(
    name="My Dataset",
    description="Dataset description",
    data_file="data.arff"
)
dataset.publish()

After (CLI commands):


# Upload a dataset
openml upload dataset data.arff --name "My Dataset" --description "Dataset description"
# Upload a flow
openml upload flow model.pkl --name "My Model" --description "Model description"
# Upload a run from file
openml upload run run_results.xml

Implementation Details:
Added upload functions in openml/cli.py: upload_dataset(), upload_flow(), and upload_run()
Integrated into main CLI parser with proper argument handling
Added comprehensive test suite in tests/test_openml/test_cli.py
Uses existing upload functions from respective modules
Requires API key to be configured
All tests use mocked API calls
Any other comments?
All pre-commit hooks pass (ruff, mypy, formatting)
No breaking changes
Follows project code style and patterns
Includes proper error handling for authentication
Ready for review

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions