🧠 AI-Studio-ClearML

This repository provides a minimal, reproducible example of how to use ClearML to build machine learning pipelines, track experiments, and manage datasets using both task-based pipelines and function-based pipelines.

📦 Project Structure

├── model_artifacts/                  # Example outputs or saved models
├── work_dataset/                     # Dataset samples and usage examples
├── demo_functions.py                 # Base Functions from ClearML 
├── demo_using_artifacts_example.py  # Demonstrates artifact loading
├── main.py                           # Entry point
├── pipeline_from_tasks.py           # Pipeline built from existing ClearML Tasks
├── step1_dataset_artifact.py        # Step 1: Upload dataset as artifact
├── step2_data_preprocessing.py      # Step 2: Preprocess dataset
├── step3_train_model.py             # Step 3: Train model using preprocessed data

🧪 Features

✅ Task-based pipeline using PipelineController.add_step(...)
[TBD] Function-based pipeline using PipelineController.add_function_step(...)
✅ Reusable ClearML Task templates
✅ Dataset and model artifact management with ClearML
✅ End-to-end ML workflow: Dataset → Preprocessing → Training
✅ Fully compatible with ClearML Hosted and ClearML Server

🚀 Getting Started

1. Install Dependencies

pip install clearml

2. Configure ClearML

Set up ClearML by running:

clearml-init

You will be prompted to enter:

ClearML Credential

Use https://app.clear.ml to register for a free account if needed.

3. Create a ClearML Agent

Install the ClearML agent on your machine or server.

pip install clearml-agent

🛠️ How to Use

Using Colab: refer to ClearML_Pipeline_Demo.ipynb.

🔁 Option 1: Pipeline from Predefined ClearML Tasks

To use a task-based pipeline, follow these steps:

Step 1: Register the Base Tasks

Before running the pipeline, execute the following scripts **once** to create reusable ClearML Tasks:

Note: When running for the first time, comment out `task.execute_remotely()` in the each .py file of the three tasks to successfully create a task template.

# Step 1: Upload dataset
python step1_dataset_artifact.py

# Step 2: Preprocess dataset
python step2_data_preprocessing.py

# Step 3: Train model
python step3_train_model.py

These will appear in your ClearML dashboard and serve as base tasks for the pipeline.

Step 1.5: Initial ClearML Queue

Create Queue with name as pipeline (or your customized one), ensure it is consistent in pipeline_from_tasks.py

pipe.start(queue="pipeline")

Run the agent for queue worker:

Step 2: Run the Pipeline

Once all base tasks are registered, run the pipeline:

python main.py # Where we execute the run_pipeline()

🔧 [TBD] Option 2: Pipeline from Local Python Functions

This version demonstrates using add_function_step(...) to wrap Python logic as pipeline steps.

🧩 Run Individual Pipeline Steps

You can run each task separately as well:

Note: When running for the first time, comment out task.execute_remotely() in the code file to successfully create a task template.

# Step 1: Upload dataset
python step1_dataset_artifact.py

# Step 2: Preprocess data
python step2_data_preprocessing.py

# Step 3: Train model
python step3_train_model.py

📘 References

🙌 Acknowledgments

This project is developed and maintained by:

Jacoo-Zhao (GitHub: @Jacoo-Zhao)
Zoe Lin (Github: @Zoe Lin)

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🧠 AI-Studio-ClearML

📦 Project Structure

🧪 Features

🚀 Getting Started

1. Install Dependencies

2. Configure ClearML

3. Create a ClearML Agent

🛠️ How to Use

Using Colab: refer to ClearML_Pipeline_Demo.ipynb.

🔁 Option 1: Pipeline from Predefined ClearML Tasks

Step 1: Register the Base Tasks

Step 1.5: Initial ClearML Queue

Step 2: Run the Pipeline

🔧 [TBD] Option 2: Pipeline from Local Python Functions

🧩 Run Individual Pipeline Steps

📘 References

🙌 Acknowledgments

📄 License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🧠 AI-Studio-ClearML

📦 Project Structure

🧪 Features

🚀 Getting Started

1. Install Dependencies

2. Configure ClearML

3. Create a ClearML Agent

🛠️ How to Use

Using Colab: refer to ClearML_Pipeline_Demo.ipynb.

🔁 Option 1: Pipeline from Predefined ClearML Tasks

Step 1: Register the Base Tasks

Step 1.5: Initial ClearML Queue

Step 2: Run the Pipeline

🔧 [TBD] Option 2: Pipeline from Local Python Functions

🧩 Run Individual Pipeline Steps

📘 References

🙌 Acknowledgments

📄 License