Build, experiment, and deploy ML pipelines with confidence
Documentation β’ Quick Start β’ Examples β’ Contributing
LabChain is a production-ready ML experimentation framework that combines the flexibility of research with the rigor of production deployment. Stop fighting with boilerplate code and focus on what matters: your models.
|
π§© Modular by Design
|
π Production Ready
|
|
π Reproducible
|
β‘ Experimental Features
|
pip install framework3from labchain import Container, F3Pipeline
from labchain.plugins.filters import StandardScalerPlugin, KnnFilter
from labchain.plugins.metrics import F1, Precission, Recall
from labchain.base import XYData
from sklearn.datasets import load_iris
# Load data
iris = load_iris()
X = XYData.mock(iris.data)
y = XYData.mock(iris.target)
# Build pipeline
pipeline = F3Pipeline(
filters=[
StandardScalerPlugin(),
KnnFilter(n_neighbors=5)
],
metrics=[F1("weighted"), Precission("weighted"), Recall("weighted")]
)
# Train and evaluate
pipeline.fit(X, y)
predictions = pipeline.predict(X)
results = pipeline.evaluate(X, y, predictions)
print(results)
# {'F1': 0.95, 'Precision': 0.95, 'Recall': 0.95}That's it! π You just built, trained, and evaluated an ML pipeline.
# Mix and match components like LEGO blocks
from labchain.plugins.filters import (
PCAPlugin,
StandardScalerPlugin,
ClassifierSVMPlugin
)
pipeline = F3Pipeline(
filters=[
StandardScalerPlugin(),
PCAPlugin(n_components=2),
ClassifierSVMPlugin(kernel='rbf')
]
)from labchain.plugins.filters import Cached
# Cache expensive operations automatically
pipeline = F3Pipeline(
filters=[
Cached(
filter=ExpensivePreprocessor(),
cache_data=True,
cache_filter=True
),
MyModel()
]
)from labchain import WandbOptimizer
# Optimize with Weights & Biases
optimizer = WandbOptimizer(
project="my-experiment",
scorer=F1(),
method="bayes",
n_trials=50
)
# Define search space
pipeline = F3Pipeline(
filters=[
KnnFilter().grid({
'n_neighbors': [3, 5, 7, 9]
})
]
)
optimizer.optimize(pipeline)
optimizer.fit(X_train, y_train)Deploy pipelines without deploying code:
# On your laptop
@Container.bind(persist=True)
class MyCustomFilter(BaseFilter):
def predict(self, x):
return x * 2
Container.storage = S3Storage(bucket="my-models")
Container.ppif.push_all()
# On production server (no source code needed!)
from labchain.base import BasePlugin
pipeline = BasePlugin.build_from_dump(config, Container.ppif)
predictions = pipeline.predict(data) # Just works! β¨from labchain import HPCPipeline
# Automatic Spark distribution
pipeline = HPCPipeline(
app_name="distributed-training",
filters=[Filter1(), Filter2(), Filter3()]
)
pipeline.fit(large_dataset)Classification with Cross-Validation
from labchain import F3Pipeline, KFoldSplitter
from labchain.plugins.filters import StandardScalerPlugin, ClassifierSVMPlugin
from labchain.plugins.metrics import F1, Precission, Recall
pipeline = F3Pipeline(
filters=[
StandardScalerPlugin(),
ClassifierSVMPlugin(kernel='rbf', C=1.0)
],
metrics=[F1(), Precission(), Recall()]
).splitter(
KFoldSplitter(n_splits=5, shuffle=True, random_state=42)
)
pipeline.fit(X_train, y_train)
results = pipeline.evaluate(X_test, y_test, pipeline.predict(X_test))Parallel Processing
from labchain import LocalThreadPipeline
from labchain.plugins.filters import Filter1, Filter2, Filter3
# Process filters in parallel
pipeline = LocalThreadPipeline(
filters=[
Filter1(), # Runs in parallel
Filter2(), # Runs in parallel
Filter3() # Runs in parallel
]
)
# Results are concatenated automatically
predictions = pipeline.predict(X)Custom Components
from labchain import Container
from labchain.base import BaseFilter, XYData
@Container.bind()
class MyCustomFilter(BaseFilter):
def __init__(self, threshold: float = 0.5):
super().__init__(threshold=threshold)
def fit(self, x: XYData, y: XYData = None):
# Your training logic
pass
def predict(self, x: XYData) -> XYData:
# Your prediction logic
return XYData.mock(x.value > self.threshold)
# Use it like any other filter
pipeline = F3Pipeline(filters=[MyCustomFilter(threshold=0.7)])Version Control & Rollback
# Version 1
@Container.bind(persist=True)
class MyModel(BaseFilter):
def predict(self, x):
return x * 1
Container.ppif.push_all()
hash_v1 = Container.pcm.get_class_hash(MyModel)
# Version 2
@Container.bind(persist=True)
class MyModel(BaseFilter):
def predict(self, x):
return x * 2
Container.ppif.push_all()
hash_v2 = Container.pcm.get_class_hash(MyModel)
# Rollback to V1
ModelV1 = Container.ppif.get_version("MyModel", hash_v1)| Resource | Description |
|---|---|
| π Quick Start Guide | Get up and running in 5 minutes |
| π Tutorials | Step-by-step guides and examples |
| π API Reference | Complete API documentation |
| β‘ Remote Injection | Deploy without code (experimental) |
| ποΈ Architecture | Deep dive into design principles |
| π‘ Best Practices | Production-ready patterns |
|
|
- Core pipeline functionality
- Automatic caching system
- Hyperparameter optimization
- Distributed processing (Spark)
- Remote injection (experimental)
- Multi-cloud storage backends (GCS, Azure)
- Real-time inference API
- AutoML capabilities
- Model registry integration
- Kubernetes deployment templates
We β€οΈ contributions! Here's how you can help:
- π Report bugs by opening an issue
- π‘ Suggest features in discussions
- π Improve documentation
- π§ Submit pull requests
- β Star the repo to show support
# Clone the repository
git clone https://github.com/manucouto1/LabChain.git
cd LabChain
# Install dependencies
pip install -r requirements.txt
# Run tests
pytest tests/
# Build documentation
cd docs && mkdocs serve- Follow PEP 8 style guide
- Add tests for new features
- Update documentation
- Keep commits atomic and well-described
- π Issue Tracker - Report bugs and request features
- π§ Email - Contact the maintainers
- π Documentation - Comprehensive guides
This project is licensed under the AGPL-3.0 License - see the LICENSE file for details.
- β Use LabChain for free in your projects
- β Modify and distribute the code
β οΈ If you modify and distribute LabChain, you must release your changes under AGPL-3.0β οΈ If you use LabChain in a network service, you must make the source available
Made with β and Python