Skip to content
This repository was archived by the owner on Mar 23, 2026. It is now read-only.

Latest commit

 

History

History
140 lines (97 loc) · 3.2 KB

File metadata and controls

140 lines (97 loc) · 3.2 KB

MLOps Guard — Real Scenarios

Examples of the PreToolUse hook intercepting ML code writes, showing what it catches and what it ignores.


Scenario 1: Training Script Missing Seeds (INTERCEPTS)

File: train.py | Checks passing: 1/4 (tracking only)

Code written:

import torch
import wandb
from torch import nn
from torch.utils.data import DataLoader

wandb.init(project="my-exp")

model = nn.Linear(128, 10)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

Hook fires:

MLOps gaps: • SEED → add torch.manual_seed(42); np.random.seed(42) before model init • DEVICE → add device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

→ Proceed as-is, or I'll add these automatically?

User: "add them" → Claude inserts both patterns before writing the file.


Scenario 2: Full Training Script — Passes (APPROVES)

File: trainer.py | Checks passing: 4/4

import torch
import wandb
from pathlib import Path

# Seeds
torch.manual_seed(42)
np.random.seed(42)

# Tracking
wandb.init(project="my-exp", config=cfg)

# Device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Paths — no hardcoded absolute paths
data_dir = Path(os.environ.get("DATA_DIR", "data/"))

Hook: approves silently.


Scenario 3: Test File (SKIPS)

File: test_dataset.py

import pytest
import torch
from src.data.dataset import MyDataset

def test_dataset_length():
    ds = MyDataset(data_dir="tests/fixtures/")
    assert len(ds) == 10

Hook: filename matches test_ pattern → approves immediately, no audit.


Scenario 4: Hardcoded Path Detected (INTERCEPTS)

File: finetune.py | Checks passing: 2/4 (seed + tracking, but paths fail)

import torch
import mlflow

torch.manual_seed(42)
mlflow.start_run()

model_path = "/home/techknowmad/models/bert-base/"   # ← flagged
data_path = "/Users/admin/datasets/my_corpus/"         # ← flagged

Hook fires:

MLOps gaps: • PATHS → replace /home/techknowmad/models/bert-base/ with Path(os.environ["MODEL_DIR"]) or a config value

→ Proceed as-is, or I'll add these automatically?


Scenario 5: Utility/Helper File (SKIPS)

File: utils/logging_utils.py

import logging
from rich.logging import RichHandler

def setup_logger(name: str, level: int = logging.INFO) -> logging.Logger:
    logging.basicConfig(handlers=[RichHandler()])
    return logging.getLogger(name)

Hook: no ML indicators in content → approves immediately.


Scenario 6: JAX Fine-tuning Script (INTERCEPTS)

File: finetune_jax.py | Checks passing: 0/4

import jax
import jax.numpy as jnp
import optax

model = FlaxBertForSequenceClassification.from_pretrained("bert-base-uncased")
optimizer = optax.adam(learning_rate=2e-5)

Hook fires:

MLOps gaps: • SEED → add key = jax.random.PRNGKey(42) before model init • TRACKING → add wandb.init(project="...", config=cfg) or mlflow.start_run() • PATHS → confirm data paths use os.environ or pathlib • DEVICE → confirm device strategy (jax.devices() aware)

→ Proceed as-is, or I'll add these automatically?