Genie gives you AI inside Databricks. This gives you Databricks inside AI.
Run code on Databricks clusters while your agent orchestrates everything else — other skills, subagents, MCPs, local files, and parallel hypothesis validation. One session, no boundaries.
Databricks is powerful. But Databricks inside an AI agent that can parallelize work, compose tools, and cross every boundary? That's something else.
Works with Claude Code, Cursor, GitHub Copilot, and 40+ other agents.
Genie works inside one notebook, one workspace. When the real work crosses boundaries, it stops. Your AI agent doesn't.
| What you need | Genie | Your agent + this skill |
|---|---|---|
| Analyze a repo and cross-reference with Databricks logs | Workspace only | Reads repo + queries cluster in one session |
| Validate 3 hypotheses in parallel on different datasets | One notebook at a time | Spawns subagents, each running its own cluster query |
| Train on cluster, compare with local baselines, commit results | Can't access local files or git | Cluster compute + local files + git — same session |
| Use an MCP to enrich data before running Spark | No MCP support | Calls MCPs, APIs, other skills, then sends to cluster |
| Explore Python + Scala + SQL across multiple repos | Single-language notebooks | Subagents explore each language, agent synthesizes |
| Resume after cluster eviction | Start over | Append-only session log with replay |
The difference isn't features. It's architecture. Genie is an assistant scoped to Databricks. This makes Databricks one resource inside an orchestrator that can do anything — use GSD, superpowers, compose skills, spawn subagents, interact with MCPs, and parallelize work across tools.
/plugin marketplace add wedneyyuri/databricks-repl
/plugin install databricks-repl@wedneyyuri-databricks-repl
npx skills add wedneyyuri/databricks-replThe CLI detects which agents you have and installs to each one automatically.
You: "Load the customers table, train a classifier,
compare with last quarter's local baseline,
and open a PR with the results"
Claude:
→ creates a REPL session on your Databricks cluster
→ runs the training code, captures outputs as files
→ reads your local baseline for comparison
→ consolidates everything into a clean .py file
→ commits and opens the PR
Five tools, one session. No switching between terminal, notebooks, and browser.
- You describe the task — your agent decides what to run
- Scripts handle the plumbing — auth, sessions, polling, output capture
- Agent sees only metadata — file paths and status, never raw output
Context stays clean. Sessions stay productive for 50+ interactions.
| Example | What It Shows |
|---|---|
| primes | Basic Python execution on a Databricks cluster |
| monte-carlo-pi | Distributed Spark — estimate π scaling from 100M to 10B samples |
| iris-classification | Full ML pipeline — load, train, evaluate, persist model to Volumes |
| Skill | What It Does |
|---|---|
| databricks-repl | Execute Python on Databricks via a stateful REPL session |
| databricks-repl-consolidate | Turn a REPL session into a single committable .py file |
- Databricks CLI with a profile in
~/.databrickscfg - Databricks SDK for Python (
pip install databricks-sdk) - A running classic all-purpose cluster
These skills follow the Agent Skills Specification. If you prefer not to use the marketplace or npx skills, copy the skills manually:
git clone https://github.com/wedneyyuri/databricks-repl.git /tmp/databricks-repl
# Cursor
cp -r /tmp/databricks-repl/skills/databricks-repl .cursor/skills/
cp -r /tmp/databricks-repl/skills/databricks-repl-consolidate .cursor/skills/
# GitHub Copilot
mkdir -p .github/skills
cp -r /tmp/databricks-repl/skills/databricks-repl .github/skills/
cp -r /tmp/databricks-repl/skills/databricks-repl-consolidate .github/skills/