|
| 1 | +# Databricks MCP Server |
| 2 | + |
| 3 | +A Model Context Protocol (MCP) server for generating production-ready Databricks applications with testing, |
| 4 | +linting and deployment setup from a single prompt. This agent relies heavily on scaffolding and |
| 5 | +extensive validation to ensure high-quality outputs. |
| 6 | + |
| 7 | +## TL;DR |
| 8 | + |
| 9 | +**Primary Goal:** Create and deploy production-ready Databricks applications from a single natural language prompt. This MCP server combines scaffolding, validation, and deployment into a seamless workflow that goes from idea to running application. |
| 10 | + |
| 11 | +**How it works:** |
| 12 | +1. **Explore your data** - Query Databricks catalogs, schemas, and tables to understand your data |
| 13 | +2. **Generate the app** - Scaffold a full-stack TypeScript application (tRPC + React) with proper structure |
| 14 | +3. **Customize with AI** - Use workspace tools to read, write, and edit files naturally through conversation |
| 15 | +4. **Validate rigorously** - Run builds, type checks, and tests to ensure quality |
| 16 | +5. **Deploy confidently** - Push validated apps directly to Databricks Apps platform |
| 17 | + |
| 18 | +**Why use it:** |
| 19 | +- **Speed**: Go from concept to deployed Databricks app in minutes, not hours or days |
| 20 | +- **Quality**: Extensive validation ensures your app builds, passes tests, and is production-ready |
| 21 | +- **Simplicity**: One natural language conversation handles the entire workflow |
| 22 | + |
| 23 | +Perfect for data engineers and developers who want to build Databricks apps without the manual overhead of project setup, configuration, testing infrastructure, and deployment pipelines. |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## Getting Started |
| 28 | + |
| 29 | +### Quick Setup |
| 30 | + |
| 31 | +1. **Set up Databricks credentials** (required for Databricks tools): |
| 32 | + ```bash |
| 33 | + export DATABRICKS_HOST="https://your-workspace.databricks.com" |
| 34 | + export DATABRICKS_TOKEN="dapi..." |
| 35 | + export DATABRICKS_WAREHOUSE_ID="your-warehouse-id" |
| 36 | + ``` |
| 37 | + |
| 38 | +2. **Configure your MCP client** (e.g., Claude Code): |
| 39 | + |
| 40 | + Add to your MCP config file (e.g., `~/.claude.json`): |
| 41 | + ```json |
| 42 | + { |
| 43 | + "mcpServers": { |
| 44 | + "databricks": { |
| 45 | + "command": "databricks", |
| 46 | + "args": ["experimental", "apps-mcp"], |
| 47 | + "env": { |
| 48 | + "DATABRICKS_HOST": "https://your-workspace.databricks.com", |
| 49 | + "DATABRICKS_TOKEN": "dapi...", |
| 50 | + "DATABRICKS_WAREHOUSE_ID": "your-warehouse-id" |
| 51 | + } |
| 52 | + } |
| 53 | + } |
| 54 | + } |
| 55 | + ``` |
| 56 | + |
| 57 | +3. **Create your first Databricks app:** |
| 58 | + |
| 59 | + Restart your MCP client and try: |
| 60 | + ``` |
| 61 | + Create a Databricks app that shows sales data from main.sales.transactions |
| 62 | + with a chart showing revenue by region. Deploy it as "sales-dashboard". |
| 63 | + ``` |
| 64 | + |
| 65 | + The AI will: |
| 66 | + - Explore your Databricks tables |
| 67 | + - Generate a full-stack application |
| 68 | + - Customize it based on your requirements |
| 69 | + - Validate it passes all tests |
| 70 | + - Deploy it to Databricks Apps |
| 71 | + |
| 72 | +--- |
| 73 | + |
| 74 | +## Features |
| 75 | + |
| 76 | +All features are designed to support the end-to-end workflow of creating production-ready Databricks applications: |
| 77 | + |
| 78 | +### 1. Data Exploration (Foundation) |
| 79 | + |
| 80 | +Understand your Databricks data before building: |
| 81 | + |
| 82 | +- **`databricks_list_catalogs`** - Discover available data catalogs |
| 83 | +- **`databricks_list_schemas`** - Browse schemas in a catalog |
| 84 | +- **`databricks_list_tables`** - Find tables in a schema |
| 85 | +- **`databricks_describe_table`** - Get table details, columns, and sample data |
| 86 | +- **`databricks_execute_query`** - Test queries and preview data |
| 87 | + |
| 88 | +*These tools help the AI understand your data structure so it can generate relevant application code.* |
| 89 | + |
| 90 | +### 2. Application Generation (Core) |
| 91 | + |
| 92 | +Create the application structure: |
| 93 | + |
| 94 | +- **`scaffold_data_app`** - Generate a full-stack TypeScript application |
| 95 | + - Modern stack: Node.js, TypeScript, React, tRPC |
| 96 | + - Pre-configured build system, linting, and testing |
| 97 | + - Production-ready project structure |
| 98 | + - Databricks SDK integration |
| 99 | + |
| 100 | +*This is the foundation of your application - a working, tested template ready for customization.* |
| 101 | + |
| 102 | +### 3. Validation (Quality Assurance) |
| 103 | + |
| 104 | +Ensure production-readiness before deployment: |
| 105 | + |
| 106 | +- **`validate_data_app`** - Comprehensive validation |
| 107 | + - Build verification (npm build) |
| 108 | + - Type checking (TypeScript compiler) |
| 109 | + - Test execution (full test suite) |
| 110 | + |
| 111 | +*This step guarantees your application is tested and ready for production before deployment.* |
| 112 | + |
| 113 | +### 4. Deployment (Production Release) |
| 114 | + |
| 115 | +Deploy validated applications to Databricks (enable with `--allow-deployment`): |
| 116 | + |
| 117 | +- **`deploy_databricks_app`** - Push to Databricks Apps platform |
| 118 | + - Automatic deployment configuration |
| 119 | + - Environment management |
| 120 | + - Production-grade setup |
| 121 | + |
| 122 | +*The final step: your validated application running on Databricks.* |
| 123 | + |
| 124 | +--- |
| 125 | + |
| 126 | +## Example Usage |
| 127 | + |
| 128 | +Here are example conversations showing the end-to-end workflow for creating Databricks applications: |
| 129 | + |
| 130 | +### Complete Workflow: Analytics Dashboard |
| 131 | + |
| 132 | +This example shows how to go from data exploration to deployed application: |
| 133 | + |
| 134 | +**User:** |
| 135 | +``` |
| 136 | +I want to create a Databricks app that visualizes customer purchases. The data is |
| 137 | +in the main.sales catalog. Show me what tables are available and create a dashboard |
| 138 | +with charts for total revenue by region and top products. Deploy it as "sales-insights". |
| 139 | +``` |
| 140 | + |
| 141 | +**What happens:** |
| 142 | +1. **Data Discovery** - AI lists schemas and tables in main.sales |
| 143 | +2. **Data Inspection** - AI describes the purchases table structure |
| 144 | +3. **App Generation** - AI scaffolds a TypeScript application |
| 145 | +4. **Customization** - AI adds visualization components and queries |
| 146 | +5. **Validation** - AI runs build, type check, and tests in container |
| 147 | +6. **Deployment** - AI deploys to Databricks Apps as "sales-insights" |
| 148 | + |
| 149 | +**Result:** A production-ready Databricks app running in minutes with proper testing. |
| 150 | + |
| 151 | +--- |
| 152 | + |
| 153 | +### Quick Examples for Specific Use Cases |
| 154 | + |
| 155 | +#### Data App from Scratch |
| 156 | + |
| 157 | +``` |
| 158 | +Create a Databricks app in ~/projects/user-analytics that shows daily active users |
| 159 | +from main.analytics.events. Include a line chart and data table. |
| 160 | +``` |
| 161 | + |
| 162 | +#### Real-Time Monitoring Dashboard |
| 163 | + |
| 164 | +``` |
| 165 | +Build a monitoring dashboard for the main.logs.system_metrics table. Show CPU, |
| 166 | +memory, and disk usage over time. Add alerts for values above thresholds. |
| 167 | +``` |
| 168 | + |
| 169 | +#### Report Generator |
| 170 | + |
| 171 | +``` |
| 172 | +Create an app that generates weekly reports from main.sales.transactions. |
| 173 | +Include revenue trends, top customers, and product performance. Add export to CSV. |
| 174 | +``` |
| 175 | + |
| 176 | +#### Data Quality Dashboard |
| 177 | + |
| 178 | +``` |
| 179 | +Build a data quality dashboard for main.warehouse.inventory. Check for nulls, |
| 180 | +duplicates, and out-of-range values. Show data freshness metrics. |
| 181 | +``` |
| 182 | + |
| 183 | +--- |
| 184 | + |
| 185 | +### Working with Existing Applications |
| 186 | + |
| 187 | +Once an app is scaffolded, you can continue development through conversation: |
| 188 | + |
| 189 | +``` |
| 190 | +Add a filter to show only transactions from the last 30 days |
| 191 | +``` |
| 192 | + |
| 193 | +``` |
| 194 | +Update the chart to use a bar chart instead of line chart |
| 195 | +``` |
| 196 | + |
| 197 | +``` |
| 198 | +Add a new API endpoint to fetch customer details |
| 199 | +``` |
| 200 | + |
| 201 | +``` |
| 202 | +Run the tests and fix any failures |
| 203 | +``` |
| 204 | + |
| 205 | +``` |
| 206 | +Add error handling for failed database queries |
| 207 | +``` |
| 208 | + |
| 209 | +--- |
| 210 | + |
| 211 | +### Iterative Development Workflow |
| 212 | + |
| 213 | +**Initial Request:** |
| 214 | +``` |
| 215 | +Create a simple dashboard for main.sales.orders |
| 216 | +``` |
| 217 | + |
| 218 | +**Refinement:** |
| 219 | +``` |
| 220 | +Add a date range picker to filter orders |
| 221 | +``` |
| 222 | + |
| 223 | +**Enhancement:** |
| 224 | +``` |
| 225 | +Include a summary card showing total orders and revenue |
| 226 | +``` |
| 227 | + |
| 228 | +**Quality Check:** |
| 229 | +``` |
| 230 | +Validate the app and show me any test failures |
| 231 | +``` |
| 232 | + |
| 233 | +**Production:** |
| 234 | +``` |
| 235 | +Deploy the app to Databricks as "orders-dashboard" |
| 236 | +``` |
| 237 | + |
| 238 | +--- |
| 239 | + |
| 240 | +## Why This Approach Works |
| 241 | + |
| 242 | +### Traditional Development vs. Databricks MCP |
| 243 | + |
| 244 | +| Traditional Approach | With Databricks MCP | |
| 245 | +|---------------------|-------------| |
| 246 | +| Manual project setup (hours) | Instant scaffolding (seconds) | |
| 247 | +| Configure build tools manually | Pre-configured and tested | |
| 248 | +| Set up testing infrastructure | Built-in test suite | |
| 249 | +| Manual code changes and debugging | AI-powered development with validation | |
| 250 | +| Local testing only | Containerized validation (reproducible) | |
| 251 | +| Manual deployment setup | Automated deployment to Databricks | |
| 252 | +| **Time to production: days/weeks** | **Time to production: minutes** | |
| 253 | + |
| 254 | +### Key Advantages |
| 255 | + |
| 256 | +**1. Scaffolding + Validation = Quality** |
| 257 | +- Start with a working, tested template |
| 258 | +- Every change is validated before deployment |
| 259 | +- No broken builds reach production |
| 260 | + |
| 261 | +**2. Natural Language = Productivity** |
| 262 | +- Describe what you want, not how to build it |
| 263 | +- AI handles implementation details |
| 264 | +- Focus on requirements, not configuration |
| 265 | + |
| 266 | +**3. End-to-End Workflow = Simplicity** |
| 267 | +- Single tool for entire lifecycle |
| 268 | +- No context switching between tools |
| 269 | +- Seamless progression from idea to deployment |
| 270 | + |
| 271 | +### What Makes It Production-Ready |
| 272 | + |
| 273 | +The Databricks MCP server doesn't just generate code—it ensures quality: |
| 274 | + |
| 275 | +- ✅ **TypeScript** - Type safety catches errors early |
| 276 | +- ✅ **Build verification** - Ensures code compiles |
| 277 | +- ✅ **Test suite** - Validates functionality |
| 278 | +- ✅ **Linting** - Enforces code quality |
| 279 | +- ✅ **Databricks integration** - Native SDK usage |
| 280 | + |
| 281 | +--- |
| 282 | + |
| 283 | +## Reference |
| 284 | + |
| 285 | +### CLI Commands |
| 286 | + |
| 287 | +```bash |
| 288 | +# Start MCP server (default mode) |
| 289 | +databricks experimental apps-mcp --warehouse-id <warehouse-id> |
| 290 | + |
| 291 | +# Enable workspace tools |
| 292 | +databricks experimental apps-mcp --warehouse-id <warehouse-id> --with-workspace-tools |
| 293 | + |
| 294 | +# Enable deployment |
| 295 | +databricks experimental apps-mcp --warehouse-id <warehouse-id> --allow-deployment |
| 296 | +``` |
| 297 | + |
| 298 | +### CLI Flags |
| 299 | + |
| 300 | +| Flag | Description | Default | |
| 301 | +|------|-------------|---------| |
| 302 | +| `--warehouse-id` | Databricks SQL Warehouse ID (required) | - | |
| 303 | +| `--with-workspace-tools` | Enable workspace file operations | `false` | |
| 304 | +| `--allow-deployment` | Enable deployment operations | `false` | |
| 305 | +| `--help` | Show help | - | |
| 306 | + |
| 307 | +### Environment Variables |
| 308 | + |
| 309 | +| Variable | Description | Example | |
| 310 | +|----------|-------------|---------| |
| 311 | +| `DATABRICKS_HOST` | Databricks workspace URL | `https://your-workspace.databricks.com` | |
| 312 | +| `DATABRICKS_TOKEN` | Databricks personal access token | `dapi...` | |
| 313 | +| `WAREHOUSE_ID` | Databricks SQL warehouse ID (preferred) | `abc123def456` | |
| 314 | +| `DATABRICKS_WAREHOUSE_ID` | Alternative name for warehouse ID | `abc123def456` | |
| 315 | +| `ALLOW_DEPLOYMENT` | Enable deployment operations | `true` or `false` | |
| 316 | +| `WITH_WORKSPACE_TOOLS` | Enable workspace tools | `true` or `false` | |
| 317 | + |
| 318 | +### Authentication |
| 319 | + |
| 320 | +The MCP server uses standard Databricks CLI authentication methods: |
| 321 | + |
| 322 | +1. **Environment variables** (as shown in the config above) |
| 323 | +2. **Databricks CLI profiles** - Use `--profile` flag or `DATABRICKS_PROFILE` env var |
| 324 | +3. **Default profile** - Uses `~/.databrickscfg` default profile if available |
| 325 | + |
| 326 | +For more details, see the [Databricks authentication documentation](https://docs.databricks.com/en/dev-tools/cli/authentication.html). |
| 327 | + |
| 328 | +### Requirements |
| 329 | + |
| 330 | +- **Databricks CLI** (this package) |
| 331 | +- **Databricks workspace** with a SQL warehouse |
| 332 | +- **MCP-compatible client** (Claude Desktop, Continue, etc.) |
| 333 | + |
| 334 | +--- |
| 335 | + |
| 336 | +## License |
| 337 | + |
| 338 | +See the main repository license. |
| 339 | + |
| 340 | +## Contributing |
| 341 | + |
| 342 | +Contributions welcome! Please see the main repository for development guidelines. |
| 343 | + |
| 344 | +## Support |
| 345 | + |
| 346 | +- **Issues**: https://github.com/databricks/cli/issues |
| 347 | +- **Documentation**: https://docs.databricks.com/dev-tools/cli/databricks-cli.html |
0 commit comments