Embucket exposes a Snowflake-compatible API over lakehouse data. The repo currently ships two runtime artifacts:
embucketdfor local and self-hosted runsembucket-lambdafor AWS Lambda deployments
- Start locally if you want the fastest test or evaluation loop.
- Run from source if you want to build
embucketdyourself for local evaluation. - Deploy on AWS Lambda if you want the current serverless runtime.
- Connect dbt if you want the recommended client path.
- Run Snowplow web analytics if you want a fuller example on the Lambda + dbt path.
- Use S3 Tables if you want the currently documented external catalog.
- Troubleshoot if your client, auth, or runtime setup does not behave as expected.
Relevant guides live under docs/src/content/docs/:
essentials/quick-start.mdxessentials/runtime-modes.mdxguides/aws-lambda.mdxguides/dbt.mdxguides/self-hosted.mdxguides/snowplow.mdxguides/s3-tables.mdxguides/troubleshooting.mdx
If you want to build the local binary instead of using Docker, start with docs/src/content/docs/guides/self-hosted.mdx.
If you want a fuller example on the recommended client path, start with docs/src/content/docs/guides/snowplow.mdx.
The current docs should make these distinctions explicit:
- Local mode is the fastest path for tests and evaluation.
- AWS Lambda + dbt-embucket is verified and is the recommended client path.
- AWS Lambda + Snowflake CLI over Function URL is tested, but not production-ready because the Function URL is publicly reachable.
- Production-facing Lambda deployments should avoid a public Function URL. The AWS Lambda guide includes an anonymized private API Gateway example.
- AWS S3 Tables is the currently documented external catalog path.
Run Embucket locally:
docker run --name embucket --rm -p 3000:3000 embucket/embucketExpected startup log:
{"timestamp":"2025-07-01T15:35:05.687807Z","level":"INFO","fields":{"message":"Listening on http://0.0.0.0:3000"},"target":"embucketd"}
Configure Snowflake CLI for the local endpoint:
snow --info
# Add this connection block to your Snowflake CLI config file.
[connections.local]
host = "localhost"
region = "us-east-2"
port = 3000
protocol = "http"
database = "embucket"
schema = "public"
warehouse = "em.wh"
account = "acc.local"
user = "embucket"
password = "embucket"Validate the connection and run a query:
snow connection test -c local
snow sql -c local -q "SELECT 1 AS ok"You can also open http://127.0.0.1:3000/ to inspect the current Swagger/OpenAPI surface served by embucketd.
If you want the current serverless path, start with docs/src/content/docs/guides/aws-lambda.mdx.
The current runtime is built from crates/embucket-lambda and can be deployed with:
make -C crates/embucket-lambda deployFor test-only validation, you can expose a Function URL and connect Snowflake CLI to it. For production-facing traffic, keep the Lambda private and put an API gateway layer in front of it.
If you want the recommended client workflow, start with docs/src/content/docs/guides/dbt.mdx.
The official adapter lives in the sibling repository Embucket/dbt-embucket and uses:
type: embucketfunction_arnto reach the deployed Lambdadbt debuganddbt runas the verified end-to-end checks
The current docs treat AWS S3 Tables as the supported external catalog path. Start with docs/src/content/docs/guides/s3-tables.mdx for the YAML shape, AWS prerequisites, and query flow.