fix: install DuckDB dependencies at build time (via Docker 🤔) #60

ceholden · 2025-02-25T15:02:35Z

Description

This PR attempts to address some cold start performance issues described in, #59
TLDR - our Lambda downloads ~300MB of DuckDB extensions as part of a cold start, adding some overhead in time, data ingress cost, and adding load to DuckDB upstream repos.

The approach I tried was to initialize our DuckDB client during the build process, which necessitated moving to a Docker container based Lambda setup instead of a ZIP (the extensions alone are >250MB and stacrs is ~60MB).

Note

I'm not sure we should merge this, or maybe we can merge to test and revert!

I noticed that our cold starts are on the same order or slightly faster. Spot checking I saw ~0.7s to 1.5s for Docker, while ZIP based was always >1s even though loading the ZIP is faster we still have to wait on extension downloading & installing. I haven't been able to figure out why, but the warm start performance of the Docker based Lambda was much worse (3-4x slower for /collections/naip and /collections/naip/items).

There are one or two nuggets that I think are worth including,

ensuring that when we uv pip install that we install based on the lockfile
bumping stacrs to 0.5.5 since the one we point to in the lockfile has been yanked
specify the "platform" in our CDK code so we ensure our container builds for the platform that we run our Lambda on

I built/deployed this from my laptop, so it might be nice to deploy via CI to be 100% sure that my local deployment environment isn't causing the warm start performance degradation. I redeployed using stacrs==0.5.5 with a ZIP and I saw the same performance as with stacrs==0.4.0, so it's something about the Docker based Lambda

Dockerfile

infrastructure/aws/lambda/Dockerfile

hrodmn · 2025-02-25T15:46:54Z

Thanks for testing this out @ceholden!

The approach I tried was to initialize our DuckDB client during the build process, which necessitated moving to a Docker container based Lambda setup instead of a ZIP (the extensions alone are >250MB and stacrs is ~60MB).

I noticed that our cold starts are on the same order or slightly faster. Spot checking I saw ~0.7s to 1.5s for Docker, while ZIP based was always >1s even though loading the ZIP is faster we still have to wait on extension downloading & installing. I haven't been able to figure out why, but the warm start performance of the Docker based Lambda was much worse (3-4x slower for /collections/naip and /collections/naip/items).

~~Roughly what is the cold start time for the zip-based deployment? 2s, 3s, 10s?~~ Nevermind, just saw the 8s estimate in the original issue. If it is less than 5 seconds is it a big deal? Maybe a problem for clients with a timeout that bumps against the cold start time.

I haven't used this feature before but it's possible we could accomplish the same goal with layers. I don't know how performance would compare to the Docker image Lambda, though.

There are one or two nuggets that I think are worth including,

ensuring that when we uv pip install that we install based on the lockfile

bumping stacrs to 0.5.5 since the one we point to in the lockfile has been yanked

specify the "platform" in our CDK code so we ensure our container builds for the platform that we run our Lambda on

That all sounds good - could you implement in a separate PR since those are great improvements regardless of the out come of zip vs ECR throwdown 🥊

I built/deployed this from my laptop, so it might be nice to deploy via CI to be 100% sure that my local deployment environment isn't causing the warm start performance degradation. I redeployed using stacrs==0.5.5 with a ZIP and I saw the same performance as with stacrs==0.4.0, so it's something about the Docker based Lambda

I just added a new deployemnt environment called test - we could easily enable deploying to a test stack by making environment an input parameter so that we could pick the branch/environment during a workflow_dispatch. It would make sense to use the input parameter for workflow_dispatch events and default to development for release-triggered runs. That would be a great thing to add to the separate PR with the other improvements you made.

ceholden · 2025-02-25T16:04:57Z

Roughly what is the cold start time for the zip-based deployment? 2s, 3s, 10s? If it is less than 5 seconds is it a big deal? Maybe a problem for clients with a timeout that bumps against the cold start time.

Cold start was ~8-10 seconds. This seems not so nice for anything with a UI in front, but we can mitigate by keeping some reserved concurrency of Lambdas to have a "warm" pool. If we stashed DuckDB dependencies in our own S3 bucket in the same region then we'd achieve the other goals of, 1) isolating our service from any outages at DuckDB, 2) being nice to DuckDB, 3) maaaybe it'd download a bit quicker since it'd be within-region.

I haven't used this feature before but it's possible we could accomplish the same goal with layers. I don't know how performance would compare to the Docker image Lambda, though.

Layers unfortunately don't get you around the "maximum of 250MB",

250 MB The maximum size of the contents of a deployment package, including layers and custom runtimes. (unzipped)
https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html#function-configuration-deployment-and-execution

I think of layers as more useful for organization, reducing rebuild times (e.g., only redeploy app code on top of a stable layer), or sharing among related functions (e.g., if 10 Lambdas needed rasterio then it'd be quicker to deploy and we'd maybe get some cache reuse if there were a dedicated layer).

That all sounds good - could you implement in a separate PR since those are great improvements regardless of the out come of zip vs ECR throwdown 🥊

💯 will do! I think there's maybe a question of uv export && uv pip install --target /asset versus uv sync && (and a bunch of copies) from Pete's question, but the "export to requirements" was recommended by uv authors and that's also how it's done with Poetry so it seems simpler to me to do export as an intermediate step

gadomski · 2025-03-03T13:10:40Z

Dockerfile

 COPY --from=builder --chown=app:app /app/.venv /app/.venv

+# install DuckDB extensions needed by DuckDB client at build time
+RUN python -c "import stacrs; stacrs.DuckdbClient()"


Since this is duplicated in the deploy dockerfile, can we refactor to something shared so we don't repeat ourselves? Not sure if that makes sense with how dockerfiles work, but just asking.

gadomski · 2025-03-03T13:11:32Z

infrastructure/aws/lambda/Dockerfile

+COPY infrastructure/aws/lambda/handler.py ${LAMBDA_TASK_ROOT}/handler.py
+
+# install DuckDB extensions needed by DuckDB client at build time
+RUN python -c "import stacrs; stacrs.DuckdbClient()"


@ceholden I'm going to take over stac-utils/rustac#655 and implement that this morning.

@ceholden

@ceholden I took your good work in #60 and reshaped it a bit to use the targeted extension fetching from stac-utils/rustac-py#81. Lambda sizes: - Before (current **main**): 29.2 MB - With the **uv** and other tweaks, but no DuckDB extensions: 26.2 MB - With DuckDB extensions: 93.6 MB Things are broken, because I haven't wired up the server to actually use those extensions, but I _think_ this is a proof out of using pre-fetched DuckDB extensions. Lambda is here: https://us-west-2.console.aws.amazon.com/lambda/home?region=us-west-2#/functions/stac-fastapi-geoparquet-labs-375-de-lambda8B5974B5-W1CWbXEHRA1Y?tab=code > [!WARNING] > When I tried to use the wheels directly, the size got too big. For now, I'm sticking with a build-in-the-container (which is slow, ~10 minutes on my machine), but I need to dig in to why the build is smaller than the wheel.

gadomski · 2025-03-13T19:22:01Z

Closing as OBE on #75

ceholden added 3 commits February 25, 2025 08:54

run w/ docker container

233f4bb

alas there are no more 0.4.0 wheels

2c86dd6

Relock stacrs==0.5.5 since 0.4.0 was yanked

9665902

ceholden requested review from gadomski and hrodmn and removed request for hrodmn February 25, 2025 15:12

gadomski reviewed Feb 25, 2025

View reviewed changes

Dockerfile Show resolved Hide resolved

gadomski reviewed Feb 25, 2025

View reviewed changes

infrastructure/aws/lambda/Dockerfile Show resolved Hide resolved

gadomski reviewed Mar 3, 2025

View reviewed changes

gadomski mentioned this pull request Mar 12, 2025

feat: duckdb extensions in lambda #75

Merged

gadomski closed this Mar 13, 2025

gadomski deleted the fix/fix-duckdb-dl branch April 22, 2025 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: install DuckDB dependencies at build time (via Docker 🤔) #60

fix: install DuckDB dependencies at build time (via Docker 🤔) #60

Uh oh!

ceholden commented Feb 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

hrodmn commented Feb 25, 2025 •

edited

Loading

Uh oh!

ceholden commented Feb 25, 2025

Uh oh!

gadomski Mar 3, 2025

Uh oh!

gadomski Mar 3, 2025

Uh oh!

gadomski commented Mar 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: install DuckDB dependencies at build time (via Docker 🤔) #60

fix: install DuckDB dependencies at build time (via Docker 🤔) #60

Uh oh!

Conversation

ceholden commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Uh oh!

Uh oh!

hrodmn commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ceholden commented Feb 25, 2025

Uh oh!

gadomski Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

gadomski Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

gadomski commented Mar 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ceholden commented Feb 25, 2025 •

edited

Loading

hrodmn commented Feb 25, 2025 •

edited

Loading