Skip to content

fix: dfbench respects DATAFUSION_RUNTIME_MEMORY_LIMIT env var#20631

Open
adriangb wants to merge 1 commit intoapache:mainfrom
pydantic:fix-dfbench-memory-limit-env-var
Open

fix: dfbench respects DATAFUSION_RUNTIME_MEMORY_LIMIT env var#20631
adriangb wants to merge 1 commit intoapache:mainfrom
pydantic:fix-dfbench-memory-limit-env-var

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Mar 1, 2026

Which issue does this PR close?

N/A — discovered while running benchmarks with bench.sh.

Rationale for this change

When running benchmarks via bench.sh / dfbench, setting DATAFUSION_RUNTIME_MEMORY_LIMIT=2G is ignored for memory pool enforcement. Most DATAFUSION_* env vars work because SessionConfig::from_env() picks them up, but the memory limit is a special case — it requires constructing a MemoryPool in the RuntimeEnv, which dfbench only did when --memory-limit was passed as a CLI flag.

What changes are included in this PR?

In runtime_env_builder(), when self.memory_limit (CLI flag) is None, fall back to reading the DATAFUSION_RUNTIME_MEMORY_LIMIT env var using the existing parse_memory_limit() function. The CLI flag still takes precedence when provided.

Are these changes tested?

Yes — added test_runtime_env_builder_reads_env_var which sets the env var, constructs a CommonOpt with no CLI memory limit, and verifies the resulting RuntimeEnv has a Finite(2GB) memory pool.

Are there any user-facing changes?

dfbench now honors the DATAFUSION_RUNTIME_MEMORY_LIMIT environment variable as a fallback when --memory-limit is not passed on the command line. No breaking changes — existing CLI flag behavior is unchanged.

🤖 Generated with Claude Code

When no `--memory-limit` CLI flag is provided, `runtime_env_builder()`
now falls back to the `DATAFUSION_RUNTIME_MEMORY_LIMIT` environment
variable. This makes the memory pool configuration consistent with how
other `DATAFUSION_*` env vars are handled via `SessionConfig::from_env()`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@adriangb adriangb force-pushed the fix-dfbench-memory-limit-env-var branch from 0e734ab to 1c8ec9b Compare March 1, 2026 15:20
@adriangb
Copy link
Contributor Author

adriangb commented Mar 1, 2026

@alamb a small fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant