Skip to content

fix: add buffered DMMF API to bypass V8 string length limit#5757

Open
chris-tophers wants to merge 1 commit intoprisma:mainfrom
chris-tophers:fix/dmmf-v8-string-limit
Open

fix: add buffered DMMF API to bypass V8 string length limit#5757
chris-tophers wants to merge 1 commit intoprisma:mainfrom
chris-tophers:fix/dmmf-v8-string-limit

Conversation

@chris-tophers
Copy link

Summary

Adds three new wasm_bindgen exports to prisma-schema-wasm that allow DMMF JSON to be returned as chunked Uint8Array data instead of a single JS string, bypassing V8's hard string length limit of 0x1fffffe8 characters (~536MB).

  • get_dmmf_buffered(params) -> usize — serializes DMMF to an internal buffer using serde_json::to_vec(), returns total byte count
  • read_dmmf_chunk(offset, length) -> Vec<u8> — returns a chunk as Uint8Array (no V8 string limit)
  • free_dmmf_buffer() — releases the internal buffer

The existing get_dmmf() is unchanged for backward compatibility.

Root Cause

dmmf_json_from_validated_schema() in query-compiler/dmmf/src/lib.rs uses serde_json::to_string(), producing a single Rust String. When wasm-bindgen converts this to a JS string across the WASM FFI boundary, V8 rejects it if the string exceeds 0x1fffffe8 characters. No Node.js flags can change this limit — it's a V8 engine constant.

Changes

File Change
query-compiler/dmmf/src/lib.rs Add dmmf_json_bytes_from_validated_schema() using serde_json::to_vec()
prisma-fmt/src/get_dmmf.rs Add get_dmmf_bytes() returning Vec<u8>
prisma-fmt/src/lib.rs Export get_dmmf_bytes()
prisma-schema-wasm/src/lib.rs Add 3 new wasm_bindgen exports with Mutex<Vec<u8>> buffer

+76 lines, 4 files changed. No existing behavior modified.

Test Results

Tested with a production schema (1,600+ models, 1,100+ enums, 111K lines):

Test Result Details
Original get_dmmf() FAILS Cannot create a string longer than 0x1fffffe8 characters
get_dmmf_buffered() + chunked read PASSES 571MB DMMF, 35 chunks x 16MB each
Streaming JSON parse of chunks PASSES Full parse completed in ~21s

Limitations

This fix addresses the V8 string length limit but has an upper bound: WASM32 linear memory is capped at ~4GB. For schemas producing DMMF larger than ~4GB, the buffered approach would
also fail. An alternative for extreme cases would be using the native schema engine binary path (which streams over stdio with no memory limit), but the buffered API should cover schemas
well into the tens of thousands of models.

Companion Change Required

The TypeScript side (prisma/prismagetDmmfWasm in packages/internals) needs to detect the V8 string limit error and fall back to the buffered API with a streaming JSON parser.
Both this PR and the companion TypeScript PR are required for a complete fix. We will submit the companion PR to prisma/prisma as well.

Fixes: prisma/prisma#29111

Add get_dmmf_buffered(), read_dmmf_chunk(), and free_dmmf_buffer() to
prisma-schema-wasm. These allow the DMMF JSON to be returned as chunked
Uint8Array data instead of a single JS string, bypassing V8's hard limit
of ~536MB (0x1fffffe8 characters).

For schemas generating DMMF larger than ~536MB, the existing get_dmmf()
throws 'Cannot create a string longer than 0x1fffffe8 characters' because
wasm-bindgen converts the Rust String to a JS string. The new buffered API
keeps the JSON as bytes in WASM linear memory and lets JS read chunks as
Uint8Array (which has no such limit).

Changes:
- query-compiler/dmmf/src/lib.rs: add dmmf_json_bytes_from_validated_schema()
  using serde_json::to_vec() instead of to_string()
- prisma-fmt/src/get_dmmf.rs: add get_dmmf_bytes() returning Vec<u8>
- prisma-fmt/src/lib.rs: export get_dmmf_bytes()
- prisma-schema-wasm/src/lib.rs: add 3 wasm_bindgen exports:
  - get_dmmf_buffered(params) -> byte count
  - read_dmmf_chunk(offset, length) -> Uint8Array
  - free_dmmf_buffer()

Tested with a 1,600+ model schema producing 571MB DMMF:
- Original get_dmmf: FAILS (V8 string limit)
- Buffered API: PASSES (35 chunks x 16MB, streamed successfully)

Fixes: prisma/prisma#29111
@CLAassistant
Copy link

CLAassistant commented Feb 6, 2026

CLA assistant check
All committers have signed the CLA.

chris-tophers added a commit to chris-tophers/prisma that referenced this pull request Feb 6, 2026
When prisma generate processes schemas that produce DMMF larger than
~536MB, the existing get_dmmf() WASM call fails with V8's hard-coded
string length limit (0x1fffffe8 characters). This adds automatic
fallback to the buffered DMMF API (get_dmmf_buffered + read_dmmf_chunk)
which returns data as chunked Uint8Array, bypassing the V8 string limit.

The fallback is transparent — it only activates when the V8 string limit
error is detected, so there is no behavior change for schemas that work
with the existing API.

Companion to: prisma/prisma-engines#5757
Fixes: prisma#29111
chris-tophers added a commit to chris-tophers/prisma-engines that referenced this pull request Feb 6, 2026
Add a `get-dmmf` subcommand to the prisma-fmt CLI that streams DMMF JSON
directly to stdout via serde_json::to_writer(). This approach has no
memory ceiling — unlike WASM (limited to ~4GB linear memory), the native
binary can stream arbitrarily large DMMF with only 1x peak memory
(the in-memory DMMF struct, no serialized buffer).

Changes:
- dmmf crate: add dmmf_json_to_writer() using serde_json::to_writer()
- prisma-fmt: add get_dmmf_to_writer() that validates + streams
- prisma-fmt: export get_dmmf_to_writer() from lib.rs
- prisma-fmt: add GetDmmf CLI variant, reads stdin params, streams to stdout

Alternative approach to the buffered WASM API in prisma#5757.
Fixes: prisma/prisma#29111

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@chris-tophers
Copy link
Author

Alternative approach submitted: #5761

I've also submitted a companion binary streaming approach as an alternative to this WASM buffered API. The key difference:

Aspect WASM Buffered (this PR) Binary Streaming (#5761)
Memory ceiling ~1.5-2GB (WASM32 limit) Unlimited
Peak memory 2x DMMF (struct + buffer) 1x (streams, no buffer)
New binary No Yes (prisma-fmt)
Rust complexity Medium (Mutex, chunks) Low (to_writer + CLI cmd)

The binary approach uses serde_json::to_writer() to stream DMMF JSON directly to stdout from a native prisma-fmt get-dmmf subcommand — no intermediate String or Vec<u8> allocation. The TypeScript side spawns the binary and stream-parses stdout via @streamparser/json.

Both approaches solve the immediate V8 string limit issue. The Prisma team can choose whichever fits best, or combine both (WASM buffered as primary fallback, binary streaming as secondary for schemas that exceed WASM32's ~4GB limit).

Companion TypeScript PRs:

/// This allows JS to read the DMMF in chunks via Uint8Array, bypassing
/// V8's string length limit of ~536MB.
/// See: https://github.com/prisma/prisma/issues/29111
static DMMF_BUFFER: Mutex<Vec<u8>> = Mutex::new(Vec::new());
Copy link
Contributor

@jacek-prisma jacek-prisma Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of using a static the API should allow allocating and freeing your own buffer (and passing it as an argument), so that there's no implicit global state

@jacek-prisma
Copy link
Contributor

jacek-prisma commented Feb 9, 2026

Thanks for submitting the PR! I think we can merge this change (along with the prisma part), conditional on my comment getting resolved

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 9, 2026

Merging this PR will not alter performance

✅ 11 untouched benchmarks
⏩ 11 skipped benchmarks1


Comparing chris-tophers:fix/dmmf-v8-string-limit (8eed767) with main (aa5ee09)

Open in CodSpeed

Footnotes

  1. 11 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prisma 7.x WASM DMMF generation fails with "Cannot create a string longer than 0x1fffffe8 characters"

3 participants

Comments