fix(base64): chunk Uint8Array encoding to prevent stack overflow for large inputs by terminalchai · Pull Request #59 · unjs/undio

terminalchai · 2026-03-14T20:51:37Z

Closes #41

Problem

_base64Encode converts a Uint8Array to a base64 string with:

btoa(String.fromCodePoint(...data))

The spread operator passes every byte as a separate function argument. JavaScript engines have a hard limit on the number of arguments a function call can receive (~65 536 in V8/SpiderMonkey). For large inputs such as PDF pages or images, this reliably throws:

RangeError: Maximum call stack size exceeded

The bug only manifests in resource-constrained environments (small cloud containers, edge workers) where the engine's argument limit is hit sooner, which is why it works locally but crashes on DigitalOcean's smallest app tier.

Fix

Process the array in 65 535-byte chunks, building the binary string incrementally before passing it to btoa:

const CHUNK_SIZE = 0xffff;
let str = "";
for (let i = 0; i < data.length; i += CHUNK_SIZE) {
  str += String.fromCodePoint(...data.subarray(i, i + CHUNK_SIZE));
}
let encoded = btoa(str);

This is the standard pattern recommended by MDN for this exact scenario. It is fully cross-platform — no Buffer, no Node.js-only APIs — so it works in browsers, Deno, Cloudflare Workers, and Bun too.

Test

Added a regression test that encodes a 200 kB Uint8Array (well above the 65 535-arg threshold) and asserts:

it does not throw
the output length matches the expected base64 size for the input

Summary by CodeRabbit

New Features
- Added a public function to encode Uint8Array data to base64.
Bug Fixes
- Reworked the encoder to process large binary payloads safely, preventing stack/overflow issues and ensuring correct output length for very large inputs.

coderabbitai · 2026-03-14T20:51:51Z

📝 Walkthrough

Walkthrough

The _base64Encode implementation was changed to encode Uint8Array input in fixed-size chunks rather than using a single spread into String.fromCodePoint, and a test was added to ensure large arrays (>65535 bytes) encode without throwing and produce the expected base64 length.

Changes

Cohort / File(s)	Summary
Base64 encoding implementation `src/data-types/_utils.ts`	Replaced `String.fromCodePoint(...data)` approach with a chunked loop (CHUNK_SIZE) that concatenates code points per slice and then base64-encodes the result; preserves URL-safe and dataURL options.
Tests & public export `test/index.test.ts`	Added `uint8ArrayToBase64` usage and a test that constructs a 200,000-byte `Uint8Array`, asserts encoding does not throw, and checks resulting base64 length matches expected value.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibbled bytes in careful chunks,
No more spreads that made me flunk.
Big arrays hum, encodings light,
Hopped through base64 — sleep tight. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: fixing base64 encoding by implementing chunking to prevent stack overflow on large inputs.
Linked Issues check	✅ Passed	The PR implements the core requirement from `#41`: chunking the Uint8Array to avoid stack overflow with large inputs, while maintaining cross-platform compatibility without Node-only APIs.
Out of Scope Changes check	✅ Passed	All changes are directly related to fixing the base64 encoding stack overflow issue and adding corresponding test coverage, with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…large inputs

coderabbitai

🧹 Nitpick comments (1)

test/index.test.ts (1)

125-133: Consider adding a round-trip assertion for stronger regression coverage.

The current check proves “no throw” and expected size, but not exact content correctness. Adding a decode-and-compare assertion would make this test much harder to false-pass.

✅ Suggested test enhancement

-it("should encode large Uint8Array (>65535 bytes) without stack overflow", () => {
+it("should encode large Uint8Array (>65535 bytes) without stack overflow", async () => {
   // String.fromCodePoint(...data) crashes when data.length exceeds the JS
   // engine's max argument count. Use a 200 kB array to reliably reproduce.
   const large = new Uint8Array(200_000).fill(42);
   expect(() => uint8ArrayToBase64(large, { dataURL: false })).not.toThrow();
   const encoded = uint8ArrayToBase64(large, { dataURL: false });
   // btoa of 200 000 bytes of 0x2A ('*') → predictable base64 output length
   expect(encoded.length).toBe(Math.ceil(200_000 / 3) * 4);
+  const decoded = await convertTo("Uint8Array", encoded, "Base64");
+  expect(decoded).toEqual(large);
 });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/index.test.ts` around lines 125 - 133, Add a round-trip assertion to
verify content correctness: after calling uint8ArrayToBase64(large, { dataURL:
false }) decode the resulting base64 back to bytes (e.g. Buffer.from(encoded,
"base64") in Node tests or a project helper like base64ToUint8Array if
available) and assert the decoded Uint8Array equals the original `large`; update
the test inside the same it block that references uint8ArrayToBase64 to include
this equality check to prevent false-positive passes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/index.test.ts`:
- Around line 125-133: Add a round-trip assertion to verify content correctness:
after calling uint8ArrayToBase64(large, { dataURL: false }) decode the resulting
base64 back to bytes (e.g. Buffer.from(encoded, "base64") in Node tests or a
project helper like base64ToUint8Array if available) and assert the decoded
Uint8Array equals the original `large`; update the test inside the same it block
that references uint8ArrayToBase64 to include this equality check to prevent
false-positive passes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c5ab5a1-94a8-45ca-832c-c5e8d0d3c1f9

📥 Commits

Reviewing files that changed from the base of the PR and between 7d3f730 and 9bb0942.

📒 Files selected for processing (2)

src/data-types/_utils.ts
test/index.test.ts

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/data-types/_utils.ts`:
- Around line 19-23: The current chunking and concatenation in the loop is
unsafe and inefficient: reduce CHUNK_SIZE from 0xff_ff to 0x8000 and avoid
repeated string allocations by accumulating pieces in an array and joining once;
specifically, replace the loop that uses CHUNK_SIZE,
String.fromCodePoint(...data.subarray(...)) and str += ... with building an
array (e.g., parts.push(String.fromCodePoint(...data.subarray(i, i +
CHUNK_SIZE)))) and after the loop set str = parts.join('') so fromCodePoint
never receives too many arguments and string concatenation is optimized.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f72343ca-47dd-4e05-b332-eacd9ccc344e

📥 Commits

Reviewing files that changed from the base of the PR and between 9bb0942 and 93c9b88.

📒 Files selected for processing (2)

src/data-types/_utils.ts
test/index.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

test/index.test.ts

coderabbitai · 2026-03-14T20:59:28Z

src/data-types/_utils.ts

+  const CHUNK_SIZE = 0xff_ff;
+  let str = "";
+  for (let i = 0; i < data.length; i += CHUNK_SIZE) {
+    str += String.fromCodePoint(...data.subarray(i, i + CHUNK_SIZE));
+  }


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For converting large Uint8Array values to strings in JavaScript, what chunk size is broadly recommended across engines (V8, SpiderMonkey, JavaScriptCore) when using String.fromCodePoint/String.fromCharCode with spread or apply?

💡 Result:

A broadly used cross-engine “safe” chunk size is 0x8000 (32,768) elements per call when doing String.fromCharCode(...chunk) / .apply(null, chunk) (and similarly for fromCodePoint), because engines have an unspecified max-arguments limit (often failing somewhere in the “tens of thousands”), with JavaScriptCore explicitly noted as having a 65,536 argument cap, and MDN’s recommended hybrid strategy uses QUANTUM = 32768. [1], [2]

Sources: [1] MDN Function.prototype.apply() (argument-limit note + QUANTUM=32768 example), [2] MDN spread syntax (points to the same argument-length limit guidance).

🏁 Script executed:

# First, find and read the file cat -n src/data-types/_utils.ts | head -40

Repository: unjs/undio

Length of output: 1349

🏁 Script executed:

# Search for tests of _base64Encode rg "_base64Encode" --type ts --type js

Repository: unjs/undio

Length of output: 1673

🏁 Script executed:

# Search for test files fd ".*test.*" -e ts -e js | head -20

Repository: unjs/undio

Length of output: 72

🏁 Script executed:

# Check the test file for tests related to _base64Encode or large data cat -n test/index.test.ts | grep -A 5 -B 5 -i "base64\|chunk\|large"

Repository: unjs/undio

Length of output: 2753

Use a more conservative chunk size and optimize string concatenation.

At line 19, 0xff_ff (65,535) operates at the edge of JavaScript engines' argument limits—JavaScriptCore has an explicit 65,536 argument cap, leaving only 1-argument margin before failure. Additionally, the str += pattern in the loop causes repeated string allocations, creating unnecessary overhead for large payloads. Use 0x8000 (32,768) as the chunk size, which is the broadly recommended safe threshold across V8, SpiderMonkey, and JavaScriptCore, and collect results in an array to join at the end.

Proposed refactor

- const CHUNK_SIZE = 0xff_ff; - let str = ""; + const CHUNK_SIZE = 0x8000; + const parts: string[] = []; for (let i = 0; i < data.length; i += CHUNK_SIZE) { - str += String.fromCodePoint(...data.subarray(i, i + CHUNK_SIZE)); + parts.push(String.fromCodePoint(...data.subarray(i, i + CHUNK_SIZE))); } - let encoded = btoa(str); + let encoded = btoa(parts.join(""));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/data-types/_utils.ts` around lines 19 - 23, The current chunking and concatenation in the loop is unsafe and inefficient: reduce CHUNK_SIZE from 0xff_ff to 0x8000 and avoid repeated string allocations by accumulating pieces in an array and joining once; specifically, replace the loop that uses CHUNK_SIZE, String.fromCodePoint(...data.subarray(...)) and str += ... with building an array (e.g., parts.push(String.fromCodePoint(...data.subarray(i, i + CHUNK_SIZE)))) and after the loop set str = parts.join('') so fromCodePoint never receives too many arguments and string concatenation is optimized.

fix(base64): chunk Uint8Array encoding to prevent stack overflow for …

93c9b88

…large inputs

terminalchai force-pushed the fix/base64-large-array-stack-overflow branch from 9bb0942 to 93c9b88 Compare March 14, 2026 20:54

coderabbitai bot reviewed Mar 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(base64): chunk Uint8Array encoding to prevent stack overflow for large inputs#59

fix(base64): chunk Uint8Array encoding to prevent stack overflow for large inputs#59
terminalchai wants to merge 1 commit intounjs:mainfrom
terminalchai:fix/base64-large-array-stack-overflow

terminalchai commented Mar 14, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

terminalchai commented Mar 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Test

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

terminalchai commented Mar 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 14, 2026 •

edited

Loading