Skip to content

Commit e7b1076

Browse files
committed
refactor: convert object resolution from async to sync
Remove unnecessary async/await from object resolution and related APIs. The resolver implementation was already synchronous (in-memory xref lookups), so the async interface only added overhead without benefit. This simplifies code patterns and enables cleaner type narrowing after resolving references. Async is preserved at load/save boundaries to allow future API flexibility. Key changes: - ObjectRegistry.resolve() and PDFContext.resolve() are now sync - Filter pipeline remains sync (was already, just removed misleading awaits) - Form field operations (setValue, check, etc.) are now sync - Page operations (getPages, getPage, drawField) are now sync - Font/image embedding is now sync - Added oxlint rules to catch async issues: await-thenable, no-floating-promises, require-await
1 parent 4f97193 commit e7b1076

File tree

192 files changed

+3023
-2500
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

192 files changed

+3023
-2500
lines changed

.agents/plans/040-sync-resolver.md

Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
# Synchronous Object Resolution Spec
2+
3+
## Summary
4+
5+
Change the `ObjectRegistry.resolve()` method and related infrastructure from async to sync. This enables code patterns like:
6+
7+
```typescript
8+
if (value instanceof PdfRef) {
9+
value = ctx.resolve(value);
10+
}
11+
12+
if (value instanceof PdfDict) {
13+
doSomething(value);
14+
}
15+
```
16+
17+
Without making the containing method async.
18+
19+
## Current State
20+
21+
### Filters: Already Sync
22+
23+
The filter system is **already synchronous**:
24+
25+
```typescript
26+
// src/filters/filter.ts
27+
interface Filter {
28+
decode(data: Uint8Array, params?: PdfDict): Uint8Array; // sync
29+
encode(data: Uint8Array, params?: PdfDict): Uint8Array; // sync
30+
}
31+
32+
// src/filters/filter-pipeline.ts
33+
static decode(data: Uint8Array, filters: FilterSpec[]): Uint8Array; // sync
34+
static encode(data: Uint8Array, filters: FilterSpec[]): Uint8Array; // sync
35+
36+
// src/objects/pdf-stream.ts
37+
getDecodedData(): Uint8Array; // sync
38+
getEncodedData(): Uint8Array; // sync
39+
```
40+
41+
However, call sites often use `await` unnecessarily (which works but is misleading).
42+
43+
### Resolver: Async Interface, Sync Implementation
44+
45+
The `ObjectResolver` type is async:
46+
47+
```typescript
48+
// src/document/object-registry.ts
49+
type ObjectResolver = (ref: PdfRef) => Promise<PdfObject | null>;
50+
51+
async resolve(ref: PdfRef): Promise<PdfObject | null> {
52+
const existing = this.getObject(ref);
53+
if (existing !== null) return existing;
54+
55+
if (this.resolver) {
56+
const obj = await this.resolver(ref); // async call
57+
if (obj !== null) this.addLoaded(ref, obj);
58+
return obj;
59+
}
60+
return null;
61+
}
62+
```
63+
64+
But the actual resolver implementation in `DocumentParser.buildDocument()` is **synchronous**:
65+
66+
```typescript
67+
// src/parser/document-parser.ts (line 573)
68+
const getObject = (ref: PdfRef): PdfObject | null => {
69+
// ... entirely synchronous: xref lookup, parsing, decryption
70+
return obj;
71+
};
72+
```
73+
74+
The async wrapper exists for historical reasons (possibly anticipating streaming/lazy file I/O), but all current operations are synchronous in-memory operations.
75+
76+
### Call Site Count
77+
78+
- ~60 `await ctx.resolve(...)` calls across 15 source files
79+
- ~96 `instanceof PdfRef` checks that often precede resolution
80+
- Many methods are async solely because they call `resolve()`
81+
82+
## Proposed Changes
83+
84+
### 1. Change `ObjectResolver` Type to Sync
85+
86+
```typescript
87+
// src/document/object-registry.ts
88+
export type ObjectResolver = (ref: PdfRef) => PdfObject | null; // was Promise<...>
89+
```
90+
91+
### 2. Change `ObjectRegistry.resolve()` to Sync
92+
93+
```typescript
94+
// src/document/object-registry.ts
95+
resolve(ref: PdfRef): PdfObject | null { // was async, returned Promise
96+
const existing = this.getObject(ref);
97+
if (existing !== null) return existing;
98+
99+
if (this.resolver) {
100+
const obj = this.resolver(ref); // sync call
101+
if (obj !== null) this.addLoaded(ref, obj);
102+
return obj;
103+
}
104+
return null;
105+
}
106+
```
107+
108+
### 3. Update `PDFContext.resolve()` to Sync
109+
110+
```typescript
111+
// src/api/pdf-context.ts
112+
resolve(ref: PdfRef): PdfObject | null { // was async
113+
return this.registry.resolve(ref);
114+
}
115+
```
116+
117+
### 4. Remove Unnecessary `await` from Call Sites
118+
119+
Transform ~60 call sites from:
120+
121+
```typescript
122+
const resolved = await ctx.resolve(ref);
123+
```
124+
125+
To:
126+
127+
```typescript
128+
const resolved = ctx.resolve(ref);
129+
```
130+
131+
### 5. Remove `async` from Methods That Only Awaited `resolve()`
132+
133+
Many methods are async only because they called `resolve()`. These can become sync:
134+
135+
```typescript
136+
// Before
137+
async getResources(): Promise<PdfDict | null> {
138+
const resources = await this.ctx.resolve(this.dict.getRef("Resources"));
139+
return resources instanceof PdfDict ? resources : null;
140+
}
141+
142+
// After
143+
getResources(): PdfDict | null {
144+
const resources = this.ctx.resolve(this.dict.getRef("Resources"));
145+
return resources instanceof PdfDict ? resources : null;
146+
}
147+
```
148+
149+
### 6. Update `NameTree` Resolver Type
150+
151+
```typescript
152+
// src/document/name-tree.ts
153+
export type Resolver = (ref: PdfRef) => PdfObject | null; // was Promise<...>
154+
```
155+
156+
### 7. Remove Unnecessary `await` from Filter Calls
157+
158+
While filters are already sync, many call sites use `await`:
159+
160+
```typescript
161+
// Before (works but misleading)
162+
const decoded = await stream.getDecodedData();
163+
164+
// After (explicit about sync nature)
165+
const decoded = stream.getDecodedData();
166+
```
167+
168+
## Files to Modify
169+
170+
### Core Infrastructure
171+
172+
| File | Changes |
173+
| --------------------------------- | ----------------------------------------------------------- |
174+
| `src/document/object-registry.ts` | Change `ObjectResolver` type and `resolve()` method to sync |
175+
| `src/api/pdf-context.ts` | Change `resolve()` to sync |
176+
| `src/document/name-tree.ts` | Change `Resolver` type and methods to sync |
177+
178+
### API Layer (~15 files)
179+
180+
| File | Estimated Changes |
181+
| ---------------------------- | --------------------------------------------------- |
182+
| `src/api/pdf.ts` | ~10 await removals, several methods can become sync |
183+
| `src/api/pdf-page.ts` | ~15 await removals, most methods can become sync |
184+
| `src/api/pdf-catalog.ts` | ~5 await removals |
185+
| `src/api/pdf-attachments.ts` | ~5 await removals |
186+
| `src/api/pdf-context.ts` | ~2 changes |
187+
188+
### Document Layer
189+
190+
| File | Estimated Changes |
191+
| ----------------------------------------- | ----------------- |
192+
| `src/document/forms/acro-form.ts` | ~8 await removals |
193+
| `src/document/forms/field-tree.ts` | ~5 await removals |
194+
| `src/document/forms/fields/base.ts` | ~3 await removals |
195+
| `src/document/forms/form-flattener.ts` | ~5 await removals |
196+
| `src/document/forms/widget-annotation.ts` | ~3 await removals |
197+
| `src/document/object-copier.ts` | ~5 await removals |
198+
199+
### Other
200+
201+
| File | Estimated Changes |
202+
| ----------------------------------- | ----------------- |
203+
| `src/layers/layers.ts` | ~6 await removals |
204+
| `src/annotations/*.ts` | ~5 await removals |
205+
| `src/signatures/ltv/dss-builder.ts` | ~3 await removals |
206+
207+
## Migration Strategy
208+
209+
### Phase 1: Core Changes
210+
211+
1. Update `ObjectResolver` type to sync
212+
2. Update `ObjectRegistry.resolve()` to sync
213+
3. Update `PDFContext.resolve()` to sync
214+
4. Update `NameTree` resolver
215+
216+
### Phase 2: Call Site Updates (can be parallelized)
217+
218+
For each file:
219+
220+
1. Remove `await` from `resolve()` calls
221+
2. Remove `await` from filter calls (optional but good cleanup)
222+
3. Check if method can become sync (no other awaits)
223+
4. If yes, change signature and update callers
224+
225+
### Phase 3: Test Updates
226+
227+
1. Update tests to use sync patterns
228+
2. Remove unnecessary `await` in test code
229+
230+
## Benefits
231+
232+
1. **Simpler code patterns** - No async/await gymnastics for simple type checks
233+
2. **Better TypeScript inference** - Sync narrowing works immediately
234+
3. **Performance** - Eliminates microtask queue overhead from Promise wrapping
235+
4. **Consistency** - Filters and resolution both sync; easier mental model
236+
5. **Enables sync APIs** - Can offer sync versions of common operations
237+
238+
## Risks and Mitigations
239+
240+
### Risk: Future Streaming Support
241+
242+
If we later want streaming/lazy file I/O, resolution would need to be async.
243+
244+
**Mitigation**: We can re-introduce async at the `PDF.load()` boundary. The resolver callback could do eager loading of the entire xref at load time, keeping `resolve()` sync. Alternatively, provide both sync and async APIs.
245+
246+
### Risk: Large Refactor
247+
248+
~60+ call sites need updating.
249+
250+
**Mitigation**: This is mechanical and can be done systematically. TypeScript will catch any missed sites (type errors from awaiting non-Promises or assigning `PdfObject | null` where `Promise<...>` was expected).
251+
252+
## Non-Goals
253+
254+
- Changing the public `PDF.load()` API (remains async for file I/O)
255+
- Changing the save/write pipeline (has legitimate async needs)
256+
- Making everything sync (some operations genuinely need async)
257+
258+
## Success Criteria
259+
260+
1. All tests pass
261+
2. No `async` methods that only awaited `resolve()`
262+
3. TypeScript compiles without errors
263+
4. `resolve()` calls no longer use `await`

.agents/scratch/text-extraction-api-design.ts renamed to .agents/scratch/text-extraction-api-design.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
```typescript
2+
/* oxlint-disable */
3+
14
/**
25
* Text Extraction API Design for @libpdf/core
36
*
@@ -475,3 +478,4 @@ function groupCharsIntoLines(chars: ExtractedChar[]): TextLine[] {
475478
// Implementation omitted for brevity
476479
return [];
477480
}
481+
```

.oxlintrc.json

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,23 +9,35 @@
99
"curly": "error",
1010
"no-useless-concat": "off",
1111
"no-unused-vars": "off",
12-
"@typescript-eslint/no-extraneous-class": "off"
12+
"@typescript-eslint/no-extraneous-class": "off",
13+
"@typescript-eslint/await-thenable": "error",
14+
"@typescript-eslint/no-floating-promises": "error",
15+
"@typescript-eslint/require-await": "error"
1316
},
1417
"overrides": [
1518
{
1619
"files": ["**/*.test.ts", "**/*.spec.ts"],
1720
"rules": {
21+
"no-unsafe-optional-chaining": "off",
1822
"@typescript-eslint/no-non-null-assertion": "off",
1923
"@typescript-eslint/no-unused-vars": "off",
2024
"@typescript-eslint/no-explicit-any": "off",
21-
"no-unsafe-optional-chaining": "off"
25+
"@typescript-eslint/no-unsafe-type-assertion": "off",
26+
"@typescript-eslint/no-base-to-string": "off",
27+
"@typescript-eslint/require-await": "off"
2228
}
2329
},
2430
{
2531
"files": ["examples/**/*.ts"],
2632
"rules": {
2733
"no-unused-vars": "off"
2834
}
35+
},
36+
{
37+
"files": ["apps/docs/**/*.{ts,tsx,mjs}"],
38+
"rules": {
39+
"@typescript-eslint/require-await": "off"
40+
}
2941
}
3042
],
3143
"ignorePatterns": ["dist", "node_modules", "checkouts", ".docs"]

.vscode/settings.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
"editor.codeActionsOnSave": {
55
"source.fixAll.oxc": "always"
66
},
7+
"oxc.typeAware": true,
78
"[json]": {
89
"editor.defaultFormatter": "oxc.oxc-vscode"
910
},

apps/docs/tsconfig.json

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
{
22
"compilerOptions": {
3-
"baseUrl": ".",
43
"target": "ESNext",
54
"lib": ["dom", "dom.iterable", "esnext"],
65
"allowJs": true,
@@ -17,7 +16,7 @@
1716
"incremental": true,
1817
"paths": {
1918
"@/*": ["./*"],
20-
"fumadocs-mdx:collections/*": [".source/*"]
19+
"fumadocs-mdx:collections/*": ["./.source/*"]
2120
},
2221
"plugins": [
2322
{

bun.lock

Lines changed: 15 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

examples/01-basic/create-empty-pdf.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ async function main() {
2525
console.log(`Page Count: ${pdf.getPageCount()}`);
2626

2727
// Get the page to show its dimensions
28-
const page = await pdf.getPage(0);
28+
const page = pdf.getPage(0);
2929
if (page) {
3030
console.log(`\nPage 1 Dimensions:`);
3131
console.log(` Width: ${page.width} points (${(page.width / 72).toFixed(2)} inches)`);

0 commit comments

Comments
 (0)