feat: schema codegen bucketing to help with tree-shaking #1672
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
In the implementation of Schema serde #1600, the schema objects, corresponding roughly 1-1 with modeled shapes, expand linearly with the model size.
Initially, all schemas were generated into a single file called
/schema/schemas.ts
. Because schema objects call factory functions, they are not tree shaken correctly. Even adding@__PURE__
etc. annotations in the source code did not work, with both webpack and esbuild failing to tree shake correctly (rollup did fine).PR
Here we have a bucketing system to generate schema objects into separate files.
Solution / heuristic
We can't arbitrarily put e.g. 100 schemas per file into multiple bucketing files. Schema objects form a dependency tree, and random bucketing would create poor tree-shaking outcomes.
The solution in the PR buckets schema objects into a logical operation group. For example, in the AWS S3 model, the following schema objects are grouped together in a file called
schemas_Payment.ts
Although the bucketing doesn't create the absolute minimum import graph as individual files would, it balances the file count with importing a significantly and logically reduced subset of the schema tree.
The graph partitioning details are in the PR code.