Skip to content

Conversation

@fireairforce
Copy link

@fireairforce fireairforce commented Jan 15, 2026

This change sorts the module factories under Turbopack chunks based on their identifiers, referencing Webpack: https://github.com/webpack/webpack/blob/main/lib/util/comparators.js#L312-L319

This modification can achieve approximately 100% actual size reduction for some Turbopack build artifacts (the improvement is more significant in larger projects):

Before:

image

After:

image

The principle here is that (I haven't yet created a minimal reproducible demo), Umi's SPA applications import route paths via dynamic imports:

  • Route A asynchronously loads modules [X, Y, Z]
  • Route B asynchronously loads modules [Z, X, Y]

Then, when generating chunks:

  • modules = [X, Y, Z] -> ident hash = abc123 -> chunk_abc123.js
  • modules = [Z, X, Y] -> ident hash = def456 -> chunk_def456.js

However, the module content of these two chunks is actually identical; essentially, they are two identical chunks. Therefore, we directly sort the modules under modules based on moduleId or similar criteria, ensuring that the modules within the chunk remain stable:

  • modules = [X, Y, Z] → sort → [X, Y, Z] → ident hash = abc123 → chunk_abc123.js
  • modules = [Z, X, Y] → sort → [X, Y, Z] → ident hash = abc123 → chunk_abc123.js

This way, when generating the chunk, it will be treated as a single chunk, thus saving a significant amount of chunk size overhead.

In practice, this should theoretically not affect the existing code logic, because currently it only sorts the modules of the JS chunk, but the actual code execution order is still determined by the dependencies in the dependency graph. Similarly, it will not affect module merging.

Here's a Webpack example:

// index.js
import "./module-a.js";
import "./module-c.js";
import "./module-d.js";
import "./module-b.js";

console.log("main");

// module-a.js
console.log("module a");

// module-b.js
console.log("module b");

// module-c.js
console.log("module c");

// module-d.js
console.log("module d");

When Webpack generates the index.js chunk, regardless of whether the import order is a, b, c, d or a, c, b, d, the corresponding module sorting is:

image

The modules are sorted according to their module chunk ID names (a->b->c->d), so this should also be applicable to Turbopack.

Related issue: utooland/utoo#2477

@gemini-code-assist
Copy link

Summary of Changes

Hello @fireairforce, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求为 Turbopack 引入了一种确定性的模块工厂排序机制,该机制基于模块的标识符进行。此项改进旨在通过识别并合并内容相同但模块顺序不同的 chunk 来优化打包体积,尤其在处理动态导入时效果显著。它使 Turbopack 的 chunk 生成逻辑与 Webpack 的方法保持一致,确保了输出的稳定性并减少了冗余,同时不影响代码的实际执行顺序。

Highlights

  • 模块工厂排序: 对 Turbopack 下 chunk 的模块工厂根据其标识符 (ident) 进行排序处理,以确保输出的确定性,并减少重复 chunk 的生成。
  • 体积优化: 通过排序,对于包含相同模块但顺序不同的 chunk,可以将其识别为同一个 chunk,从而避免重复生成,显著减少构建产物体积,尤其是在动态导入场景下。
  • 行为对齐: 此改动参考了 Webpack 的处理方式,旨在使 Turbopack 在处理模块排序时与 Webpack 行为保持一致,确保代码执行顺序不受影响。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这个 PR 的目标非常明确且有价值,通过对 chunk 内的 module factory 根据 ident 进行排序,解决了因模块导入顺序不同而导致生成重复 chunk 的问题,从而显著优化了构建产物的体积。从描述中提供的截图来看,效果非常可观。

代码实现上,核心改动位于 turbopack-ecmascript crate,分别在 chunk_type.rscreate_chunkmod.rsident 方法中增加了排序逻辑。这两个改动确保了无论是 chunk 的内容生成还是 chunk 自身的标识符生成,都具有了确定性,从而保证了 chunk 的稳定性。

整体来看,代码逻辑清晰,实现正确,并且对 chunking_context.rs 的一些相关代码也做了很好的重构,提高了代码的可读性和性能。这是一个高质量的优化。


let assets = chunk_items
// Sort chunk items by their content identifier for deterministic asset ordering
let mut assets_with_sort_key: Vec<_> = chunk_items
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉可以提到上游了

@fireairforce fireairforce merged commit 739f84e into utoo Jan 16, 2026
14 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants