-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Introduce canonicalizeCandidates
on the internal Design System
#19059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
94e1b8e
to
6a59311
Compare
I feel like this would better be present in another package or a separate and completely isolated file, not the core one. Specifically since core is part of the browser build which is pretty sensitive to code size and this is moving a bunch of code into it now. Have you thought about this? This PR also adds |
I see that you mentioned file size, how the hell does core have more than 3MB 🤔 That seems very unexpected. I think something with your measurement is wrong: https://bundlephobia.com/package/@tailwindcss/[email protected] |
// Parse a dimension such as `64rem` into `[64, 'rem']`. | ||
export const dimensions = new DefaultMap((input) => { | ||
let match = /^(?<value>-?(?:\d*\.)?\d+)(?<unit>[a-z]+|%)$/i.exec(input) | ||
let match = DIMENSION_REGEX.exec(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I touched this is because I noticed it before I wanted to move it from the upgrade package to core.
_userConfig: Config | null, | ||
rawCandidate: string, | ||
): Promise<string> { | ||
): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was annotating migrations to know whether it was a v3 → v4 upgrade, or if it was a v4 optimization. Noticed that this was using async which is not necessary in this case.
|
||
function themeToVar(designSystem: DesignSystem, rawCandidate: string): string { | ||
let convert = createConverter(designSystem) | ||
let convert = converterCache.get(designSystem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at this commit, use the ?w=1
GitHub flag. The only real changes here are:
- Re-indent because of the wrapping
DefaultMap
- The removal of the
prettyPrint
flag, that was only used in the CSS part (which still exists in the CSS version where whitespace is fine.
} | ||
|
||
function pathToVariableName(path: string) { | ||
function pathToVariableName(path: string, shouldPrefix = true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually a bug we shipped and only noticed it because the canonicalization tests run all examples in multiple catagories:
- The candidate on its own
- The candidate with a variant in front
- The candidate with an important
!
at the end - The candidate with a
tw:
prefix in front
Then I noticed that this broke a test. Without this, the following candidate would be using --tw-
as well:
- tw:bg-[--theme(--color-red-500/50%)]/50
+ tw:bg-[--theme(--tw-color-red-500/50%)]/50
The problem is that the --theme(…)
does take the value without the --tw-
flag: https://play.tailwindcss.com/GMFKMFGLFz
You could argue that this is a --theme(…)
bug and that we should handle it there. But this is the behavior of --theme(…)
today, so didn't want to break it.
@@ -1,13 +1,13 @@ | |||
import { substituteAtApply } from '../../../../tailwindcss/src/apply' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was kind of a smell that all these imports in the upgrade package were using tailwindcss
internals 😅
return candidate | ||
} | ||
|
||
candidate.root = `bg-linear-to-${direction}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already cloned the candidate when it came out of the design system, so we can mutate it freely without causing issues.
} | ||
} | ||
|
||
function baseCandidate<T extends Candidate>(candidate: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before running the canonicalization steps, we already drop the important
flag and variants
so each step is already dealing with the base candidate (utility).
for (let readonlyCandidate of designSystem.parseCandidate(rawCandidate)) { | ||
// The below logic makes use of mutation. Since candidates in the | ||
// DesignSystem are cached, we can't mutate them directly. | ||
let candidate = structuredClone(readonlyCandidate) as Writable<typeof readonlyCandidate> | ||
|
||
// Create a basic stripped candidate without variants or important flag. We | ||
// will re-add those later but they are irrelevant for what we are trying to | ||
// do here (and will increase cache hits because we only have to deal with | ||
// the base utility, nothing more). | ||
let targetCandidate = baseCandidate(candidate) | ||
let targetCandidateString = printUnprefixedCandidate(designSystem, targetCandidate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here, we already are dealing with the base candidate so we can just drop all of this entirely.
|
||
for (let candidate of parseCandidate(designSystem, rawCandidate)) { | ||
let clone = structuredClone(candidate) as Writable<typeof candidate> | ||
let changed = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were tracking whether something changed or not, and if it did we would re-print the Candidate AST, otherwise we would just re-use the incoming rawCandidate
.
Now, we are directly mutating the incoming candidate: Candidate
which we will return as well, so this code can be dropped.
const ATTRIBUTE_REGEX = | ||
/\[\s*(?<attribute>[a-zA-Z_-][a-zA-Z0-9_-]*)\s*((?<operator>[*|~^$]?=)\s*(?<quote>['"])?\s*(?<value>.*?)\4\s*(?<sensitivity>[is])?\s*)?\]/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one isn't even 100% correct, but it's slower already so didn't want to make it more correct 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you balancing correctness, bundle size and performance? There could always be work done on a proper parser here but of course comes with more code as a cost
Yeah core isn't 3.5MB right now. It's about 680KB (which is still a lot but that's because CJS + ESM duplication). The build process doesn't clear the |
6a59311
to
ccbd48d
Compare
While going through the migrations, I noticed that this was still marked as async even though nothing is async about this. Simplified it to a simple sync function instead.
This is likely temporary, but needed to know which migrations were sync or async, and which migrations are an _actual_ migration from v3 → v4 or which ones are mere optimizations in v4 land. The v4 optimizations will be moved to the `canonicalizeCandidates` logic straight on the design system.
+ initial basic implementation
After all migrations happened, we run the canonicalization step to perform the last bit of migrations
We will likely clean this up later. But just trying to move things aroudn first.
If you use `canonicalizeCandidates(['bg-red-500', 'bg-red-500/100'])` they would both canonicalize to `bg-red-500`. No need to return duplicates here.
Note: this commit on its own _will_ have some failing tests because the `migrate-arbitrary-utilities.test.ts` file combined some other migrations as well. Eventually once those are ported over, will make the tests pass. Another thing to notice is that we updated some tests. The more we canonicalize candidates, the more tests will need upgrading to an even more canonical form.
This test setup already tests various cases including prefixes and important strategies.
We were also converting the current charcode back to a string, but we already had that information in `input[i]`
The part where we upgrade old v3 syntax like `group-[]:` still exists in the `@tailwindcss/upgrade` package. We were relying on `postcss-selector-parser`, but the copied part is now using our own `SelectorParser`. I also added a temporary attribute selector parser to make everything pass. Don't look at the code for that _too_ much because we will swap it out in a later commit 😅
A bunch of migrations were moved to `canonicalizeCandidates`, so we still have to run that at the end. Also used: ```css @theme { --*: initial; } ``` To make the design system a bit faster.
Note: we made sure that all the tests passed before this commit, and we also made sure to not touch any of the tests as part of this commit ensuring that everything still works as expected. Aka, an actual refactor without behavior change. This is a big commit and a big refactor but that's mainly because there are some re-indents that are happening. Will maybe split up this commit but then it won't be an atomic (working) commit anymore. One thing I noticed is that all our "migrations" essentially parsed a raw candidate, did some work and then stringified the candidate AST again. That's one of the reasons I inlined the migrations in previous commits in a single file. Looking at the file now, it's a little bit silly that we parse and print over and over again. Parsing a raw candidate also results in an an array of `Candidate` AST nodes, so we always had a loop going on. Another thing I noticed is that often we need to work with a "base" candidate, that's essentially the candidate without the variants and without the important flag. That's where this refactor comes in: 1. We parse each candidate _once_ 2. When there are variants, we canonicalize each variant separately and store the results in a separate cache. 3. When there are variants _or_ the important flag is present, we canonicalize the base utility separately and later re-attach the variants and important flag at the end. This means that not a single canonicalization step has to deal with variants or important flag. 4. Since we only want to parse the candidate once on the outside, we will pass a cloned `Candidate` to each canonicalize step to prevent parsing and printing over and over again. 5. Only at the end will we re-print the `Candidate` This process has a few benefits: 1. We are handling candidates and variants separately and caching them separately. This means that we can increase cache hits because variants from candidate A and candidate B can be re-used. E.g.: `[@media_print]:flex` and `[@media_print]:underline` will map to `print:flex` and `print:underline` where `[@media_print]` is only handled once. 2. This also means that we could simplify some canonicalization steps because they don't have to worry about variants and the important flag. 3. There is no parsing & looping over the parsed candidates array going on.
The the scenario of creating signatures for a component, we always want to respect the important flag if it is given. Therefore I think that we can drop the `designSystem.compileAstNodes(candidate, CompileAstFlags.None)` which saves us some time. Also removed the `structuredClone` call. Whenever we handle `@apply` which essentially happened the step before, we already start from a cloned node internally. This call just meant we were doing it again.
This implementation is also 2-3x faster than the original one. ``` AttributeSelectorParser.parse - src/attribute-selector-parser.bench.ts > parsing 2.24x faster than REGEX.test(…) 2.59x faster than ….match(REGEX) 2.84x faster than parseAttributeSelector ```
ccbd48d
to
226ae47
Compare
Yep, that was the reason why! I'm not that stupid and I did clear them manually before. But I'm stupid enough not to realise that Turborepo just put them all back when running a build as if nothing ever happened. 😅 |
ast.push(node) | ||
} | ||
} | ||
buffer = String.fromCharCode(currentChar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we did this in purpose for perf reasons but it might not actually make a difference compared to everything else.
} | ||
|
||
// Update the candidate with the new value | ||
replaceObject(candidate, replacementCandidate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all the uses of replaceObject(candidate, replacementCandidate); return replacementCandidate
can be just return replacementCandidate
right?
It'd be cool if we could get rid of replaceObject
entirely for variants too but that's probably not realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know... very likely. Let me test that out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to merge this with the main selector parser or nah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't because we don't need it anywhere else. But we can make it part of the SelectorParser and introduce a new attribute
kind 🤔
I think the postcss-selector-parser
actually has a selector
kind that is just a group, each node is class
, id
, attribute
etc.
The only other spot in core where we use the selector parser is inside of the compat plugin API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kinda indifferent — it just feels odd to have a "selector parser" and a separate "attribute selector parser". Reads like the selector parser doesn't handle attributes at all. (it does just doesn't split out their "parts").
Maybe @philipp-spiess can weigh in on what he thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall things look good 💯
Massive PR but will be awesome to have this stuff in core. If we can speed up the initialization step enough I'll be able to make this a "default on" diagnostic for intellisense which would be pretty cool.
build.onLoad({ filter: /intellisense.ts$/ }, () => { | ||
return { | ||
contents: ` | ||
export function getClassList() { return [] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean these APIs aren't available when running the browser version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These APIs are internal and were already inaccessible in the @tailwindcss/browser
package. They were on an object (i.e. not tree-shakable) which meant they still took up some space. This replaces those APIs so they take up much less space since they can't be used anyway.
basically this is just a code-size reduction and doesn't eliminate any actual functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still import the tailwindcss
package directly — even in the browser — it'll work just fine and will have these APIs.
This is done to help eliminate some bundle size concerns with the CDN version (e.g. the one that scans class="…" attributes in HTML and compiles on the fly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that must be it great, thanks mate
These are globally inserted and don't super matter, at least I don't think so and it will free up a lot of memory. ~213 MB → ~154 MB
This PR introduces a new
canonicalizeCandidates
function on the internal Design System.The big motivation to moving this to the core
tailwindcss
package is that we can use this in various places:Here is a list of all the changes:
canonicalizeCandidates
function to the design systemprintCandidate
tests to thetailwindcss
packagecanonicalizeCandidates
based on the existing tests in the upgrade tool.I noticed that all the migrations followed a specific pattern:
Candidate[]
ASTCandidate
to a newCandidate
(this often handled both theCandidate
and itsVariant[]
)Candidate
back to a string, and pass it to the next migration step.While this makes sense in isolation, we are doing a lot of repeated work by parsing, modifying, and printing the candidate multiple times. This let me to introduce the
big refactor
commit. This changes the steps to:Candidate[]
once.utility
and not care about the variants or the important marker. We can re-attach these afterwards.rawCandidate: string
, each migration step receives an actualCandidate
object (or aVariant
object).Candidate
and theVariant[]
.All of this means that there is a lot less work that needs to be done. We can also cache results between migrations. So
[@media_print]:flex
and[@media_print]:block
will result inprint:flex
andprint:block
respectively, but the[@media_print]
part is only migrated once across both candidates.One migration step relied on the
postcss-selector-parser
package to parse selectors and attribute selectors. I didn't want to introduce a package just for this, so instead used our ownSelectorParser
in the migration and wrote a smallAttributeSelectorParser
that can parse the attribute selector into a little data structure we can work with instead.If we want, we can split this PR up into smaller pieces, but since the biggest chunk is moving existing code around, I think it's fairly doable to review as long as you go commit by commit.
With this new API, we can turn:
Into their canonicalized form:
The list is also unique, so we won't end up with
bg-red-500 bg-red-500
twice.While the canonicalization itself is fairly fast, we still pay a ~1s startup cost for some migrations (once, and cached for the entire lifetime of the design system). I would like to keep improving the performance and the kinds of migrations we do, but I think this is a good start.
The cost we pay is for:
getClassList
suggestions API.The canonicalization step for this list takes ~2.9ms on my machine.
Just for fun, if you use the
getClassList
API for intellisense that generates all the suggestions and their modifiers, you get a list of 263788 classes. If you canonicalize all of these, it takes ~500ms in total. So roughly ~1.9μs per candidate.This new API doesn't result in a performance difference for normal Tailwind CSS builds.
The other potential concern is file size of the package. The generated
tailwindcss.tgz
file changed like this:So the package increased by ~65 kB which I don't think is the end of the world, but it is important for the
@tailwindcss/browser
build which we don't want to grow unnecessarily. For this reason we remove some of the code for the design system conditionally such that you don't pay this cost in an environment where you will never need this API.The
@tailwindcss/browser
build looks like this: