Skip to content

Commit 6af4835

Browse files
RobinMalfaitthecrypticaceadamwathan
authored
Improve performance of scanning source files (#15270)
This PR improves scanning files by scanning chunks of the files in parallel. Each chunk is separated by new lines since we can't use whitespace in classes anyway. This also means that we can use the power of your CPU to scan files faster. The extractor itself also has less state to worry about on these smaller chunks. On a dedicated benchmark machine: Mac Mini, M1, 16 GB RAM ```shellsession ❯ hyperfine --warmup 15 --runs 50 \ -n NEW 'bun --bun /Users/ben/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css' \ -n CURRENT 'bun --bun /Users/ben/github.com/tailwindlabs/tailwindcss--next/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css' Benchmark 1: NEW Time (mean ± σ): 337.2 ms ± 2.9 ms [User: 1376.6 ms, System: 80.9 ms] Range (min … max): 331.0 ms … 345.3 ms 50 runs Benchmark 2: CURRENT Time (mean ± σ): 730.3 ms ± 3.8 ms [User: 978.9 ms, System: 78.7 ms] Range (min … max): 722.0 ms … 741.8 ms 50 runs Summary NEW ran 2.17 ± 0.02 times faster than CURRENT ``` On a more powerful machine, MacBook Pro M1 Max, 64 GB RAM, the results look even more promising: ```shellsession ❯ hyperfine --warmup 15 --runs 50 \ -n NEW 'bun --bun /Users/robin/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css' \ -n CURRENT 'bun --bun /Users/robin/github.com/tailwindlabs/tailwindcss--next/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css' Benchmark 1: NEW Time (mean ± σ): 307.8 ms ± 24.5 ms [User: 1124.8 ms, System: 187.9 ms] Range (min … max): 291.7 ms … 397.9 ms 50 runs Benchmark 2: CURRENT Time (mean ± σ): 754.7 ms ± 27.2 ms [User: 934.9 ms, System: 217.6 ms] Range (min … max): 735.5 ms … 845.6 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary NEW ran 2.45 ± 0.21 times faster than CURRENT ``` > Note: This last benchmark is running on my main machine which is more "busy" compared to my benchmark machine. Because of this I had to increase the `--runs` to get statistically better results. There is still a warning present, but the overall numbers are still very promising. --- These benchmarks are running on our Tailwind UI project where we have >1000 files, and >750 000 lines of code in those files. | Before | After | | --- | --- | | <img width="385" alt="image" src="https://github.com/user-attachments/assets/4786b842-bedc-4456-a9ca-942f72ca738c"> | <img width="382" alt="image" src="https://github.com/user-attachments/assets/fb43cff8-95e7-453e-991e-d036c64659ba"> | --- I am sure there is more we can do here, because reading all of these 1000 files only takes ~10ms, whereas parsing all these files takes ~180ms. But I'm still happy with these results as an incremental improvement. For good measure, I also wanted to make sure that we didn't regress on smaller projects. Running this on Catalyst, we only have to deal with ~100 files and ~18 000 lines of code. In this case reading all the files takes ~890µs and parsing takes about ~4ms. | Before | After | | --- | --- | | <img width="381" alt="image" src="https://github.com/user-attachments/assets/25d4859f-d058-4f57-a2f6-219d8c4b1804"> | <img width="390" alt="image" src="https://github.com/user-attachments/assets/f06d7536-337b-4dc0-a460-6a9f141c65f5"> | Not a huge difference, still better and definitely no regressions which sounds like a win to me. --- **Edit:** after talking to @thecrypticace, instead of splitting on any whitespace we just split on newlines. This makes the chunks a bit larger, but it reduces the overhead of the extractor itself. This now results in a 2.45x speedup in Tailwind UI compared to 1.94x speedup. --------- Co-authored-by: Jordan Pittman <[email protected]> Co-authored-by: Adam Wathan <[email protected]>
1 parent e9426d0 commit 6af4835

File tree

2 files changed

+14
-14
lines changed

2 files changed

+14
-14
lines changed

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10-
- Nothing yet!
10+
### Added
11+
12+
- Parallelize parsing of individual source files ([#15270](https://github.com/tailwindlabs/tailwindcss/pull/15270))
1113

1214
## [4.0.0-beta.4] - 2024-11-29
1315

crates/oxide/src/lib.rs

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -102,13 +102,11 @@ impl Scanner {
102102
pub fn scan(&mut self) -> Vec<String> {
103103
init_tracing();
104104
self.prepare();
105-
self.check_for_new_files();
106105
self.compute_candidates();
107106

108-
let mut candidates: Vec<String> = self.candidates.clone().into_iter().collect();
109-
110-
candidates.sort();
107+
let mut candidates: Vec<String> = self.candidates.clone().into_par_iter().collect();
111108

109+
candidates.par_sort();
112110
candidates
113111
}
114112

@@ -140,7 +138,7 @@ impl Scanner {
140138
let extractor = Extractor::with_positions(&content[..], Default::default());
141139

142140
let candidates: Vec<(String, usize)> = extractor
143-
.into_iter()
141+
.into_par_iter()
144142
.map(|(s, i)| {
145143
// SAFETY: When we parsed the candidates, we already guaranteed that the byte slices
146144
// are valid, therefore we don't have to re-check here when we want to convert it back
@@ -156,7 +154,7 @@ impl Scanner {
156154
self.prepare();
157155

158156
self.files
159-
.iter()
157+
.par_iter()
160158
.filter_map(|x| Path::from(x.clone()).canonicalize().ok())
161159
.map(|x| x.to_string())
162160
.collect()
@@ -201,14 +199,15 @@ impl Scanner {
201199

202200
if !changed_content.is_empty() {
203201
let candidates = parse_all_blobs(read_all_files(changed_content));
204-
self.candidates.extend(candidates);
202+
self.candidates.par_extend(candidates);
205203
}
206204
}
207205

208206
// Ensures that all files/globs are resolved and the scanner is ready to scan
209207
// content for candidates.
210208
fn prepare(&mut self) {
211209
if self.ready {
210+
self.check_for_new_files();
212211
return;
213212
}
214213

@@ -455,12 +454,10 @@ fn read_all_files(changed_content: Vec<ChangedContent>) -> Vec<Vec<u8>> {
455454

456455
#[tracing::instrument(skip_all)]
457456
fn parse_all_blobs(blobs: Vec<Vec<u8>>) -> Vec<String> {
458-
let input: Vec<_> = blobs.iter().map(|blob| &blob[..]).collect();
459-
let input = &input[..];
460-
461-
let mut result: Vec<String> = input
457+
let mut result: Vec<_> = blobs
462458
.par_iter()
463-
.map(|input| Extractor::unique(input, Default::default()))
459+
.flat_map(|blob| blob.par_split(|x| matches!(x, b'\n')))
460+
.map(|blob| Extractor::unique(blob, Default::default()))
464461
.reduce(Default::default, |mut a, b| {
465462
a.extend(b);
466463
a
@@ -473,6 +470,7 @@ fn parse_all_blobs(blobs: Vec<Vec<u8>>) -> Vec<String> {
473470
unsafe { String::from_utf8_unchecked(s.to_vec()) }
474471
})
475472
.collect();
476-
result.sort();
473+
474+
result.par_sort();
477475
result
478476
}

0 commit comments

Comments
 (0)