Commit 0bca613
Use entry() to avoid redundant DashMap insertions to reduce CPU by 12%
Summary:
Optimize graph construction by using `entry().or_insert_with()` instead of
unconditional `insert()` in the `impl_string_storage!` macro. This avoids
redundant string allocations and write locks when the same key is inserted
multiple times (which happens frequently since targets are referenced from
multiple dependency edges).
Benchmarks show ~12% reduction in CPU time:
- Before: 1467s average user CPU time
- After: 1295s average user CPU time
- Improvement: 172s (11.8%)
**CPU TIME != WALL CLOCK TIME.**
**WALL CLOCK HAS A VERY HIGH VARIANCE, SO IMPROVEMENT WAS DIFFICULT TO MEASURE**
The change eliminates ~18% of CPU overhead that was spent specifically
in `DashMap::_insert` for TargetId and LabelId string maps.
### Before
`self.target_id_to_label.insert(id, s.to_string());`
Every single call:
1. s.to_string() → allocates a new String on the heap (malloc)
2. Acquires a write lock on the DashMap shard
3. Inserts the key-value pair (overwrites if exists)
4. Drops the old String if it existed (free)
5. Releases the write lock
With millions of edges and 30 threads, you get:
- Millions of unnecessary String allocations
- Millions of unnecessary write locks (which block other threads)
- Millions of unnecessary frees
### After
`self.target_id_to_label.entry(id).or_insert_with(|| s.to_string());`
First call for a key:
1. Acquires a read lock, checks if key exists → No
2. Upgrades to write lock
3. s.to_string() → allocates String
4. Inserts
5. Releases lock
Subsequent calls for same key:
1. Acquires a read lock, checks if key exists → Yes
2. Returns immediately (no allocation, no write lock)
3. Releases read lock
Specifically looking at CPU time changes, I ran 5 tests before the change
and after the change, recorded each one, and then calculated the average:
| Metric | Before | After | Change |
| --- | --- | --- | --- |
| Run 1 | 1644.61 s | 1287.90 s | −356.71 s |
| Run 2 | 1422.43 s | 1285.39 s | −137.04 s |
| Run 3 | 1422.16 s | 1291.19 s | −130.97 s |
| Run 4 | 1419.80 s | 1315.53 s | −104.27 s |
| Run 5 | 1428.01 s | 1292.94 s | −135.07 s |
| Average | 1467.40 s | 1294.59 s | −172.81 s (11.8%) |
Reviewed By: James534
Differential Revision: D92280926
fbshipit-source-id: 3db71cf48a28faec5da3bf78a66d2e7eca6f239f1 parent 965bcab commit 0bca613
1 file changed
+7
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
33 | 39 | | |
34 | 40 | | |
35 | | - | |
| 41 | + | |
36 | 42 | | |
37 | 43 | | |
38 | 44 | | |
| |||
0 commit comments