Skip to content

Commit a2a0144

Browse files
authored
Turbopack: use the same serialization method for lookup as for storing (#84754)
### What? This makes sure to use the same method for serializing CachedTaskType for storing and lookup. Before that we instantiated the serialization twice and that caused some weird behavior on linux. Before this fix I was seeing that serialization of CachedTaskType was not byte to byte identical between storing and lookup. This caused tasks to not be found, which will lead to very weird effects: stale tasks, cell not found, etc. I was expecting that pot serialization is deterministic, but turns out it isn't fully deterministic. It has some logic to deduplicate Symbols (like enum variant names). But it deduplicates these `&'static str`s based on the pointer address of the static string. Turns out that Rust doesn't guarantee that static strings are deduplicated in all cases. I assume that we are somehow running into that problem when having two variants of the serialization. They ended up serializing task types differently (one didn't have a symbol deduplicated in the serialized form. But that's more a wild guess. It did fix the bug, but I have a hard time understanding why this is happening. On long term, I think we should change pot to avoid using pointer addresses and do a real comparision instead.
1 parent 85217fe commit a2a0144

File tree

1 file changed

+23
-10
lines changed

1 file changed

+23
-10
lines changed

turbopack/crates/turbo-tasks-backend/src/kv_backing_storage.rs

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -357,8 +357,12 @@ impl<T: KeyValueDatabase + Send + Sync + 'static> BackingStorageSealed
357357

358358
let mut task_type_bytes = Vec::new();
359359
for (task_type, task_id) in updates {
360+
serialize_task_type(
361+
&task_type,
362+
&mut task_type_bytes,
363+
Some(task_id),
364+
)?;
360365
let task_id: u32 = *task_id;
361-
serialize_task_type(&task_type, &mut task_type_bytes, task_id)?;
362366

363367
batch
364368
.put(
@@ -441,8 +445,8 @@ impl<T: KeyValueDatabase + Send + Sync + 'static> BackingStorageSealed
441445
.entered();
442446
let mut task_type_bytes = Vec::new();
443447
for (task_type, task_id) in task_cache_updates.into_iter().flatten() {
448+
serialize_task_type(&task_type, &mut task_type_bytes, Some(task_id))?;
444449
let task_id = *task_id;
445-
serialize_task_type(&task_type, &mut task_type_bytes, task_id)?;
446450

447451
batch
448452
.put(
@@ -499,8 +503,10 @@ impl<T: KeyValueDatabase + Send + Sync + 'static> BackingStorageSealed
499503
tx: &D::ReadTransaction<'_>,
500504
task_type: &CachedTaskType,
501505
) -> Result<Option<TaskId>> {
502-
let task_type = POT_CONFIG.serialize(task_type)?;
503-
let Some(bytes) = database.get(tx, KeySpace::ForwardTaskCache, &task_type)? else {
506+
let mut task_type_bytes = Vec::new();
507+
serialize_task_type(task_type, &mut task_type_bytes, None)?;
508+
let Some(bytes) = database.get(tx, KeySpace::ForwardTaskCache, &task_type_bytes)?
509+
else {
504510
return Ok(None);
505511
};
506512
let bytes = bytes.borrow().try_into()?;
@@ -645,23 +651,30 @@ where
645651
Ok(())
646652
}
647653

654+
#[inline(never)]
648655
fn serialize_task_type(
649-
task_type: &Arc<CachedTaskType>,
656+
task_type: &CachedTaskType,
650657
mut task_type_bytes: &mut Vec<u8>,
651-
task_id: u32,
658+
task_id: Option<TaskId>,
652659
) -> Result<()> {
653660
task_type_bytes.clear();
654661
POT_CONFIG
655-
.serialize_into(&**task_type, &mut task_type_bytes)
656-
.with_context(|| anyhow!("Unable to serialize task {task_id} cache key {task_type:?}"))?;
662+
.serialize_into(task_type, &mut task_type_bytes)
663+
.with_context(|| {
664+
if let Some(task_id) = task_id {
665+
anyhow!("Unable to serialize task {task_id} cache key {task_type:?}")
666+
} else {
667+
anyhow!("Unable to serialize task cache key {task_type:?}")
668+
}
669+
})?;
657670
#[cfg(feature = "verify_serialization")]
658671
{
659672
let deserialize: Result<CachedTaskType, _> = serde_path_to_error::deserialize(
660673
&mut pot_de_symbol_list().deserializer_for_slice(&*task_type_bytes)?,
661674
);
662675
if let Err(err) = deserialize {
663-
println!("Task type would not be deserializable {task_id}: {err:?}\n{task_type:#?}");
664-
panic!("Task type would not be deserializable {task_id}: {err:?}");
676+
println!("Task type would not be deserializable {task_id:?}: {err:?}\n{task_type:#?}");
677+
panic!("Task type would not be deserializable {task_id:?}: {err:?}");
665678
}
666679
}
667680
Ok(())

0 commit comments

Comments
 (0)