Skip to content

Conversation

yash-atreya
Copy link
Member

@yash-atreya yash-atreya commented Sep 23, 2025

Motivation

towards #8898

Solution

  • Replaces CorpusManager with WorkerCorpus
  • WorkerCorpus is the corpus to be used by parallel worker threads.
  • Each WorkerCorpus has an id: u32. The master worker has the id = 0.
  • The WorkerCorpus's share their in_memory_corpus amongst each other via the file system using a star-pattern where the master worker (id = 0) is in the center.
  • The corpus_dir is now organized as:
corpus_dir/
    worker0/ - // master
        sync/
        corpus/
    worker1/
        sync/
        corpus/
    worker2/
        sync/
  • Each non-master worker exports their corpus to worker0/sync/ - See fn export in 089f30b
  • The master worker distributes it's worker0/corpus entries (which includes entries from all workers when synced) to each workers sync/ directory - See fn distribute in d4200e4
  • Each worker then pulls the new corpus entries from their respective corpur_dir/workerId/sync dir into corpus_dir/workerId/corpus if it leads to new coverage and updates it's history_map - See fn calibrate in 488d09d
  • In fn calibrate we are fetching the new corpus entries from the workers sync/ dir and replaying the tx sequences to check if they lead to new coverage for this particular worker. If it does then we're updating history_map.
  • The pub fn sync introduced in e9d8d3c handles all of the above.

Note: This PR does not address parallelizing the fuzz runs, only prepares for it. Opened for initial feedback on the approach.

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

@yash-atreya yash-atreya changed the title [wip] feat(evm): SharedCorpus for multiple worker threads [wip] feat(fuzz): SharedCorpus for multiple worker threads Sep 23, 2025
@yash-atreya yash-atreya changed the title [wip] feat(fuzz): SharedCorpus for multiple worker threads [wip] feat(fuzz): WorkerCorpus for multiple worker threads Sep 25, 2025
Comment on lines +388 to +390
// Track in-memory corpus changes to update MasterWorker on sync
let new_index = self.in_memory_corpus.len();
self.new_entry_indices.push(new_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but may result in some corpus entries not getting sync'd e.g. due to a crash or ctr+c and restart. If you persist the last sync'd timestamp, it can recover from restarts by checking if there are newer entries written before a sync could occur

metrics.update_seen(is_edge);
}
if id == 0 && config.corpus_dir.is_some() {
// Master worker loads the initial corpus if it exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Master worker loads the initial corpus if it exists
// Master worker loads the initial corpus, if it exists. Then, [export]s to workers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I guess this is distribute

foundry_common::fs::write_json_gzip_file(
corpus_dir.join(format!("{corpus_uuid}{JSON_EXTENSION}.gz")).as_path(),
worker_corpus
.join(format!("{corpus_uuid}-{timestamp}{JSON_EXTENSION}.gz"))
Copy link
Contributor

@0xalpharush 0xalpharush Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the uuid is redundant if we prefix the worker id to the timestamp to avoid collision

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamps are in seconds, corpus-uuid avoids collision within the same worker

@yash-atreya yash-atreya changed the title [wip] feat(fuzz): WorkerCorpus for multiple worker threads feat(fuzz): WorkerCorpus for multiple worker threads Sep 29, 2025
@yash-atreya yash-atreya self-assigned this Sep 29, 2025
@yash-atreya yash-atreya moved this to Ready For Review in Foundry Sep 29, 2025
@yash-atreya yash-atreya added this to the v1.5.0 milestone Sep 29, 2025
@DaniPopes
Copy link
Member

please merge master, this is still using old CI runners

@kathleenmotley49-maker

This comment was marked as abuse.

@yash-atreya yash-atreya marked this pull request as ready for review October 3, 2025 14:35
let file_path = corpus_dir.join(&file_name);
let sync_path = master_sync_dir.join(&file_name);

let Ok(_) = foundry_common::fs::copy(file_path, sync_path) else {
Copy link
Contributor

@0xalpharush 0xalpharush Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could symlink and only copy if it is going to be deleted or moved later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or once it's decided to import from the sync dir, we create the hardlink

"persisted {} inputs for new coverage in {corpus_uuid} corpus",
&corpus.tx_seq.len()
"persisted {} inputs for new coverage in worker {} for {corpus_uuid} corpus",
self.id, &corpus.tx_seq.len()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can avoid logging worked id by using a span on the function, also the argument order is incorrect on this specific log


let uuid = corpus.uuid;
debug!(target: "corpus", "evict corpus {uuid}");
debug!(target: "corpus", "evict corpus {uuid} in worker {}", self.id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines +235 to +236
'corpus_replay: for entry in std::fs::read_dir(corpus_dir)? {
let path = entry?.path();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we continually write to the corpus directory? This is very expensive as we not only iterate a directory and read the files, but we also (if gzip is enabled) do decompression over and over, potentially of the same file. It feels like the corpus should be default in-memory, and we only write at the end.

Copy link
Contributor

@0xalpharush 0xalpharush Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just happening at start up. The entries are held in memory so long as they don't exceed a configurable limit and then flushed to disk (and compressed if it's enabled).

Your point does still stand elsewhere. IIUC workers share compressed corpus entries so they potentially are repeatedly decompressing the same files. Moving compression to the very end would resolve this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready For Review
Development

Successfully merging this pull request may close these issues.

5 participants