-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat(fuzz): WorkerCorpus
for multiple worker threads
#11769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
SharedCorpus
for multiple worker threadsSharedCorpus
for multiple worker threads
…es that provided in new coverage
SharedCorpus
for multiple worker threadsWorkerCorpus
for multiple worker threads
// Track in-memory corpus changes to update MasterWorker on sync | ||
let new_index = self.in_memory_corpus.len(); | ||
self.new_entry_indices.push(new_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, but may result in some corpus entries not getting sync'd e.g. due to a crash or ctr+c and restart. If you persist the last sync'd timestamp, it can recover from restarts by checking if there are newer entries written before a sync could occur
metrics.update_seen(is_edge); | ||
} | ||
if id == 0 && config.corpus_dir.is_some() { | ||
// Master worker loads the initial corpus if it exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Master worker loads the initial corpus if it exists | |
// Master worker loads the initial corpus, if it exists. Then, [export]s to workers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I guess this is distribute
foundry_common::fs::write_json_gzip_file( | ||
corpus_dir.join(format!("{corpus_uuid}{JSON_EXTENSION}.gz")).as_path(), | ||
worker_corpus | ||
.join(format!("{corpus_uuid}-{timestamp}{JSON_EXTENSION}.gz")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the uuid is redundant if we prefix the worker id to the timestamp to avoid collision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamps are in seconds, corpus-uuid avoids collision within the same worker
WorkerCorpus
for multiple worker threadsWorkerCorpus
for multiple worker threads
please merge master, this is still using old CI runners |
This comment was marked as abuse.
This comment was marked as abuse.
let file_path = corpus_dir.join(&file_name); | ||
let sync_path = master_sync_dir.join(&file_name); | ||
|
||
let Ok(_) = foundry_common::fs::copy(file_path, sync_path) else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could symlink and only copy if it is going to be deleted or moved later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or once it's decided to import from the sync dir, we create the hardlink
"persisted {} inputs for new coverage in {corpus_uuid} corpus", | ||
&corpus.tx_seq.len() | ||
"persisted {} inputs for new coverage in worker {} for {corpus_uuid} corpus", | ||
self.id, &corpus.tx_seq.len() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can avoid logging worked id by using a span on the function, also the argument order is incorrect on this specific log
|
||
let uuid = corpus.uuid; | ||
debug!(target: "corpus", "evict corpus {uuid}"); | ||
debug!(target: "corpus", "evict corpus {uuid} in worker {}", self.id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
'corpus_replay: for entry in std::fs::read_dir(corpus_dir)? { | ||
let path = entry?.path(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we continually write to the corpus directory? This is very expensive as we not only iterate a directory and read the files, but we also (if gzip is enabled) do decompression over and over, potentially of the same file. It feels like the corpus should be default in-memory, and we only write at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just happening at start up. The entries are held in memory so long as they don't exceed a configurable limit and then flushed to disk (and compressed if it's enabled).
Your point does still stand elsewhere. IIUC workers share compressed corpus entries so they potentially are repeatedly decompressing the same files. Moving compression to the very end would resolve this
Motivation
towards #8898
Solution
CorpusManager
withWorkerCorpus
WorkerCorpus
is the corpus to be used by parallel worker threads.WorkerCorpus
has anid: u32
. The master worker has theid = 0
.WorkerCorpus
's share theirin_memory_corpus
amongst each other via the file system using a star-pattern where the master worker (id = 0
) is in the center.corpus_dir
is now organized as:worker0/sync/
- Seefn export
in 089f30bworker0/corpus
entries (which includes entries from all workers when synced) to each workerssync/
directory - Seefn distribute
in d4200e4corpur_dir/workerId/sync
dir intocorpus_dir/workerId/corpus
if it leads to new coverage and updates it'shistory_map
- Seefn calibrate
in 488d09dfn calibrate
we are fetching the new corpus entries from the workerssync/
dir and replaying the tx sequences to check if they lead to new coverage for this particular worker. If it does then we're updatinghistory_map
.pub fn sync
introduced in e9d8d3c handles all of the above.Note: This PR does not address parallelizing the fuzz runs, only prepares for it. Opened for initial feedback on the approach.
PR Checklist