When rerunning base results, a "hash" is used, but results may already exist under the branch name "main". Since these have different filenames, the old results won't be overwritten. When matching results, I think it's matching "first found" which may be the old ones.
We need to have by-commit-hash-only matching of existing results and deleting them if they exist so that re-running in fact replaces results.