The scaffold filtering logic in mappingFilter.hpp has been updated to better remove "random off-diagonal shorter weaker scaffold chains" by applying the plane sweep algorithm to the merged chains before selecting anchors.
- Merge mappings into chains
- Filter chains by minimum length
- Collect ALL mappings within chain bounds as anchors
- Apply plane sweep filter to anchors
- Use filtered anchors for distance-based filtering
- Merge mappings into chains
- Filter chains by minimum length
- NEW: Apply plane sweep filter to merged chains (Step 4)
- Collect mappings within filtered chain bounds as anchors (Step 5)
- Use anchors for distance-based filtering
The critical change is in Step 4, where we now filter the scaffold chains themselves:
// Step 4: Apply plane sweep filter to the merged chains to remove off-diagonal/weaker scaffold chains
if (!mergedChains.empty() && (param.filterMode == filter::MAP || param.filterMode == filter::ONETOONE)) {
MappingResultsVector_t filteredChains;
filterByGroup(mergedChains, filteredChains, param.numMappingsForSegment - 1,
false, idManager, param, progress);
mergedChains = std::move(filteredChains);
}This ensures that:
- Weaker scaffold chains that would create off-diagonal noise are removed
- Only the best-scoring chains survive to contribute anchors
- The final anchor set is cleaner and more focused on true syntenic regions
The implementation has been tested with cerevisiae chrV data:
- Default parameters (S=5): 990 mappings retained, 666 scaffold chains
- Higher threshold (S=10): 876 mappings retained, 568 scaffold chains
The scaffold output file (--scaffold-out) now contains the filtered merged chains, showing which scaffolds survived the plane sweep filter.
- Better noise reduction: Off-diagonal chains are removed at the chain level
- Cleaner anchor sets: Only mappings from high-quality chains become anchors
- More focused scaffolds: The plane sweep ensures competing/overlapping chains are resolved
src/map/include/mappingFilter.hpp: UpdatedfilterByScaffolds()function