Skip to content

Commit 6a69711

Browse files
fix: regenerate .fai index before reading and guard empty edx samples
Regenerate the FASTA index in compute_seq_similarity.py to prevent stale .fai from referencing collapsed-away sequences. Also guard edx_concordance output on actual edx samples existing, not just edx.enabled config flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent a32caf9 commit 6a69711

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

workflow/rules/common.smk

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,7 @@ def pipeline_outputs():
315315
outs.append(os.path.join(outdir, "summary", "qc", "reference_similarity.tsv"))
316316

317317
# EDX (3' adapter barcode) concordance table
318-
if config.get("edx", {}).get("enabled", False):
318+
if config.get("edx", {}).get("enabled", False) and get_edx_samples():
319319
outs.append(os.path.join(outdir, "summary", "edx", "edx_concordance.tsv.gz"))
320320

321321
return outs

workflow/scripts/compute_seq_similarity.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,9 @@ def compute_similarity_matrix(fasta_path):
3232
- similarity_matrix: numpy array of percent identity values
3333
- sequence_names: list of sequence IDs
3434
"""
35+
# Regenerate .fai index to ensure it matches the current FASTA content
36+
pysam.faidx(fasta_path)
37+
3538
# Read all sequences from FASTA using pysam
3639
faidx = pysam.FastaFile(fasta_path)
3740
names = list(faidx.references)

0 commit comments

Comments
 (0)