Skip to content

Commit e75b3ca

Browse files
committed
feat(rna): add ENA parallel processor for 10x faster FASTQ downloads
- New process_apis_mellifera_ena.py downloads pre-extracted FASTQs directly from ENA - Uses ENA Portal API for reliable URL discovery - 20 workers safe since no extraction bottleneck - Observed: 19-42 sec/sample vs 50+ min via NCBI - Reduce NCBI fallback pipeline to 10 workers
1 parent 668c25d commit e75b3ca

File tree

8 files changed

+767
-39
lines changed

8 files changed

+767
-39
lines changed
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"n_targets": 28355,
3+
"n_bootstraps": 0,
4+
"n_processed": 772651,
5+
"n_pseudoaligned": 636494,
6+
"n_unique": 349326,
7+
"p_pseudoaligned": 82.4,
8+
"p_unique": 45.2,
9+
"kallisto_version": "0.51.1",
10+
"index_version": 13,
11+
"k-mer length": 31,
12+
"start_time": "Thu Feb 5 06:10:14 2026",
13+
"call": "kallisto quant -i output/amalgkit/apis_mellifera_all/work/index/Apis_mellifera_transcripts.idx -o output/amalgkit/apis_mellifera_all/work/quant/DRR129208 -t 2 --single -l 200 -s 30 /Volumes/blue/data/apis_mellifera/DRR129208/DRR129208.fastq.gz"
14+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"n_targets": 28355,
3+
"n_bootstraps": 0,
4+
"n_processed": 2068189,
5+
"n_pseudoaligned": 375947,
6+
"n_unique": 184261,
7+
"p_pseudoaligned": 18.2,
8+
"p_unique": 8.9,
9+
"kallisto_version": "0.51.1",
10+
"index_version": 13,
11+
"k-mer length": 31,
12+
"start_time": "Thu Feb 5 06:04:09 2026",
13+
"call": "kallisto quant -i output/amalgkit/apis_mellifera_all/work/index/Apis_mellifera_transcripts.idx -o output/amalgkit/apis_mellifera_all/work/quant/SRR12584229 -t 2 --single -l 200 -s 30 /Volumes/blue/data/apis_mellifera/SRR12584229/SRR12584229.fastq.gz"
14+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"n_targets": 28355,
3+
"n_bootstraps": 0,
4+
"n_processed": 3746640,
5+
"n_pseudoaligned": 0,
6+
"n_unique": 0,
7+
"p_pseudoaligned": 0.0,
8+
"p_unique": 0.0,
9+
"kallisto_version": "0.51.1",
10+
"index_version": 13,
11+
"k-mer length": 31,
12+
"start_time": "Thu Feb 5 06:09:47 2026",
13+
"call": "kallisto quant -i output/amalgkit/apis_mellifera_all/work/index/Apis_mellifera_transcripts.idx -o output/amalgkit/apis_mellifera_all/work/quant/SRR21601887 -t 2 --single -l 200 -s 30 /Volumes/blue/data/apis_mellifera/SRR21601887/SRR21601887.fastq.gz"
14+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"n_targets": 28355,
3+
"n_bootstraps": 0,
4+
"n_processed": 108758,
5+
"n_pseudoaligned": 5,
6+
"n_unique": 1,
7+
"p_pseudoaligned": 0.0,
8+
"p_unique": 0.0,
9+
"kallisto_version": "0.51.1",
10+
"index_version": 13,
11+
"k-mer length": 31,
12+
"start_time": "Thu Feb 5 06:09:56 2026",
13+
"call": "kallisto quant -i output/amalgkit/apis_mellifera_all/work/index/Apis_mellifera_transcripts.idx -o output/amalgkit/apis_mellifera_all/work/quant/SRR7412464 -t 2 --single -l 200 -s 30 /Volumes/blue/data/apis_mellifera/SRR7412464/SRR7412464.fastq.gz"
14+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"n_targets": 28355,
3+
"n_bootstraps": 0,
4+
"n_processed": 77570,
5+
"n_pseudoaligned": 7,
6+
"n_unique": 2,
7+
"p_pseudoaligned": 0.0,
8+
"p_unique": 0.0,
9+
"kallisto_version": "0.51.1",
10+
"index_version": 13,
11+
"k-mer length": 31,
12+
"start_time": "Thu Feb 5 06:09:51 2026",
13+
"call": "kallisto quant -i output/amalgkit/apis_mellifera_all/work/index/Apis_mellifera_transcripts.idx -o output/amalgkit/apis_mellifera_all/work/quant/SRR7412489 -t 2 --single -l 200 -s 30 /Volumes/blue/data/apis_mellifera/SRR7412489/SRR7412489.fastq.gz"
14+
}

0 commit comments

Comments
 (0)