Skip to content

Commit fc881e2

Browse files
authored
Merge pull request #3 from sourmash-bio/update_tables
MRG: update with tables and other trickery
2 parents ddd6eb4 + 5cd701b commit fc881e2

File tree

18 files changed

+411
-169
lines changed

18 files changed

+411
-169
lines changed

Snakefile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,16 @@ Templates_To_Output = namedtuple("Templates_To_Output",
77

88
templates = [
99
Templates_To_Output('gtdb220',
10-
'complete',
10+
'gtdb',
1111
'outputs/md/gtdb220.md'),
1212
Templates_To_Output('gtdb226',
13-
'complete',
13+
'gtdb+rocksdb',
1414
'outputs/md/gtdb226.md'),
1515
Templates_To_Output('ncbi_viruses_2025_01',
16-
'complete',
16+
'ncbi',
1717
'outputs/md/ncbi_viruses_2025_01.md'),
1818
Templates_To_Output('ncbi_euks_2025_01',
19-
'complete',
19+
'ncbi',
2020
'outputs/md/ncbi_euks_2025_01.md'),
2121
]
2222

outputs/md/gtdb220.md

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
2-
<!-- template file: templates/complete.md -->
2+
<!-- template file: templates/gtdb.md -->
33

44
# Collection: GTDB RS220
55

@@ -11,25 +11,37 @@ Links:
1111

1212
## Database files:
1313

14-
Files:
14+
| K-mer size | GTDB reps | GTDB entire |
15+
| -------- | -------- | -------- |
16+
| 21 | [download (2.8 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k21.dna.zip) | [download (17.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k21.dna.zip) |
17+
| 31 | [download (2.8 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k31.dna.zip) | [download (17.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k31.dna.zip) |
18+
| 51 | [download (2.8 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k51.dna.zip) | [download (17.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k51.dna.zip) |
1519

16-
* zip: [gtdb-rs220-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k21.dna.zip) - all GTDB genomes. - DNA, k=21, scaled=1000 (17.0 GB)
17-
* zip: [gtdb-rs220-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k31.dna.zip) - all GTDB genomes. - DNA, k=31, scaled=1000 (17.0 GB)
18-
* zip: [gtdb-rs220-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k51.dna.zip) - all GTDB genomes. - DNA, k=51, scaled=1000 (17.0 GB)
1920

21+
## Taxonomy files:
2022

21-
* zip: [gtdb-reps-rs220-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k21.dna.zip) - all GTDB species representative genomes. - DNA, k=21, scaled=1000 (2.8 GB)
22-
* zip: [gtdb-reps-rs220-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k31.dna.zip) - all GTDB species representative genomes. - DNA, k=31, scaled=1000 (2.8 GB)
23-
* zip: [gtdb-reps-rs220-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k51.dna.zip) - all GTDB species representative genomes. - DNA, k=51, scaled=1000 (2.8 GB)
23+
* [GTDB taxonomy for RS220.](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220.lineages.csv)
2424

2525

26+
## Advanced
2627

27-
## Taxonomy files:
28+
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
29+
<!-- template file: templates/advanced.md -->
2830

29-
* [GTDB taxonomy for RS220.](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220.lineages.csv)
31+
### Complete list of database files
32+
33+
Files:
34+
35+
* zip: [gtdb-rs220-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k21.dna.zip) - all GTDB genomes - DNA, k=21, scaled=1000 (17.0 GB)
36+
* zip: [gtdb-rs220-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k31.dna.zip) - all GTDB genomes - DNA, k=31, scaled=1000 (17.0 GB)
37+
* zip: [gtdb-rs220-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-rs220-k51.dna.zip) - all GTDB genomes - DNA, k=51, scaled=1000 (17.0 GB)
38+
39+
40+
* zip: [gtdb-reps-rs220-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k21.dna.zip) - GTDB species representative genomes - DNA, k=21, scaled=1000 (2.8 GB)
41+
* zip: [gtdb-reps-rs220-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k31.dna.zip) - GTDB species representative genomes - DNA, k=31, scaled=1000 (2.8 GB)
42+
* zip: [gtdb-reps-rs220-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs220/gtdb-reps-rs220-k51.dna.zip) - GTDB species representative genomes - DNA, k=51, scaled=1000 (2.8 GB)
3043

3144

32-
## Advanced
3345

3446
### Download via curl using the command line
3547

outputs/md/gtdb226.md

Lines changed: 38 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
2-
<!-- template file: templates/complete.md -->
2+
<!-- template file: templates/gtdb.md -->
33

44
# Collection: GTDB RS226
55

@@ -11,30 +11,45 @@ Links:
1111

1212
## Database files:
1313

14-
Files:
14+
| K-mer size | GTDB reps | GTDB entire | RocksDB index of entire |
15+
| -------- | -------- | -------- | ---- |
16+
| 21 | [download (3.7 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k21.dna.zip) | [download (21.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.zip) | [download (31.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.zip) |
17+
| 31 | [download (3.7 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k31.dna.zip) | [download (21.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.zip) | [download (32.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.zip) |
18+
| 51 | [download (3.7 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k51.dna.zip) | [download (21.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.zip) | [download (33.0 GB)](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.zip) |
1519

16-
* zip: [gtdb-reps-rs226-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k21.dna.zip) - all GTDB species representative genomes. - DNA, k=21, scaled=1000 (21.0 GB)
17-
* zip: [gtdb-reps-rs226-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k31.dna.zip) - all GTDB species representative genomes. - DNA, k=31, scaled=1000 (21.0 GB)
18-
* zip: [gtdb-reps-rs226-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k51.dna.zip) - all GTDB species representative genomes. - DNA, k=51, scaled=1000 (21.0 GB)
1920

21+
Note: RocksDB indexes must be unzipped before use, while the other
22+
databases can be used directly as zip files.
2023

21-
* zip: [gtdb-rs226-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.zip) - all GTDB genomes - DNA, k=21, scaled=1000 (21.0 GB)
22-
* zip: [gtdb-rs226-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.zip) - all GTDB genomes - DNA, k=31, scaled=1000 (21.0 GB)
23-
* zip: [gtdb-rs226-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.zip) - all GTDB genomes - DNA, k=51, scaled=1000 (21.0 GB)
24+
## Taxonomy files:
2425

26+
* [GTDB taxonomy for RS226.](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226.lineages.csv)
2527

26-
* tar.gz: [gtdb-rs226-k21.dna.rocksdb.tar.gz](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.tar.gz) - all GTDB genomes, indexed with RocksDB - DNA, k=21, scaled=1000 (31.0 GB)
27-
* tar.gz: [gtdb-rs226-k31.dna.rocksdb.tar.gz](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.tar.gz) - all GTDB genomes, indexed with RocksDB - DNA, k=31, scaled=1000 (32.0 GB)
28-
* tar.gz: [gtdb-rs226-k51.dna.rocksdb.tar.gz](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.tar.gz) - all GTDB genomes, indexed with RocksDB - DNA, k=51, scaled=1000 (33.0 GB)
2928

29+
## Advanced
3030

31+
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
32+
<!-- template file: templates/advanced.md -->
3133

32-
## Taxonomy files:
34+
### Complete list of database files
3335

34-
* [GTDB taxonomy for RS226.](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226.lineages.csv)
36+
Files:
37+
38+
* zip: [gtdb-reps-rs226-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k21.dna.zip) - GTDB species representative genomes - DNA, k=21, scaled=1000 (3.7 GB)
39+
* zip: [gtdb-reps-rs226-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k31.dna.zip) - GTDB species representative genomes - DNA, k=31, scaled=1000 (3.7 GB)
40+
* zip: [gtdb-reps-rs226-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226-k51.dna.zip) - GTDB species representative genomes - DNA, k=51, scaled=1000 (3.7 GB)
41+
42+
43+
* zip: [gtdb-rs226-k21.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.zip) - all GTDB genomes - DNA, k=21, scaled=1000 (21.0 GB)
44+
* zip: [gtdb-rs226-k31.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.zip) - all GTDB genomes - DNA, k=31, scaled=1000 (21.0 GB)
45+
* zip: [gtdb-rs226-k51.dna.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.zip) - all GTDB genomes - DNA, k=51, scaled=1000 (21.0 GB)
46+
47+
48+
* tar.gz: [gtdb-rs226-k21.dna.rocksdb.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.zip) - all GTDB genomes, indexed with RocksDB - DNA, k=21, scaled=1000 (31.0 GB)
49+
* tar.gz: [gtdb-rs226-k31.dna.rocksdb.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.zip) - all GTDB genomes, indexed with RocksDB - DNA, k=31, scaled=1000 (32.0 GB)
50+
* tar.gz: [gtdb-rs226-k51.dna.rocksdb.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.zip) - all GTDB genomes, indexed with RocksDB - DNA, k=51, scaled=1000 (33.0 GB)
3551

3652

37-
## Advanced
3853

3954
### Download via curl using the command line
4055

@@ -57,14 +72,14 @@ curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-
5772
# download gtdb-rs226-k51.dna.zip
5873
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.zip
5974

60-
# download gtdb-rs226-k21.dna.rocksdb.tar.gz
61-
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.tar.gz
75+
# download gtdb-rs226-k21.dna.rocksdb.zip
76+
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.zip
6277

63-
# download gtdb-rs226-k31.dna.rocksdb.tar.gz
64-
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.tar.gz
78+
# download gtdb-rs226-k31.dna.rocksdb.zip
79+
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.zip
6580

66-
# download gtdb-rs226-k51.dna.rocksdb.tar.gz
67-
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.tar.gz
81+
# download gtdb-rs226-k51.dna.rocksdb.zip
82+
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.zip
6883

6984
# download taxonomy file
7085
curl -O --no-clobber https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226.lineages.csv
@@ -79,8 +94,8 @@ https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-reps-rs226
7994
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.zip
8095
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.zip
8196
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.zip
82-
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.tar.gz
83-
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.tar.gz
84-
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.tar.gz
97+
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.zip
98+
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.zip
99+
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.zip
85100
https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226.lineages.csv
86101
```

outputs/md/ncbi_euks_2025_01.md

Lines changed: 37 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
2-
<!-- template file: templates/complete.md -->
2+
<!-- template file: templates/ncbi.md -->
33

44
# Collection: NCBI Eukaryotes (Jan 2025)
55

@@ -11,6 +11,42 @@ Links:
1111

1212
## Database files:
1313

14+
15+
Indexed RocksDB collections:
16+
17+
* zip: [ncbi-euks-all-2025.01.k51.rocksdb.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-all-2025.01.k51.rocksdb.zip) - all NCBI eukaryotes, indexed in a RocksDB - DNA, k=51, scaled=10000 (19.0 GB)
18+
19+
20+
Note: RocksDB indexes must be unzipped before use.
21+
22+
23+
Zip collections:
24+
25+
* zip: [ncbi-euks-vertebrates-2025.01.dna.k=51.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-vertebrates-2025.01.dna.k=51.sig.zip) - vertebrate reference genomes (NCBI:txid7742) - DNA, k=51, scaled=10000 (4.0 GB)
26+
27+
* zip: [ncbi-euks-bilateria-minus-vertebrates-2025.01.dna.k=51.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-bilateria-minus-vertebrates-2025.01.dna.k=51.sig.zip) - bilateria minus the vertebrates - DNA, k=51, scaled=10000 (1.7 GB)
28+
29+
* zip: [ncbi-euks-plants-2025.01.dna.k=51.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-plants-2025.01.dna.k=51.sig.zip) - plant reference genomes (NCBI:txid33090) - DNA, k=51, scaled=10000 (1.3 GB)
30+
31+
* zip: [ncbi-euks-fungi-2025.01.dna.k=51.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-fungi-2025.01.dna.k=51.sig.zip) - fungal reference genomes (NCBI:txid4751) - DNA, k=51, scaled=10000 (0.2 GB)
32+
33+
* zip: [ncbi-euks-metazoa-minus-bilateria-2025.01.dna.k=51.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-metazoa-minus-bilateria-2025.01.dna.k=51.sig.zip) - metazoan reference genomes minus the bilateria - DNA, k=51, scaled=10000 (0.1 GB)
34+
35+
* zip: [ncbi-euks-other-2025.01.dna.k=51.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-other-2025.01.dna.k=51.sig.zip) - remaining eukaryotes (not plants, fungi, or metazoa) - DNA, k=51, scaled=10000 (0.1 GB)
36+
37+
38+
## Taxonomy files:
39+
40+
* [NCBI taxonomy for eukaryotes (NCBI:txid2759) as of January 2025.](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-eukaryotes.2025.01.lineages.csv)
41+
42+
43+
## Advanced
44+
45+
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
46+
<!-- template file: templates/advanced.md -->
47+
48+
### Complete list of database files
49+
1450
Files:
1551

1652
* zip: [ncbi-euks-all-2025.01.k51.rocksdb.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-euks-all-2025.01.k51.rocksdb.zip) - all NCBI eukaryotes, indexed in a RocksDB - DNA, k=51, scaled=10000 (19.0 GB)
@@ -35,13 +71,6 @@ Files:
3571

3672

3773

38-
## Taxonomy files:
39-
40-
* [NCBI taxonomy for eukaryotes (NCBI:txid2759) as of January 2025.](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/genbank-euks-2025.01/ncbi-eukaryotes.2025.01.lineages.csv)
41-
42-
43-
## Advanced
44-
4574
### Download via curl using the command line
4675

4776
```shell

outputs/md/ncbi_viruses_2025_01.md

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
2-
<!-- template file: templates/complete.md -->
2+
<!-- template file: templates/ncbi.md -->
33

44
# Collection: NCBI Viruses (Jan 2025)
55

@@ -11,12 +11,15 @@ Links:
1111

1212
## Database files:
1313

14-
Files:
1514

16-
* zip: [ncbi-viruses-2025.01.dna.k=21.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=21.sig.zip) - all viral genomes. - DNA, k=21, scaled=50 (1.4 GB)
17-
* zip: [ncbi-viruses-2025.01.dna.k=31.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=31.sig.zip) - all viral genomes. - DNA, k=31, scaled=50 (1.4 GB)
18-
* zip: [ncbi-viruses-2025.01.skip_m2n3.k=24.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.skip_m2n3.k=24.sig.zip) - all viral genomes. - skip_m2n3, k=24, scaled=50 (2.7 GB)
1915

16+
Zip collections:
17+
18+
* zip: [ncbi-viruses-2025.01.dna.k=21.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=21.sig.zip) - all viral genomes - DNA, k=21, scaled=50 (1.4 GB)
19+
20+
* zip: [ncbi-viruses-2025.01.dna.k=31.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=31.sig.zip) - all viral genomes - DNA, k=31, scaled=50 (1.4 GB)
21+
22+
* zip: [ncbi-viruses-2025.01.skip_m2n3.k=24.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.skip_m2n3.k=24.sig.zip) - all viral genomes - skip_m2n3, k=24, scaled=50 (2.7 GB)
2023

2124

2225
## Taxonomy files:
@@ -26,6 +29,19 @@ Files:
2629

2730
## Advanced
2831

32+
<!-- automatically generated by code in https://github.com/sourmash-bio/2025-sourmash-databases-doc-template/ -->
33+
<!-- template file: templates/advanced.md -->
34+
35+
### Complete list of database files
36+
37+
Files:
38+
39+
* zip: [ncbi-viruses-2025.01.dna.k=21.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=21.sig.zip) - all viral genomes - DNA, k=21, scaled=50 (1.4 GB)
40+
* zip: [ncbi-viruses-2025.01.dna.k=31.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=31.sig.zip) - all viral genomes - DNA, k=31, scaled=50 (1.4 GB)
41+
* zip: [ncbi-viruses-2025.01.skip_m2n3.k=24.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.skip_m2n3.k=24.sig.zip) - all viral genomes - skip_m2n3, k=24, scaled=50 (2.7 GB)
42+
43+
44+
2945
### Download via curl using the command line
3046

3147
```shell

outputs/scripts/check-urls.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@
1919
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.zip',
2020
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.zip',
2121
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.zip',
22-
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.tar.gz',
23-
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.tar.gz',
24-
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.tar.gz',
22+
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k21.dna.rocksdb.zip',
23+
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k31.dna.rocksdb.zip',
24+
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226-k51.dna.rocksdb.zip',
2525
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/gtdb-rs226/gtdb-rs226.lineages.csv',
2626
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=21.sig.zip',
2727
'https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db.new/ncbi-viruses-2025.01/ncbi-viruses-2025.01.dna.k=31.sig.zip',

0 commit comments

Comments
 (0)