Skip to content

Commit 129a6c5

Browse files
authored
Fix phylo outbreak (#354)
* phylo: Update example data Prompted by release of mpox Nextclade dataset that changed the values in the `outbreak` column. * phylo: Update `outbreak` filters in configs Lates mpox Nextclade dataset updated the values in the `outbreak` such that outbreak hMPXV-1 is now called sh2017.¹ Resolves #353 ¹ <https://github.com/nextstrain/nextclade_data/releases/tag/2025-12-10--14-52-38Z> * phylo: update hardcoded `hMPXV-1` values to `sh2017` * Update changelog
1 parent ae250ef commit 129a6c5

File tree

8 files changed

+175420
-181480
lines changed

8 files changed

+175420
-181480
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ Instead, changes appear below grouped by the date they were added to the workflo
1010

1111
## 2025
1212

13+
* 10 December 2025: BREAKING CHANGE - the `outbreak` column values have been updated with the latest release of the Nextclade dataset.
14+
See [Nextclade release notes](https://github.com/nextstrain/nextclade_data/releases/tag/2025-12-10--14-52-38Z) for details.
15+
* phylogenetic workflow's default filter params have been updated to replace outbreak `hMPXV-1` with `sh2017`
1316
* 10 November 2025: Transition to Pathoplexus as data source
1417
* Ingest workflow updated to pull data from Pathoplexus API instead of directly from INSDC databases.
1518
* Phylogenetic workflow updated to take into account changes in metadata and sequence data from Pathoplexus.

phylogenetic/build-configs/ci/config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ subsample:
3434
--group-by lineage year country
3535
--sequences-per-group 50
3636
--exclude-where
37-
outbreak!=hMPXV-1
37+
outbreak!=sh2017
3838
clade!=IIb
3939
lineage=B.1
4040
lineage=B.1.1
@@ -85,7 +85,7 @@ subsample:
8585
b1: >-
8686
--group-by country year
8787
--subsample-max-sequences 300
88-
--exclude-where outbreak!=hMPXV-1 clade!=IIb
88+
--exclude-where outbreak!=sh2017 clade!=IIb
8989
9090
## align
9191
max_indel: 10000

phylogenetic/defaults/color_ordering.tsv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16894,7 +16894,7 @@ clade_membership IIb
1689416894

1689516895
################
1689616896

16897-
outbreak hMPXV-1
16897+
outbreak sh2017
1689816898

1689916899
################
1690016900

phylogenetic/defaults/hmpxv1/config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ subsample:
3434
--group-by lineage year country
3535
--sequences-per-group 75
3636
--exclude-where
37-
outbreak!=hMPXV-1
37+
outbreak!=sh2017
3838
clade!=IIb
3939
lineage=B.1
4040
lineage=B.1.1
@@ -86,7 +86,7 @@ subsample:
8686
--group-by country year
8787
--subsample-max-sequences 800
8888
--probabilistic-sampling
89-
--exclude-where outbreak!=hMPXV-1 clade!=IIb
89+
--exclude-where outbreak!=sh2017 clade!=IIb
9090
9191
## align
9292
max_indel: 10000

phylogenetic/defaults/hmpxv1_big/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ subsample:
3131
--group-by year month country
3232
--subsample-max-sequences 5000
3333
--exclude-where
34-
outbreak!=hMPXV-1
34+
outbreak!=sh2017
3535
clade!=IIb
3636
lineage=A
3737
lineage=A.1

phylogenetic/example_data/metadata.tsv

Lines changed: 64 additions & 65 deletions
Large diffs are not rendered by default.

phylogenetic/example_data/sequences.fasta

Lines changed: 175344 additions & 181406 deletions
Large diffs are not rendered by default.

phylogenetic/scripts/clades_renaming.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030

3131
# if it starts with clade -> it's a clade
3232
# if it starts with outbreak -> it's outbreak, need to look up clade
33-
# if it starts with lineage -> it's clade IIb, outbreak hMPXV-1
33+
# if it starts with lineage -> it's clade IIb, outbreak sh2017
3434
if old_clade_name.startswith("clade"):
3535
clade_name = old_clade_name.split()[1]
3636
# Need to set up clade dictionary for when we have other outbreaks
@@ -41,7 +41,7 @@
4141
clade_name = args.outgroup_clade_name
4242
else:
4343
clade_name = "IIb"
44-
outbreak_name = "hMPXV-1"
44+
outbreak_name = "sh2017"
4545
lineage_name = old_clade_name
4646

4747
new_node_data[name] = {
@@ -52,7 +52,7 @@
5252
if "clade_annotation" in node:
5353
new_node_data[name]["clade_annotation"] = node["clade_annotation"]
5454
if node["clade_annotation"] == "A":
55-
new_node_data[name]["clade_annotation"] = "hMPXV-1 A"
55+
new_node_data[name]["clade_annotation"] = "sh2017 A"
5656

5757
data["nodes"] = new_node_data
5858
with open(args.output_node_data, "w") as fh:

0 commit comments

Comments
 (0)