Release 2025.5.0 #575

bencap · 2025-11-10T21:23:35Z

Features

Closes Publication Associations for Score Ranges Independent of Score Set Publications #523, closes Generic Score Ranges for Supported Formats #518, closes Allow ranges marked as 'unspecified' to overlap #501: via Refactor Score Ranges to Score Calibrations and Strengthen Their Data Model #545
Closes Long score set search result optimization #524: via Score set search result optimization #525
Closes Namespaced variant CSV export #446: via Modify the function of get_score_set_variants_csv to allow downloading multiple data types together. #541
Closes Structured metadata for column descriptions #38: via Adds column metadata support and a multipart PATCH endpoint to update score sets and variants #546
Closes Add VRS data to CSV output when post mapped HGVS strings are requested #550: via feat: output VRS digest with post mapped HGVS strings #577
Closes Calculated Mutational Consequence (using VEP standards) #429: via Data Integrity Issues for Curated Score Ranges #533
Closes Return p./g./c. HGVS Strings with all Mapping Jobs #468: via Data Integrity Issues for Curated Score Ranges #533
Closes Populate additional HGVS columns via CAID #470: via Data Integrity Issues for Curated Score Ranges #533

Bug Fixes

Maintenance

Deployment Note
To properly migrate the database to support this branch follow these steps:

Checkout the commit 2cbf857d78ecd38caaa105eeecada9b6eda46f99 and upgrade to the alembic head.

git checkout 129d7daefb0c18d0f70f20ee633cb17a02a3cc37
DB_PORT=5434 DB_DATABASE_NAME=mavedb DB_USERNAME=postgres DB_PASSWORD=XXX poetry run alembic upgrade head

Run the manual migration

DB_PORT=5434 DB_DATABASE_NAME=mavedb DB_USERNAME=postgres DB_PASSWORD=XXX poetry run python3 alembic/manual_migrations/migrate_score_ranges_to_calibrations.py

Checkout the commit 3da28919a6f6173ad0d65c768b172dfffbf730fc and upgrade to the alembic head.

git checkout 3da28919a6f6173ad0d65c768b172dfffbf730fc
DB_PORT=5434 DB_DATABASE_NAME=mavedb DB_USERNAME=postgres DB_PASSWORD=XXX poetry run alembic upgrade head

Checkout the head commit at main and upgrade to the alembic head

git checkout main
DB_PORT=5434 DB_DATABASE_NAME=mavedb DB_USERNAME=postgres DB_PASSWORD=XXX poetry run alembic upgrade head

…single primary definition

For use with urn:mavedb:00000097-0-2

Adds a standalone score calibration model to replace the `score_ranges` property of score sets. This model better supports generically typed score ranges, publication identifiers directly associated with score ranges and odds paths provided independently of functional classes.

…st to simplify build process

…null falsey values

…g multiple data types together.

Introduce dataset_columns field on score sets allowing definition of score and count column metadata. Establish base plumbing for persisting and returning column structure with score set resources. Add dataset_columns to score set updates Allows two keys of dataset_columns to be set up score_set_update: score_columns_metadata, and count_columns_metadata. These can be populated with JSON, keyed on column name, with values describing the fields listed in the score_columns and count_columns values of dataset_columns. Both score and count column metadata are optional, but if present will be utilized by the UI to display more information about custom columns present in each dataset.

…nts as part of updating variants Reworks previous implementation that includes scores and counts column metadata as part of JSON body to instead have them submitted as optional JSON files along with scores and counts CSV files. This change also includes validation of the JSON files with the CSV files they correspond to. Added pydantic models for DatasetColumns and DatasetColumnMetadata to correspond with ScoreSet.dataset_columns and its score_columns_metadata and count_columns_metadata fields. Ensures background jobs have structured column annotations available for downstream processing.

Create update model where all fields are optional plus as_form classmethod to parse multipart form data. Enables PATCH/PUT endpoints to accept partial updates including JSON-encoded nested objects.

Extract dataset column models into view_models/score_set_dataset_columns for cohesion and reuse. Improves maintainability by separating concerns from core score set models.

Remove test relying on camelization since dataset_columns now backed by explicit pydantic models. Adjust wrapper function to satisfy mypy type checking.

Add recordType to SavedDatasetColumns for clearer differentiation of stored column metadata groups. Update related constants and tests to reflect new attribute.

…pload Implement /score-sets-with-variants/{urn} PATCH handling multipart form (files + metadata). Supports simultaneous update of score set fields, scores CSV, counts CSV, and column metadata JSON.

Provide utility functions to avoid redundant target gene recreation on update. Improves efficiency and correctness of target gene handling during score set modifications.

Rename internal variables for consistency (score_columns_metadata, count_columns_metadata). Adjust route logic to accept both score/count files and their metadata JSON counterparts.

Integrate file parsing (scores, counts, column metadata) into PATCH workflow. Adds validation + conditional variant regeneration based on uploaded content and target gene changes.

Revise existing tests and introduce new ones to cover optional update model and multipart handling. Ensures regression coverage for newly added endpoint behaviors.

… usage Add targeted tests for optional update model. Apply model validation within router context to assert coherent partial update semantics.

Introduce score_columns_metadata.json and count metadata test files. Enable worker job tests to parse realistic column annotations during variant ingestion pipeline.

- New /api/v1/alphafold-files/version endpoint (CORS-safe) - Fetches upstream XML index, handles optional namespace - Extracts model version from NextMarker (model_vN) and returns {"version": "v6"} (lowercased) - Error responses: upstream fetch failure (status passthrough), malformed ID (500), XML parse error (502)

…t scoreset URNs and count from response Including the scoreset URNs and count for all score sets for the parent experiment is computationally expensive and not necessary in all contexts (e.g. the main score set search page). Adding an optional boolean parameter to score set search paramaters to indicate whether or not to include this in the response, which defaults to true to maintain existing behavior.

- Add a limit option to score set search queries. Require a limit of at most 100 for searches of all score sets, while searches for the user's own score sets can have no limit. - Extract the score set search query filter clause logic into a new function. - Use limit + 1 in the search query, and if limit is exceeded, run a second query to count available rows. In either case, return the number of available rows with the limited search result. - Instead of searching all score sets and then replacing un-superseded ones with their successors, revise the database query to search only un-superseded score sets. - In the main search endpoint (but not in the "my score sets" endpoint), mandate that the search be only for published score sets.

…e set search.

…pecified, but limit the number of publication IDs.

The Rxiv server recently updated their API to remove the `preprint_` string pre-pended to various fields. This change removes the prepended string, allowing us to parse these fields out again.

…for lower points, citation information

In newer versions of UTA, the gene table has become overrun with mRNA, mitochondrial genes, etc. that polute the result list. Since the real purpose of this endpoint is to list the genes for which we can link transcripts, this query edits the query to select distinct HGNC names from the transcripts table.

…in variant enqueue

… and clarify error logging for variant updates

…s boolean comparison

…tion function

Populate and surface post-mapped HGVS expressions and VEP functional consequence, and surface gnomAD AF

… on VRS if populated HGVS is not available

bencap and others added 30 commits October 6, 2025 13:40

Add primary field to score range models and implement validation for …

5667053

…single primary definition

Add Fayer score range models and integrate into score set ranges

c11a1e7

For use with urn:mavedb:00000097-0-2

fix: score range admin model inheritance

e3192cd

Add build path to docker-compose-dev.yaml for dcd-mapping and cdot-re…

61aa6ef

…st to simplify build process

Update redis environment variables in template file

73b3a1a

Update value check on score_set and experiment updates to handle non-…

12d6c0e

…null falsey values

Modify the function of get_score_set_variants_csv to allow downloadin…

b24f3c0

…g multiple data types together.

Modify a related test.

4a8f65a

Add some related tests.

c6aab8f

feat: add ScoreSetUpdateAllOptional with multipart form helper

f28c8e7

Create update model where all fields are optional plus as_form classmethod to parse multipart form data. Enables PATCH/PUT endpoints to accept partial updates including JSON-encoded nested objects.

refactor: move dataset column pydantic models to dedicated module

5cb60e0

Extract dataset column models into view_models/score_set_dataset_columns for cohesion and reuse. Improves maintainability by separating concerns from core score set models.

refactor: replace dynamic camelization test with explicit model

bbcb3f2

Remove test relying on camelization since dataset_columns now backed by explicit pydantic models. Adjust wrapper function to satisfy mypy type checking.

feat: extend SavedDatasetColumns with recordType

454cf86

Add recordType to SavedDatasetColumns for clearer differentiation of stored column metadata groups. Update related constants and tests to reflect new attribute.

feat: add PATCH endpoint supporting variants + score/count metadata u…

e3812b1

…pload Implement /score-sets-with-variants/{urn} PATCH handling multipart form (files + metadata). Supports simultaneous update of score set fields, scores CSV, counts CSV, and column metadata JSON.

feat: add target gene find_or_create helpers for sequence/accession

45c195d

Provide utility functions to avoid redundant target gene recreation on update. Improves efficiency and correctness of target gene handling during score set modifications.

refactor: unify variable names for score/count metadata & extend routes

de87fc4

Rename internal variables for consistency (score_columns_metadata, count_columns_metadata). Adjust route logic to accept both score/count files and their metadata JSON counterparts.

feat: enhance PATCH endpoint to fully process uploaded files

7fe4ed8

Integrate file parsing (scores, counts, column metadata) into PATCH workflow. Adds validation + conditional variant regeneration based on uploaded content and target gene changes.

test: update and add unit tests aligned with new score set update flow

cf89e6e

Revise existing tests and introduce new ones to cover optional update model and multipart handling. Ensures regression coverage for newly added endpoint behaviors.

test: add ScoreSetUpdateAllOptional model tests and router validation…

d49fb27

… usage Add targeted tests for optional update model. Apply model validation within router context to assert coherent partial update semantics.

feat: add worker job fixtures for score/count column metadata

4fa4995

Introduce score_columns_metadata.json and count metadata test files. Enable worker job tests to parse realistic column annotations during variant ingestion pipeline.

Add gzip middleware to fastapi to compress large responses

e427084

Add an endpoint to obtain search filter options based on a given scor…

0c1544f

…e set search.

Allow score set search without a row limit when publication IDs are s…

041bfb9

…pecified, but limit the number of publication IDs.

MyPy: typing for counters

1e5e6ed

sallybg and others added 27 commits November 12, 2025 15:33

Add VEP functional consequence to variants csv

4f8e0d8

Retrieve and store hgvs representations of mapped variants

83a5c67

Include post-mapped hgvs in variants data csv

fffde3d

Fix get_hgvs_from_post_mapped imports

ba7f99c

Assert type for mypy

5a9c887

Update variants csv tests to reflect mapped hgvs changes

00f4429

Resolve alembic merge conflict

6bfaaca

Use provided na_rep string to represent null values

f88a72c

Use production clingen API

e04d305

Update progress while populating mapped hgvs

fc404d0

Update todo issue links

41c223b

Fix worker import for hgvs extraction

1403c06

Add vep to csv namespace options

ae5ab83

Fix csv tests

4fdad00

fix: update author extraction logic in RxivPublication class

5326b5d

The Rxiv server recently updated their API to remove the `preprint_` string pre-pended to various fields. This change removes the prepended string, allowing us to parse these fields out again.

feat: unbounded zeiberg ranges to infinity, inclusive boundary logic …

268b5e9

…for lower points, citation information

fix: allow experiment update to unset values

dec9827

fix: update score_columns and count_columns to use namespaced column …

085177c

…in variant enqueue

fix: change db.commit() to db.flush() for better transaction handling…

83cbe75

… and clarify error logging for variant updates

fix: replace pd.NA with np.NaN for consistency in DataFrame null type…

d275877

…s boolean comparison

fix: add condition to check for score_columns in enqueue_variant_crea…

6ab205d

…tion function

Add gnomad af to csv

21a771f

Merge pull request #553 from VariantEffect/store-all-hgvs

aeeddd8

Populate and surface post-mapped HGVS expressions and VEP functional consequence, and surface gnomAD AF

chore: update revision identifiers in alembic migration to point at head

0c981c2

feat: enhance mapped HGVS retrieval in variant_to_csv_row to fallback…

b0b5bd4

… on VRS if populated HGVS is not available

Bump version to 2025.5.0

4f67293

bencap marked this pull request as ready for review November 14, 2025 03:51

bencap merged commit 37659db into main Nov 14, 2025
6 checks passed

bencap deleted the release-2025.5.0 branch November 14, 2025 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release 2025.5.0 #575

Release 2025.5.0 #575

Uh oh!

bencap commented Nov 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Release 2025.5.0 #575

Release 2025.5.0 #575

Uh oh!

Conversation

bencap commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Bug Fixes

Maintenance

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

bencap commented Nov 10, 2025 •

edited

Loading