-
Notifications
You must be signed in to change notification settings - Fork 2
Release 2025.5.0 #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Release 2025.5.0 #575
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…single primary definition
For use with urn:mavedb:00000097-0-2
Adds a standalone score calibration model to replace the `score_ranges` property of score sets. This model better supports generically typed score ranges, publication identifiers directly associated with score ranges and odds paths provided independently of functional classes.
…st to simplify build process
…null falsey values
…g multiple data types together.
Introduce dataset_columns field on score sets allowing definition of score and count column metadata. Establish base plumbing for persisting and returning column structure with score set resources. Add dataset_columns to score set updates Allows two keys of dataset_columns to be set up score_set_update: score_columns_metadata, and count_columns_metadata. These can be populated with JSON, keyed on column name, with values describing the fields listed in the score_columns and count_columns values of dataset_columns. Both score and count column metadata are optional, but if present will be utilized by the UI to display more information about custom columns present in each dataset.
…nts as part of updating variants Reworks previous implementation that includes scores and counts column metadata as part of JSON body to instead have them submitted as optional JSON files along with scores and counts CSV files. This change also includes validation of the JSON files with the CSV files they correspond to. Added pydantic models for DatasetColumns and DatasetColumnMetadata to correspond with ScoreSet.dataset_columns and its score_columns_metadata and count_columns_metadata fields. Ensures background jobs have structured column annotations available for downstream processing.
Create update model where all fields are optional plus as_form classmethod to parse multipart form data. Enables PATCH/PUT endpoints to accept partial updates including JSON-encoded nested objects.
Extract dataset column models into view_models/score_set_dataset_columns for cohesion and reuse. Improves maintainability by separating concerns from core score set models.
Remove test relying on camelization since dataset_columns now backed by explicit pydantic models. Adjust wrapper function to satisfy mypy type checking.
Add recordType to SavedDatasetColumns for clearer differentiation of stored column metadata groups. Update related constants and tests to reflect new attribute.
…pload
Implement /score-sets-with-variants/{urn} PATCH handling multipart form (files + metadata).
Supports simultaneous update of score set fields, scores CSV, counts CSV, and column metadata JSON.
Provide utility functions to avoid redundant target gene recreation on update. Improves efficiency and correctness of target gene handling during score set modifications.
Rename internal variables for consistency (score_columns_metadata, count_columns_metadata). Adjust route logic to accept both score/count files and their metadata JSON counterparts.
Integrate file parsing (scores, counts, column metadata) into PATCH workflow. Adds validation + conditional variant regeneration based on uploaded content and target gene changes.
Revise existing tests and introduce new ones to cover optional update model and multipart handling. Ensures regression coverage for newly added endpoint behaviors.
… usage Add targeted tests for optional update model. Apply model validation within router context to assert coherent partial update semantics.
Introduce score_columns_metadata.json and count metadata test files. Enable worker job tests to parse realistic column annotations during variant ingestion pipeline.
- New /api/v1/alphafold-files/version endpoint (CORS-safe)
- Fetches upstream XML index, handles optional namespace
- Extracts model version from NextMarker (model_vN) and returns {"version": "v6"} (lowercased)
- Error responses: upstream fetch failure (status passthrough), malformed ID (500), XML parse error (502)
…t scoreset URNs and count from response Including the scoreset URNs and count for all score sets for the parent experiment is computationally expensive and not necessary in all contexts (e.g. the main score set search page). Adding an optional boolean parameter to score set search paramaters to indicate whether or not to include this in the response, which defaults to true to maintain existing behavior.
- Add a limit option to score set search queries. Require a limit of at most 100 for searches of all score sets, while searches for the user's own score sets can have no limit. - Extract the score set search query filter clause logic into a new function. - Use limit + 1 in the search query, and if limit is exceeded, run a second query to count available rows. In either case, return the number of available rows with the limited search result. - Instead of searching all score sets and then replacing un-superseded ones with their successors, revise the database query to search only un-superseded score sets. - In the main search endpoint (but not in the "my score sets" endpoint), mandate that the search be only for published score sets.
…pecified, but limit the number of publication IDs.
The Rxiv server recently updated their API to remove the `preprint_` string pre-pended to various fields. This change removes the prepended string, allowing us to parse these fields out again.
…for lower points, citation information
In newer versions of UTA, the gene table has become overrun with mRNA, mitochondrial genes, etc. that polute the result list. Since the real purpose of this endpoint is to list the genes for which we can link transcripts, this query edits the query to select distinct HGNC names from the transcripts table.
…in variant enqueue
… and clarify error logging for variant updates
…s boolean comparison
Populate and surface post-mapped HGVS expressions and VEP functional consequence, and surface gnomAD AF
… on VRS if populated HGVS is not available
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Features
Bug Fixes
Maintenance
Deployment Note
To properly migrate the database to support this branch follow these steps:
Checkout the commit
2cbf857d78ecd38caaa105eeecada9b6eda46f99and upgrade to the alembic head.Run the manual migration
Checkout the commit
3da28919a6f6173ad0d65c768b172dfffbf730fcand upgrade to the alembic head.Checkout the head commit at main and upgrade to the alembic head