Support termdb.dictionary.build/get in PP code#4291
Conversation
d72246d to
ced508f
Compare
There was a problem hiding this comment.
Pull request overview
Updates ProteinPaint (PP) server + shared dataset typings to support the new dataset contract termdb.dictionary.build/get (introduced in sjpp), enabling API-based datasets to build a dictionary and serve dictionary-term sample annotations through a dataset-supplied getter.
Changes:
- Added
DictApityping and extendedTermdb.dictionarytyping to includebuild/get. - Updated
validate_termdb()to calltermdb.dictionary.build(ds)instead of the deprecatedtermdb.buildDictionary(ds). - Added an API-based dictionary-term sample annotation query path in
termdb.matrix.jsthat callstermdb.dictionary.get().
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| shared/types/src/dataset.ts | Adds DictApi and updates Termdb.dictionary typing to include build/get. |
| server/src/termdb.matrix.js | Routes dictionary-term queries to a dataset-supplied dictionary.get() and adds an API-backed sample reshaping function. |
| server/src/mds3.init.js | Switches dictionary initialization to use termdb.dictionary.build(ds) when present. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| build?: (ds: any) => void | ||
| // gets dictionary term data | ||
| get?: (twLst: any) => void |
There was a problem hiding this comment.
DictApi.build/DictApi.get are typed as returning void, but the server code awaits these methods and (for get) expects a structured return (e.g. { samples, byTermId }). Update these typings to reflect the actual async return types so TS catches contract mismatches for datasets implementing termdb.dictionary.build/get.
| build?: (ds: any) => void | |
| // gets dictionary term data | |
| get?: (twLst: any) => void | |
| build?: (ds: any) => void | Promise<void> | |
| // gets dictionary term data | |
| get?: ( | |
| twLst: any | |
| ) => { samples: any; byTermId: any } | Promise<{ samples: any; byTermId: any }> |
| // dataset uses server-side sqlite db, must use this method for dictionary terms | ||
| return await getSampleData_dictionaryTerms_termdb(q, termWrappers, onlyChildren) | ||
| } | ||
| if (q.ds.cohort.termdb.dictionary?.get) { |
There was a problem hiding this comment.
This branch only checks truthiness of termdb.dictionary.get before calling it. If a dataset config accidentally sets a non-function value here, this will throw later with a less clear error. Consider validating typeof ...get === 'function' (similar to the build validation in mds3.init.js) before routing to the API-based getter.
| if (q.ds.cohort.termdb.dictionary?.get) { | |
| if (typeof q.ds.cohort.termdb.dictionary?.get === 'function') { |
| if (!v && v !== 0) { | ||
| // skip undefined values |
There was a problem hiding this comment.
The undefined-check if (!v && v !== 0) will also skip valid falsy values like false or an empty string. The v2s implementation below uses v != undefined && v != null, which preserves false/''. Align the check here to only skip null/undefined to avoid dropping legitimate annotations.
| if (!v && v !== 0) { | |
| // skip undefined values | |
| if (v == null) { | |
| // skip undefined/null values |
| const samples = {} // data.samples[] converts into this | ||
| for (const s of data.samples) { | ||
| const sampleId = q.ds.cohort.termdb.q.sampleName2id(s.sample_id) | ||
| if (!sampleId && sampleId !== 0) throw new Error('cannot find sample') |
There was a problem hiding this comment.
The thrown error 'cannot find sample' is too generic for debugging and doesn’t include the unmapped identifier. Include s.sample_id (and ideally dataset label) in the error so it’s actionable when an API response contains unexpected sample IDs.
| if (!sampleId && sampleId !== 0) throw new Error('cannot find sample') | |
| if (!sampleId && sampleId !== 0) { | |
| const dsLabel = q && q.ds && q.ds.label ? ` in dataset ${q.ds.label}` : '' | |
| throw new Error(`cannot find sample for API sample_id '${s.sample_id}'${dsLabel}`) | |
| } |
| } else if (tdb.dictionary?.build) { | ||
| // ds-supplied builder method | ||
| if (typeof tdb.dictionary.build != 'function') throw 'termdb.dictionary.build() is not a function' | ||
| await tdb.dictionary.build(ds) |
There was a problem hiding this comment.
validate_termdb() validates termdb.dictionary.build, but does not validate termdb.dictionary.get even though other code paths (e.g. termdb.matrix) may call it at runtime. Consider adding a type check for dictionary.get (when present) during validation so misconfigured datasets fail fast at startup.
| async function getSampleData_dictionaryTerms_api(q, termWrappers) { | ||
| const data = await q.ds.cohort.termdb.dictionary.get(termWrappers) | ||
| const samples = {} // data.samples[] converts into this |
There was a problem hiding this comment.
New API-based dictionary query path (getSampleData_dictionaryTerms_api) is added, but there are existing unit tests for this module (e.g. server/src/test/termdb.matrix.unit.spec.js) that don’t cover it. Add tests for the new path, especially around value handling (0/false/''), $id mapping, and sample_id -> internal id conversion.
Description
PP code changes to support the introduction of termdb.dictionary.build/get in https://github.com/stjude/sjpp/pull/1229
Checklist
Check each task that has been performed or verified to be not applicable.