Skip to content

Support termdb.dictionary.build/get in PP code#4291

Merged
xzhou82 merged 2 commits intomasterfrom
mmrf_scrna.overlay
Mar 12, 2026
Merged

Support termdb.dictionary.build/get in PP code#4291
xzhou82 merged 2 commits intomasterfrom
mmrf_scrna.overlay

Conversation

@gavrielm
Copy link
Collaborator

Description

PP code changes to support the introduction of termdb.dictionary.build/get in https://github.com/stjude/sjpp/pull/1229

Checklist

Check each task that has been performed or verified to be not applicable.

  • Tests: Added and/or passed unit and integration tests, or N/A
  • Todos: Commented or documented, or N/A
  • Notable Changes: updated release.txt, prefixed a commit message with "fix:" or "feat:", added to an internal tracking document, or N/A
  • Rust: Checked to see whether Rust needs to be re-compiled because of this PR, or N/A

@gavrielm gavrielm requested review from josephpowi and xzhou82 March 10, 2026 16:09
@gavrielm gavrielm force-pushed the mmrf_scrna.overlay branch from d72246d to ced508f Compare March 12, 2026 01:32
@gavrielm gavrielm marked this pull request as ready for review March 12, 2026 01:41
Copilot AI review requested due to automatic review settings March 12, 2026 01:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates ProteinPaint (PP) server + shared dataset typings to support the new dataset contract termdb.dictionary.build/get (introduced in sjpp), enabling API-based datasets to build a dictionary and serve dictionary-term sample annotations through a dataset-supplied getter.

Changes:

  • Added DictApi typing and extended Termdb.dictionary typing to include build/get.
  • Updated validate_termdb() to call termdb.dictionary.build(ds) instead of the deprecated termdb.buildDictionary(ds).
  • Added an API-based dictionary-term sample annotation query path in termdb.matrix.js that calls termdb.dictionary.get().

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
shared/types/src/dataset.ts Adds DictApi and updates Termdb.dictionary typing to include build/get.
server/src/termdb.matrix.js Routes dictionary-term queries to a dataset-supplied dictionary.get() and adds an API-backed sample reshaping function.
server/src/mds3.init.js Switches dictionary initialization to use termdb.dictionary.build(ds) when present.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +170 to +172
build?: (ds: any) => void
// gets dictionary term data
get?: (twLst: any) => void
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DictApi.build/DictApi.get are typed as returning void, but the server code awaits these methods and (for get) expects a structured return (e.g. { samples, byTermId }). Update these typings to reflect the actual async return types so TS catches contract mismatches for datasets implementing termdb.dictionary.build/get.

Suggested change
build?: (ds: any) => void
// gets dictionary term data
get?: (twLst: any) => void
build?: (ds: any) => void | Promise<void>
// gets dictionary term data
get?: (
twLst: any
) => { samples: any; byTermId: any } | Promise<{ samples: any; byTermId: any }>

Copilot uses AI. Check for mistakes.
// dataset uses server-side sqlite db, must use this method for dictionary terms
return await getSampleData_dictionaryTerms_termdb(q, termWrappers, onlyChildren)
}
if (q.ds.cohort.termdb.dictionary?.get) {
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch only checks truthiness of termdb.dictionary.get before calling it. If a dataset config accidentally sets a non-function value here, this will throw later with a less clear error. Consider validating typeof ...get === 'function' (similar to the build validation in mds3.init.js) before routing to the API-based getter.

Suggested change
if (q.ds.cohort.termdb.dictionary?.get) {
if (typeof q.ds.cohort.termdb.dictionary?.get === 'function') {

Copilot uses AI. Check for mistakes.
Comment on lines +693 to +694
if (!v && v !== 0) {
// skip undefined values
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The undefined-check if (!v && v !== 0) will also skip valid falsy values like false or an empty string. The v2s implementation below uses v != undefined && v != null, which preserves false/''. Align the check here to only skip null/undefined to avoid dropping legitimate annotations.

Suggested change
if (!v && v !== 0) {
// skip undefined values
if (v == null) {
// skip undefined/null values

Copilot uses AI. Check for mistakes.
const samples = {} // data.samples[] converts into this
for (const s of data.samples) {
const sampleId = q.ds.cohort.termdb.q.sampleName2id(s.sample_id)
if (!sampleId && sampleId !== 0) throw new Error('cannot find sample')
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thrown error 'cannot find sample' is too generic for debugging and doesn’t include the unmapped identifier. Include s.sample_id (and ideally dataset label) in the error so it’s actionable when an API response contains unexpected sample IDs.

Suggested change
if (!sampleId && sampleId !== 0) throw new Error('cannot find sample')
if (!sampleId && sampleId !== 0) {
const dsLabel = q && q.ds && q.ds.label ? ` in dataset ${q.ds.label}` : ''
throw new Error(`cannot find sample for API sample_id '${s.sample_id}'${dsLabel}`)
}

Copilot uses AI. Check for mistakes.
Comment on lines +299 to +302
} else if (tdb.dictionary?.build) {
// ds-supplied builder method
if (typeof tdb.dictionary.build != 'function') throw 'termdb.dictionary.build() is not a function'
await tdb.dictionary.build(ds)
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_termdb() validates termdb.dictionary.build, but does not validate termdb.dictionary.get even though other code paths (e.g. termdb.matrix) may call it at runtime. Consider adding a type check for dictionary.get (when present) during validation so misconfigured datasets fail fast at startup.

Copilot uses AI. Check for mistakes.
Comment on lines +681 to +683
async function getSampleData_dictionaryTerms_api(q, termWrappers) {
const data = await q.ds.cohort.termdb.dictionary.get(termWrappers)
const samples = {} // data.samples[] converts into this
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New API-based dictionary query path (getSampleData_dictionaryTerms_api) is added, but there are existing unit tests for this module (e.g. server/src/test/termdb.matrix.unit.spec.js) that don’t cover it. Add tests for the new path, especially around value handling (0/false/''), $id mapping, and sample_id -> internal id conversion.

Copilot uses AI. Check for mistakes.
@xzhou82 xzhou82 merged commit 9f2e4c4 into master Mar 12, 2026
5 checks passed
@xzhou82 xzhou82 deleted the mmrf_scrna.overlay branch March 12, 2026 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants