feat: enhance logging for sample intersection to assist debugging "No gene exist." errors #24

KaishinShaw · 2025-12-13T05:28:23Z

Description

This PR improves the observability of the data preprocessing pipeline.

Currently, if a user encounters a "No gene exist." error, it is difficult to pinpoint whether the issue stems from:

Sample ID mismatch during intersection.
Mismatched Gene IDs in the provided gene list.
Overly aggressive expression filtering.

Changes

Added logging.info statements to report gene and sample counts after key steps:
- Raw phenotype loading.
- User-provided gene list loading.
- create_readydata (intersection of genotype, phenotype, and covariates).
- Zero-expression filtering.
- Final gene list and expression threshold filtering.
Added a logging.warning if the dataset becomes empty immediately after intersection.
Added a logging.error if the final gene count is zero.

These changes allow users to easily debug data input issues by checking the log output.

zixuanzhang · 2025-12-13T15:36:55Z

Thank you for bringing this to our attention and make the edits accordingly! We are working on improving the software and will merge this changes in the future release!

quattro · 2025-12-15T16:55:47Z

Wow thank you so much. As @zixuanzhang mentioned, we have an updated version on another branch we're hoping to release soon, that also includes the features you developed here.

Hoping to get them out before the holidays.

feat: add verbose logging for data loading and filtering steps

0f60bd0

These changes allow users to easily debug data input issues by checking the log output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance logging for sample intersection to assist debugging "No gene exist." errors #24

feat: enhance logging for sample intersection to assist debugging "No gene exist." errors #24

Uh oh!

KaishinShaw commented Dec 13, 2025

Uh oh!

zixuanzhang commented Dec 13, 2025

Uh oh!

quattro commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: enhance logging for sample intersection to assist debugging "No gene exist." errors #24

Are you sure you want to change the base?

feat: enhance logging for sample intersection to assist debugging "No gene exist." errors #24

Uh oh!

Conversation

KaishinShaw commented Dec 13, 2025

Description

Changes

Uh oh!

zixuanzhang commented Dec 13, 2025

Uh oh!

quattro commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants