Add contamination and relatedness QC checks#13
Add contamination and relatedness QC checks#13iamh2o wants to merge 4 commits intocodex/refactor-script-directives-to-shellfrom
Conversation
There was a problem hiding this comment.
Codex Review: Here are some suggestions.
Reply with @codex fix comments to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
| samples_df = pd.read_csv(samples_path, sep="\t", dtype=str) | ||
|
|
||
| if "patient" not in samples_df.columns: | ||
| raise ValueError( | ||
| "The samples sheet must contain a 'patient' column to assess relatedness." | ||
| ) |
There was a problem hiding this comment.
[P1] Avoid hard requirement for patient column in samples sheet
The new attach_patient_metadata function aborts when the samples sheet lacks a patient column (raise ValueError(...)). The schema in workflow/schemas/samples.schema.yaml still treats patient as optional, and the function already implements a sensible fallback by logging a warning and defaulting missing patients to the sample name. With the default config now activating relatedness checks, existing projects that omit the optional patient column will fail before reaching the fallback logic. Consider removing the hard check and letting the existing warning/default behaviour handle this case so backward‑compatible sample sheets continue to run.
Useful? React with 👍 / 👎.
Summary
Testing
https://chatgpt.com/codex/tasks/task_e_68cc900403e48331b0692e842405ab9d