Skip to content

scripts to calculate DWPC directly#7

Open
lagillenwater wants to merge 4 commits intogreenelab:mainfrom
lagillenwater:pr-direct-dwpc
Open

scripts to calculate DWPC directly#7
lagillenwater wants to merge 4 commits intogreenelab:mainfrom
lagillenwater:pr-direct-dwpc

Conversation

@lagillenwater
Copy link
Copy Markdown
Collaborator

This PR focuses on calculating DWPC scores for metapaths after null generation.

  • Updated production pipeline runner to include direct DWPC after null generation:

  • Added direct DWPC execution script

  • Fixed permutation validation to iterate existing

  • updated README and pyproject.toml to include instructions for DWPC calc tasks

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a direct, local DWPC computation path (via hetmatpy) intended to run after null dataset generation, reducing reliance on Docker/API lookups and enabling caching/parallelism.

Changes:

  • Introduces a new src/dwpc_direct.py module to compute/cache DWPC matrices and extract DWPC values for pair lists.
  • Adds scripts/compute_dwpc_direct.py to run the direct DWPC computation across real/permuted/random datasets.
  • Updates pipeline/tasking/docs and tweaks permutation validation output formatting.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/dwpc_direct.py New DWPC computation/caching module + helpers for pairwise extraction and parallel runs.
scripts/compute_dwpc_direct.py New executable script (jupytext export) that precomputes matrices and writes per-dataset DWPC outputs.
scripts/pipeline_production.py Alters subprocess invocation for pipeline steps.
scripts/permutation_null_datasets.py Minor formatting changes in permutation validation output.
pyproject.toml Adds/adjusts Poe tasks and optional dev dependencies.
README.md Updates workflow/task documentation.
Comments suppressed due to low confidence (1)

pyproject.toml:60

  • pipeline-publication is referenced in README.md but is not defined in the Poe tasks anymore (only pipeline-production / pipeline-null exist). Consider re-adding a pipeline-publication task (it looks like scripts/pipeline_publication.py still exists) or removing the README reference to avoid broken docs.
# Grouped tasks
pipeline-production = "python scripts/pipeline_production.py"
pipeline-null = ["gen-permutation", "gen-random"]


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lagillenwater and others added 3 commits March 30, 2026 15:17
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…laps, and maintain consistent data structures

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@lagillenwater lagillenwater requested a review from emma2207 March 30, 2026 21:27
@lagillenwater
Copy link
Copy Markdown
Collaborator Author

Hi @emma2207, I realized that I had an existing, unmerged PR to address. Let me know if you have any questions or if anything does not make sense.

Copy link
Copy Markdown

@emma2207 emma2207 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 I love to see you wrote a class with functions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants