Skip to content

Conversation

@jhnwu3
Copy link
Collaborator

@jhnwu3 jhnwu3 commented Jan 1, 2026

This pull request adds two new benchmarking scripts for drug recommendation tasks using the MIMIC-IV dataset and updates the installation documentation to clarify Python version requirements and recommended package versions. The new scripts provide reproducible performance measurements for both single-threaded pandas processing and parallelized PyHealth task processing with configurable worker counts, including detailed tracking of memory usage and cache sizes.

Benchmarking scripts for drug recommendation:

  • Added examples/benchmark_perf/benchmark_pandas_drug_rec.py, a standalone pandas-based benchmark for the MIMIC-IV drug recommendation task, including cumulative visit history construction, memory tracking, and result reporting.
  • Added examples/benchmark_perf/benchmark_workers_n_drug_recommendation.py, a benchmarking script that measures PyHealth's drug recommendation task performance across multiple num_workers values. It tracks dataset/task cache sizes, peak memory usage (including child processes), and supports repeated runs for robust statistics. Results are written to CSV for analysis.

Documentation updates:

  • Updated docs/install.rst to clarify that PyHealth 2.0 requires Python 3.12 or higher (up to 3.13), reflecting a hard dependency on modern Python features. Also updated recommended and legacy version installation instructions and notes.

@jhnwu3 jhnwu3 requested a review from Logiquo January 1, 2026 18:07
@jhnwu3 jhnwu3 merged commit a3750b0 into master Jan 1, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants