Skip to content

Commit 1794a35

Browse files
authored
Train system and general refactoring (Machine-Learning-for-Medical-Language#230)
* move examples to examples directory * move models to models module * create args and data modules * add train command to cli * update submodules tests * wip train system refactor * continue train_system refactor, other misc refactoring * data processing, train system, logging * model saving, logging, display, tests * update ruff version * simplify build process with newer uv version * fix lint/format errors * fix pre-commit error * run pre-commit on all files in `make check` * fix a few type issues * simplify train system invocation from cli * add logfile path to display * move new tests to test directory * move notebooks to examples * add todos for evaluation and prediction * fix some more type issues * support python 12 and 13 * consistent naming for args classes * remove todos * remove redundant set call * move old code to `cnlpt.legacy`, rework docs * remove some refactoring comments * restore legacy code * fix REST api data preprocessing * fix `cnlpt train` help message * set default python version to 3.13 and test newer python versions in CI * update README.md * minor version bumps in lockfile * set min torch version to 2.6 * drop python 3.13 support due to windows issue * minor args refactoring * fix pin memory warning * only disable mps for tests in CI * refactor metrics, include transformers logging in logfile * display name of best checkpoint * update example READMEs * support averaging multiple selection metrics * add some data tests * add prediction and evaluation to CnlpTrainSystem * close logfile in train system test * shutdown logging after train system test, cache HF models in CI * use close instead of shutdown * file handler is on root logger, not train system logger * oops * add tokens to CnlpPredictions * json serialization for CnlpPredictions * ensure "None" label has id 0 for relations tasks * overwrite predictions file if `overwrite_output_dir` is true * add data.analysis module * fix metric averaging * extract relations in analysis dataframe * fix chemprot preprocessing * fix broken import * fix a couple training display issues * temporarily remove results and error analysis stuff from chemprot readme * add a --version arg for docker builds
1 parent ac73090 commit 1794a35

File tree

121 files changed

+7661
-2593
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+7661
-2593
lines changed

.github/workflows/build-and-test.yml

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
strategy:
2424
fail-fast: false
2525
matrix:
26-
python-version: ["3.9", "3.10", "3.11"]
26+
python-version: ["3.9", "3.10", "3.11", "3.12"]
2727
os: [ubuntu-latest, macos-latest, windows-latest]
2828

2929
runs-on: ${{ matrix.os }}
@@ -32,9 +32,17 @@ jobs:
3232
- uses: actions/checkout@v4
3333
- name: Install uv
3434
id: setup-uv
35-
uses: astral-sh/setup-uv@v3
35+
uses: astral-sh/setup-uv@v6
3636
with:
37+
version: "0.7.11"
3738
enable-cache: true
39+
- name: Cache HF models
40+
uses: actions/cache@v3
41+
with:
42+
path: ~/.cache/huggingface
43+
key: ${{ runner.os }}-hf-${{ hashFiles('**/*.py') }}
44+
restore-keys: |
45+
${{ runner.os }}-hf-
3846
- name: Test with pytest
3947
run: |
40-
uv run --frozen --group test -p ${{ matrix.python-version }} pytest test/
48+
uv run --frozen --group test -p ${{ matrix.python-version }} pytest -vv test/

.github/workflows/lint-and-format.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ jobs:
2525
- uses: actions/checkout@v4
2626
- uses: astral-sh/ruff-action@v1
2727
with:
28-
version: 0.7.0
28+
version: 0.11.8 # same as in pyproject.toml and pre-commit hooks
2929
args: check
3030
- uses: astral-sh/ruff-action@v1
3131
with:
32-
version: 0.7.0
32+
version: 0.11.8 # same as in pyproject.toml and pre-commit hooks
3333
args: format --check

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
repos:
22
- repo: https://github.com/astral-sh/ruff-pre-commit
3-
rev: v0.7.0 # same as in pyproject.toml
3+
rev: v0.11.8 # same as in pyproject.toml and CI
44
hooks:
55
# Run the linter.
66
- id: ruff
@@ -15,4 +15,4 @@ repos:
1515
- id: check-yaml
1616
- id: debug-statements
1717
- id: detect-private-key
18-
- id: end-of-file-fixer
18+
- id: end-of-file-fixer

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.12

.readthedocs.yaml

Lines changed: 13 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,21 @@
1-
# .readthedocs.yaml
21
# Read the Docs configuration file
32
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
43

5-
# Required
64
version: 2
75

8-
# Set the version of Python and other tools you might need
9-
build:
10-
os: ubuntu-20.04
11-
tools:
12-
python: "3.8"
13-
# You can also specify other tool versions:
14-
# nodejs: "16"
15-
# rust: "1.55"
16-
# golang: "1.17"
17-
18-
# Build documentation in the docs/ directory with Sphinx
196
sphinx:
207
configuration: docs/conf.py
218

22-
# If using Sphinx, optionally build your docs in additional formats such as PDF
23-
# formats:
24-
# - pdf
25-
26-
# Optionally declare the Python requirements required to build your docs
27-
python:
28-
install:
29-
- method: pip
30-
path: .
31-
- requirements: dev-requirements.txt
9+
build:
10+
os: ubuntu-24.04
11+
tools:
12+
python: "3.12"
13+
jobs:
14+
pre_create_environment:
15+
- asdf plugin add uv
16+
- asdf install uv latest
17+
- asdf global uv latest
18+
create_environment:
19+
- uv venv "${READTHEDOCS_VIRTUALENV_PATH}"
20+
install:
21+
- UV_PROJECT_ENVIRONMENT="${READTHEDOCS_VIRTUALENV_PATH}" uv sync --frozen --group docs

CONTRIBUTING.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,9 @@ we have instructions for using [uv](https://github.com/astral-sh/uv)
4545
2. From the project's base directory, run:
4646

4747
```bash
48-
uv sync --python 3.11 # 3.9 and 3.10 are also supported. uv will install dev dependencies by default.
48+
# uv will use python 3.12 and install dev dependencies by default.
49+
# If you prefer a different python version, use e.g. `uv sync -p 3.9`
50+
uv sync
4951
source .venv/bin/activate # activate the virtual environment
5052
```
5153

@@ -56,7 +58,7 @@ we have instructions for using [uv](https://github.com/astral-sh/uv)
5658
2. Create a new conda environment:
5759

5860
```bash
59-
conda create -n cnlpt python=3.11 # 3.9 and 3.10 are also supported
61+
conda create -n cnlpt python=3.12 # 3.9, 3.10, and 3.11 are also supported
6062
conda activate cnlpt
6163
```
6264

@@ -111,7 +113,7 @@ The `lint-and-format` workflow should always pass if `make check` reports
111113
that everything is correct.
112114

113115
The `build-and-test` workflow will run `pytest` on Linux, MacOS, and Windows,
114-
for each Python version this project supports (currently 3.9, 3.10, and 3.11).
116+
for each Python version this project supports (currently 3.9, 3.10, 3.11, and 3.12).
115117

116118
You can see the structure of these CI runs in the
117119
[**Actions**](https://github.com/Machine-Learning-for-Medical-Language/cnlp_transformers/actions)
@@ -168,8 +170,7 @@ the Semantic Versioning guidelines. The key points are as follows:
168170

169171
When the codebase is ready for release, run `python scripts/prepare_release.py <new version number>`.
170172
This will walk you through the last few changes that need to be made before release,
171-
including updating the changelog and setting the setuptools_scm fallback version,
172-
and will also update the lockfile and your venv with the new package version.
173+
including updating the changelog and the lockfile.
173174

174175
> [!WARNING]
175176
> `prepare_release.py` requires uv to update the lockfile.

Makefile

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ help:
55
@echo ' hooks - install pre-commit hooks'
66
@echo ' check - lint and format using ruff'
77
@echo ' test - run tests with pytest'
8-
@echo ' docs - build the docs'
8+
@echo ' docs - build the docs locally'
99
@echo ' build - build cnlp-transformers for distribution'
1010

1111
.PHONY: hooks
@@ -16,19 +16,16 @@ hooks:
1616
check:
1717
ruff check --fix
1818
ruff format
19+
pre-commit run -a
1920

2021
.PHONY: test
2122
test:
2223
pytest test/
2324

2425
.PHONY: docs
2526
docs:
26-
# this script is copied from the old build_doc_source.sh script
27-
find docs -maxdepth 1 ! -name 'index.rst' -name '*.rst' -type f -exec rm -f {} +
28-
rm -f transformer_objects.inv
29-
sphobjinv convert zlib docs/transformer_objects.txt --quiet
30-
SPHINX_APIDOC_OPTIONS=members,show-inheritance sphinx-apidoc -feT -o docs src/cnlpt
31-
echo " :noindex:" >> docs/cnlpt.rst
27+
scripts/build_html_docs.sh docs/build
28+
@echo "Point your browser at file://${PWD}/docs/build/html/index.html to view."
3229

3330
.PHONY: build
3431
build:

README.md

Lines changed: 38 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -11,23 +11,28 @@ Primary use cases include
1111
This library is _not_ intended to serve as a place for clinical NLP applications to live. If you build something cool that uses transformer models that take advantage of our model definitions, the best practice is probably to rely on it as a library rather than treating it as your workspace. This library is also not intended as a deployment-ready tool for _scalable_ clinical NLP. There is a lot of interest in developing methods and tools that are smaller and can process millions of records, and this library can potentially be used for research along those line. But it will probably never be extremely optimized or shrink-wrapped for applications. However, there should be plenty of examples and useful code for people who are interested in that type of deployment.
1212

1313
## Install
14-
> [!WARNING]
15-
macOS support is currently experimental. We recommend using python3.10 for macOS installations.
1614

17-
> [!NOTE]
18-
When installing the library's dependencies, `pip` will probably install
19-
PyTorch with CUDA 10.2 support by default. If you would like to run the
20-
library in CPU-only mode or with a newer version of CUDA, [install PyTorch
21-
to your desired specifications](https://pytorch.org/get-started/locally/)
22-
in your virtual environment first before installing `cnlp-transformers`.
15+
> [!IMPORTANT]
16+
> When installing the library's dependencies, PyTorch will probably be installed
17+
> with CUDA 12.6 support by default on linux, and without CUDA support on other platforms.
18+
> If you would like to run the library in CPU-only mode or with a specific version of CUDA,
19+
> [install PyTorch to your desired specifications](https://pytorch.org/get-started/locally/)
20+
> in your virtual environment first before installing `cnlp-transformers`.
21+
> [See here](https://docs.astral.sh/uv/guides/integration/pytorch/#the-uv-pip-interface) if
22+
> using uv.
2323
2424
### Static installation
2525

2626
If you are installing just to fine-tune or run the REST APIs,
27-
you can install without cloning:
27+
you can install without cloning using [uv](https://docs.astral.sh/uv/):
28+
29+
```sh
30+
uv pip install cnlp-transformers
31+
```
32+
33+
Or with pip:
2834

2935
```sh
30-
# Note: if needed, install PyTorch first (see above)
3136
pip install cnlp-transformers
3237
```
3338

@@ -110,18 +115,18 @@ We provided the following step-by-step examples how to finetune in clinical NLP
110115

111116
### Fine-tuning options
112117

113-
Run ```python -m cnlpt.train_system -h``` to see all the available options. In addition to inherited Huggingface Transformers options, there are options to do the following:
118+
Run `cnlpt train -h` to see all the available options. In addition to inherited Huggingface Transformers options, there are options to do the following:
114119

115-
* Select different models: ```--model hier``` uses a hierarchical transformer layer on top of a specified encoder model. We recommend using a very small encoder: ```--encoder microsoft/xtremedistil-l6-h256-uncased``` so that the full model fits into memory.
120+
* Select different models: `--model hier` uses a hierarchical transformer layer on top of a specified encoder model. We recommend using a very small encoder: `--encoder microsoft/xtremedistil-l6-h256-uncased` so that the full model fits into memory.
116121
* Run simple baselines (use ``--model cnn|lstm --tokenizer_name roberta-base`` -- since there is no HF model then you must specify the tokenizer explicitly)
117-
* Use a different layer's CLS token for the classification (e.g., ```--layer 10```)
118-
* Probabilistically freeze weights of the encoder (leaving classifier weights all unfrozen) (```--freeze``` alone freezes all encoder weights, ```--freeze <float>``` when given a parameter between 0 and 1, freezes that percentage of encoder weights)
119-
* Classify based on a token embedding instead of the CLS embedding (```--token``` -- applies to the event/entity classification setting only, and requires the input to have xml-style tags (`<e>`, `</e>`) around the tokens representing the event/entity)
120-
* Use class-weighted loss function (```--class_weights```)
122+
* Use a different layer's CLS token for the classification (e.g., `--layer 10`)
123+
* Probabilistically freeze weights of the encoder (leaving classifier weights all unfrozen) (`--freeze` alone freezes all encoder weights, `--freeze <float>` when given a parameter between 0 and 1, freezes that percentage of encoder weights)
124+
* Classify based on a token embedding instead of the CLS embedding (`--token` -- applies to the event/entity classification setting only, and requires the input to have xml-style tags (`<e>`, `</e>`) around the tokens representing the event/entity)
125+
* Use class-weighted loss function (`--class_weights`)
121126
122127
## Running REST APIs
123128
124-
There are existing REST APIs in the ```src/cnlpt/api``` folder for a few important clinical NLP tasks:
129+
There are existing REST APIs in the `src/cnlpt/api` folder for a few important clinical NLP tasks:
125130
126131
1. Negation detection
127132
2. Time expression tagging (spans + time classes)
@@ -133,7 +138,7 @@ There are existing REST APIs in the ```src/cnlpt/api``` folder for a few importa
133138
To demo the negation API:
134139
135140
1. Install the `cnlp-transformers` package.
136-
2. Run `cnlpt_negation_rest [-p PORT]`.
141+
2. Run `cnlpt rest --model-type negation [-p PORT]`.
137142
3. Open a python console and run the following commands:
138143
139144
#### Setup variables for negation
@@ -167,7 +172,7 @@ The model correctly classifies both nausea and anosmia as negated.
167172
To demo the temporal API:
168173
169174
1. Install the `cnlp-transformers` package.
170-
2. Run `cnlpt_temporal_rest [-p PORT]`
175+
2. Run `cnlpt rest --model-type temporal [-p PORT]`
171176
3. Open a python console and run the following commands to test:
172177
173178
#### Setup variables for temporal
@@ -217,20 +222,14 @@ should return:
217222
218223
This output indicates the token spans of events and timexes, and relations between events and timexes, where the suffixes are indices into the respective arrays (e.g., TIMEX-0 in a relation refers to the 0th time expression found, which begins at token 6 and ends at token 9 -- ["March 3, 2010"])
219224
220-
To run only the time expression or event taggers, change the run command to:
221-
222-
```uvicorn cnlpt.api.timex_rest:app --host 0.0.0.0``` or
223-
224-
```uvicorn cnlpt.api.event_rest:app --host 0.0.0.0```
225-
226-
then run the same process commands as above (including the same URL). You will get similar json output, but only one of the dictionary elements (timexes or events) will be populated.
227-
228225
## Citing cnlp_transformers
226+
229227
Please use the following bibtex to cite cnlp_transformers if you use it in a publication:
230-
```
228+
229+
```latex
231230
@misc{cnlp_transformers,
232231
author = {CNLPT},
233-
title = {Clinical {NLP} {Transformers} (cnlp\_transformers)},
232+
title = {Clinical {NLP} {Transformers} (cnlp\_transformers)},
234233
year = {2021},
235234
publisher = {GitHub},
236235
journal = {GitHub repository},
@@ -239,14 +238,15 @@ Please use the following bibtex to cite cnlp_transformers if you use it in a pub
239238
```
240239
241240
## Publications using cnlp_transformers
241+
242242
Please send us any citations that used this library!
243243
244-
1. Chen S, Guevara M, Ramirez N, Murray A, Warner JL, Aerts HJWL, et al. Natural Language Processing to Automatically Extract the Presence and Severity of Esophagitis in Notes of Patients Undergoing Radiotherapy. JCO Clin Cancer Inform. 2023 Jul;(7):e2300048.
245-
2. Li Y, Miller T, Bethard S, Savova G. Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information [Internet]. arXiv.org. 2024 [cited 2025 May 22]. Available from: https://arxiv.org/abs/2410.12774v1
246-
3. Wang L, Li Y, Miller T, Bethard S, Savova G. Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models. In: Rogers A, Boyd-Graber J, Okazaki N, editors. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) [Internet]. Toronto, Canada: Association for Computational Linguistics; 2023 [cited 2025 May 22]. p. 15746–61. Available from: https://aclanthology.org/2023.acl-long.877/
247-
4. Miller T, Bethard S, Dligach D, Savova G. End-to-end clinical temporal information extraction with multi-head attention. Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:313–9.
248-
5. Yoon W, Ren B, Thomas S, Kim C, Savova G, Hall MH, et al. Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction [Internet]. arXiv; 2025 [cited 2025 May 22]. Available from: http://arxiv.org/abs/2502.10388
249-
6. Wang L, Zipursky AR, Geva A, McMurry AJ, Mandl KD, Miller TA. A computable case definition for patients with SARS-CoV2 testing that occurred outside the hospital. JAMIA Open. 2023 Oct 1;6(3):ooad047.
250-
7. Bitterman DS, Goldner E, Finan S, Harris D, Durbin EB, Hochheiser H, et al. An End-to-End Natural Language Processing System for Automatically Extracting Radiation Therapy Events From Clinical Texts. Int J Radiat Oncol Biol Phys. 2023 Sep 1;117(1):262–73.
251-
8. McMurry AJ, Gottlieb DI, Miller TA, Jones JR, Atreja A, Crago J, et al. Cumulus: A federated EHR-based learning system powered by FHIR and AI. medRxiv. 2024 Feb 6;2024.02.02.24301940.
252-
9. LCD benchmark: long clinical document benchmark on mortality prediction for language models | Journal of the American Medical Informatics Association | Oxford Academic [Internet]. [cited 2025 Jan 23]. Available from: https://academic.oup.com/jamia/article-abstract/32/2/285/7909835?redirectedFrom=fulltext
244+
1. Chen S, Guevara M, Ramirez N, Murray A, Warner JL, Aerts HJWL, et al. Natural Language Processing to Automatically Extract the Presence and Severity of Esophagitis in Notes of Patients Undergoing Radiotherapy. JCO Clin Cancer Inform. 2023 Jul;(7):e2300048.
245+
2. Li Y, Miller T, Bethard S, Savova G. Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information [Internet]. arXiv.org. 2024 [cited 2025 May 22]. Available from: <https://arxiv.org/abs/2410.12774v1>
246+
3. Wang L, Li Y, Miller T, Bethard S, Savova G. Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models. In: Rogers A, Boyd-Graber J, Okazaki N, editors. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) [Internet]. Toronto, Canada: Association for Computational Linguistics; 2023 [cited 2025 May 22]. p. 15746–61. Available from: <https://aclanthology.org/2023.acl-long.877/>
247+
4. Miller T, Bethard S, Dligach D, Savova G. End-to-end clinical temporal information extraction with multi-head attention. Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:313–9.
248+
5. Yoon W, Ren B, Thomas S, Kim C, Savova G, Hall MH, et al. Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction [Internet]. arXiv; 2025 [cited 2025 May 22]. Available from: <http://arxiv.org/abs/2502.10388>
249+
6. Wang L, Zipursky AR, Geva A, McMurry AJ, Mandl KD, Miller TA. A computable case definition for patients with SARS-CoV2 testing that occurred outside the hospital. JAMIA Open. 2023 Oct 1;6(3):ooad047.
250+
7. Bitterman DS, Goldner E, Finan S, Harris D, Durbin EB, Hochheiser H, et al. An End-to-End Natural Language Processing System for Automatically Extracting Radiation Therapy Events From Clinical Texts. Int J Radiat Oncol Biol Phys. 2023 Sep 1;117(1):262–73.
251+
8. McMurry AJ, Gottlieb DI, Miller TA, Jones JR, Atreja A, Crago J, et al. Cumulus: A federated EHR-based learning system powered by FHIR and AI. medRxiv. 2024 Feb 6;2024.02.02.24301940.
252+
9. LCD benchmark: long clinical document benchmark on mortality prediction for language models | Journal of the American Medical Informatics Association | Oxford Academic [Internet]. [cited 2025 Jan 23]. Available from: <https://academic.oup.com/jamia/article-abstract/32/2/285/7909835?redirectedFrom=fulltext>

docker/build.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,10 @@
1616
]
1717

1818
parser = argparse.ArgumentParser()
19-
parser.add_argument("--model", action="append", choices=["all"] + MODELS)
19+
parser.add_argument("--model", action="append", choices=["all", *MODELS])
2020
parser.add_argument("--processor", choices=["all", "cpu", "gpu"], default="all")
2121
parser.add_argument("--push", action="store_true", default=False)
22+
parser.add_argument("--cnlpt_version", default=None)
2223
args = parser.parse_args()
2324

2425

@@ -69,7 +70,7 @@ def build_one(model: str, processor: str, *, version: str, push: bool = False) -
6970
else:
7071
build_args.append("--load") # to load into docker locally
7172

72-
subprocess.run(["docker", "buildx", "build"] + build_args, check=True)
73+
subprocess.run(["docker", "buildx", "build", *build_args], check=True)
7374

7475

7576
if __name__ == "__main__":
@@ -86,7 +87,7 @@ def build_one(model: str, processor: str, *, version: str, push: bool = False) -
8687
# Our Dockerfiles pull directly from pip, so we want to be setting the same version as we'll install.
8788
# We don't want to pull the version from our sibling code in this repo, because it might not be released yet,
8889
# but we still want to be able to push new builds of the existing releases.
89-
version = get_latest_pip_version("cnlp-transformers")
90+
version = args.cnlpt_version or get_latest_pip_version("cnlp-transformers")
9091

9192
for model in models:
9293
for processor in processors:

docker/compose.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,3 @@ services:
1919
devices:
2020
- capabilities: [gpu]
2121
entrypoint: cnlpt_negation_rest -p 8000
22-

0 commit comments

Comments
 (0)