Commit 21e9b48

committed

tests: Add case for /train_supervised

Due to the small size of the dataset used for test purposes, training on MedCAT fails intermittently. This is due to the fact that after splitting the dataset into training and testing sets, the training set might end up being empty. Observation shows that this happens ~40% of the time for our test dataset and the default test size of 0.2. With that in mind, we rerun the flaky test up to 6 times before failing, based on the following calculation: P(failure) = ~0.4 P(n failures) = P(failure) x P(failure) x ... x P(failure) = 0.4^n If we want to keep the probability of failure below 0.01: 0.4^n < 0.01 => log(0.4^n) < log(0.01) => n x log(0.4) < log(0.01) => n > log(0.01) / log(0.4) => n > 5.03 Signed-off-by: Phoevos Kalemkeris <[email protected]>

1 parent 616ddb3 commit 21e9b48Copy full SHA for 21e9b48

3 files changed

+378

-331

lines changed

poetry.lock
pyproject.toml
tests/integration
- test_api.py

3 files changed

+378

-331

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 21e9b48

3 files changed

3 files changed

File tree

3 files changed

3 files changed

0 commit comments