Skip to content

Conversation

@sfluegel05
Copy link
Collaborator

Previously, the class-based weighting used the pkl-files from processed-main (untokenized). This has two disadvantages:

  1. If some instances get thrown out due to not being tokenizable, the weighting will be a bit off
  2. If a user wants to provide their own tokenized datasets, they don't have to change the untokenized one as well

I also added a safeguard to avoid concatenating empty prediction lists which would raise an error.

@sfluegel05 sfluegel05 merged commit 5661b64 into dev Mar 28, 2025
6 checks passed
@sfluegel05 sfluegel05 deleted the feature/weighted-bce-tokenized branch March 28, 2025 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants