-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Re-ranking capability is already available with a default language model that provides normalized (0..1) and consistent scoring across a multi-algorithm candidate pool.
This task is to design a straightforward evaluation to compare the performance of our default re-ranking language model (LM) to one or more alternative LMs (i.e. a medically trained LM) using several validation datasets. Decision points:
- Is our current model good enough that we should make re-ranking available to all users?
- Do users need the option to enable/disable or can it always be enabled?
- Is a medically trained model giving better results?
- Is one LM for all projects sufficient to start? Or do different projects require different LMs for re-ranking to be useful?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Requirements