Skip to content

Benchmarking system extension (evaluation measures): evaluation of non-comparable, "original" CQs #4

@dersuchendee

Description

@dersuchendee

Assessment of which prompting technique/MoE/single LLMs could be best to operationalize this evaluation.
Consideration of Inter-Annotator-Agreement.
Agreement put forward via prompt following the advocacy-inquiry model.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions