Skip to content

Commit d1dd75c

Browse files
committed
Update evaluation README with metrics and API usage details, and add 'Contributing' section
1 parent 3c9a1c9 commit d1dd75c

File tree

1 file changed

+10
-7
lines changed

1 file changed

+10
-7
lines changed

evaluation/README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,23 @@
22

33
## `evals`: LLM evaluations to test and improve model outputs
44

5-
LLM evals test a prompt with a set of test data by scoring each item in the data set
6-
7-
To test Balancer's structured text extraction of medication rules, `evals` computes:
5+
### Metrics
86

97
[Extractiveness](https://huggingface.co/docs/lighteval/en/metric-list#automatic-metrics-for-generative-tasks):
108

9+
Natural Language Generation Performance:
10+
1111
* Extractiveness Coverage:
1212
- Percentage of words in the summary that are part of an extractive fragment with the article
1313
* Extractiveness Density:
1414
- Average length of the extractive fragment to which each word in the summary belongs
1515
* Extractiveness Compression:
1616
- Word ratio between the article and the summary
1717

18-
API usage:
18+
API Performance:
1919

20-
* Token usage (input/output)
21-
* Estimated cost in USD
20+
* Token Usage (input/output)
21+
* Estimated Cost in USD
2222
* Duration (in seconds)
2323

2424
### Test Data
@@ -152,4 +152,7 @@ for i, metric in enumerate(all_metrics):
152152
153153
plt.tight_layout()
154154
plt.show()
155-
```
155+
156+
```
157+
158+
### Contributing

0 commit comments

Comments
 (0)