You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`sotabencheval` is a framework-agnostic library that contains a collection of deep learning benchmarks you can use to benchmark your models. It can be used in conjunction with the [sotabench](https://www.sotabench.com) service to record results for models, so the community can compare model performance on different tasks, as well as a continuous integration style service for your repository to benchmark your models on each commit.
You should read the [full documentation here](https://paperswithcode.github.io/sotabench-eval/index.html), which contains guidance on getting started and connecting to [sotabench](https://www.sotabench.com).
31
+
32
+
Integration is lightweight. For example, if you are evaluating an ImageNet model, you initialize an Evaluator object and (optionally) link to the paper where the model originated from to compare with published results:
33
+
34
+
```
35
+
from sotabencheval.image_classification import ImageNetEvaluator
36
+
evaluator = ImageNetEvaluator(
37
+
model_name='ResNeXt-101-32x8d',
38
+
paper_arxiv_id='1611.05431')
39
+
```
40
+
41
+
Then for each batch of predictions your model makes on ImageNet, you pass a dictionary of keys as image IDs and values as output predictions to the `evaluator.add` method:
42
+
43
+
```
44
+
evaluator.add(dict(zip(image_ids, batch_output)))
45
+
```
46
+
47
+
This logic just needs to be written in a `sotabench.py` file (which contains whatever evaluation logic you need - e.g loading and processing the data), and sotabench will run it on each commit and record the results:
14
48
15
-
You can access the [full documentation here](https://paperswithcode.github.io/sotabench-eval/index.html).
0 commit comments