-
Notifications
You must be signed in to change notification settings - Fork 363
Open
Labels
Description
Hey, I want to reproduce the results posted on https://crfm.stanford.edu/helm/arabic/latest/. But the leading model LLM-X doesn't seem to have a public API? It is mentioned on the website that the results are reproducible. Does the HELM benchmark also apply to private models?
Reactions are currently unavailable