Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions docs/inference-providers/register-as-a-provider.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,36 @@ Here is an example of response:
}
```

### Automatic validation

Once a mapping is created through the API, Hugging Face performs periodic automated tests to ensure the mapped endpoint functions correctly.

Each model is tested every 6 hours by making API calls to your service. If the test is successful, the model remains active and continues to be tested periodically. However, if the test fails (e.g., your service returns an HTTP error status during an inference request), the provider will be temporarily removed from the list of active providers.

<div class="flex justify-center">
<picture>
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/automatic-validation-light.png">
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/automatic-validation-dark.png">
</picture>
</div>

A failed mapping undergoes retesting every hour. Additionally, updating the status of a model mapping triggers an immediate validation test.

The validation process checks the following:

- The Inference API is reachable, and the HTTP call succeeds.
- The output format is compatible with the Hugging Face JavaScript Inference Client.
- Latency requirements are met:
- For conversational and text models: under 5 seconds (time to first token in streaming mode).
- For other tasks: under 30 seconds.

For large language models (LLMs), additional behavioral tests are conducted:

- Tool calling support.
- Structured output support.
Comment on lines +322 to +325
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mark the model as in error as well? or it's a ≠ UI? (seems a bit heavy to mark it as Failing validation for this)

Copy link
Contributor Author

@SBrandeis SBrandeis Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not - for now it just feeds a map of booleans flagging wether or not the provider-llm pair supports the feature

The tests are only carried out if the initial test succeeds (first token received in < 5s)

It's not displayed nor used anywhere yet


These tests involve sending specific inference requests to the model and verifying that the responses meet the expected format.

## 4. Billing

For routed requests (see figure below), i.e. when users authenticate via HF, our intent is that
Expand Down