Skip to content
Merged
1 change: 0 additions & 1 deletion docs/source/de/guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,6 @@ Das Ziel von [`InferenceClient`] ist es, die einfachste Schnittstelle zum Ausfü
| | [Feature Extraction](https://huggingface.co/tasks/feature-extraction) | ✅ | [`~InferenceClient.feature_extraction`] |
| | [Fill Mask](https://huggingface.co/tasks/fill-mask) | ✅ | [`~InferenceClient.fill_mask`] |
| | [Question Answering](https://huggingface.co/tasks/question-answering) | ✅ | [`~InferenceClient.question_answering`] |
| | [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity) | ✅ | [`~InferenceClient.sentence_similarity`] |
| | [Summarization](https://huggingface.co/tasks/summarization) | ✅ | [`~InferenceClient.summarization`] |
| | [Table Question Answering](https://huggingface.co/tasks/table-question-answering) | ✅ | [`~InferenceClient.table_question_answering`] |
| | [Text Classification](https://huggingface.co/tasks/text-classification) | ✅ | [`~InferenceClient.text_classification`] |
Expand Down
1 change: 0 additions & 1 deletion docs/source/en/guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,6 @@ You might wonder why using [`InferenceClient`] instead of OpenAI's client? There
| | [`~InferenceClient.feature_extraction`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| | [`~InferenceClient.fill_mask`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| | [`~InferenceClient.question_answering`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| | [`~InferenceClient.sentence_similarity`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| | [`~InferenceClient.summarization`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| | [`~InferenceClient.table_question_answering`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| | [`~InferenceClient.text_classification`] | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
Expand Down
57 changes: 28 additions & 29 deletions docs/source/ko/guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,35 +89,34 @@ Hugging Face Hub에는 20만 개가 넘는 모델이 있습니다! [`InferenceCl

[`InferenceClient`]의 목표는 Hugging Face 모델에서 추론을 실행하기 위한 가장 쉬운 인터페이스를 제공하는 것입니다. 이는 가장 일반적인 작업들을 지원하는 간단한 API를 가지고 있습니다. 현재 지원되는 작업 목록은 다음과 같습니다:

| 도메인 | 작업 | 지원 여부 | 문서 |
|--------|--------------------------------|--------------|------------------------------------|
| 오디오 | [오디오 분류](https://huggingface.co/tasks/audio-classification) | ✅ | [`~InferenceClient.audio_classification`] |
| 오디오 | [오디오 투 오디오](https://huggingface.co/tasks/audio-to-audio) | ✅ | [`~InferenceClient.audio_to_audio`] |
| | [자동 음성 인식](https://huggingface.co/tasks/automatic-speech-recognition) | ✅ | [`~InferenceClient.automatic_speech_recognition`] |
| | [텍스트 투 스피치](https://huggingface.co/tasks/text-to-speech) | ✅ | [`~InferenceClient.text_to_speech`] |
| 컴퓨터 비전 | [이미지 분류](https://huggingface.co/tasks/image-classification) | ✅ | [`~InferenceClient.image_classification`] |
| | [이미지 분할](https://huggingface.co/tasks/image-segmentation) | ✅ | [`~InferenceClient.image_segmentation`] |
| | [이미지 투 이미지](https://huggingface.co/tasks/image-to-image) | ✅ | [`~InferenceClient.image_to_image`] |
| | [이미지 투 텍스트](https://huggingface.co/tasks/image-to-text) | ✅ | [`~InferenceClient.image_to_text`] |
| | [객체 탐지](https://huggingface.co/tasks/object-detection) | ✅ | [`~InferenceClient.object_detection`] |
| | [텍스트 투 이미지](https://huggingface.co/tasks/text-to-image) | ✅ | [`~InferenceClient.text_to_image`] |
| | [제로샷 이미지 분류](https://huggingface.co/tasks/zero-shot-image-classification) | ✅ | [`~InferenceClient.zero_shot_image_classification`] |
| 멀티모달 | [문서 질의 응답](https://huggingface.co/tasks/document-question-answering) | ✅ | [`~InferenceClient.document_question_answering`] |
| | [시각적 질의 응답](https://huggingface.co/tasks/visual-question-answering) | ✅ | [`~InferenceClient.visual_question_answering`] |
| 자연어 처리 | [대화형](https://huggingface.co/tasks/conversational) | ✅ | [`~InferenceClient.conversational`] |
| | [특성 추출](https://huggingface.co/tasks/feature-extraction) | ✅ | [`~InferenceClient.feature_extraction`] |
| | [마스크 채우기](https://huggingface.co/tasks/fill-mask) | ✅ | [`~InferenceClient.fill_mask`] |
| | [질의 응답](https://huggingface.co/tasks/question-answering) | ✅ | [`~InferenceClient.question_answering`] |
| | [문장 유사도](https://huggingface.co/tasks/sentence-similarity) | ✅ | [`~InferenceClient.sentence_similarity`] |
| | [요약](https://huggingface.co/tasks/summarization) | ✅ | [`~InferenceClient.summarization`] |
| | [테이블 질의 응답](https://huggingface.co/tasks/table-question-answering) | ✅ | [`~InferenceClient.table_question_answering`] |
| | [텍스트 분류](https://huggingface.co/tasks/text-classification) | ✅ | [`~InferenceClient.text_classification`] |
| | [텍스트 생성](https://huggingface.co/tasks/text-generation) | ✅ | [`~InferenceClient.text_generation`] |
| | [토큰 분류](https://huggingface.co/tasks/token-classification) | ✅ | [`~InferenceClient.token_classification`] |
| | [번역](https://huggingface.co/tasks/translation) | ✅ | [`~InferenceClient.translation`] |
| | [제로샷 분류](https://huggingface.co/tasks/zero-shot-classification) | ✅ | [`~InferenceClient.zero_shot_classification`] |
| 타블로 | [타블로 작업 분류](https://huggingface.co/tasks/tabular-classification) | ✅ | [`~InferenceClient.tabular_classification`] |
| | [타블로 회귀](https://huggingface.co/tasks/tabular-regression) | ✅ | [`~InferenceClient.tabular_regression`] |
| 도메인 | 작업 | 지원 여부 | 문서 |
| ----------- | --------------------------------------------------------------------------------- | --------- | --------------------------------------------------- |
| 오디오 | [오디오 분류](https://huggingface.co/tasks/audio-classification) | ✅ | [`~InferenceClient.audio_classification`] |
| 오디오 | [오디오 투 오디오](https://huggingface.co/tasks/audio-to-audio) | ✅ | [`~InferenceClient.audio_to_audio`] |
| | [자동 음성 인식](https://huggingface.co/tasks/automatic-speech-recognition) | ✅ | [`~InferenceClient.automatic_speech_recognition`] |
| | [텍스트 투 스피치](https://huggingface.co/tasks/text-to-speech) | ✅ | [`~InferenceClient.text_to_speech`] |
| 컴퓨터 비전 | [이미지 분류](https://huggingface.co/tasks/image-classification) | ✅ | [`~InferenceClient.image_classification`] |
| | [이미지 분할](https://huggingface.co/tasks/image-segmentation) | ✅ | [`~InferenceClient.image_segmentation`] |
| | [이미지 투 이미지](https://huggingface.co/tasks/image-to-image) | ✅ | [`~InferenceClient.image_to_image`] |
| | [이미지 투 텍스트](https://huggingface.co/tasks/image-to-text) | ✅ | [`~InferenceClient.image_to_text`] |
| | [객체 탐지](https://huggingface.co/tasks/object-detection) | ✅ | [`~InferenceClient.object_detection`] |
| | [텍스트 투 이미지](https://huggingface.co/tasks/text-to-image) | ✅ | [`~InferenceClient.text_to_image`] |
| | [제로샷 이미지 분류](https://huggingface.co/tasks/zero-shot-image-classification) | ✅ | [`~InferenceClient.zero_shot_image_classification`] |
| 멀티모달 | [문서 질의 응답](https://huggingface.co/tasks/document-question-answering) | ✅ | [`~InferenceClient.document_question_answering`] |
| | [시각적 질의 응답](https://huggingface.co/tasks/visual-question-answering) | ✅ | [`~InferenceClient.visual_question_answering`] |
| 자연어 처리 | [대화형](https://huggingface.co/tasks/conversational) | ✅ | [`~InferenceClient.conversational`] |
| | [특성 추출](https://huggingface.co/tasks/feature-extraction) | ✅ | [`~InferenceClient.feature_extraction`] |
| | [마스크 채우기](https://huggingface.co/tasks/fill-mask) | ✅ | [`~InferenceClient.fill_mask`] |
| | [질의 응답](https://huggingface.co/tasks/question-answering) | ✅ | [`~InferenceClient.question_answering`] |
| | [요약](https://huggingface.co/tasks/summarization) | ✅ | [`~InferenceClient.summarization`] |
| | [테이블 질의 응답](https://huggingface.co/tasks/table-question-answering) | ✅ | [`~InferenceClient.table_question_answering`] |
| | [텍스트 분류](https://huggingface.co/tasks/text-classification) | ✅ | [`~InferenceClient.text_classification`] |
| | [텍스트 생성](https://huggingface.co/tasks/text-generation) | ✅ | [`~InferenceClient.text_generation`] |
| | [토큰 분류](https://huggingface.co/tasks/token-classification) | ✅ | [`~InferenceClient.token_classification`] |
| | [번역](https://huggingface.co/tasks/translation) | ✅ | [`~InferenceClient.translation`] |
| | [제로샷 분류](https://huggingface.co/tasks/zero-shot-classification) | ✅ | [`~InferenceClient.zero_shot_classification`] |
| 타블로 | [타블로 작업 분류](https://huggingface.co/tasks/tabular-classification) | ✅ | [`~InferenceClient.tabular_classification`] |
| | [타블로 회귀](https://huggingface.co/tasks/tabular-regression) | ✅ | [`~InferenceClient.tabular_regression`] |

<Tip>

Expand Down
41 changes: 17 additions & 24 deletions src/huggingface_hub/inference/_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@
import base64
import logging
import re
import time
import warnings
from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Literal, Optional, Union, overload

Expand Down Expand Up @@ -301,8 +300,6 @@ def _inner_post(
if request_parameters.task in TASKS_EXPECTING_IMAGES and "Accept" not in request_parameters.headers:
request_parameters.headers["Accept"] = "image/png"

t0 = time.time()
timeout = self.timeout
while True:
with _open_as_binary(request_parameters.data) as data_as_binary:
try:
Expand All @@ -326,30 +323,9 @@ def _inner_post(
except HTTPError as error:
if error.response.status_code == 422 and request_parameters.task != "unknown":
msg = str(error.args[0])
print(error.response.text)
if len(error.response.text) > 0:
msg += f"\n{error.response.text}\n"
msg += f"\nMake sure '{request_parameters.task}' task is supported by the model."
error.args = (msg,) + error.args[1:]
if error.response.status_code == 503:
# If Model is unavailable, either raise a TimeoutError...
if timeout is not None and time.time() - t0 > timeout:
raise InferenceTimeoutError(
f"Model not loaded on the server: {request_parameters.url}. Please retry with a higher timeout (current:"
f" {self.timeout}).",
request=error.request,
response=error.response,
) from error
# ...or wait 1s and retry
logger.info(f"Waiting for model to be loaded on the server: {error}")
time.sleep(1)
if "X-wait-for-model" not in request_parameters.headers and request_parameters.url.startswith(
INFERENCE_ENDPOINT
):
request_parameters.headers["X-wait-for-model"] = "1"
if timeout is not None:
timeout = max(self.timeout - (time.time() - t0), 1) # type: ignore
continue
raise

def audio_classification(
Expand Down Expand Up @@ -1569,6 +1545,9 @@ def question_answering(
output = QuestionAnsweringOutputElement.parse_obj(response)
return output

@_deprecate_method(
version="0.33.0", message="Use `feature_extraction` instead and compute the sentence similarity locally."
)
def sentence_similarity(
self, sentence: str, other_sentences: List[str], *, model: Optional[str] = None
) -> List[float]:
Expand Down Expand Up @@ -3261,6 +3240,13 @@ def zero_shot_image_classification(
response = self._inner_post(request_parameters)
return ZeroShotImageClassificationOutputElement.parse_obj_as_list(response)

@_deprecate_method(
version="0.33.0",
message=(
"HF Inference API is getting revamped and will only support warm models in the future (no cold start allowed)."
" Use `HfApi.list_models(..., inference_provider='...')` to list warm models per provider."
),
)
def list_deployed_models(
self, frameworks: Union[None, str, Literal["all"], List[str]] = None
) -> Dict[str, List[str]]:
Expand Down Expand Up @@ -3444,6 +3430,13 @@ def health_check(self, model: Optional[str] = None) -> bool:
response = get_session().get(url, headers=build_hf_headers(token=self.token))
return response.status_code == 200

@_deprecate_method(
version="0.33.0",
message=(
"HF Inference API is getting revamped and will only support warm models in the future (no cold start allowed)."
" Use `HfApi.model_info` to get the model status both with HF Inference API and external providers."
),
)
def get_model_status(self, model: Optional[str] = None) -> ModelStatus:
"""
Get the status of a model hosted on the HF Inference API.
Expand Down
41 changes: 17 additions & 24 deletions src/huggingface_hub/inference/_generated/_async_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@
import base64
import logging
import re
import time
import warnings
from typing import TYPE_CHECKING, Any, AsyncIterable, Dict, List, Literal, Optional, Set, Union, overload

Expand Down Expand Up @@ -299,8 +298,6 @@ async def _inner_post(
if request_parameters.task in TASKS_EXPECTING_IMAGES and "Accept" not in request_parameters.headers:
request_parameters.headers["Accept"] = "image/png"

t0 = time.time()
timeout = self.timeout
while True:
with _open_as_binary(request_parameters.data) as data_as_binary:
# Do not use context manager as we don't want to close the connection immediately when returning
Expand Down Expand Up @@ -331,27 +328,6 @@ async def _inner_post(
except aiohttp.ClientResponseError as error:
error.response_error_payload = response_error_payload
await session.close()
if response.status == 422 and request_parameters.task != "unknown":
error.message += f". Make sure '{request_parameters.task}' task is supported by the model."
if response.status == 503:
# If Model is unavailable, either raise a TimeoutError...
if timeout is not None and time.time() - t0 > timeout:
raise InferenceTimeoutError(
f"Model not loaded on the server: {request_parameters.url}. Please retry with a higher timeout"
f" (current: {self.timeout}).",
request=error.request,
response=error.response,
) from error
# ...or wait 1s and retry
logger.info(f"Waiting for model to be loaded on the server: {error}")
if "X-wait-for-model" not in request_parameters.headers and request_parameters.url.startswith(
INFERENCE_ENDPOINT
):
request_parameters.headers["X-wait-for-model"] = "1"
await asyncio.sleep(1)
if timeout is not None:
timeout = max(self.timeout - (time.time() - t0), 1) # type: ignore
continue
raise error
except Exception:
await session.close()
Expand Down Expand Up @@ -1618,6 +1594,9 @@ async def question_answering(
output = QuestionAnsweringOutputElement.parse_obj(response)
return output

@_deprecate_method(
version="0.33.0", message="Use `feature_extraction` instead and compute the sentence similarity locally."
)
async def sentence_similarity(
self, sentence: str, other_sentences: List[str], *, model: Optional[str] = None
) -> List[float]:
Expand Down Expand Up @@ -3325,6 +3304,13 @@ async def zero_shot_image_classification(
response = await self._inner_post(request_parameters)
return ZeroShotImageClassificationOutputElement.parse_obj_as_list(response)

@_deprecate_method(
version="0.33.0",
message=(
"HF Inference API is getting revamped and will only support warm models in the future (no cold start allowed)."
" Use `HfApi.list_models(..., inference_provider='...')` to list warm models per provider."
),
)
async def list_deployed_models(
self, frameworks: Union[None, str, Literal["all"], List[str]] = None
) -> Dict[str, List[str]]:
Expand Down Expand Up @@ -3554,6 +3540,13 @@ async def health_check(self, model: Optional[str] = None) -> bool:
response = await client.get(url, proxy=self.proxies)
return response.status == 200

@_deprecate_method(
version="0.33.0",
message=(
"HF Inference API is getting revamped and will only support warm models in the future (no cold start allowed)."
" Use `HfApi.model_info` to get the model status both with HF Inference API and external providers."
),
)
async def get_model_status(self, model: Optional[str] = None) -> ModelStatus:
"""
Get the status of a model hosted on the HF Inference API.
Expand Down
Loading
Loading