Skip to content

Conversation

@Rocketknight1
Copy link
Member

@Rocketknight1 Rocketknight1 commented Jan 16, 2026

More V5 pipeline cleanup, followup to #43256 and #43306:

  • feature-extraction renamed to text-embedding (keeping the old name as an alias)
  • image-feature-extraction renamed to image-embedding (keeping the old name as an alias)
  • question-answering and visual-question-answering removed
  • fill-mask removed
  • Updated the default text-generation and image-text-to-text models
  • Updated the migration guide to explain all of this!

@Rocketknight1 Rocketknight1 marked this pull request as ready for review January 16, 2026 15:03
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Rocketknight1 Rocketknight1 changed the title Rename the feature extraction pipelines and remove question-answering More V5 pipeline cleanup Jan 16, 2026
@vasqu
Copy link
Contributor

vasqu commented Jan 16, 2026

Just commenting before more other langs docs might get broken:

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: afmoe, aimv2, albert, align, altclip, audio_spectrogram_transformer, autoformer, bamba, bart

@Rocketknight1
Copy link
Member Author

@vasqu I think this should be ready for review now! I chased down the references in the other language docs too

@github-actions
Copy link
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43325&sha=3571cb

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup 🧹 mostly nits, but I think we need to take another look at the tests to properly rename some stuff there

Comment on lines +583 to +584
`Text2TextGenerationPipeline`, as well as the related `SummarizationPipeline` and `TranslationPipeline`, were deprecated and will now be removed. The
`question-answering` pipeline has also been removed. `pipeline` classes are intended as a high-level beginner-friendly API,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`Text2TextGenerationPipeline`, as well as the related `SummarizationPipeline` and `TranslationPipeline`, were deprecated and will now be removed. The
`question-answering` pipeline has also been removed. `pipeline` classes are intended as a high-level beginner-friendly API,
`question-answering` and `Text2TextGenerationPipeline`, including its related `SummarizationPipeline` and `TranslationPipeline`, were deprecated and will now be removed. `pipeline` classes are intended as a high-level beginner-friendly API,

More of a nit, the first 2 sentences just don't read super well

Similarly, the `image-to-text` pipeline has been removed. This pipeline was used for early image captioning models, but these
no longer offer competitive performance. Instead, for image captioning tasks we recommend using a modern vision-language chat model
via the `image-text-to-text` pipeline. For example:
The above example can be adapted for translation or question answering simply by changing the prompt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The above example can be adapted for translation or question answering simply by changing the prompt.
The above example can be adapted for other tasks, e.g. translation or question answering, simply by changing the prompt.


### Other changes

- The `feature-extraction` pipeline has now been renamed to `text-embedding` and the `image-feature-extraction` pipeline has been renamed to `image-embedding`. The older names are still usable as aliases, so this should not impact your existing code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to mark that these aliases won't be forever (we should deprecate them later on)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we make changes on the Hub side, people will forever have feature-extraction models they'll want to run.

"FillMaskPipeline",
"ImageClassificationPipeline",
"ImageFeatureExtractionPipeline",
"ImageEmbeddingPipeline",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to build on top of #42564 for getting modality-specific embeddings, just as a note (to myself)

"impl": TextEmbeddingPipeline,
"pt": (AutoModel,) if is_torch_available() else (),
"default": {"model": ("distilbert/distilbert-base-cased", "6ea8117")},
"type": "multimodal",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this was wrong?

Comment on lines -909 to -922
@slow
def test_base_mask_filling(self):
pbase = pipeline(task="fill-mask", model="facebook/bart-base")
src_text = [" I went to the <mask>."]
results = [x["token_str"] for x in pbase(src_text)]
assert " bathroom" in results

@slow
def test_large_mask_filling(self):
plarge = pipeline(task="fill-mask", model="facebook/bart-large")
src_text = [" I went to the <mask>."]
results = [x["token_str"] for x in plarge(src_text)]
expected_results = [" bathroom", " gym", " wrong", " movies", " hospital"]
self.assertListEqual(results, expected_results)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are essentially integration tests, can we rewrite those instead of removing

Comment on lines -537 to -539
@is_pipeline_test
def test_pipeline_fill_mask(self):
self.run_task_tests(task="fill-mask")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only seeing tests removed but no renames / additions for the new naming? E.g. self.run_task_tests(task="text-embedding") should exist, no?

"document-question-answering": {"test": DocumentQuestionAnsweringPipelineTests},
"feature-extraction": {"test": FeatureExtractionPipelineTests},
"fill-mask": {"test": FillMaskPipelineTests},
"text-embedding": {"test": FeatureExtractionPipelineTests},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We should also rename FeatureExtractionPipelineTests (for image as well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants