Skip to content

Conversation

@larryliu0820
Copy link
Contributor

Fixes #11618

Add sentencepiece tokenizer support

@pytorch-bot
Copy link

pytorch-bot bot commented Jun 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11645

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 9df5f1a with merge base a1dec07 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 13, 2025
@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

@jackzhxng jackzhxng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @kimishpatel said he had to make some additional changes on top of this to get it to work with Gemma3

@larryliu0820
Copy link
Contributor Author

I think @kimishpatel said he had to make some additional changes on top of this to get it to work with Gemma3

Yep sync'ed with him will update the PR

@facebook-github-bot
Copy link
Contributor

@larryliu0820 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:
Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Differential Revision: D76789606

Pulled By: larryliu0820
larryliu0820 added a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Differential Revision: D76789606

Pulled By: larryliu0820
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76789606

larryliu0820 added a commit that referenced this pull request Jun 17, 2025
Summary:
X-link: meta-pytorch/tokenizers#85

Fixes #11618

Add sentencepiece tokenizer support

Pull Request resolved: #11645

Differential Revision: D76789606

Pulled By: larryliu0820
@larryliu0820 larryliu0820 force-pushed the larryliu0820-patch-10 branch from 9cf4b64 to 0c9f7f6 Compare June 17, 2025 04:51
larryliu0820 added a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:
Pull Request resolved: #85

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Differential Revision: D76789606

Pulled By: larryliu0820
Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be added automatically to the apps, e.g. benchmark apps? Particular for iOS app, do we need to add the source to the project tree?

facebook-github-bot pushed a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
@larryliu0820
Copy link
Contributor Author

Will it be added automatically to the apps, e.g. benchmark apps? Particular for iOS app, do we need to add the source to the project tree?

Not sure, will check CI

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76789606

larryliu0820 added a commit that referenced this pull request Jun 17, 2025
Summary:
X-link: meta-pytorch/tokenizers#85

Fixes #11618

Add sentencepiece tokenizer support

Pull Request resolved: #11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
@larryliu0820 larryliu0820 force-pushed the larryliu0820-patch-10 branch from 0c9f7f6 to 71e8288 Compare June 17, 2025 20:42
@guangy10
Copy link
Contributor

Will it be added automatically to the apps, e.g. benchmark apps? Particular for iOS app, do we need to add the source to the project tree?

Not sure, will check CI

Gemma-3 is disabled from running on the app. To check on CI, you can add gemma-3 back here:

models: ${{ inputs.models || github.event_name == 'schedule' && 'Qwen/Qwen3-0.6B,HuggingFaceTB/SmolLM2-135M,meta-llama/Llama-3.2-1B,allenai/OLMo-1B-hf' || 'Qwen/Qwen3-0.6B' }}
, then schedule an ondemand run on your PR

facebook-github-bot pushed a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
larryliu0820 added a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
larryliu0820 added a commit to meta-pytorch/tokenizers that referenced this pull request Jun 17, 2025
Summary:
Pull Request resolved: #85

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76789606

larryliu0820 added a commit that referenced this pull request Jun 17, 2025
Summary:
X-link: meta-pytorch/tokenizers#85

Fixes #11618

Add sentencepiece tokenizer support

Pull Request resolved: #11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
@larryliu0820 larryliu0820 force-pushed the larryliu0820-patch-10 branch from 71e8288 to a7b9512 Compare June 17, 2025 21:37
@larryliu0820 larryliu0820 temporarily deployed to upload-benchmark-results June 17, 2025 22:59 — with GitHub Actions Inactive
@kimishpatel
Copy link
Contributor

do validate against hf tokenizer + model's output

@larryliu0820 larryliu0820 requested a review from lucylq as a code owner June 18, 2025 20:17
larryliu0820 added a commit that referenced this pull request Jun 19, 2025
Summary:
X-link: meta-pytorch/tokenizers#85

Fixes #11618

Add sentencepiece tokenizer support

Pull Request resolved: #11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
@larryliu0820 larryliu0820 force-pushed the larryliu0820-patch-10 branch from 96fd3c2 to cd8b708 Compare June 19, 2025 02:24
@facebook-github-bot
Copy link
Contributor

@larryliu0820 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit to meta-pytorch/tokenizers that referenced this pull request Jun 19, 2025
Summary:
Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
facebook-github-bot pushed a commit to meta-pytorch/tokenizers that referenced this pull request Jun 19, 2025
Summary:

Fixes pytorch/executorch#11618

Add sentencepiece tokenizer support

X-link: pytorch/executorch#11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
Summary:
X-link: meta-pytorch/tokenizers#85

Fixes #11618

Add sentencepiece tokenizer support

Pull Request resolved: #11645

Reviewed By: guangy10

Differential Revision: D76789606

Pulled By: larryliu0820
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76789606

@larryliu0820 larryliu0820 force-pushed the larryliu0820-patch-10 branch from cd8b708 to 9df5f1a Compare June 19, 2025 08:52
@facebook-github-bot facebook-github-bot merged commit 496cb05 into main Jun 20, 2025
108 of 111 checks passed
@facebook-github-bot facebook-github-bot deleted the larryliu0820-patch-10 branch June 20, 2025 06:18
hinriksnaer pushed a commit to hinriksnaer/executorch that referenced this pull request Jun 26, 2025
Differential Revision: D76789606

Pull Request resolved: pytorch#11645
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Runtime crash in PreTokenizer when running optimum-et generated gemma3 via llama runner

7 participants