Skip to content

Conversation

@lucylq
Copy link
Contributor

@lucylq lucylq commented Oct 17, 2025

Summary:
Also see: #15215

Currently:

  • default eos/bos tokens are embedded into the pte
  • llama3 instruct has a different set of eos/bos tokens
  • users must manually specify at export time the llama3 instruct eos/bos tokens, because the runner overrides tokenizer eos/bos with the values in the PTE

This diff:

  • removes the defaults
  • rely on tokenizer for eos/bos UNLESS the user explicitly specifies in the metadata, in which case use the eos/bos saved in PTE.

Differential Revision: D84942718

@lucylq lucylq requested a review from jackzhxng as a code owner October 17, 2025 19:37
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15231

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 5 Unrelated Failures

As of commit 0e547c1 with merge base aeee757 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 17, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 17, 2025

@lucylq has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84942718.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@lucylq lucylq requested a review from larryliu0820 October 17, 2025 19:39
Copy link
Contributor

@jackzhxng jackzhxng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to get rid of fairseq

Summary:

See: pytorch#15215

Currently:
- default eos/bos tokens are embedded into the pte
- llama3 instruct has a different set of eos/bos tokens
- users must manually specify at export time the llama3 instruct eos/bos tokens, because the runner overrides tokenizer eos/bos with the values in the PTE

This diff:
- removes the defaults
- rely on tokenizer for eos/bos UNLESS the user explicitly specifies in the metadata, in which case use the eos/bos saved in PTE.

Reviewed By: jackzhxng

Differential Revision: D84942718
@meta-codesync meta-codesync bot merged commit 5ec4872 into pytorch:main Oct 18, 2025
142 of 153 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants