You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update on "update llama runner to decode single token"
Right now, we don't print the generated response in the eager runner until all tokens are generated. This is not good experience as we need to wait until all tokens are generated to see the response.
This PR updates it to decode each new token immediately after it is generated.
Differential Revision: [D65578306](https://our.internmc.facebook.com/intern/diff/D65578306/)
[ghstack-poisoned]
Copy file name to clipboardExpand all lines: README.md
+5Lines changed: 5 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,6 +43,11 @@ We recommend using the latest release tag from the
43
43
See [CONTRIBUTING.md](CONTRIBUTING.md) for details about issues, PRs, code
44
44
style, CI jobs, and other development topics.
45
45
46
+
To connect with us and other community members, we invite you to join PyTorch Slack community by filling out this [form](https://docs.google.com/forms/d/e/1FAIpQLSeADnUNW36fjKjYzyHDOzEB_abKQE9b6gqqW9NXse6O0MWh0A/viewform). Once you've joined, you can:
47
+
* Head to the `#executorch-general` channel for general questions, discussion, and community support.
48
+
* Join the `#executorch-contributors` channel if you're interested in contributing directly to project development.
0 commit comments