Skip to content
Discussion options

You must be logged in to vote

Hello,

You are right both are the same here, I presume Sebastian just did that as a quick dummy example to test the class, but it won't always be the case in practice.

Later in the chapters, you'll see that for example in SFT, the sequences in the batch are shorter than the model's context_length. Therefore num_tokens and context_length won't be the same but I don't want to spoil or go into more detail because it will make more sense as you progress and it'll be well explained.

Hope that helps

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@myme5261314
Comment options

@rasbt
Comment options

Answer selected by myme5261314
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants