Skip to content

Commit 8c7d266

Browse files
authored
lead with practical info, clarify chat/embedding input/output token distinction
1 parent fc7a9d9 commit 8c7d266

File tree

1 file changed

+5
-4
lines changed
  • explore-analyze/elastic-inference

1 file changed

+5
-4
lines changed

explore-analyze/elastic-inference/eis.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -57,13 +57,14 @@ All models on EIS incur a charge per million tokens. The pricing details are at
5757

5858
### Token-based billing
5959

60-
EIS is billed per million tokens used. Tokens are the fundamental units that language models process for both input and output.
60+
EIS is billed per million tokens used:
6161

62-
Tokenizers convert text into numerical data by segmenting it into subword units. A token may be a complete word, part of a word, or a punctuation mark, depending on the model's trained tokenizer and the frequency patterns in its training data.
62+
- For chat models input and output tokens are billed. Longer conversations with extensive context or detailed responses will consume more tokens.
63+
- For embeddings models, only input tokens are billed.
6364

64-
For example, the sentence "It was the best of times, it was the worst of times." contains 52 characters but would tokenize into approximately 14 tokens with a typical word-based approach, though the exact count varies by tokenizer.
65+
Tokens are the fundamental units that language models process for both input and output. Tokenizers convert text into numerical data by segmenting it into subword units. A token may be a complete word, part of a word, or a punctuation mark, depending on the model's trained tokenizer and the frequency patterns in its training data.
6566

66-
Both input tokens (your prompts and any context provided) and output tokens (the model's responses) count toward usage. Longer conversations with extensive context or detailed responses will consume more tokens.
67+
For example, the sentence "It was the best of times, it was the worst of times." contains 52 characters but would tokenize into approximately 14 tokens with a typical word-based approach, though the exact count varies by tokenizer.
6768

6869
## Rate Limits
6970

0 commit comments

Comments
 (0)