Skip to content

Commit 2d56513

Browse files
[EIS] Adding more info on tokens (elastic#3673)
Co-authored-by: Liam Thompson <[email protected]>
1 parent 8d49fc4 commit 2d56513

File tree

1 file changed

+11
-0
lines changed
  • explore-analyze/elastic-inference

1 file changed

+11
-0
lines changed

explore-analyze/elastic-inference/eis.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,17 @@ All models on EIS incur a charge per million tokens. The pricing details are at
5757

5858
Note that this pricing models differs from the existing [Machine Learning Nodes](https://www.elastic.co/docs/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models), which is billed via VCUs consumed.
5959

60+
### Token-based billing
61+
62+
EIS is billed per million tokens used:
63+
64+
- For **chat** models, input and output tokens are billed. Longer conversations with extensive context or detailed responses will consume more tokens.
65+
- For **embeddings** models, only input tokens are billed.
66+
67+
Tokens are the fundamental units that language models process for both input and output. Tokenizers convert text into numerical data by segmenting it into subword units. A token may be a complete word, part of a word, or a punctuation mark, depending on the model's trained tokenizer and the frequency patterns in its training data.
68+
69+
For example, the sentence "It was the best of times, it was the worst of times." contains 52 characters but would tokenize into approximately 14 tokens with a typical word-based approach, though the exact count varies by tokenizer.
70+
6071
## Rate Limits
6172

6273
The service enforces rate limits on an ongoing basis. Exceeding a limit will result in HTTP 429 responses from the server until the sliding window moves on further and parts of the limit resets.

0 commit comments

Comments
 (0)