You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Licensed under the Apache License, Version 2.0 (the "License");
5
+
* you may not use this file except in compliance with the License.
6
+
* You may obtain a copy of the License at
7
+
*
8
+
* http://www.apache.org/licenses/LICENSE-2.0
9
+
*
10
+
* Unless required by applicable law or agreed to in writing, software
11
+
* distributed under the License is distributed on an "AS IS" BASIS,
12
+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+
* See the License for the specific language governing permissions and
14
+
* limitations under the License.
15
+
*/
16
+
17
+
/* Markdown (render)
18
+
# Gemini API: All about tokens
19
+
20
+
An understanding of tokens is central to using the Gemini API. This guide will provide a interactive introduction to what tokens are and how they are used in the Gemini API.
21
+
22
+
## About tokens
23
+
24
+
LLMs break up their input and produce their output at a granularity that is smaller than a word, but larger than a single character or code-point.
25
+
26
+
These **tokens** can be single characters, like `z`, or whole words, like `the`. Long words may be broken up into several tokens. The set of all tokens used by the model is called the vocabulary, and the process of breaking down text into tokens is called tokenization.
27
+
28
+
For Gemini models, a token is equivalent to about 4 characters. **100 tokens are about 60-80 English words**.
29
+
30
+
When billing is enabled, the price of a paid request is controlled by the [number of input and output tokens](https://ai.google.dev/pricing), so knowing how to count your tokens is important.
31
+
32
+
*/
33
+
34
+
/* Markdown (render)
35
+
## Setup
36
+
### Install SDK and set-up the client
37
+
38
+
### API Key Configuration
39
+
40
+
To ensure security, avoid hardcoding the API key in frontend code. Instead, set it as an environment variable on the server or local machine.
41
+
42
+
When using the Gemini API client libraries, the key will be automatically detected if set as either `GEMINI_API_KEY` or `GOOGLE_API_KEY`. If both are set, `GOOGLE_API_KEY` takes precedence.
43
+
44
+
For instructions on setting environment variables across different operating systems, refer to the official documentation: [Set API Key as Environment Variable](https://ai.google.dev/gemini-api/docs/api-key#set-api-env-var)
45
+
46
+
In code, the key can then be accessed as:
47
+
48
+
```js
49
+
ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
The models available through the Gemini API have context windows that are measured in tokens. These define how much input you can provide, and how much output the model can generate, and combined are referred to as the "context window". This information is available directly through [the API](https://ai.google.dev/api/rest/v1/models/get) and in the [models](https://ai.google.dev/models/gemini) documentation.
65
+
66
+
In this example you can see the `gemini-2.5-flash` model has an 1M tokens context window. If you need more, Pro models have an even bigger 2M tokens context window.
The API provides an endpoint for counting the number of tokens in a request: [`ai.models.countTokens`](https://googleapis.github.io/js-genai/release_docs/classes/models.Models.html#counttokens). You pass the same arguments as you would to [`ai.models.generateContent`](https://googleapis.github.io/js-genai/release_docs/classes/models.Models.html#counttokens) and the service will return the number of tokens in that request.
89
+
*/
90
+
91
+
/* Markdown (render)
92
+
93
+
### Choose a model
94
+
95
+
Now select the model you want to use in this guide, either by selecting one in the list or writing it down. Keep in mind that some models, like the 2.5 ones are thinking models and thus take slightly more time to respond (cf. [thinking notebook](https://github.com/google-gemini/cookbook/blob/main/quickstarts-js/Get_started_thinking.js) for more details and in particular learn how to switch the thiking off).
96
+
97
+
The tokenization should be more or less the same for each of the Gemini models, but you can still switch between the different ones to double-check.
98
+
99
+
For more information about all Gemini models, check the [documentation](https://ai.google.dev/gemini-api/docs/models/gemini) for extended information on each of them.
When you call `ai.models.generateContent` (or `ai.sendMessage`) the response object has a `usageMetadata` attribute containing both the input and output token counts (`promptTokenCount` and `candidatesTokenCount`):
128
+
*/
129
+
130
+
// [CODE STARTS]
131
+
genResponse=awaitai.models.generateContent({
132
+
model: MODEL_ID,
133
+
contents: "The quick brown fox jumps over the lazy dog."
134
+
});
135
+
136
+
console.log(genResponse.text);
137
+
// [CODE ENDS]
138
+
139
+
/* Output Sample
140
+
141
+
Indeed it is!
142
+
143
+
This is a classic example of a **pangram**—a sentence that contains every letter of the alphabet at least once.
144
+
145
+
It's famously used for testing typewriters, keyboards, and fonts, as it allows you to see all the letters in action. It's remarkably efficient and widely recognized!
Media objects can be sent to the API inline with the request:
200
+
*/
201
+
202
+
// [CODE STARTS]
203
+
countTokensResponse=awaitai.models.countTokens({
204
+
model: MODEL_ID,
205
+
contents: [
206
+
{
207
+
inlineData: {
208
+
data: imageDataUrl,
209
+
mimeType: imageBlob.type
210
+
}
211
+
}
212
+
]
213
+
});
214
+
215
+
console.log("Prompt with image tokens:",countTokensResponse.totalTokens);
216
+
// [CODE ENDS]
217
+
218
+
/* Output Sample
219
+
220
+
Prompt with image tokens: 288661
221
+
222
+
*/
223
+
224
+
/* Markdown (render)
225
+
You can try with different images and should always get the same number of tokens, that is independent of their display or file size. Note that an extra token seems to be added, representing the empty prompt.
226
+
*/
227
+
228
+
/* Markdown (render)
229
+
#### Files API
230
+
231
+
The model sees identical tokens if you upload parts of the prompt through the files API instead:
232
+
*/
233
+
234
+
// [CODE STARTS]
235
+
organUpload=awaitai.files.upload({
236
+
file: imageBlob,
237
+
mimeType: imageBlob.type,
238
+
displayName: "organ.jpg"
239
+
});
240
+
241
+
countTokensResponse=awaitai.models.countTokens({
242
+
model: MODEL_ID,
243
+
contents: [
244
+
{fileData: {fileUri: organUpload.uri}}
245
+
]
246
+
});
247
+
248
+
console.log("Prompt with image tokens:",countTokensResponse.totalTokens);
249
+
250
+
// [CODE ENDS]
251
+
252
+
/* Output Sample
253
+
254
+
Prompt with image tokens: 259
255
+
256
+
*/
257
+
258
+
/* Markdown (render)
259
+
Audio and video are each converted to tokens at a fixed rate of tokens per minute.
console.log("Duration (in seconds):",audio.duration);
273
+
resolve();
274
+
});
275
+
});
276
+
277
+
// [CODE ENDS]
278
+
279
+
/* Output Sample
280
+
281
+
Duration (in seconds): 2610.128938
282
+
283
+
*/
284
+
285
+
/* Markdown (render)
286
+
As you can see, this audio file is 2610s long.
287
+
*/
288
+
289
+
// [CODE STARTS]
290
+
uploadedAudio=awaitai.files.upload({
291
+
file: audioBlob,
292
+
displayName: "sample.mp3",
293
+
mimeType: "audio/mpeg"
294
+
});
295
+
// [CODE ENDS]
296
+
297
+
// [CODE STARTS]
298
+
countTokensResponse=awaitai.models.countTokens({
299
+
model: MODEL_ID,
300
+
contents: [
301
+
{
302
+
fileData: {
303
+
fileUri: uploadedAudio.uri,
304
+
mimeType: uploadedAudio.mimeType
305
+
}
306
+
}
307
+
]
308
+
});
309
+
310
+
console.log("Prompt with audio tokens:",countTokensResponse.totalTokens);
311
+
console.log("Tokens per second:",countTokensResponse.totalTokens/2610);
312
+
// [CODE ENDS]
313
+
314
+
/* Output Sample
315
+
316
+
Prompt with audio tokens: 83528
317
+
318
+
Tokens per second: 32.003065134099614
319
+
320
+
*/
321
+
322
+
/* Markdown (render)
323
+
### Chat, tools and cache
324
+
325
+
Chat, tools and cache are currently not supported by the unified SDK `count_tokens` method. This notebook will be updated when that will be the case.
326
+
327
+
In the meantime you can still check the token used after the call using the `usageMetadata` from the response. Check the [[Python](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Caching.ipynb)] caching notebook for more details.
328
+
*/
329
+
330
+
/* Markdown (render)
331
+
## Further reading
332
+
333
+
For more on token counting, check out the [documentation](https://ai.google.dev/gemini-api/docs/tokens?lang=node#multimodal-tokens) or the API reference:
334
+
335
+
* [`countTokens`](https://ai.google.dev/api/rest/v1/models/countTokens) REST API reference,
336
+
* [`countTokens`](https://googleapis.github.io/js-genai/release_docs/classes/models.Models.html#counttokens) JavaScript API reference,
| Get Started | A comprehensive introduction to the Gemini JS/TS SDK, demonstrating features such as text and multimodal prompting, token counting, system instructions, safety filters, multi-turn chat, output control, function calling, content streaming, file uploads, and using URL or YouTube video context. | Explore core Gemini capabilities in JS/TS |[](https://aistudio.google.com/apps/bundled/get_started?showPreview=true)| <imgsrc="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg"alt="JS"width="20"/> [Get_Started.js](./Get_Started.js)|
20
+
| Counting Tokens | Learn how tokens work in Gemini, how to count them, and how context windows affect requests. Includes text, image, and audio tokenization. | Token counting, context windows, multimodal tokens |[](https://aistudio.google.com/apps/bundled/counting_tokens?showPreview=true)| <imgsrc="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg"alt="JS"width="20"/> [Counting_Tokens.js](./Counting_Tokens.js)|
20
21
| Image Output | Generate and iterate on images using Gemini’s multimodal capabilities. Learn to use text+image responses, edit images mid-conversation, and handle multiple image outputs with chat-style prompting. | Image generation, multimodal output, image editing, iterative refinement |[](https://aistudio.google.com/apps/bundled/get_started_image_out?showPreview=true)| <imgsrc="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg"alt="JS"width="20"/> [ImageOutput.js](./ImageOutput.js)|
21
22
| File API | Learn how to upload, use, retrieve, and delete files (text, image, audio, code) with the Gemini File API for multimodal prompts. | File upload, multimodal prompts, text/code/media files |[](https://aistudio.google.com/apps/bundled/file_api?showPreview=true)| <imgsrc="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg"alt="JS"width="20"/> [File_API.js](./File_API.js)|
22
23
| Audio | Demonstrates how to use audio files with Gemini: upload, prompt, summarize, transcribe, and analyze audio and YouTube content. | Audio file upload, inline audio, transcription, YouTube analysis |[](https://aistudio.google.com/apps/bundled/audio?showPreview=true)| <imgsrc="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg"alt="JS"width="20"/> [Audio.js](./Audio.js)|
23
24
| Get Started LearnLM | Explore LearnLM, an experimental model for AI tutoring, with examples of system instructions for test prep, concept teaching, learning activities, and homework help. | AI tutoring, system instructions, adaptive learning, education |[](https://aistudio.google.com/apps/bundled/get_started_learnlm?showPreview=true)| <imgsrc="https://cdn.jsdelivr.net/gh/devicons/devicon/icons/javascript/javascript-original.svg"alt="JS"width="20"/> [Get_started_LearnLM.js](./Get_started_LearnLM.js)|
0 commit comments