-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathClaude API DOCS
More file actions
8185 lines (6555 loc) · 209 KB
/
Claude API DOCS
File metadata and controls
8185 lines (6555 loc) · 209 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Using the API
Getting started
Accessing the API
The API is made available via our web Console. You can use the Workbench to try out the API in the browser and then generate API keys in Account Settings. Use workspaces to segment your API keys and control spend by use case.
Authentication
All requests to the Anthropic API must include an x-api-key header with your API key. If you are using the Client SDKs, you will set the API when constructing a client, and then the SDK will send the header on your behalf with every request. If integrating directly with the API, you’ll need to send this header yourself.
Content types
The Anthropic API always accepts JSON in request bodies and returns JSON in response bodies. You will need to send the content-type: application/json header in requests. If you are using the Client SDKs, this will be taken care of automatically.
Response Headers
The Anthropic API includes the following headers in every response:
request-id: A globally unique identifier for the request.
anthropic-organization-id: The organization ID associated with the API key used in the request.
Examples
curl
Python
TypeScript
Shell
curl https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "content-type: application/json" \
--data \
'{
"model": "claude-3-7-sonnet-20250219",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, world"}
]
}'
Was this page helpful?
Yes
No
IP addresses
x
linkedin
Using the API
IP addresses
Anthropic services live at a fixed range of IP addresses. You can add these to your firewall to open the minimum amount of surface area for egress traffic when accessing the Anthropic API and Console. These ranges will not change without notice.
IPv4
160.79.104.0/23
IPv6
2607:6bc0::/48
Was this page helpful?
Yes
No
Getting started
Versions
x
linkedin
Using the API
Versions
When making API requests, you must send an anthropic-version request header. For example, anthropic-version: 2023-06-01. If you are using our client libraries, this is handled for you automatically.
For any given API version, we will preserve:
Existing input parameters
Existing output parameters
However, we may do the following:
Add additional optional inputs
Add additional values to the output
Change conditions for specific error types
Add new variants to enum-like output values (for example, streaming event types)
Generally, if you are using the API as documented in this reference, we will not break your usage.
Version history
We always recommend using the latest API version whenever possible. Previous versions are considered deprecated and may be unavailable for new users.
2023-06-01
New format for streaming server-sent events (SSE):
Completions are incremental. For example, " Hello", " my", " name", " is", " Claude." instead of " Hello", " Hello my", " Hello my name", " Hello my name is", " Hello my name is Claude.".
All events are named events, rather than data-only events.
Removed unnecessary data: [DONE] event.
Removed legacy exception and truncated values in responses.
2023-01-01: Initial release.
Using the API
Errors
HTTP errors
Our API follows a predictable HTTP error code format:
400 - invalid_request_error: There was an issue with the format or content of your request. We may also use this error type for other 4XX status codes not listed below.
401 - authentication_error: There’s an issue with your API key.
403 - permission_error: Your API key does not have permission to use the specified resource.
404 - not_found_error: The requested resource was not found.
413 - request_too_large: Request exceeds the maximum allowed number of bytes.
429 - rate_limit_error: Your account has hit a rate limit.
500 - api_error: An unexpected error has occurred internal to Anthropic’s systems.
529 - overloaded_error: Anthropic’s API is temporarily overloaded.
Sudden large increases in usage may lead to an increased rate of 529 errors. We recommend ramping up gradually and maintaining consistent usage patterns.
When receiving a streaming response via SSE, it’s possible that an error can occur after returning a 200 response, in which case error handling wouldn’t follow these standard mechanisms.
Error shapes
Errors are always returned as JSON, with a top-level error object that always includes a type and message value. For example:
JSON
{
"type": "error",
"error": {
"type": "not_found_error",
"message": "The requested resource could not be found."
}
}
In accordance with our versioning policy, we may expand the values within these objects, and it is possible that the type values will grow over time.
Request id
Every API response includes a unique request-id header. This header contains a value such as req_018EeWyXxfu5pfWkrYcMdjWG. When contacting support about a specific request, please include this ID to help us quickly resolve your issue.
Our official SDKs provide this value as a property on top-level response objects, containing the value of the x-request-id header:
Python
TypeScript
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-3-7-sonnet-20250219",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(f"Request ID: {message._request_id}")
Long requests
We highly encourage using the streaming Messages API or Message Batches API for long running requests, especially those over 10 minutes.
We do not recommend setting a large max_tokens values without using our streaming Messages API or Message Batches API:
Some networks may drop idle connections after a variable period of time, which can cause the request to fail or timeout without receiving a response from Anthropic.
Networks differ in reliablity; our Message Batches API can help you manage the risk of network issues by allowing you to poll for results rather than requiring an uninterrupted network connection.
If you are building a direct API integration, you should be aware that setting a TCP socket keep-alive can reduce the impact of idle connection timeouts on some networks.
Our SDKs will validate that your non-streaming Messages API requests are not expected to exceed a 10 minute timeout and also will set a socket option for TCP keep-alive.
Using the API
Rate limits
To mitigate misuse and manage capacity on our API, we have implemented limits on how much an organization can use the Claude API.
We have two types of limits:
Spend limits set a maximum monthly cost an organization can incur for API usage.
Rate limits set the maximum number of API requests an organization can make over a defined period of time.
We enforce service-configured limits at the organization level, but you may also set user-configurable limits for your organization’s workspaces.
About our limits
Limits are designed to prevent API abuse, while minimizing impact on common customer usage patterns.
Limits are defined by usage tier, where each tier is associated with a different set of spend and rate limits.
Your organization will increase tiers automatically as you reach certain thresholds while using the API. Limits are set at the organization level. You can see your organization’s limits in the Limits page in the Anthropic Console.
You may hit rate limits over shorter time intervals. For instance, a rate of 60 requests per minute (RPM) may be enforced as 1 request per second. Short bursts of requests at a high volume can surpass the rate limit and result in rate limit errors.
The limits outlined below are our standard limits. If you’re seeking higher, custom limits, contact sales through the Anthropic Console.
We use the token bucket algorithm to do rate limiting. This means that your capacity is continuously replenished up to your maximum limit, rather than being reset at fixed intervals.
All limits described here represent maximum allowed usage, not guaranteed minimums. These limits are intended to reduce unintentional overspend and ensure fair distribution of resources among users.
Spend limits
Each usage tier has a limit on how much you can spend on the API each calendar month. Once you reach the spend limit of your tier, until you qualify for the next tier, you will have to wait until the next month to be able to use the API again.
To qualify for the next tier, you must meet a deposit requirement. To minimize the risk of overfunding your account, you cannot deposit more than your monthly spend limit.
Requirements to advance tier
Usage Tier Credit Purchase Max Usage per Month
Tier 1 $5 $100
Tier 2 $40 $500
Tier 3 $200 $1,000
Tier 4 $400 $5,000
Monthly Invoicing N/A N/A
Rate limits
Our rate limits for the Messages API are measured in requests per minute (RPM), input tokens per minute (ITPM), and output tokens per minute (OTPM) for each model class. If you exceed any of the rate limits you will get a 429 error describing which rate limit was exceeded, along with a retry-after header indicating how long to wait.
ITPM rate limits are estimated at the beginning of each request, and the estimate is adjusted during the request to reflect the actual number of input tokens used. The final adjustment counts input_tokens and cache_creation_input_tokens towards ITPM rate limits, while cache_read_input_tokens are not (though they are still billed). In some instances, cache_read_input_tokens are counted towards ITPM rate limits.
OTPM rate limits are estimated based on max_tokens at the beginning of each request, and the estimate is adjusted at the end of the request to reflect the actual number of output tokens used. If you’re hitting OTPM limits earlier than expected, try reducing max_tokens to better approximate the size of your completions.
Rate limits are applied separately for each model; therefore you can use different models up to their respective limits simultaneously. You can check your current rate limits and behavior in the Anthropic Console.
Tier 1
Tier 2
Tier 3
Tier 4
Custom
Model Maximum requests per minute (RPM) Maximum input tokens per minute (ITPM) Maximum output tokens per minute (OTPM)
Claude 3.7 Sonnet 50 20,000 8,000
Claude 3.5 Sonnet
2024-10-22 50 40,000* 8,000
Claude 3.5 Sonnet
2024-06-20 50 40,000* 8,000
Claude 3.5 Haiku 50 50,000* 10,000
Claude 3 Opus 50 20,000* 4,000
Claude 3 Sonnet 50 40,000* 8,000
Claude 3 Haiku 50 50,000* 10,000
Limits marked with asterisks (*) count cache_read_input_tokens towards ITPM usage.
Message Batches API
The Message Batches API has its own set of rate limits which are shared across all models. These include a requests per minute (RPM) limit to all API endpoints and a limit on the number of batch requests that can be in the processing queue at the same time. A “batch request” here refers to part of a Message Batch. You may create a Message Batch containing thousands of batch requests, each of which count towards this limit. A batch request is considered part of the processing queue when it has yet to be successfully processed by the model.
Tier 1
Tier 2
Tier 3
Tier 4
Custom
Maximum requests per minute (RPM) Maximum batch requests in processing queue Maximum batch requests per batch
50 100,000 100,000
Setting lower limits for Workspaces
In order to protect Workspaces in your Organization from potential overuse, you can set custom spend and rate limits per Workspace.
Example: If your Organization’s limit is 40,000 input tokens per minute and 8,000 output tokens per minute, you might limit one Workspace to 30,000 total tokens per minute. This protects other Workspaces from potential overuse and ensures a more equitable distribution of resources across your Organization. The remaining unused tokens per minute (or more, if that Workspace doesn’t use the limit) are then available for other Workspaces to use.
Note:
You can’t set limits on the default Workspace.
If not set, Workspace limits match the Organization’s limit.
Organization-wide limits always apply, even if Workspace limits add up to more.
Support for input and output token limits will be added to Workspaces in the future.
Response headers
The API response includes headers that show you the rate limit enforced, current usage, and when the limit will be reset.
The following headers are returned:
Header Description
retry-after The number of seconds to wait until you can retry the request. Earlier retries will fail.
anthropic-ratelimit-requests-limit The maximum number of requests allowed within any rate limit period.
anthropic-ratelimit-requests-remaining The number of requests remaining before being rate limited.
anthropic-ratelimit-requests-reset The time when the request rate limit will be fully replenished, provided in RFC 3339 format.
anthropic-ratelimit-tokens-limit The maximum number of tokens allowed within any rate limit period.
anthropic-ratelimit-tokens-remaining The number of tokens remaining (rounded to the nearest thousand) before being rate limited.
anthropic-ratelimit-tokens-reset The time when the token rate limit will be fully replenished, provided in RFC 3339 format.
anthropic-ratelimit-input-tokens-limit The maximum number of input tokens allowed within any rate limit period.
anthropic-ratelimit-input-tokens-remaining The number of input tokens remaining (rounded to the nearest thousand) before being rate limited.
anthropic-ratelimit-input-tokens-reset The time when the input token rate limit will be fully replenished, provided in RFC 3339 format.
anthropic-ratelimit-output-tokens-limit The maximum number of output tokens allowed within any rate limit period.
anthropic-ratelimit-output-tokens-remaining The number of output tokens remaining (rounded to the nearest thousand) before being rate limited.
anthropic-ratelimit-output-tokens-reset The time when the output token rate limit will be fully replenished, provided in RFC 3339 format.
The anthropic-ratelimit-tokens-* headers display the values for the most restrictive limit currently in effect. For instance, if you have exceeded the Workspace per-minute token limit, the headers will contain the Workspace per-minute token rate limit values. If Workspace limits do not apply, the headers will return the total tokens remaining, where total is the sum of input and output tokens. This approach ensures that you have visibility into the most relevant constraint on your current API usage.
Using the API
Client SDKs
We provide libraries in Python and TypeScript that make it easier to work with the Anthropic API.
Additional configuration is needed to use Anthropic’s Client SDKs through a partner platform. If you are using Amazon Bedrock, see this guide; if you are using Google Cloud Vertex AI, see this guide.
Python
Python library GitHub repo
Example:
Python
import anthropic
client = anthropic.Anthropic(
# defaults to os.environ.get("ANTHROPIC_API_KEY")
api_key="my_api_key",
)
message = client.messages.create(
model="claude-3-7-sonnet-20250219",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message.content)
TypeScript
TypeScript library GitHub repo
While this library is in TypeScript, it can also be used in JavaScript libraries.
Example:
TypeScript
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: 'my_api_key', // defaults to process.env["ANTHROPIC_API_KEY"]
});
const msg = await anthropic.messages.create({
model: "claude-3-7-sonnet-20250219",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello, Claude" }],
});
console.log(msg);
Java
Java library GitHub repo
Example:
Java
import com.anthropic.models.Message;
import com.anthropic.models.MessageCreateParams;
import com.anthropic.models.Model;
MessageCreateParams params = MessageCreateParams.builder()
.maxTokens(1024L)
.addUserMessage("Hello, Claude")
.model(Model.CLAUDE_3_7_SONNET)
.build();
Message message = client.messages().create(params);
Go
Go library GitHub repo
The Anthropic Go SDK is currently in beta. If you see any issues with it, please file an issue on GitHub!
Example:
Go
package main
import (
"context"
"fmt"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/option"
)
func main() {
client := anthropic.NewClient(
option.WithAPIKey("my-anthropic-api-key"),
)
message, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaude3_7Sonnet),
MaxTokens: anthropic.F(int64(1024)),
Messages: anthropic.F([]anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("What is a quaternion?")),
}),
})
if err != nil {
panic(err.Error())
}
fmt.Printf("%+v\n", message.Content)
}
Using the API
Supported regions
Here are the countries, regions, and territories we can currently support access from:
Albania
Algeria
Andorra
Angola
Antigua and Barbuda
Argentina
Armenia
Australia
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belgium
Belize
Benin
Bhutan
Bolivia
Bosnia and Herzegovina
Botswana
Brazil
Brunei
Bulgaria
Burkina Faso
Burundi
Cabo Verde
Cambodia
Cameroon
Canada
Chad
Chile
Colombia
Comoros
Congo, Republic of the
Costa Rica
Côte d’Ivoire
Croatia
Cyprus
Czechia (Czech Republic)
Denmark
Djibouti
Dominica
Dominican Republic
Ecuador
Egypt
El Salvador
Equatorial Guinea
Estonia
Eswatini
Fiji
Finland
France
Gabon
Gambia
Georgia
Germany
Ghana
Greece
Grenada
Guatemala
Guinea
Guinea-Bissau
Guyana
Haiti
Holy See (Vatican City)
Honduras
Hungary
Iceland
India
Indonesia
Iraq
Ireland
Israel
Italy
Jamaica
Japan
Jordan
Kazakhstan
Kenya
Kiribati
Kuwait
Kyrgyzstan
Laos
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Madagascar
Malawi
Malaysia
Maldives
Malta
Marshall Islands
Mauritania
Mauritius
Mexico
Micronesia
Moldova
Monaco
Mongolia
Montenegro
Morocco
Mozambique
Namibia
Nauru
Nepal
Netherlands
New Zealand
Niger
Nigeria
North Macedonia
Norway
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Poland
Portugal
Qatar
Romania
Rwanda
Saint Kitts and Nevis
Saint Lucia
Saint Vincent and the Grenadines
Samoa
San Marino
Sao Tome and Principe
Saudi Arabia
Senegal
Serbia
Seychelles
Sierra Leone
Singapore
Slovakia
Slovenia
Solomon Islands
South Africa
South Korea
Spain
Sri Lanka
Suriname
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Timor-Leste, Democratic Republic of
Togo
Tonga
Trinidad and Tobago
Tunisia
Turkey
Turkmenistan
Tuvalu
Uganda
Ukraine (except Crimea, Donetsk, and Luhansk regions)
United Arab Emirates
United Kingdom
United States of America
Uruguay
Uzbekistan
Vanuatu
Vietnam
Zambia
Zimbabwe
Messages
Messages
Send a structured list of input messages with text and/or image content, and the model will generate the next message in the conversation.
The Messages API can be used for either single queries or stateless multi-turn conversations.
Learn more about the Messages API in our user guide
POST
/
v1
/
messages
Headers
anthropic-beta
string[]
Optional header to specify the beta version(s) you want to use.
To use multiple betas, use a comma separated list like beta1,beta2 or specify the header multiple times for each beta.
anthropic-version
string
required
The version of the Anthropic API you want to use.
Read more about versioning and our version history here.
x-api-key
string
required
Your unique API key for authentication.
This key is required in the header of all API requests, to authenticate your account and access Anthropic's services. Get your API key through the Console. Each key is scoped to a Workspace.
Body
application/json
max_tokens
integer
required
The maximum number of tokens to generate before stopping.
Note that our models may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.
Different models have different maximum values for this parameter. See models for details.
Required range: x > 1
messages
object[]
required
Input messages.
Our models are trained to operate on alternating user and assistant conversational turns. When creating a new Message, you specify the prior conversational turns with the messages parameter, and the model then generates the next Message in the conversation. Consecutive user or assistant turns in your request will be combined into a single turn.
Each input message must be an object with a role and content. You can specify a single user-role message, or you can include multiple user and assistant messages.
If the final message uses the assistant role, the response content will continue immediately from the content in that message. This can be used to constrain part of the model's response.
Example with a single user message:
[{"role": "user", "content": "Hello, Claude"}]
Example with multiple conversational turns:
[
{"role": "user", "content": "Hello there."},
{"role": "assistant", "content": "Hi, I'm Claude. How can I help you?"},
{"role": "user", "content": "Can you explain LLMs in plain English?"},
]
Example with a partially-filled response from Claude:
[
{"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
{"role": "assistant", "content": "The best answer is ("},
]
Each input message content may be either a single string or an array of content blocks, where each block has a specific type. Using a string for content is shorthand for an array of one content block of type "text". The following input messages are equivalent:
{"role": "user", "content": "Hello, Claude"}
{"role": "user", "content": [{"type": "text", "text": "Hello, Claude"}]}
Starting with Claude 3 models, you can also send image content blocks:
{"role": "user", "content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "/9j/4AAQSkZJRg...",
}
},
{"type": "text", "text": "What is in this image?"}
]}
We currently support the base64 source type for images, and the image/jpeg, image/png, image/gif, and image/webp media types.
See examples for more input examples.
Note that if you want to include a system prompt, you can use the top-level system parameter — there is no "system" role for input messages in the Messages API.
There is a limit of 100000 messages in a single request.
Show child attributes
model
string
required
The model that will complete your prompt.
See models for additional details and options.
Required string length: 1 - 256
metadata
object
An object describing metadata about the request.
Show child attributes
stop_sequences
string[]
Custom text sequences that will cause the model to stop generating.
Our models will normally stop when they have naturally completed their turn, which will result in a response stop_reason of "end_turn".
If you want the model to stop generating when it encounters custom strings of text, you can use the stop_sequences parameter. If the model encounters one of the custom sequences, the response stop_reason value will be "stop_sequence" and the response stop_sequence value will contain the matched stop sequence.
stream
boolean
Whether to incrementally stream the response using server-sent events.
See streaming for details.
system
string
System prompt.
A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role. See our guide to system prompts.
temperature
number
Amount of randomness injected into the response.
Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks.
Note that even with temperature of 0.0, the results will not be fully deterministic.
Required range: 0 < x < 1
thinking
object
Configuration for enabling Claude's extended thinking.
When enabled, responses include thinking content blocks showing Claude's thinking process before the final answer. Requires a minimum budget of 1,024 tokens and counts towards your max_tokens limit.
See extended thinking for details.
Enabled
Disabled
Show child attributes
tool_choice
object
How the model should use the provided tools. The model can use a specific tool, any available tool, decide by itself, or not use tools at all.
Auto
Any
Tool
ToolChoiceNone
Show child attributes
tools
object[]
Definitions of tools that the model may use.
If you include tools in your API request, the model may return tool_use content blocks that represent the model's use of those tools. You can then run those tools using the tool input generated by the model and then optionally return results back to the model using tool_result content blocks.
Each tool definition includes:
name: Name of the tool.
description: Optional, but strongly-recommended description of the tool.
input_schema: JSON schema for the tool input shape that the model will produce in tool_use output content blocks.
For example, if you defined tools as:
[
{
"name": "get_stock_price",
"description": "Get the current stock price for a given ticker symbol.",
"input_schema": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "The stock ticker symbol, e.g. AAPL for Apple Inc."
}
},
"required": ["ticker"]
}
}
]
And then asked the model "What's the S&P 500 at today?", the model might produce tool_use content blocks in the response like this:
[
{
"type": "tool_use",
"id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"name": "get_stock_price",
"input": { "ticker": "^GSPC" }
}
]
You might then run your get_stock_price tool with {"ticker": "^GSPC"} as an input, and return the following back to the model in a subsequent user message:
[
{
"type": "tool_result",
"tool_use_id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"content": "259.75 USD"
}
]
Tools can be used for workflows that include running client-side tools and functions, or more generally whenever you want the model to produce a particular JSON structure of output.
See our guide for more details.
Custom Tool
ComputerUseTool_20241022
BashTool_20241022
TextEditor_20241022
ComputerUseTool_20250124
BashTool_20250124
TextEditor_20250124
Show child attributes
top_k
integer
Only sample from the top K options for each subsequent token.
Used to remove "long tail" low probability responses. Learn more technical details here.
Recommended for advanced use cases only. You usually only need to use temperature.
Required range: x > 0
top_p
number
Use nucleus sampling.
In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. You should either alter temperature or top_p, but not both.
Recommended for advanced use cases only. You usually only need to use temperature.
Required range: 0 < x < 1
Response
200 - application/json
content
object[]
required
Content generated by the model.
This is an array of content blocks, each of which has a type that determines its shape.
Example:
[{"type": "text", "text": "Hi, I'm Claude."}]
If the request input messages ended with an assistant turn, then the response content will continue directly from that last turn. You can use this to constrain the model's output.
For example, if the input messages were:
[
{"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
{"role": "assistant", "content": "The best answer is ("}
]
Then the response content might be:
[{"type": "text", "text": "B)"}]
Text
Tool Use
Thinking
Redacted Thinking
Show child attributes
id
string
required
Unique object identifier.
The format and length of IDs may change over time.
model
string
required
The model that handled the request.
Required string length: 1 - 256
role
enum<string>
default:
assistant
required
Conversational role of the generated message.
This will always be "assistant".
Available options: assistant
stop_reason
enum<string> | null
required
The reason that we stopped.
This may be one the following values:
"end_turn": the model reached a natural stopping point
"max_tokens": we exceeded the requested max_tokens or the model's maximum
"stop_sequence": one of your provided custom stop_sequences was generated
"tool_use": the model invoked one or more tools
In non-streaming mode this value is always non-null. In streaming mode, it is null in the message_start event and non-null otherwise.
Available options: end_turn, max_tokens, stop_sequence, tool_use
stop_sequence
string | null
required
Which custom stop sequence was generated, if any.
This value will be a non-null string if one of your custom stop sequences was generated.
type
enum<string>
default:
message
required
Object type.
For Messages, this is always "message".
Available options: message
usage
object
required
Billing and rate-limit usage.
Anthropic's API bills and rate-limits by token counts, as tokens represent the underlying cost to our systems.
Under the hood, the API transforms requests into a format suitable for the model. The model's output then goes through a parsing stage before becoming an API response. As a result, the token counts in usage will not match one-to-one with the exact visible content of an API request or response.
For example, output_tokens will be non-zero, even for an empty string response from Claude.
Total input tokens in a request is the summation of input_tokens, cache_creation_input_tokens, and cache_read_input_tokens.
Show child attributes
Was this page helpful?
Yes
No
Getting help
Count Message tokens
x
linkedin
cURL
Python
JavaScript
curl https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "content-type: application/json" \
--data \
'{
"model": "claude-3-7-sonnet-20250219",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, world"}
]
}'
200
4XX
{
"content": [
{
"text": "Hi! My name is Claude.",
"type": "text"
}
],
"id": "msg_013Zva2CMHLNnXjNJJKqJ2EF",
"model": "claude-3-7-sonnet-20250219",
"role": "assistant",
"stop_reason": "end_turn",
"stop_sequence": null,
"type": "message",
"usage": {
"input_tokens": 2095,
"output_tokens": 503
}
}
Messages - Anthropic
Messages
Count Message tokens
Count the number of tokens in a Message.
The Token Count API can be used to count the number of tokens in a Message, including tools, images, and documents, without creating it.