@@ -203,6 +203,87 @@ client = GuardrailsAsyncOpenAI(
203203)
204204```
205205
206+ ## Token Usage Tracking
207+
208+ LLM-based guardrails (Jailbreak, Custom Prompt Check, etc.) consume tokens. You can track token usage across all guardrail calls using the unified ` total_guardrail_token_usage ` function:
209+
210+ ``` python
211+ from guardrails import GuardrailsAsyncOpenAI, total_guardrail_token_usage
212+
213+ client = GuardrailsAsyncOpenAI(config = " config.json" )
214+ response = await client.responses.create(model = " gpt-4o" , input = " Hello" )
215+
216+ # Get aggregated token usage from all guardrails
217+ tokens = total_guardrail_token_usage(response)
218+ print (f " Guardrail tokens used: { tokens[' total_tokens' ]} " )
219+ # Output: Guardrail tokens used: 425
220+ ```
221+
222+ The function returns a dictionary:
223+ ``` python
224+ {
225+ " prompt_tokens" : 300 , # Sum of prompt tokens across all LLM guardrails
226+ " completion_tokens" : 125 , # Sum of completion tokens
227+ " total_tokens" : 425 , # Total tokens used by guardrails
228+ }
229+ ```
230+
231+ ### Works Across All Surfaces
232+
233+ ` total_guardrail_token_usage ` works with any guardrails result type:
234+
235+ ``` python
236+ # OpenAI client responses
237+ response = await client.responses.create(... )
238+ tokens = total_guardrail_token_usage(response)
239+
240+ # Streaming (use the last chunk)
241+ async for chunk in stream:
242+ last_chunk = chunk
243+ tokens = total_guardrail_token_usage(last_chunk)
244+
245+ # Agents SDK
246+ result = await Runner.run(agent, input )
247+ tokens = total_guardrail_token_usage(result)
248+ ```
249+
250+ ### Per-Guardrail Token Usage
251+
252+ Each guardrail result includes its own token usage in the ` info ` dict:
253+
254+ ** OpenAI Clients (GuardrailsAsyncOpenAI, etc.)** :
255+
256+ ``` python
257+ response = await client.responses.create(model = " gpt-4.1" , input = " Hello" )
258+
259+ for gr in response.guardrail_results.all_results:
260+ usage = gr.info.get(" token_usage" )
261+ if usage:
262+ print (f " { gr.info[' guardrail_name' ]} : { usage[' total_tokens' ]} tokens " )
263+ ```
264+
265+ ** Agents SDK** - access token usage per stage via ` RunResult ` :
266+
267+ ``` python
268+ result = await Runner.run(agent, " Hello" )
269+
270+ # Input guardrails
271+ for gr in result.input_guardrail_results:
272+ usage = gr.output.output_info.get(" token_usage" ) if gr.output.output_info else None
273+ if usage:
274+ print (f " Input: { usage[' total_tokens' ]} tokens " )
275+
276+ # Output guardrails
277+ for gr in result.output_guardrail_results:
278+ usage = gr.output.output_info.get(" token_usage" ) if gr.output.output_info else None
279+ if usage:
280+ print (f " Output: { usage[' total_tokens' ]} tokens " )
281+
282+ # Tool guardrails: result.tool_input_guardrail_results, result.tool_output_guardrail_results
283+ ```
284+
285+ Non-LLM guardrails (URL Filter, Moderation, PII) don't consume tokens and won't have ` token_usage ` in their info.
286+
206287## Next Steps
207288
208289- Explore [ examples] ( ./examples.md ) for advanced patterns
0 commit comments