A curated list of research papers, repositories, and posts exploring side-channel attacks on LLM APIs – techniques that exploit hidden signals in the API (tokenization quirks, log-likelihood feedback, caching, streaming/timing, etc.) to extract secret information about the model, its users, or to bypass certain safeguards.
- Extracting information about other users of the API
- Breaking the LLM APIs for other users
- Extracting information about the LLM
-
Auditing Prompt Caching in Language Model APIs – Gu et al., Feb 2025.
- Attack vector: Timing differences in cache-hit vs cache-miss responses when the API uses a shared prompt cache
- Required access: querying the LLM API supporting prompt caching, and the cache is shared between users
- Information gained: detecting if there exists an user who has recently used a specific prompt
-
InputSnatch: Stealing Input in LLM Services via Timing Side-Channel Attacks – Zheng et al., Nov 2024.
- Attack vector: Timing differences in cache-hit vs cache-miss responses when the API uses a shared prompt cache
- Required access: querying the LLM API supporting prefix-based prompt caching, and the cache is shared between users
- Information gained: complete user prompts (prefix tokens reconstructed iteratively) from unknown users
-
Stealing User Prompts from Mixture-of-Experts Models – Yona et al., 2024.
- Attack vector: MoE architectures that computes something over a batch of inputs from multiple users and use a deterministic tie-breaker can leak information about some inputs into the outputs of other users
- Required access: place many queries together with the victim's queries in the same batch
- Information gained: complete victim prompts, reconstructed token-by-token
-
I Know What You Asked: Prompt Leakage via KV-Cache Sharing in Multi-Tenant LLM Serving – Wu et al., NDSS 2025.
- Attack vector: Timing differences in cache-hit vs cache-miss responses when the API uses a shared KV-cache
- Required access: attacker can issue queries to the same multi-tenant serving system as the victim, and the cache is shared between users
- Information gained: partial or complete prompts of other users
-
The Early Bird Catches the Leak: Unveiling Timing Side Channels in LLM Serving Systems – Song et al., Oct 2025.
- Attack vector: Timing differences in cache-hit vs cache-miss responses when the API uses a semantic cache (note: semantic cache is exceedingly rare)
- Required access: attacker can issue queries to the same multi-tenant serving system as the victim, and the cache is shared between users
- Information gained: infer cached/processed documents via semantic cache
-
What Was Your Prompt? A Remote Keylogging Attack on AI Assistants – Weiss et al., Mar 2024.
- Attack vector: streaming sends encrypted network packets that leak token lengths, and token lengths leak a lot of information about the text
- Required access: eavesdrop on encrypted network traffic between the user and a streaming LLM API; streaming packets are aligned with token boundaries
- Information gained: decrypting eavesdropped LLM outputs (29% exact reconstruction, 55% topic identification)
-
Wiretapping LLMs: Network Side-Channel Attacks on Interactive LLM Services – Soleimani et al., 2025.
- Attack vector: timing patterns from speculative decoding in streaming APIs; the patterns of speculative decoding leaks information about the LLM outputs
- Required access: eavesdrop on encrypted network traffic between the user and a streaming LLM API; streaming packets are aligned with speculative decoding
- Information gained: statistical information about the encrypted LLM outputs
-
Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models – Zhang et al., Dec 2024.
- Attack vector: the response time of the LLM API is a good proxy for the number of tokens in the output
- Required access: eavesdrop on encrypted network traffic between the user and an LLM API (not necessarily streaming)
- Information gained: input attributes like target language (~75% accuracy)
-
Mitigating a Token-Length Side-Channel Attack in our AI Products – Cloudflare Research Team, Mar 2024.
- Defense mechanism: token-length patterns in network traffic leak information, but can be obscured by randomizing the length of token packets
- Attack vector: timing patterns from speculative decoding in streaming APIs; the patterns of speculative decoding leaks information about the LLM outputs
- Required access: eavesdrop on encrypted network traffic between the user and a streaming LLM API; streaming packets are aligned with speculative decoding
- Information gained: statistical information about the encrypted LLM outputs
-
NetEcho: From Real-World Streaming Side-Channels to Full LLM Conversation Recovery – Zhang et al., Oct 2025.
- Attack vector: packet size and timing patterns in streaming LLM apps (including scenarios with padding/obfuscation)
- Required access: passive eavesdropping on encrypted network traffic between the user and a streaming LLM application/API
- Information gained: partial reconstruction of prompts and responses sent over an encrypted network connection
-
Whisper Leak: a side-channel attack on Large Language Models – McDonald & Bar Or, Nov 2025.
- Attack vector: packet size + timing patterns in streaming responses leak metadata usable for topic inference
- Required access: passive eavesdropping on encrypted network traffic between the user and a streaming LLM application/API
- Information gained: prompt/topic classification (identifying conversations matching sensitive topics)
-
Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference – Chu et al., Aug 2025.
- Defense mechanism: selective KV-cache sharing intended to reduce cache-hit timing leakage while retaining performance benefits
-
Buffer Overflow in Mixture of Experts – Hayes et al., Feb 2024.
- Attack vector: if a MoE architecture computes something over a batch of inputs from multiple users, then malicious inputs from one user can degrade the outputs of other users' queries
- Required access: place many queries together with the victim's query in the same batch
- Information gained: None (potential performance degradation for user victims)
-
Stealing Part of a Production Language Model – Carlini et al., Mar 2024.
- Attack vector: logit bias can leak full logits; logits can be used to reconstruct the embedding projection matrix up to symmetries
- Required access: queries to the LLM API supporting logit bias; it's faster if logprobs + logit bias is supported
- Information gained: embedding projection matrix; model hidden dimensions; architectural details
-
Logits of API-Protected LLMs Leak Proprietary Information - Finlayson et al., Mar 2024.
- Attack vector: logit bias can be used to reconstruct the image of the pre-softmax model
- Required access: queries to the LLM API supporting logit bias; it's faster if logprobs + logit bias is supported
- Information gained: model hidden dimensions; architectural details
-
Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API – Labunets et al., Jan 2025.
- Attack vector: Fine-tuning loss can be used to get original LLM loss values for a given prompt
- Required access: Fine-tuning API that reports training loss per-sample
- Information gained: Loss for a given token sequence; e.g. for gray-box jailbreak attacks
-
The Worst (But Only) Claude 3 Tokenizer – Rando & Tramèr, 2024.
- Attack vector: Token boundaries revealed in streaming API responses
- Required access: streaming API where streaming packets are aligned with token boundaries
- Information gained: vocab; reproducing the tokenization algorithm
-
Privacy Side Channels in Machine Learning Systems – Debenedetti et al., 2023.
- Attack vector: fixed context length leaks how many tokens are in a long prompt
- Required access: large queries to the LLM API
- Information gained: vocab (practical); reconstruction of tokenization algorithm (theoretical)
-
Auditing Prompt Caching in Language Model APIs – Gu et al., Feb 2025.
- Attack vector: prompt caching optimization on prompts that share a prefix are only possible in decoder-only architectures
- Required access: embedding model API with prompt caching enabled (even for a single user)
- Information gained: whether an embedding model is a decoder-only transformer
-
Privacy Side Channels in Machine Learning Systems – Debenedetti et al., 2023.
- Attack vector: output filter that activates on exact sensitive strings from the training data leaks those sensitive strings
- Required access: query access to a model with an output filter
- Information gained: training data membership; extraction of secrets that base models don't directly memorize
-
"Energon": Unveiling Transformers from GPU Power and Thermal Side-Channels – Chaudhuri et al., Aug 2025.
- Attack vector: GPU power/thermal side-channels in shared GPU settings can reveal transformer architecture
- Required access: ability to observe power/thermal signals of the GPUs running the model (co-located / shared infrastructure setting)
- Information gained: model family and architecture details
Contributions welcome. This list focuses on side-channel exploits unique to LLM APIs or other AI services
Things not covered by this list:
- Attacks requiring physical co-location of the attacker and the target
- Attacks requiring data poisoning or any modification to the training process
- Adversarial attacks or prompt injection attacks that take the defense as given, not exploiting a side-channel