Skip to content

Clarification needed on prefill phase identification in models/base.py #8

@m-alqblawi

Description

@m-alqblawi

Hello,

I've been looking into the implementation of ShadowKV and reading your excellent paper "ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference".
The paper clearly describes the pre-filling phase as handling the entire input context to build the initial KV cache, distinct from the decoding phase where new tokens are generated one by one.
While reviewing the code, specifically in models/base.py (referenced in source), I noticed a pattern like if q_len > 4*1024: # prefill. This seems to suggest that the prefill phase might be identified based on the q_len (query length) exceeding a threshold of 4096.
This observation brings up a couple of questions compared to the paper's description:

The paper defines prefill by the processing of the full context length, not a condition on q_len. How does this code logic (if q_len > 4*1024) relate to or implement the pre-filling phase described in the paper (e.g., processing sequence length s as in Algorithm 1)?

The number 4096 appears in the paper (e.g., Figure 5) as a chunk index (with a chunk size of 8), not a direct length threshold for sequence processing or query length. Is the 4*1024 (4096) value in the code intentionally related to the chunking strategy or Figure 5 in some way, or does it serve a different heuristic purpose?
I'm trying to fully understand how the system distinguishes between these phases and why this specific condition based on q_len might be used, especially since q_len is typically 1 during step-by-step decoding.
Any clarification you could provide would be greatly appreciated! Thank you for sharing this work.

https://github.com/ByteDance-Seed/ShadowKV/blob/main/models/base.py#L102

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions