Skip to content

[Performance] Add PrefixStore -> KVBlocks Caching #9

@vMaroon

Description

@vMaroon

In the current implementation, on every external request, the indexer performs the following:

        // 1. get available tokens of longest prefix
	tokens := k.tokensIndexer.FindLongestContainedTokens(prompt, modelName)
	...

	// 2. get block keys
	blockKeys := k.tokensProcessor.TokensToKVBlockKeys(tokens, modelName)
	...

	// 3. query kvblock indexer for pods
	strBlockKeys, keyToPods, err := k.kvBlockIndexer.GetPodsForKeys(ctx, blockKeys, sets.New(podIdentifiers...))
	...

It is possible to cache (2) into (1) directly and avoid these calculations - if it can be self-contained.

[Should be profiling driven]

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions