[Performance] Add PrefixStore -> KVBlocks Caching

In the current implementation, on every external request, the indexer performs the following:

```
        // 1. get available tokens of longest prefix
	tokens := k.tokensIndexer.FindLongestContainedTokens(prompt, modelName)
	...

	// 2. get block keys
	blockKeys := k.tokensProcessor.TokensToKVBlockKeys(tokens, modelName)
	...

	// 3. query kvblock indexer for pods
	strBlockKeys, keyToPods, err := k.kvBlockIndexer.GetPodsForKeys(ctx, blockKeys, sets.New(podIdentifiers...))
	...
```

It is possible to cache (2) into (1) directly and avoid these calculations - if it can be self-contained.

[Should be profiling driven]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Performance] Add PrefixStore -> KVBlocks Caching #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Performance] Add PrefixStore -> KVBlocks Caching #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions