[Platform][Ollama] Add prompt cache #416

Guikingone · 2025-09-03T12:02:35Z

Q	A
Bug fix?	no
New feature?	yes
Docs?	yes
Issues	Somehow related to #337
License	MIT

Hi 👋🏻

This PR aim to introduce a caching layer for Ollama platform (like OpenAI, Anthropic and more already does).

OskarStark · 2025-09-03T12:31:40Z

examples/ollama/chat-llama-with-cache.php

+$result = $agent->call($messages, [
+    'prompt_cache_key' => 'chat',
+]);
+
+echo $result->getContent().\PHP_EOL;
+
+$secondResult = $agent->call($messages, [
+    'prompt_cache_key' => 'chat',
+]);
+
+echo $result->getContent().\PHP_EOL;


How can we ensure, that it really uses the cache and does not just return the exact same answer twice?

VincentLanglet · 2025-09-05T12:05:20Z

src/ai-bundle/config/options.php

                    ->arrayNode('ollama')
                        ->children()
                            ->scalarNode('host_url')->defaultValue('http://127.0.0.1:11434')->end()
+                            ->scalarNode('cache')->end()


We might end with the same cache repeated again and again in every platform.

Should we introduce a cache config key at a higher level ?

That's a good question, IMHO, we should introduce a "root" key for it and allowing the override "per platform", @OskarStark @chr-hertel, any idea?

examples/ollama/chat-llama-with-cache.php

junaidbinfarooq · 2025-09-15T12:37:31Z

src/platform/src/Bridge/Ollama/OllamaResultConverter.php

+            $metadata->add('cached', true);
+            $metadata->add('prompt_cache_key', $options['prompt_cache_key']);
+            $metadata->add('cached_prompt_count', $data['prompt_eval_count']);
+            $metadata->add('cached_completion_count', $data['eval_count']);


Wouldn't it make sense to group this data into a DTO, like there is TokenUsage, and then add that DTO to metadata or perhaps even reuse the said DTO?

Not convinced about the benefits of using an object here, we're only storing an integer, I don't see the benefits to be honest 🤔

@OskarStark @chr-hertel Any thoughts?

i agree, it would be great to have an object like CacheUsage similar to TokenUsage

junaidbinfarooq · 2025-09-15T12:40:04Z

src/platform/tests/Bridge/Ollama/OllamaClientTest.php

+        $firstCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [
+            'messages' => [
+                [
+                    'role' => 'user',
+                    'content' => 'Say hello world',
+                ],
+            ],
+            'model' => 'llama3.2',
+        ], [
+            'prompt_cache_key' => 'foo',
+        ]);
+
+        $result = $firstCall->getResult();
+
+        $this->assertSame('Hello world', $result->getContent());
+        $this->assertSame(10, $result->getMetadata()->get('cached_prompt_count'));
+        $this->assertSame(10, $result->getMetadata()->get('cached_completion_count'));
+
+        $secondCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [
+            'messages' => [
+                [
+                    'role' => 'user',
+                    'content' => 'Say hello world',
+                ],
+            ],
+            'model' => 'llama3.2',
+        ], [
+            'prompt_cache_key' => 'foo',
+        ]);
+
+        $secondResult = $secondCall->getResult();
+
+        $this->assertSame('Hello world', $secondResult->getContent());


Suggested change

$firstCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$result = $firstCall->getResult();

$this->assertSame('Hello world', $result->getContent());

$this->assertSame(10, $result->getMetadata()->get('cached_prompt_count'));

$this->assertSame(10, $result->getMetadata()->get('cached_completion_count'));

$secondCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$secondResult = $secondCall->getResult();

$this->assertSame('Hello world', $secondResult->getContent());

$firstCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$secondCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$firstResult = $firstCall->getResult();

$secondResult = $secondCall->getResult();

$this->assertSame('Hello world', $firstResult->getContent());

$this->assertSame(10, $firstResult->getMetadata()->get('cached_prompt_count'));

$this->assertSame(10, $firstResult->getMetadata()->get('cached_completion_count'));

$this->assertSame('Hello world', $secondResult->getContent());

chr-hertel · 2025-09-25T16:49:40Z

Let's zoom a bit out here, for two reasons:

doesn't ollama caching already?
if we want to have it user land, why only ollama?

Guikingone · 2025-09-25T17:36:49Z

doesn't ollama caching already?

Ollama does a "context caching" and/or a K/V caching, it stores the X latest messages for the model window (or pending tokens to speed TTFT), it's not a cache that returns the generated response if the request already exist.

if we want to have it user land, why only Ollama?

Well, because that's the one that I use the most and the easiest to implement first but we can integrate it for every platform if that's the question, we just need to use the API contract, both Anthropic and OpenAI already does it natively 🤔

If the question is: Could we implement it at the platform layer for every platform without relying on API calls, well, that's not a big deal to be honest and we could easily integrate it 🙂

chr-hertel · 2025-09-25T17:40:31Z

What do you think about having it as decorator CachedPlatform or similar?

Guikingone · 2025-09-26T12:02:28Z

I like the idea of CachedPlatform, looks and sound like HttpCache, I'll rewrite it 👍🏻

OskarStark · 2025-09-29T16:04:04Z

src/ai-bundle/src/AiBundle.php

        if ('ollama' === $type) {
+            $arguments = [
+                $platform['host_url'],
+                new Reference('http_client', ContainerInterface::NULL_ON_INVALID_REFERENCE),


Suggested change

new Reference('http_client', ContainerInterface::NULL_ON_INVALID_REFERENCE),

new Reference('http_client', ContainerInterface::NULL_ON_INVALID_REFERENCE),

new Reference('ai.platform.model_catalog.ollama'),

OskarStark · 2025-09-29T16:04:14Z

src/ai-bundle/src/AiBundle.php

+            if (\array_key_exists('cache', $platform)) {
+                $arguments[] = new Reference($platform['cache'], ContainerInterface::NULL_ON_INVALID_REFERENCE);
+            }
+


$arguments is not used

chr-hertel · 2025-09-29T21:22:35Z

src/platform/src/Bridge/Ollama/OllamaResultConverter.php

changes here would also belong into CachedPlatform so every bridge can benefit from this decorator

Guikingone requested review from chr-hertel, Nyholm and OskarStark as code owners September 3, 2025 12:02

carsonbot added Feature New feature Platform Issues & PRs about the AI Platform component Status: Needs Review labels Sep 3, 2025

Guikingone force-pushed the ollama/prompt_caching branch from 51fa81c to 1fa590a Compare September 3, 2025 12:04

OskarStark changed the title ~~[Platform] Add Ollama prompt cache~~ [Platform][Ollama] Add prompt cache Sep 3, 2025

Guikingone force-pushed the ollama/prompt_caching branch 2 times, most recently from bf5a1fe to 5ef4417 Compare September 3, 2025 12:18

OskarStark reviewed Sep 3, 2025

View reviewed changes

Guikingone force-pushed the ollama/prompt_caching branch from 5ef4417 to cc5f431 Compare September 5, 2025 11:57

VincentLanglet reviewed Sep 5, 2025

View reviewed changes

Guikingone force-pushed the ollama/prompt_caching branch 3 times, most recently from 8e15d3d to 46bca63 Compare September 5, 2025 15:31

junaidbinfarooq reviewed Sep 9, 2025

View reviewed changes

examples/ollama/chat-llama-with-cache.php Outdated Show resolved Hide resolved

Guikingone force-pushed the ollama/prompt_caching branch 4 times, most recently from 914eddf to 8112ab2 Compare September 15, 2025 08:57

Guikingone requested review from OskarStark, VincentLanglet and junaidbinfarooq September 15, 2025 09:06

junaidbinfarooq reviewed Sep 15, 2025

View reviewed changes

Guikingone force-pushed the ollama/prompt_caching branch 3 times, most recently from de85ef7 to e038555 Compare September 23, 2025 11:56

Guikingone added 2 commits September 29, 2025 15:57

feat(platform): Ollama prompt cache

2d7297f

ref

194f7a4

Guikingone force-pushed the ollama/prompt_caching branch from e038555 to 194f7a4 Compare September 29, 2025 15:51

OskarStark reviewed Sep 29, 2025

View reviewed changes

chr-hertel reviewed Sep 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Platform][Ollama] Add prompt cache #416

[Platform][Ollama] Add prompt cache #416

Guikingone commented Sep 3, 2025 •

edited

Loading

Uh oh!

OskarStark Sep 3, 2025

Uh oh!

VincentLanglet Sep 5, 2025

Uh oh!

Guikingone Sep 5, 2025

Uh oh!

Uh oh!

junaidbinfarooq Sep 15, 2025

Uh oh!

Guikingone Sep 17, 2025

Uh oh!

chr-hertel Sep 29, 2025

Uh oh!

junaidbinfarooq Sep 15, 2025

Uh oh!

chr-hertel commented Sep 25, 2025

Uh oh!

Guikingone commented Sep 25, 2025

Uh oh!

chr-hertel commented Sep 25, 2025

Uh oh!

Guikingone commented Sep 26, 2025

Uh oh!

OskarStark Sep 29, 2025

Uh oh!

OskarStark Sep 29, 2025

Uh oh!

chr-hertel Sep 29, 2025

Uh oh!

Uh oh!

	new Reference('http_client', ContainerInterface::NULL_ON_INVALID_REFERENCE),
	new Reference('http_client', ContainerInterface::NULL_ON_INVALID_REFERENCE),
	new Reference('ai.platform.model_catalog.ollama'),

Uh oh!

[Platform][Ollama] Add prompt cache #416

Are you sure you want to change the base?

[Platform][Ollama] Add prompt cache #416

Conversation

Guikingone commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chr-hertel commented Sep 25, 2025

Uh oh!

Guikingone commented Sep 25, 2025

Uh oh!

chr-hertel commented Sep 25, 2025

Uh oh!

Guikingone commented Sep 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Guikingone commented Sep 3, 2025 •

edited

Loading