diff --git a/content/manuals/ai/model-runner/_index.md b/content/manuals/ai/model-runner/_index.md
index bda21d4d1980..b8491af037ec 100644
--- a/content/manuals/ai/model-runner/_index.md
+++ b/content/manuals/ai/model-runner/_index.md
@@ -385,3 +385,7 @@ The Docker Model CLI currently lacks consistent support for specifying models by
 ## Share feedback
 
 Thanks for trying out Docker Model Runner. Give feedback or report any bugs you may find through the **Give feedback** link next to the **Enable Docker Model Runner** setting.
+
+## Related pages
+
+- [Use Model Runner with Compose](/manuals/compose/how-tos/model-runner.md)
diff --git a/content/manuals/compose/how-tos/model-runner.md b/content/manuals/compose/how-tos/model-runner.md
index 2a7fca43ca83..d64886b61f49 100644
--- a/content/manuals/compose/how-tos/model-runner.md
+++ b/content/manuals/compose/how-tos/model-runner.md
@@ -40,15 +40,33 @@ services:
       type: model
       options:
         model: ai/smollm2
+        context-size: 1024
+        runtime-flags: "--no-prefill-assistant"
 ```
 
-Notice the dedicated `provider` attribute in the `ai_runner` service.   
-This attribute specifies that the service is a model provider and lets you define options such as the name of the model to be used.
-
-There is also a `depends_on` attribute in the `chat` service.  
-This attribute specifies that the `chat` service depends on the `ai_runner` service.  
-This means that the `ai_runner` service will be started before the `chat` service to allow injection of model information to the `chat` service.
-
+Notice the following:
+
+In the `ai_runner` service:
+
+- `provider.type`: Specifies that the service is a `model` provider.
+- `provider.options`: Specifies the options of the mode:
+  - We want to use `ai/smollm2` model.
+  - We set the context size to `1024` tokens.
+    
+    > [!NOTE]
+    > Each model has its own maximum context size. When increasing the context length,
+    > consider your hardware constraints. In general, try to use the smallest context size
+    > possible for your use case.
+  - We pass the llama.cpp server `--no-prefill-assistant` parameter,
+    see [the available parameters](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md).
+ 
+
+   
+In the `chat` service:
+   
+-  `depends_on` specifies that the `chat` service depends on the `ai_runner` service. The
+   `ai_runner` service will be started before the `chat` service, to allow injection of model information to the `chat` service.
+   
 ## How it works
 
 During the `docker compose up` process, Docker Model Runner automatically pulls and runs the specified model.  
@@ -61,6 +79,6 @@ In the example above, the `chat` service receives 2 environment variables prefix
 
 This lets the `chat` service to interact with the model and use it for its own purposes.
 
-## Reference
+## Related pages
 
 - [Docker Model Runner documentation](/manuals/ai/model-runner.md)