Merge pull request #2023 from jasonrandrews/review

jasonrandrews · web-flow · commit a9521c821306 · 2025-06-09T18:47:43.000-05:00
Docker Model Runner editorial review
diff --git a/content/learning-paths/laptops-and-desktops/docker-models/_index.md b/content/learning-paths/laptops-and-desktops/docker-models/_index.md
@@ -1,22 +1,19 @@
 ---
-title: Learn how to use Docker Model Runner in AI applications
+title: Run AI models with Docker Model Runner
 
-draft: true
-cascade:
-    draft: true
 
 minutes_to_complete: 45
 
-who_is_this_for: This is for software developers and AI enthusiasts who want to run AI models using Docker Model Runner.
+who_is_this_for: This is for software developers and AI enthusiasts who want to run pre-trained AI models locally using Docker Model Runner.
 
 learning_objectives:
     - Run AI models locally using Docker Model Runner.
-    - Easily build containerized applications with LLMs.
+    - Build containerized applications that integrate Large Language Models (LLMs).
 
 prerequisites:
-    - A computer with at least 16GB of RAM (recommended) and Docker Desktop installed (version 4.40 or later).
-    - Basic understanding of Docker.
-    - Familiarity with Large Language Model (LLM) concepts.
+    - Docker Desktop (version 4.40 or later) installed on a system with at least 16GB of RAM (recommended).
+    - Basic understanding of Docker CLI and concepts.
+    - Familiarity with LLM concepts.
 
 author: Jason Andrews
 
diff --git a/content/learning-paths/laptops-and-desktops/docker-models/compose.md b/content/learning-paths/laptops-and-desktops/docker-models/compose.md
@@ -4,15 +4,13 @@ weight: 3
 layout: "learningpathall"
 ---
 
-Docker Compose makes it easy to run multi-container applications. Docker Compose can also include AI models in your project.
+Docker Compose makes it easy to run multi-container applications, and it can also include those that include local AI inference services.
 
-In this section, you'll learn how to use Docker Compose to deploy a web-based AI chat application that uses Docker Model Runner as the backend for AI inference.
+In this section, you'll use Docker Compose to deploy a simple web-based AI chat application. The frontend is a Flask web app, and the backend uses Docker Model Runner to serve AI responses.
 
 ## Clone the example project
 
-The example project, named [docker-model-runner-chat](https://github.com/jasonrandrews/docker-model-runner-chat) is available on GitHub. It provides a simple web interface to interact with local AI models such as Llama 3.2 or Gemma 3.
-
-First, clone the example repository:
+Clone the [docker-model-runner-chat](https://github.com/jasonrandrews/docker-model-runner-chat) repository from GitHub. This project provides a simple web interface to interact with local AI models such as Llama 3.2 or Gemma 3.
 
 ```console
 git clone https://github.com/jasonrandrews/docker-model-runner-chat.git
@@ -21,7 +19,7 @@ cd docker-model-runner-chat
 
 ## Review the Docker Compose file
 
-The `compose.yaml` file defines how the application is deployed using Docker Compose. 
+The `compose.yaml` file defines defines how Docker Compose sets up and connects the services.
 
 It sets up two services:
 
@@ -60,21 +58,21 @@ From the project directory, start the app with:
 docker compose up --build
 ```
 
-Docker Compose will build the web app image and start both services.
+Docker Compose builds the web app image and starts both services.
 
 ## Access the chat interface
 
-Open your browser and copy and paste the local URL below: 
+Once running, open your browser and copy-and-paste the local URL below: 
 
 ```console
 http://localhost:5000
 ```
 
-You can now chat with the AI model using the web interface. Enter your prompt and view the response in real time.
+You’ll see a simple chat UI. Enter a prompt and get real-time responses from the AI model.
 
-![Compose #center](compose-app.png)
+![Compose #center](compose-app.png "Docker Model Chat")
 
-## Configuration
+## Configure the model
 
 You can change the AI model or endpoint by editing the `vars.env` file before starting the containers. The file contains environment variables used by the web application:
 
@@ -88,15 +86,20 @@ BASE_URL=http://model-runner.docker.internal/engines/v1/
 MODEL=ai/gemma3
 ```
 
-To use a different model, change the `MODEL` value. For example:
+To use a different model or API endpoint, change the `MODEL` value. For example:
 
 ```console
 MODEL=ai/llama3.2
 ```
 
-Make sure to change the model in the `compose.yaml` file also. 
+Be sure to also update the model name in the `compose.yaml` under the `ai-runner` service. 
+
+## Optional: customize generation parameters
+
+You can edit `app.py` to adjust parameters such as:
 
-You can also change the `temperature` and `max_tokens` values in `app.py` to further customize the application.
+* `temperature`: controls randomness (higher is more creative)
+* `max_tokens`: controls the length of responses 
 
 ## Stop the application
 
@@ -112,12 +115,13 @@ docker compose down
 
 Use the steps below if you have any issues running the application:
 
-- Ensure Docker and Docker Compose are installed and running
-- Make sure port 5000 is not in use by another application
-- Check logs with:
+* Ensure Docker and Docker Compose are installed and running
+* Make sure port 5000 is not in use by another application
+* Check logs with:
 
 ```console
 docker compose logs
 ```
 
+## What you've learned 
 In this section, you learned how to use Docker Compose to run a containerized AI chat application with a web interface and local model inference from Docker Model Runner. 
diff --git a/content/learning-paths/laptops-and-desktops/docker-models/models.md b/content/learning-paths/laptops-and-desktops/docker-models/models.md
@@ -4,11 +4,13 @@ weight: 2
 layout: "learningpathall"
 ---
 
-Docker Model Runner is an official Docker extension that allows you to run Large Language Models (LLMs) on your local computer. It provides a convenient way to deploy and use AI models across different environments, including Arm-based systems, without complex setup or cloud dependencies.
+## Simplified Local LLM Inference 
+
+Docker Model Runner is an official Docker extension that allows you to run Large Language Models (LLMs) directly on your local computer. It provides a convenient way to deploy and use AI models across different environments, including Arm-based systems, without complex framework setup or cloud dependencies.
 
 Docker uses [llama.cpp](https://github.com/ggml-org/llama.cpp), an open source C/C++ project developed by Georgi Gerganov that enables efficient LLM inference on a variety of hardware, but you do not need to download, build, or install any LLM frameworks. 
 
-Docker Model Runner provides a easy to use CLI that is familiar to Docker users. 
+Docker Model Runner provides a easy-to-use CLI interface that is familiar to Docker users. 
 
 ## Before you begin
 
@@ -18,21 +20,21 @@ Verify Docker is running with:
 docker version
 ```
 
-You should see output showing your Docker version. 
+You should see your Docker version shown in the output. 
 
-Confirm the Docker Desktop version is 4.40 or above, for example:
+Confirm that Docker Desktop is version 4.40 or above, for example:
 
 ```output
 Server: Docker Desktop 4.41.2 (191736)
 ```
 
-Make sure the Docker Model Runner is enabled.
+Make sure the Docker Model Runner is enabled:
 
 ```console
 docker model --help
 ```
 
-You should see the usage message:
+You should see this output:
 
 ```output
 Usage:  docker model COMMAND
@@ -52,27 +54,28 @@ Commands:
   version     Show the Docker Model Runner version
 ```
 
-If Docker Model Runner is not enabled, enable it using the [Docker Model Runner documentation](https://docs.docker.com/model-runner/).
+If Docker Model Runner is not enabled, enable it by following the [Docker Model Runner documentation](https://docs.docker.com/model-runner/).
 
-You should also see the Models icon in your Docker Desktop sidebar.
+You should also see the **Models** tab and icon appear in your Docker Desktop sidebar.
  
-![Models #center](models-tab.png)
+![Models #center](models-tab.png "Docker Models UI")
 
-## Running your first AI model with Docker Model Runner
+## Run your first AI model with Docker Model Runner
 
 Docker Model Runner is an extension for Docker Desktop that simplifies running AI models locally. 
 
 Docker Model Runner automatically selects compatible model versions and optimizes performance for the Arm architecture.
 
-You can try Docker Model Runner by using an LLM from Docker Hub. 
+You can try Model Runner by downloading and running a model from Docker Hub. 
 
-The example below uses the [SmolLM2 model](https://hub.docker.com/r/ai/smollm2), a compact language model with 360 million parameters, designed to run efficiently on-device while performing a wide range of language tasks. You can explore additional [models in Docker Hub](https://hub.docker.com/u/ai).
+The example below uses the [SmolLM2 model](https://hub.docker.com/r/ai/smollm2), a compact LLM with ~360 million parameters, designed for efficient on-device inference while performing a wide range of language tasks. You can explore further models in [Docker Hub](https://hub.docker.com/u/ai).
 
-Download the model using:
+1. Download the model 
 
 ```console
 docker model pull ai/smollm2
 ```
+2. Run the model interactively 
 
 For a simple chat interface, run the model:
 
@@ -96,10 +99,9 @@ int main() {
     return 0;
 }
 ```
+To exit the chat, use the `/bye` command.
 
-You can ask more questions and continue to chat.
-
-To exit the chat use the `/bye` command.
+3. View downloaded models
 
 You can print the list of models on your computer using:
 
@@ -119,7 +121,9 @@ ai/llama3.2  3.21 B      IQ2_XXS/Q4_K_M  llama         436bb282b419  2 months ag
 
 ## Use the OpenAI endpoint to call the model
 
-From your host computer you can access the model using the OpenAI endpoint and a TCP port. 
+Docker Model Runner exposes a REST endpoint compatible with OpenAI's API spec.
+
+From your host computer, you can access the model using the OpenAI endpoint and a TCP port. 
 
 First, enable the TCP port to connect with the model:
 
@@ -155,7 +159,7 @@ Run the shell script:
 bash ./curl-test.sh | jq
 ```
 
-If you don't have `jq` installed, you eliminate piping the output.
+If you don't have `jq` installed, you can eliminate piping the output.
 
 The output, including the performance information, is shown below:
 
@@ -193,5 +197,14 @@ The output, including the performance information, is shown below:
   }
 }
 ```
+You now have a fully functioning OpenAI-compatible inference endpoint running locally.
+
+## What you've learned
+
+In this section, you learned:
+
+* How to verify and use Docker Model Runner on Docker Desktop
+* How to run a model interactively from the CLI
+* How to connect to a model using a local OpenAI-compatible API
 
-In this section you learned how to run AI models using Docker Model Runner. Continue to see how to use Docker Compose to build an application with a built-in AI model. 
+In the next section, you'll use Docker Compose to deploy a web-based AI chat interface powered by Docker Model Runner.