Skip to content

Commit a9521c8

Browse files
Merge pull request #2023 from jasonrandrews/review
Docker Model Runner editorial review
2 parents 84f3ce0 + 08a2f21 commit a9521c8

File tree

3 files changed

+59
-45
lines changed

3 files changed

+59
-45
lines changed

content/learning-paths/laptops-and-desktops/docker-models/_index.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,19 @@
11
---
2-
title: Learn how to use Docker Model Runner in AI applications
2+
title: Run AI models with Docker Model Runner
33

4-
draft: true
5-
cascade:
6-
draft: true
74

85
minutes_to_complete: 45
96

10-
who_is_this_for: This is for software developers and AI enthusiasts who want to run AI models using Docker Model Runner.
7+
who_is_this_for: This is for software developers and AI enthusiasts who want to run pre-trained AI models locally using Docker Model Runner.
118

129
learning_objectives:
1310
- Run AI models locally using Docker Model Runner.
14-
- Easily build containerized applications with LLMs.
11+
- Build containerized applications that integrate Large Language Models (LLMs).
1512

1613
prerequisites:
17-
- A computer with at least 16GB of RAM (recommended) and Docker Desktop installed (version 4.40 or later).
18-
- Basic understanding of Docker.
19-
- Familiarity with Large Language Model (LLM) concepts.
14+
- Docker Desktop (version 4.40 or later) installed on a system with at least 16GB of RAM (recommended).
15+
- Basic understanding of Docker CLI and concepts.
16+
- Familiarity with LLM concepts.
2017

2118
author: Jason Andrews
2219

content/learning-paths/laptops-and-desktops/docker-models/compose.md

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,13 @@ weight: 3
44
layout: "learningpathall"
55
---
66

7-
Docker Compose makes it easy to run multi-container applications. Docker Compose can also include AI models in your project.
7+
Docker Compose makes it easy to run multi-container applications, and it can also include those that include local AI inference services.
88

9-
In this section, you'll learn how to use Docker Compose to deploy a web-based AI chat application that uses Docker Model Runner as the backend for AI inference.
9+
In this section, you'll use Docker Compose to deploy a simple web-based AI chat application. The frontend is a Flask web app, and the backend uses Docker Model Runner to serve AI responses.
1010

1111
## Clone the example project
1212

13-
The example project, named [docker-model-runner-chat](https://github.com/jasonrandrews/docker-model-runner-chat) is available on GitHub. It provides a simple web interface to interact with local AI models such as Llama 3.2 or Gemma 3.
14-
15-
First, clone the example repository:
13+
Clone the [docker-model-runner-chat](https://github.com/jasonrandrews/docker-model-runner-chat) repository from GitHub. This project provides a simple web interface to interact with local AI models such as Llama 3.2 or Gemma 3.
1614

1715
```console
1816
git clone https://github.com/jasonrandrews/docker-model-runner-chat.git
@@ -21,7 +19,7 @@ cd docker-model-runner-chat
2119

2220
## Review the Docker Compose file
2321

24-
The `compose.yaml` file defines how the application is deployed using Docker Compose.
22+
The `compose.yaml` file defines defines how Docker Compose sets up and connects the services.
2523

2624
It sets up two services:
2725

@@ -60,21 +58,21 @@ From the project directory, start the app with:
6058
docker compose up --build
6159
```
6260

63-
Docker Compose will build the web app image and start both services.
61+
Docker Compose builds the web app image and starts both services.
6462

6563
## Access the chat interface
6664

67-
Open your browser and copy and paste the local URL below:
65+
Once running, open your browser and copy-and-paste the local URL below:
6866

6967
```console
7068
http://localhost:5000
7169
```
7270

73-
You can now chat with the AI model using the web interface. Enter your prompt and view the response in real time.
71+
You’ll see a simple chat UI. Enter a prompt and get real-time responses from the AI model.
7472

75-
![Compose #center](compose-app.png)
73+
![Compose #center](compose-app.png "Docker Model Chat")
7674

77-
## Configuration
75+
## Configure the model
7876

7977
You can change the AI model or endpoint by editing the `vars.env` file before starting the containers. The file contains environment variables used by the web application:
8078

@@ -88,15 +86,20 @@ BASE_URL=http://model-runner.docker.internal/engines/v1/
8886
MODEL=ai/gemma3
8987
```
9088

91-
To use a different model, change the `MODEL` value. For example:
89+
To use a different model or API endpoint, change the `MODEL` value. For example:
9290

9391
```console
9492
MODEL=ai/llama3.2
9593
```
9694

97-
Make sure to change the model in the `compose.yaml` file also.
95+
Be sure to also update the model name in the `compose.yaml` under the `ai-runner` service.
96+
97+
## Optional: customize generation parameters
98+
99+
You can edit `app.py` to adjust parameters such as:
98100

99-
You can also change the `temperature` and `max_tokens` values in `app.py` to further customize the application.
101+
* `temperature`: controls randomness (higher is more creative)
102+
* `max_tokens`: controls the length of responses
100103

101104
## Stop the application
102105

@@ -112,12 +115,13 @@ docker compose down
112115

113116
Use the steps below if you have any issues running the application:
114117

115-
- Ensure Docker and Docker Compose are installed and running
116-
- Make sure port 5000 is not in use by another application
117-
- Check logs with:
118+
* Ensure Docker and Docker Compose are installed and running
119+
* Make sure port 5000 is not in use by another application
120+
* Check logs with:
118121

119122
```console
120123
docker compose logs
121124
```
122125

126+
## What you've learned
123127
In this section, you learned how to use Docker Compose to run a containerized AI chat application with a web interface and local model inference from Docker Model Runner.

content/learning-paths/laptops-and-desktops/docker-models/models.md

Lines changed: 32 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,13 @@ weight: 2
44
layout: "learningpathall"
55
---
66

7-
Docker Model Runner is an official Docker extension that allows you to run Large Language Models (LLMs) on your local computer. It provides a convenient way to deploy and use AI models across different environments, including Arm-based systems, without complex setup or cloud dependencies.
7+
## Simplified Local LLM Inference
8+
9+
Docker Model Runner is an official Docker extension that allows you to run Large Language Models (LLMs) directly on your local computer. It provides a convenient way to deploy and use AI models across different environments, including Arm-based systems, without complex framework setup or cloud dependencies.
810

911
Docker uses [llama.cpp](https://github.com/ggml-org/llama.cpp), an open source C/C++ project developed by Georgi Gerganov that enables efficient LLM inference on a variety of hardware, but you do not need to download, build, or install any LLM frameworks.
1012

11-
Docker Model Runner provides a easy to use CLI that is familiar to Docker users.
13+
Docker Model Runner provides a easy-to-use CLI interface that is familiar to Docker users.
1214

1315
## Before you begin
1416

@@ -18,21 +20,21 @@ Verify Docker is running with:
1820
docker version
1921
```
2022

21-
You should see output showing your Docker version.
23+
You should see your Docker version shown in the output.
2224

23-
Confirm the Docker Desktop version is 4.40 or above, for example:
25+
Confirm that Docker Desktop is version 4.40 or above, for example:
2426

2527
```output
2628
Server: Docker Desktop 4.41.2 (191736)
2729
```
2830

29-
Make sure the Docker Model Runner is enabled.
31+
Make sure the Docker Model Runner is enabled:
3032

3133
```console
3234
docker model --help
3335
```
3436

35-
You should see the usage message:
37+
You should see this output:
3638

3739
```output
3840
Usage: docker model COMMAND
@@ -52,27 +54,28 @@ Commands:
5254
version Show the Docker Model Runner version
5355
```
5456

55-
If Docker Model Runner is not enabled, enable it using the [Docker Model Runner documentation](https://docs.docker.com/model-runner/).
57+
If Docker Model Runner is not enabled, enable it by following the [Docker Model Runner documentation](https://docs.docker.com/model-runner/).
5658

57-
You should also see the Models icon in your Docker Desktop sidebar.
59+
You should also see the **Models** tab and icon appear in your Docker Desktop sidebar.
5860

59-
![Models #center](models-tab.png)
61+
![Models #center](models-tab.png "Docker Models UI")
6062

61-
## Running your first AI model with Docker Model Runner
63+
## Run your first AI model with Docker Model Runner
6264

6365
Docker Model Runner is an extension for Docker Desktop that simplifies running AI models locally.
6466

6567
Docker Model Runner automatically selects compatible model versions and optimizes performance for the Arm architecture.
6668

67-
You can try Docker Model Runner by using an LLM from Docker Hub.
69+
You can try Model Runner by downloading and running a model from Docker Hub.
6870

69-
The example below uses the [SmolLM2 model](https://hub.docker.com/r/ai/smollm2), a compact language model with 360 million parameters, designed to run efficiently on-device while performing a wide range of language tasks. You can explore additional [models in Docker Hub](https://hub.docker.com/u/ai).
71+
The example below uses the [SmolLM2 model](https://hub.docker.com/r/ai/smollm2), a compact LLM with ~360 million parameters, designed for efficient on-device inference while performing a wide range of language tasks. You can explore further models in [Docker Hub](https://hub.docker.com/u/ai).
7072

71-
Download the model using:
73+
1. Download the model
7274

7375
```console
7476
docker model pull ai/smollm2
7577
```
78+
2. Run the model interactively
7679

7780
For a simple chat interface, run the model:
7881

@@ -96,10 +99,9 @@ int main() {
9699
return 0;
97100
}
98101
```
102+
To exit the chat, use the `/bye` command.
99103

100-
You can ask more questions and continue to chat.
101-
102-
To exit the chat use the `/bye` command.
104+
3. View downloaded models
103105

104106
You can print the list of models on your computer using:
105107

@@ -119,7 +121,9 @@ ai/llama3.2 3.21 B IQ2_XXS/Q4_K_M llama 436bb282b419 2 months ag
119121

120122
## Use the OpenAI endpoint to call the model
121123

122-
From your host computer you can access the model using the OpenAI endpoint and a TCP port.
124+
Docker Model Runner exposes a REST endpoint compatible with OpenAI's API spec.
125+
126+
From your host computer, you can access the model using the OpenAI endpoint and a TCP port.
123127

124128
First, enable the TCP port to connect with the model:
125129

@@ -155,7 +159,7 @@ Run the shell script:
155159
bash ./curl-test.sh | jq
156160
```
157161

158-
If you don't have `jq` installed, you eliminate piping the output.
162+
If you don't have `jq` installed, you can eliminate piping the output.
159163

160164
The output, including the performance information, is shown below:
161165

@@ -193,5 +197,14 @@ The output, including the performance information, is shown below:
193197
}
194198
}
195199
```
200+
You now have a fully functioning OpenAI-compatible inference endpoint running locally.
201+
202+
## What you've learned
203+
204+
In this section, you learned:
205+
206+
* How to verify and use Docker Model Runner on Docker Desktop
207+
* How to run a model interactively from the CLI
208+
* How to connect to a model using a local OpenAI-compatible API
196209

197-
In this section you learned how to run AI models using Docker Model Runner. Continue to see how to use Docker Compose to build an application with a built-in AI model.
210+
In the next section, you'll use Docker Compose to deploy a web-based AI chat interface powered by Docker Model Runner.

0 commit comments

Comments
 (0)