Skip to content

Commit fe9b065

Browse files
authored
Merge pull request #430 from Paraphraser/20211015-telegraf-influx-container-master
20211015 Telegraf - defaults - master branch - PR 1 of 3
2 parents 651a30b + ad3bcd4 commit fe9b065

File tree

5 files changed

+202
-46
lines changed

5 files changed

+202
-46
lines changed

.templates/telegraf/Dockerfile

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,23 @@ RUN apt update && apt install -y rsync
77
# where IOTstack template files are stored
88
ENV IOTSTACK_DEFAULTS_DIR="iotstack_defaults"
99

10-
# make a copy of the default config file
11-
RUN mkdir -p /${IOTSTACK_DEFAULTS_DIR} && \
12-
cp /etc/telegraf/telegraf.conf /${IOTSTACK_DEFAULTS_DIR}/
10+
# copy template files to image
11+
COPY ${IOTSTACK_DEFAULTS_DIR} /${IOTSTACK_DEFAULTS_DIR}
12+
13+
# 1. copy the default configuration file that ships with the image as
14+
# a baseline reference for the user, and make it read-only.
15+
# 2. strip comment lines and blank lines from the baseline reference to
16+
# use as the starting point for the IOTstack default configuration.
17+
# 3. edit the IOTstack default configuration to insert an appropriate
18+
# URL for influxdb running in another container in the same stack.
19+
ENV BASELINE_CONFIG=/${IOTSTACK_DEFAULTS_DIR}/telegraf-reference.conf
20+
ENV IOTSTACK_CONFIG=/${IOTSTACK_DEFAULTS_DIR}/telegraf.conf
21+
RUN cp /etc/telegraf/telegraf.conf ${BASELINE_CONFIG} && \
22+
chmod 444 ${BASELINE_CONFIG} && \
23+
grep -v -e "^[ ]*#" -e "^[ ]*$" ${BASELINE_CONFIG} >${IOTSTACK_CONFIG} && \
24+
sed -i '/^\[\[outputs.influxdb\]\]/a\ \ urls = ["http://influxdb:8086"]' ${IOTSTACK_CONFIG}
25+
ENV BASELINE_CONFIG=
26+
ENV IOTSTACK_CONFIG=
1327

1428
# replace the docker entry-point script with a self-repairing version
1529
ENV IOTSTACK_ENTRY_POINT="entrypoint.sh"
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Read metrics about docker containers
2+
# Credit: @tablatronix
3+
[[inputs.docker]]
4+
endpoint = "unix:///var/run/docker.sock"
5+
gather_services = false
6+
container_names = []
7+
source_tag = false
8+
container_name_include = []
9+
container_name_exclude = []
10+
timeout = "5s"
11+
perdevice = false
12+
total = true
13+
docker_label_include = []
14+
docker_label_exclude = []
15+
tag_env = ["HEAP_SIZE"]
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Read metrics from MQTT topic(s)
2+
# Credit: https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf
3+
[[inputs.mqtt_consumer]]
4+
servers = ["tcp://mosquitto:1883"]
5+
topics = [
6+
"telegraf/host01/cpu",
7+
"telegraf/+/mem",
8+
"sensors/#",
9+
]
10+
data_format = "json"

.templates/telegraf/service.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ telegraf:
1010
- "8125:8125/udp"
1111
volumes:
1212
- ./volumes/telegraf/:/etc/telegraf
13+
- /var/run/docker.sock:/var/run/docker.sock:ro
1314
depends_on:
1415
- influxdb
1516
- mosquitto

docs/Containers/Telegraf.md

Lines changed: 159 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22

33
This document discusses an IOTstack-specific version of Telegraf built on top of [influxdata/influxdata-docker/telegraf](https://github.com/influxdata/influxdata-docker/tree/master/telegraf) using a *Dockerfile*.
44

5-
The purpose of the Dockerfile is to enable the container to perform self-repair if the `telegraf.conf ` configuration file disappears.
5+
The purpose of the Dockerfile is to:
6+
7+
* tailor the default configuration to be IOTstack-ready; and
8+
* enable the container to perform self-repair if essential elements of the persistent storage area disappear.
69

710
## <a name="references"> References </a>
811

@@ -18,23 +21,35 @@ The purpose of the Dockerfile is to enable the container to perform self-repair
1821
│   └── telegraf
1922
│      ├── Dockerfile ❶
2023
│      ├── entrypoint.sh ❷
21-
│      └── service.yml ❸
24+
│ ├── iotstack_defaults
25+
│   │ └── additions ❸
26+
│      └── service.yml ❹
2227
├── services
2328
│ └── telegraf
24-
│ └── service.yml
25-
├── docker-compose.yml
29+
│ └── service.yml
30+
├── docker-compose.yml
2631
└── volumes
27-
└── telegraf ❻
28-
   └── telegraf.conf ❼
32+
└── telegraf ❼
33+
├── additions ❽
34+
├── telegraf-reference.conf ➒
35+
└── telegraf.conf ➓
2936
```
3037

3138
1. The *Dockerfile* used to customise Telegraf for IOTstack.
3239
2. A replacement for the `telegraf` container script of the same name, extended to handle container self-repair.
33-
3. The *template service definition*.
34-
4. The *working service definition* (only relevant to old-menu, copied from ❸).
35-
5. The *Compose* file (includes ❸).
36-
6. The *persistent storage area* for the `telegraf` container.
37-
7. The configuration file. Will be created by default if not present in ❻ when the container starts but will not be overwritten if customised by you.
40+
3. The *additions folder*. See [Applying optional additions](#optionalAdditions).
41+
4. The *template service definition*.
42+
5. The *working service definition* (only relevant to old-menu, copied from ❹).
43+
6. The *Compose* file (includes ❹).
44+
7. The *persistent storage area* for the `telegraf` container.
45+
8. A working copy of the *additions folder* (copied from ❸). See [Applying optional additions](#optionalAdditions).
46+
9. The *reference configuration file*. See [Changing Telegraf's configuration](#editConfiguration).
47+
10. The *active configuration file*. A subset of ➒ altered to support communication with InfluxDB running in a container in the same IOTstack instance.
48+
49+
Everything in the persistent storage area ❼:
50+
51+
* will be replaced if it is not present when the container starts; but
52+
* will never be overwritten if altered by you.
3853

3954
## <a name="howTelegrafIOTstackGetsBuilt"> How Telegraf gets built for IOTstack </a>
4055

@@ -88,16 +103,18 @@ The `FROM` statement tells the build process to pull down the ***base image*** f
88103

89104
> It is a ***base*** image in the sense that it never actually runs as a container on your Raspberry Pi.
90105
91-
The remaining instructions in the *Dockerfile* customise the *base image* to produce a ***local image***. The customisations are:
106+
The remaining instructions in the *Dockerfile* customise the ***base image*** to produce a ***local image***. The customisations are:
92107

93108
1. Add the `rsync` package. This helps the container perform self-repair.
94-
2. Make a backup copy of the default `telegraf.conf`. The backup is used to re-create the working copy if that ever gets removed from the persistent storage area.
95-
3. Replace `entrypoint.sh` with a version which:
109+
2. Copy the *default configuration file* that comes with the DockerHub image (so it will be available as a fully-commented reference for the user) and make it read-only.
110+
3. Make a *working version* of the *default configuration file* from which comment lines and blank lines have been removed.
111+
4. Patch the *working version* to support communications with InfluxDB running in another container in the same IOTstack instance.
112+
5. Replace `entrypoint.sh` with a version which:
96113

97114
* calls `rsync` to perform self-repair if `telegraf.conf` goes missing; and
98115
* enforces root:root ownership in `~/IOTstack/volumes/telegraf`.
99116

100-
The *local image* is instantiated to become your running container.
117+
The ***local image*** is instantiated to become your running container.
101118

102119
When you run the `docker images` command after Telegraf has been built, you will see two rows for Telegraf:
103120

@@ -108,10 +125,10 @@ iotstack_telegraf latest 59861b7fe9ed 2 hours ago 292MB
108125
telegraf latest a721ac170fad 3 days ago 273MB
109126
```
110127

111-
* `telegraf ` is the *base image*; and
112-
* `iotstack_telegraf ` is the *local image*.
128+
* `telegraf ` is the ***base image***; and
129+
* `iotstack_telegraf ` is the ***local image***.
113130

114-
You will see the same pattern in *Portainer*, which reports the *base image* as "unused". You should not remove the *base* image, even though it appears to be unused.
131+
You will see the same pattern in *Portainer*, which reports the ***base image*** as "unused". You should not remove the ***base*** image, even though it appears to be unused.
115132

116133
### <a name="migration"> Migration considerations </a>
117134

@@ -129,39 +146,138 @@ Under this implementation of Telegraf, the configuration file has moved to:
129146

130147
> The change of location is one of the things that allows self-repair to work properly.
131148
132-
The default version the configuration file supplied with earlier versions of IOTstack only contained 237 lines. At the time of writing (August 2021), the default version supplied with the Telegraf image downloaded from *DockerHub* contains 8641 lines.
149+
With one exception, all prior and current versions of the default configuration file are identical in terms of their semantics.
133150

134-
> That is a **significant** difference. It is not clear why the version supplied with the original [gcgarner/IOTstack](https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf) was so short. Nevertheless, that file was inherited by [SensorsIot/IOTstack](https://github.com/SensorsIot/IOTstack/blob/master/.templates/telegraf/telegraf.conf) and has never been changed.
151+
> In other words, once you strip away comments and blank lines, and remove any "active" configuration options that simply repeat their default setting, you get the same subset of "active" configuration options. The default configuration file supplied with gcgarner/IOTstack is available [here](https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf) if you wish to refer to it.
135152
136-
If you did not need to alter the 237-line file when you were running the original IOTstack implementation of Telegraf, it is *readonably* likely that the 8641-line default will also work, and that there will be no change in Telegraf's behaviour when it is built from a *Dockerfile*.
153+
The exception is `[[inputs.mqtt_consumer]]` which is now provided as an optional addition. If your existing Telegraf configuration depends on that input, you will need to apply it. See [applying optional additions](#optionalAdditions).
137154

138-
However, if you experience problems then you have two choices:
155+
## <a name="logging"> Logging </a>
139156

140-
1. Use your old `telegraf.conf`:
157+
You can inspect Telegraf's log by:
141158

142-
```bash
143-
$ cd ~/IOTstack
144-
$ docker-compose rm --force --stop -v telegraf
145-
$ sudo cp ./services/telegraf/telegraf.conf ./volumes/telegraf/telegraf.conf
146-
$ docker-compose up -d telegraf
147-
```
148-
149-
2. Work out which options you need to change in the 8641-line version. You can use your favourite Unix text editor. To cause Telegraf to notice your changes:
159+
```
160+
$ docker logs telegraf
161+
```
162+
163+
These logs are ephemeral and will disappear when your Telegraf container is rebuilt.
164+
165+
### <a name="logTelegrafDB"> log message: *database "telegraf" creation failed* </a>
166+
167+
The following log message can be misleading:
168+
169+
```
170+
W! [outputs.influxdb] When writing to [http://influxdb:8086]: database "telegraf" creation failed: Post "http://influxdb:8086/query": dial tcp 172.30.0.9:8086: connect: connection refused
171+
```
172+
173+
If InfluxDB is not running when Telegraf starts, the `depends_on:` clause in Telegraf's service definition tells Docker to start InfluxDB (and Mosquitto) before starting Telegraf. Although it can launch the InfluxDB *container* first, Docker has no way of knowing when the `influxd` *process* running inside the InfluxDB container will start listening to port 8086.
174+
175+
What this error message *usually* means is that Telegraf has tried to communicate with InfluxDB before the latter is ready to accept connections. Telegraf typically retries after a short delay and is then able to communicate with InfluxDB.
176+
177+
## <a name="editConfiguration"> Changing Telegraf's configuration </a>
178+
179+
The first time you launch the Telegraf container, the following structure will be created in the persistent storage area:
180+
181+
```
182+
~/IOTstack/volumes/telegraf
183+
├── [drwxr-xr-x root ] additions
184+
│   ├── [-rw-r--r-- root ] inputs.docker.conf
185+
│   └── [-rw-r--r-- root ] inputs.mqtt_consumer.conf
186+
├── [-rw-r--r-- root ] telegraf.conf
187+
└── [-r--r--r-- root ] telegraf-reference.conf
188+
```
189+
190+
The file:
191+
192+
* `telegraf-reference.conf`:
193+
194+
- is a *reference* copy of the default configuration file that ships with the ***base image*** for Telegraf when it is downloaded from DockerHub. It is nearly 9000 lines long and is mostly comments.
195+
- is **not** used by Telegraf but will be replaced if you delete it.
196+
- is marked "read-only" (even for root) as a reminder that it is only for your reference. Any changes you make will be ignored.
197+
198+
* `telegraf.conf`:
199+
200+
- is created by removing all comment lines and blank lines from `telegraf-reference.conf`, leaving only the "active" configuration options, and then adding options necessary for IOTstack.
201+
- is less than 30 lines and is significantly easier to understand than `telegraf-reference.conf`.
202+
203+
* `inputs.docker.conf` – see [Applying optional additions](#optionalAdditions) below.
204+
205+
The intention of this structure is that you:
206+
207+
1. search `telegraf-reference.conf` to find the configuration option you need;
208+
2. read the comments to understand what the option does and how to use it; and then
209+
3. import the option into the correct section of `telegraf.conf`.
210+
211+
When you make a change to `telegraf.conf`, you activate it by restarting the container:
212+
213+
```
214+
$ cd ~/IOTstack
215+
$ docker-compose restart telegraf
216+
```
217+
218+
### <a name="optionalAdditions"> Applying optional additions </a>
219+
220+
The *additions folder* (see [Significant directories and files](#significantFiles)) is a mechanism for additional *IOTstack-ready* configuration options to be provided for Telegraf.
221+
222+
At the time of writing (October 2021), two additions are provided:
223+
224+
1. `inputs.docker.conf` provided by @tablatronix, which instructs Telegraf to collect metrics from Docker.
225+
2. `inputs.mqtt_consumer.conf` which formed part of the [gcgarner/IOTstack telegraf configuration](https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf) and instructs Telegraf to subscribe to a metric feed from the Mosquitto broker. This assumes, of course, that something is publishing those metrics.
226+
227+
Using `inputs.docker.conf` as the example, applying that addition to your Telegraf configuration file involves:
228+
229+
```
230+
$ cd ~/IOTstack/volumes/telegraf
231+
$ grep -v "^#" additions/inputs.docker.conf | sudo tee -a telegraf.conf >/dev/null
232+
$ cd ~/IOTstack
233+
$ docker-compose restart telegraf
234+
```
235+
236+
The `grep` strips comment lines and the `sudo tee` is a safe way of appending the result to `telegraf.conf`. The `restart` causes Telegraf to notice the change.
237+
238+
## <a name="cleanSlate"> Getting a clean slate </a>
239+
240+
### <a name="resetDB"> Erasing the persistent storage area </a>
241+
242+
Erasing Telegraf's persistent storage area triggers self-healing and restores known defaults:
243+
244+
```
245+
$ cd ~/IOTstack
246+
$ docker-compose rm --force --stop -v telegraf
247+
$ sudo rm -rf ./volumes/telegraf
248+
$ docker-compose up -d telegraf
249+
```
250+
251+
Note:
252+
253+
* You can also remove individual files within the persistent storage area and then trigger self-healing. For example, if you decide to edit `telegraf-reference.conf` and make a mess, you can restore the original version like this:
150254

151255
```
152256
$ cd ~/IOTstack
257+
$ sudo rm ./volumes/telegraf/telegraf-reference.conf
153258
$ docker-compose restart telegraf
154259
```
155260

156-
## <a name="logging"> Logging </a>
261+
### <a name="resetDB"> Resetting the InfluxDB database </a>
157262

158-
You can inspect Telegraf's log by:
263+
To reset the InfluxDB database that Telegraf writes into, proceed like this:
159264

160265
```
161-
$ docker logs telegraf
266+
$ cd ~/IOTstack
267+
$ docker-compose rm --force --stop -v telegraf
268+
$ docker exec -it influxdb influx -precision=rfc3339
269+
> drop database telegraf
270+
> exit
271+
$ docker-compose up -d telegraf
162272
```
163273

164-
These logs are ephemeral and will disappear when your Telegraf container is rebuilt.
274+
In words:
275+
276+
* Be in the right directory.
277+
* Stop the Telegraf container (while leaving the InfluxDB container running).
278+
* Launch the Influx CLI inside the InfluxDB container.
279+
* Delete the `telegraf` database, and then exit the CLI.
280+
* Start the Telegraf container. This re-creates the database automatically.
165281

166282
## <a name="upgradingTelegraf"> Upgrading Telegraf </a>
167283

@@ -180,7 +296,7 @@ In words:
180296
* `docker-compose up -d` causes any newly-downloaded images to be instantiated as containers (replacing the old containers); and
181297
* the `prune` gets rid of the outdated images.
182298

183-
This strategy doesn't work when a *Dockerfile* is used to build a *local image* on top of a *base image* downloaded from [*DockerHub*](https://hub.docker.com). The *local image* is what is running so there is no way for the `pull` to sense when a newer version becomes available.
299+
This strategy doesn't work when a *Dockerfile* is used to build a ***local image*** on top of a ***base image*** downloaded from [*DockerHub*](https://hub.docker.com). The ***local image*** is what is running so there is no way for the `pull` to sense when a newer version becomes available.
184300

185301
The only way to know when an update to Telegraf is available is to check the [Telegraf tags page](https://hub.docker.com/_/telegraf?tab=tags&page=1&ordering=last_updated) on *DockerHub*.
186302

@@ -197,13 +313,13 @@ $ docker system prune
197313
Breaking it down into parts:
198314

199315
* `build` causes the named container to be rebuilt;
200-
* `--no-cache` tells the *Dockerfile* process that it must not take any shortcuts. It really **must** rebuild the *local image*;
201-
* `--pull` tells the *Dockerfile* process to actually check with [*DockerHub*](https://hub.docker.com) to see if there is a later version of the *base image* and, if so, to download it before starting the build;
316+
* `--no-cache` tells the *Dockerfile* process that it must not take any shortcuts. It really **must** rebuild the ***local image***;
317+
* `--pull` tells the *Dockerfile* process to actually check with [*DockerHub*](https://hub.docker.com) to see if there is a later version of the ***base image*** and, if so, to download it before starting the build;
202318
* `telegraf` is the named container argument required by the `build` command.
203319

204-
Your existing Telegraf container continues to run while the rebuild proceeds. Once the freshly-built *local image* is ready, the `up` tells `docker-compose` to do a new-for-old swap. There is barely any downtime for your service.
320+
Your existing Telegraf container continues to run while the rebuild proceeds. Once the freshly-built ***local image*** is ready, the `up` tells `docker-compose` to do a new-for-old swap. There is barely any downtime for your service.
205321

206-
The `prune` is the simplest way of cleaning up. The first call removes the old *local image*. The second call cleans up the old *base image*.
322+
The `prune` is the simplest way of cleaning up. The first call removes the old ***local image***. The second call cleans up the old ***base image***.
207323

208324
### <a name="versionPinning"> Telegraf version pinning </a>
209325

@@ -227,16 +343,16 @@ If you need to pin Telegraf to a particular version:
227343
FROM telegraf:1.19.3
228344
```
229345

230-
4. Save the file and tell `docker-compose` to rebuild the local image:
346+
4. Save the file and tell `docker-compose` to rebuild the ***local image***:
231347

232348
```bash
233349
$ cd ~/IOTstack
234350
$ docker-compose up -d --build telegraf
235351
$ docker system prune
236352
```
237353

238-
The new *local image* is built, then the new container is instantiated based on that image. The `prune` deletes the old *local image*.
354+
The new ***local image*** is built, then the new container is instantiated based on that image. The `prune` deletes the old ***local image***.
239355

240356
Note:
241357

242-
* As well as preventing Docker from updating the *base image*, pinning will also block incoming updates to the *Dockerfile* from a `git pull`. Nothing will change until you decide to remove the pin.
358+
* As well as preventing Docker from updating the ***base image***, pinning will also block incoming updates to the *Dockerfile* from a `git pull`. Nothing will change until you decide to remove the pin.

0 commit comments

Comments
 (0)