You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document discusses an IOTstack-specific version of Telegraf built on top of [influxdata/influxdata-docker/telegraf](https://github.com/influxdata/influxdata-docker/tree/master/telegraf) using a *Dockerfile*.
4
4
5
-
The purpose of the Dockerfile is to enable the container to perform self-repair if the `telegraf.conf ` configuration file disappears.
5
+
The purpose of the Dockerfile is to:
6
+
7
+
* tailor the default configuration to be IOTstack-ready; and
8
+
* enable the container to perform self-repair if essential elements of the persistent storage area disappear.
6
9
7
10
## <aname="references"> References </a>
8
11
@@ -18,23 +21,35 @@ The purpose of the Dockerfile is to enable the container to perform self-repair
18
21
│ └── telegraf
19
22
│ ├── Dockerfile ❶
20
23
│ ├── entrypoint.sh ❷
21
-
│ └── service.yml ❸
24
+
│ ├── iotstack_defaults
25
+
│ │ └── additions ❸
26
+
│ └── service.yml ❹
22
27
├── services
23
28
│ └── telegraf
24
-
│ └── service.yml ❹
25
-
├── docker-compose.yml ❺
29
+
│ └── service.yml ❺
30
+
├── docker-compose.yml ❻
26
31
└── volumes
27
-
└── telegraf ❻
28
-
└── telegraf.conf ❼
32
+
└── telegraf ❼
33
+
├── additions ❽
34
+
├── telegraf-reference.conf ➒
35
+
└── telegraf.conf ➓
29
36
```
30
37
31
38
1. The *Dockerfile* used to customise Telegraf for IOTstack.
32
39
2. A replacement for the `telegraf` container script of the same name, extended to handle container self-repair.
33
-
3. The *template service definition*.
34
-
4. The *working service definition* (only relevant to old-menu, copied from ❸).
35
-
5. The *Compose* file (includes ❸).
36
-
6. The *persistent storage area* for the `telegraf` container.
37
-
7. The configuration file. Will be created by default if not present in ❻ when the container starts but will not be overwritten if customised by you.
40
+
3. The *additions folder*. See [Applying optional additions](#optionalAdditions).
41
+
4. The *template service definition*.
42
+
5. The *working service definition* (only relevant to old-menu, copied from ❹).
43
+
6. The *Compose* file (includes ❹).
44
+
7. The *persistent storage area* for the `telegraf` container.
45
+
8. A working copy of the *additions folder* (copied from ❸). See [Applying optional additions](#optionalAdditions).
46
+
9. The *reference configuration file*. See [Changing Telegraf's configuration](#editConfiguration).
47
+
10. The *active configuration file*. A subset of ➒ altered to support communication with InfluxDB running in a container in the same IOTstack instance.
48
+
49
+
Everything in the persistent storage area ❼:
50
+
51
+
* will be replaced if it is not present when the container starts; but
52
+
* will never be overwritten if altered by you.
38
53
39
54
## <aname="howTelegrafIOTstackGetsBuilt"> How Telegraf gets built for IOTstack </a>
40
55
@@ -88,16 +103,18 @@ The `FROM` statement tells the build process to pull down the ***base image*** f
88
103
89
104
> It is a ***base*** image in the sense that it never actually runs as a container on your Raspberry Pi.
90
105
91
-
The remaining instructions in the *Dockerfile* customise the *base image* to produce a ***local image***. The customisations are:
106
+
The remaining instructions in the *Dockerfile* customise the ***base image*** to produce a ***local image***. The customisations are:
92
107
93
108
1. Add the `rsync` package. This helps the container perform self-repair.
94
-
2. Make a backup copy of the default `telegraf.conf`. The backup is used to re-create the working copy if that ever gets removed from the persistent storage area.
95
-
3. Replace `entrypoint.sh` with a version which:
109
+
2. Copy the *default configuration file* that comes with the DockerHub image (so it will be available as a fully-commented reference for the user) and make it read-only.
110
+
3. Make a *working version* of the *default configuration file* from which comment lines and blank lines have been removed.
111
+
4. Patch the *working version* to support communications with InfluxDB running in another container in the same IOTstack instance.
112
+
5. Replace `entrypoint.sh` with a version which:
96
113
97
114
* calls `rsync` to perform self-repair if `telegraf.conf` goes missing; and
98
115
* enforces root:root ownership in `~/IOTstack/volumes/telegraf`.
99
116
100
-
The *local image* is instantiated to become your running container.
117
+
The ***local image*** is instantiated to become your running container.
101
118
102
119
When you run the `docker images` command after Telegraf has been built, you will see two rows for Telegraf:
You will see the same pattern in *Portainer*, which reports the *base image* as "unused". You should not remove the *base* image, even though it appears to be unused.
131
+
You will see the same pattern in *Portainer*, which reports the ***base image*** as "unused". You should not remove the ***base*** image, even though it appears to be unused.
@@ -129,39 +146,138 @@ Under this implementation of Telegraf, the configuration file has moved to:
129
146
130
147
> The change of location is one of the things that allows self-repair to work properly.
131
148
132
-
The default version the configuration file supplied with earlier versions of IOTstack only contained 237 lines. At the time of writing (August 2021), the default version supplied with the Telegraf image downloaded from *DockerHub* contains 8641 lines.
149
+
With one exception, all prior and current versions of the default configuration file are identical in terms of their semantics.
133
150
134
-
> That is a **significant** difference. It is not clear why the version supplied with the original [gcgarner/IOTstack](https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf) was so short. Nevertheless, that file was inherited by [SensorsIot/IOTstack](https://github.com/SensorsIot/IOTstack/blob/master/.templates/telegraf/telegraf.conf)and has never been changed.
151
+
> In other words, once you strip away comments and blank lines, and remove any "active" configuration options that simply repeat their default setting, you get the same subset of "active" configuration options. The default configuration file supplied with gcgarner/IOTstack is available [here](https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf)if you wish to refer to it.
135
152
136
-
If you did not need to alter the 237-line file when you were running the original IOTstack implementation of Telegraf, it is *readonably* likely that the 8641-line default will also work, and that there will be no change in Telegraf's behaviour when it is built from a *Dockerfile*.
153
+
The exception is `[[inputs.mqtt_consumer]]` which is now provided as an optional addition. If your existing Telegraf configuration depends on that input, you will need to apply it. See [applying optional additions](#optionalAdditions).
137
154
138
-
However, if you experience problems then you have two choices:
2. Work out which options you need to change in the 8641-line version. You can use your favourite Unix text editor. To cause Telegraf to notice your changes:
159
+
```
160
+
$ docker logs telegraf
161
+
```
162
+
163
+
These logs are ephemeral and will disappear when your Telegraf container is rebuilt.
W! [outputs.influxdb] When writing to [http://influxdb:8086]: database "telegraf" creation failed: Post "http://influxdb:8086/query": dial tcp 172.30.0.9:8086: connect: connection refused
171
+
```
172
+
173
+
If InfluxDB is not running when Telegraf starts, the `depends_on:` clause in Telegraf's service definition tells Docker to start InfluxDB (and Mosquitto) before starting Telegraf. Although it can launch the InfluxDB *container* first, Docker has no way of knowing when the `influxd`*process* running inside the InfluxDB container will start listening to port 8086.
174
+
175
+
What this error message *usually* means is that Telegraf has tried to communicate with InfluxDB before the latter is ready to accept connections. Telegraf typically retries after a short delay and is then able to communicate with InfluxDB.
- is a *reference* copy of the default configuration file that ships with the ***base image*** for Telegraf when it is downloaded from DockerHub. It is nearly 9000 lines long and is mostly comments.
195
+
- is **not** used by Telegraf but will be replaced if you delete it.
196
+
- is marked "read-only" (even for root) as a reminder that it is only for your reference. Any changes you make will be ignored.
197
+
198
+
*`telegraf.conf`:
199
+
200
+
- is created by removing all comment lines and blank lines from `telegraf-reference.conf`, leaving only the "active" configuration options, and then adding options necessary for IOTstack.
201
+
- is less than 30 lines and is significantly easier to understand than `telegraf-reference.conf`.
202
+
203
+
*`inputs.docker.conf` – see [Applying optional additions](#optionalAdditions) below.
204
+
205
+
The intention of this structure is that you:
206
+
207
+
1. search `telegraf-reference.conf` to find the configuration option you need;
208
+
2. read the comments to understand what the option does and how to use it; and then
209
+
3. import the option into the correct section of `telegraf.conf`.
210
+
211
+
When you make a change to `telegraf.conf`, you activate it by restarting the container:
The *additions folder* (see [Significant directories and files](#significantFiles)) is a mechanism for additional *IOTstack-ready* configuration options to be provided for Telegraf.
221
+
222
+
At the time of writing (October 2021), two additions are provided:
223
+
224
+
1.`inputs.docker.conf` provided by @tablatronix, which instructs Telegraf to collect metrics from Docker.
225
+
2.`inputs.mqtt_consumer.conf` which formed part of the [gcgarner/IOTstack telegraf configuration](https://github.com/gcgarner/IOTstack/blob/master/.templates/telegraf/telegraf.conf) and instructs Telegraf to subscribe to a metric feed from the Mosquitto broker. This assumes, of course, that something is publishing those metrics.
226
+
227
+
Using `inputs.docker.conf` as the example, applying that addition to your Telegraf configuration file involves:
228
+
229
+
```
230
+
$ cd ~/IOTstack/volumes/telegraf
231
+
$ grep -v "^#" additions/inputs.docker.conf | sudo tee -a telegraf.conf >/dev/null
232
+
$ cd ~/IOTstack
233
+
$ docker-compose restart telegraf
234
+
```
235
+
236
+
The `grep` strips comment lines and the `sudo tee` is a safe way of appending the result to `telegraf.conf`. The `restart` causes Telegraf to notice the change.
237
+
238
+
## <aname="cleanSlate"> Getting a clean slate </a>
239
+
240
+
### <aname="resetDB"> Erasing the persistent storage area </a>
241
+
242
+
Erasing Telegraf's persistent storage area triggers self-healing and restores known defaults:
243
+
244
+
```
245
+
$ cd ~/IOTstack
246
+
$ docker-compose rm --force --stop -v telegraf
247
+
$ sudo rm -rf ./volumes/telegraf
248
+
$ docker-compose up -d telegraf
249
+
```
250
+
251
+
Note:
252
+
253
+
* You can also remove individual files within the persistent storage area and then trigger self-healing. For example, if you decide to edit `telegraf-reference.conf` and make a mess, you can restore the original version like this:
*`docker-compose up -d` causes any newly-downloaded images to be instantiated as containers (replacing the old containers); and
181
297
* the `prune` gets rid of the outdated images.
182
298
183
-
This strategy doesn't work when a *Dockerfile* is used to build a *local image* on top of a *base image* downloaded from [*DockerHub*](https://hub.docker.com). The *local image* is what is running so there is no way for the `pull` to sense when a newer version becomes available.
299
+
This strategy doesn't work when a *Dockerfile* is used to build a ***local image*** on top of a ***base image*** downloaded from [*DockerHub*](https://hub.docker.com). The ***local image*** is what is running so there is no way for the `pull` to sense when a newer version becomes available.
184
300
185
301
The only way to know when an update to Telegraf is available is to check the [Telegraf tags page](https://hub.docker.com/_/telegraf?tab=tags&page=1&ordering=last_updated) on *DockerHub*.
186
302
@@ -197,13 +313,13 @@ $ docker system prune
197
313
Breaking it down into parts:
198
314
199
315
*`build` causes the named container to be rebuilt;
200
-
*`--no-cache` tells the *Dockerfile* process that it must not take any shortcuts. It really **must** rebuild the *local image*;
201
-
*`--pull` tells the *Dockerfile* process to actually check with [*DockerHub*](https://hub.docker.com) to see if there is a later version of the *base image* and, if so, to download it before starting the build;
316
+
*`--no-cache` tells the *Dockerfile* process that it must not take any shortcuts. It really **must** rebuild the ***local image***;
317
+
*`--pull` tells the *Dockerfile* process to actually check with [*DockerHub*](https://hub.docker.com) to see if there is a later version of the ***base image*** and, if so, to download it before starting the build;
202
318
*`telegraf` is the named container argument required by the `build` command.
203
319
204
-
Your existing Telegraf container continues to run while the rebuild proceeds. Once the freshly-built *local image* is ready, the `up` tells `docker-compose` to do a new-for-old swap. There is barely any downtime for your service.
320
+
Your existing Telegraf container continues to run while the rebuild proceeds. Once the freshly-built ***local image*** is ready, the `up` tells `docker-compose` to do a new-for-old swap. There is barely any downtime for your service.
205
321
206
-
The `prune` is the simplest way of cleaning up. The first call removes the old *local image*. The second call cleans up the old *base image*.
322
+
The `prune` is the simplest way of cleaning up. The first call removes the old ***local image***. The second call cleans up the old ***base image***.
207
323
208
324
### <aname="versionPinning"> Telegraf version pinning </a>
209
325
@@ -227,16 +343,16 @@ If you need to pin Telegraf to a particular version:
227
343
FROM telegraf:1.19.3
228
344
```
229
345
230
-
4. Save the file and tell `docker-compose` to rebuild the local image:
346
+
4. Save the file and tell `docker-compose` to rebuild the ***local image***:
231
347
232
348
```bash
233
349
$ cd ~/IOTstack
234
350
$ docker-compose up -d --build telegraf
235
351
$ docker system prune
236
352
```
237
353
238
-
The new *local image* is built, then the new container is instantiated based on that image. The `prune` deletes the old *local image*.
354
+
The new ***local image*** is built, then the new container is instantiated based on that image. The `prune` deletes the old ***local image***.
239
355
240
356
Note:
241
357
242
-
* As well as preventing Docker from updating the *base image*, pinning will also block incoming updates to the *Dockerfile* from a `git pull`. Nothing will change until you decide to remove the pin.
358
+
* As well as preventing Docker from updating the ***base image***, pinning will also block incoming updates to the *Dockerfile* from a `git pull`. Nothing will change until you decide to remove the pin.
0 commit comments