Skip to content

Commit 0f0e5ef

Browse files
author
Xuewei Zhang
committed
Adding stackdriver exporter
1 parent 9e789b5 commit 0f0e5ef

File tree

17 files changed

+705
-23
lines changed

17 files changed

+705
-23
lines changed

.travis.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,5 @@ script:
2727
- BUILD_TAGS="disable_system_log_monitor" make test
2828
- make clean && BUILD_TAGS="disable_system_stats_monitor" make
2929
- BUILD_TAGS="disable_system_stats_monitor" make test
30+
- make clean && BUILD_TAGS="disable_stackdriver_exporter" make
31+
- BUILD_TAGS="disable_stackdriver_exporter" make test

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ PKG:=k8s.io/node-problem-detector
4141
PKG_SOURCES:=$(shell find pkg cmd -name '*.go')
4242

4343
# TARBALL is the name of release tar. Include binary version by default.
44-
TARBALL:=node-problem-detector-$(VERSION).tar.gz
44+
TARBALL?=node-problem-detector-$(VERSION).tar.gz
4545

4646
# IMAGE is the image name of the node problem detector container image.
4747
IMAGE:=$(REGISTRY)/node-problem-detector:$(TAG)

README.md

Lines changed: 33 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -69,27 +69,44 @@ List of supported problem daemons:
6969
# Exporter
7070

7171
An exporter is a component of node-problem-detector. It reports node problems and/or metrics to
72-
certain back end (e.g. Kubernetes API server, or Prometheus scrape endpoint).
72+
certain back end. Some of them can be disable at compile time using a build tag. List of supported exporters:
73+
74+
| Exporter |Description | Disabling Build Tag |
75+
|----------|:-----------|:--------------------|
76+
| Kubernetes exporter | Kubernetes exporter reports node problems to Kubernetes API server: temporary problems get reported as Events, and permanent problems get reported as Node Conditions. |
77+
| Prometheus exporter | Prometheus exporter reports node problems and metrics locally as Prometheus metrics |
78+
| [Stackdriver exporter](https://github.com/kubernetes/node-problem-detector/blob/master/config/exporter/stackdriver-exporter.json) | Stackdriver exporter reports node problems and metrics to Stackdriver Monitoring API. | disable_stackdriver_exporter
7379

7480
# Usage
7581

7682
## Flags
7783

7884
* `--version`: Print current version of node-problem-detector.
79-
* `--address`: The address to bind the node problem detector server.
80-
* `--port`: The port to bind the node problem detector server. Use 0 to disable.
85+
* `--hostname-override`: A customized node name used for node-problem-detector to update conditions and emit events. node-problem-detector gets node name first from `hostname-override`, then `NODE_NAME` environment variable and finally fall back to `os.Hostname`.
86+
87+
#### For System Log Monitor
88+
8189
* `--config.system-log-monitor`: List of paths to system log monitor configuration files, comma separated, e.g.
8290
[config/kernel-monitor.json](https://github.com/kubernetes/node-problem-detector/blob/master/config/kernel-monitor.json).
8391
Node problem detector will start a separate log monitor for each configuration. You can
8492
use different log monitors to monitor different system log.
85-
* `--config.custom-plugin-monitor`: List of paths to custom plugin monitor config files, comma separated, e.g.
86-
[config/custom-plugin-monitor.json](https://github.com/kubernetes/node-problem-detector/blob/master/config/custom-plugin-monitor.json).
87-
Node problem detector will start a separate custom plugin monitor for each configuration. You can
88-
use different custom plugin monitors to monitor different node problems.
93+
94+
#### For System Stats Monitor
95+
8996
* `--config.system-stats-monitor`: List of paths to system stats monitor config files, comma separated, e.g.
9097
[config/system-stats-monitor.json](https://github.com/kubernetes/node-problem-detector/blob/master/config/system-stats-monitor.json).
9198
Node problem detector will start a separate system stats monitor for each configuration. You can
9299
use different system stats monitors to monitor different problem-related system stats.
100+
101+
#### For Custom Plugin Monitor
102+
103+
* `--config.custom-plugin-monitor`: List of paths to custom plugin monitor config files, comma separated, e.g.
104+
[config/custom-plugin-monitor.json](https://github.com/kubernetes/node-problem-detector/blob/master/config/custom-plugin-monitor.json).
105+
Node problem detector will start a separate custom plugin monitor for each configuration. You can
106+
use different custom plugin monitors to monitor different node problems.
107+
108+
#### For Kubernetes exporter
109+
93110
* `--enable-k8s-exporter`: Enables reporting to Kubernetes API server, default to `true`.
94111
* `--apiserver-override`: A URI parameter used to customize how node-problem-detector
95112
connects the apiserver. This is ignored if `--enable-k8s-exporter` is `false`. The format is same as the
@@ -100,10 +117,18 @@ For example, to run without auth, use the following config:
100117
http://APISERVER_IP:APISERVER_PORT?inClusterConfig=false
101118
```
102119
Refer [heapster docs](https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md#kubernetes) for a complete list of available options.
103-
* `--hostname-override`: A customized node name used for node-problem-detector to update conditions and emit events. node-problem-detector gets node name first from `hostname-override`, then `NODE_NAME` environment variable and finally fall back to `os.Hostname`.
120+
* `--address`: The address to bind the node problem detector server.
121+
* `--port`: The port to bind the node problem detector server. Use 0 to disable.
122+
123+
#### For Prometheus exporter
124+
104125
* `--prometheus-address`: The address to bind the Prometheus scrape endpoint, default to `127.0.0.1`.
105126
* `--prometheus-port`: The port to bind the Prometheus scrape endpoint, default to 20257. Use 0 to disable.
106127

128+
#### For Stackdriver exporter
129+
130+
* `--exporter.stackdriver`: Path to a Stackdriver exporter config file, e.g. [config/exporter/stackdriver-exporter.json](https://github.com/kubernetes/node-problem-detector/blob/master/config/exporter/stackdriver-exporter.json), default to empty string. Set to empty string to disable.
131+
107132
### Deprecated Flags
108133

109134
* `--system-log-monitors`: List of paths to system log monitor config files, comma separated. This option is deprecated, replaced by `--config.system-log-monitor`, and will be removed. NPD will panic if both `--system-log-monitors` and `--config.system-log-monitor` are set.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
/*
2+
Copyright 2019 The Kubernetes Authors All rights reserved.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
package exporterplugins
18+
19+
// This file is necessary to make sure the exporterplugins package non-empty
20+
// under any build tags.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
// +build !disable_stackdriver_exporter
2+
3+
/*
4+
Copyright 2019 The Kubernetes Authors All rights reserved.
5+
6+
Licensed under the Apache License, Version 2.0 (the "License");
7+
you may not use this file except in compliance with the License.
8+
You may obtain a copy of the License at
9+
10+
http://www.apache.org/licenses/LICENSE-2.0
11+
12+
Unless required by applicable law or agreed to in writing, software
13+
distributed under the License is distributed on an "AS IS" BASIS,
14+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
See the License for the specific language governing permissions and
16+
limitations under the License.
17+
*/
18+
19+
package exporterplugins
20+
21+
import (
22+
_ "k8s.io/node-problem-detector/pkg/exporters/stackdriver"
23+
)
24+
25+
// The stackdriver plugin takes about 6MB in the NPD binary.

cmd/nodeproblemdetector/node_problem_detector.go

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,10 @@ import (
2222
"github.com/golang/glog"
2323
"github.com/spf13/pflag"
2424

25+
_ "k8s.io/node-problem-detector/cmd/nodeproblemdetector/exporterplugins"
2526
_ "k8s.io/node-problem-detector/cmd/nodeproblemdetector/problemdaemonplugins"
2627
"k8s.io/node-problem-detector/cmd/options"
28+
"k8s.io/node-problem-detector/pkg/exporters"
2729
"k8s.io/node-problem-detector/pkg/exporters/k8sexporter"
2830
"k8s.io/node-problem-detector/pkg/exporters/prometheusexporter"
2931
"k8s.io/node-problem-detector/pkg/problemdaemon"
@@ -54,21 +56,28 @@ func main() {
5456
}
5557

5658
// Initialize exporters.
57-
exporters := []types.Exporter{}
59+
defaultExporters := []types.Exporter{}
5860
if ke := k8sexporter.NewExporterOrDie(npdo); ke != nil {
59-
exporters = append(exporters, ke)
61+
defaultExporters = append(defaultExporters, ke)
6062
glog.Info("K8s exporter started.")
6163
}
6264
if pe := prometheusexporter.NewExporterOrDie(npdo); pe != nil {
63-
exporters = append(exporters, pe)
65+
defaultExporters = append(defaultExporters, pe)
6466
glog.Info("Prometheus exporter started.")
6567
}
66-
if len(exporters) == 0 {
68+
69+
plugableExporters := exporters.NewExporters()
70+
71+
npdExporters := []types.Exporter{}
72+
npdExporters = append(npdExporters, defaultExporters...)
73+
npdExporters = append(npdExporters, plugableExporters...)
74+
75+
if len(npdExporters) == 0 {
6776
glog.Fatalf("No exporter is successfully setup")
6877
}
6978

7079
// Initialize NPD core.
71-
p := problemdetector.NewProblemDetector(problemDaemons, exporters)
80+
p := problemdetector.NewProblemDetector(problemDaemons, npdExporters)
7281
if err := p.Run(); err != nil {
7382
glog.Fatalf("Problem detector failed with error: %v", err)
7483
}

cmd/options/options.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ import (
2626

2727
"github.com/spf13/pflag"
2828

29+
"k8s.io/node-problem-detector/pkg/exporters"
2930
"k8s.io/node-problem-detector/pkg/problemdaemon"
3031
"k8s.io/node-problem-detector/pkg/types"
3132
)
@@ -86,6 +87,7 @@ type NodeProblemDetectorOptions struct {
8687

8788
func NewNodeProblemDetectorOptions() *NodeProblemDetectorOptions {
8889
npdo := &NodeProblemDetectorOptions{MonitorConfigPaths: types.ProblemDaemonConfigPathMap{}}
90+
8991
for _, problemDaemonName := range problemdaemon.GetProblemDaemonNames() {
9092
npdo.MonitorConfigPaths[problemDaemonName] = &[]string{}
9193
}
@@ -118,6 +120,10 @@ func (npdo *NodeProblemDetectorOptions) AddFlags(fs *pflag.FlagSet) {
118120
fs.StringVar(&npdo.PrometheusServerAddress, "prometheus-address",
119121
"127.0.0.1", "The address to bind the Prometheus scrape endpoint.")
120122

123+
for _, exporterName := range exporters.GetExporterNames() {
124+
exporterHandler := exporters.GetExporterHandlerOrDie(exporterName)
125+
exporterHandler.Options.SetFlags(fs)
126+
}
121127
for _, problemDaemonName := range problemdaemon.GetProblemDaemonNames() {
122128
fs.StringSliceVar(
123129
npdo.MonitorConfigPaths[problemDaemonName],
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"apiEndpoint": "monitoring.googleapis.com:443",
3+
"exportPeriod": "60s",
4+
"metadataFetchTimeout": "600s",
5+
"metadataFetchInterval": "10s",
6+
"panicOnMetadataFetchFailure": false,
7+
"customMetricPrefix": ""
8+
}

config/systemd/node-problem-detector-metric-only.service

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
[Unit]
22
Description=Node problem detector
3-
Wants=local-fs.target
4-
After=local-fs.target
3+
Wants=network-online.target
4+
After=network-online.target
55

66
[Service]
77
Restart=always
88
RestartSec=10
99
ExecStart=/home/kubernetes/bin/node-problem-detector --v=2 --logtostderr --enable-k8s-exporter=false \
10+
--exporter.stackdriver=/home/kubernetes/node-problem-detector/config/exporter/stackdriver-exporter.json \
1011
--config.system-log-monitor=/home/kubernetes/node-problem-detector/config/kernel-monitor.json,/home/kubernetes/node-problem-detector/config/docker-monitor.json,/home/kubernetes/node-problem-detector/config/systemd-monitor.json \
1112
--config.custom-plugin-monitor=/home/kubernetes/node-problem-detector/config/kernel-monitor-counter.json,/home/kubernetes/node-problem-detector/config/systemd-monitor-counter.json \
1213
--config.system-stats-monitor=/home/kubernetes/node-problem-detector/config/system-stats-monitor.json

pkg/exporters/register.go

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
/*
2+
Copyright 2019 The Kubernetes Authors All rights reserved.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
package exporters
18+
19+
import (
20+
"fmt"
21+
22+
"k8s.io/node-problem-detector/pkg/types"
23+
)
24+
25+
var (
26+
handlers = make(map[types.ExporterType]types.ExporterHandler)
27+
)
28+
29+
// Register registers a exporter factory method, which will be used to create the exporter.
30+
func Register(exporterType types.ExporterType, handler types.ExporterHandler) {
31+
handlers[exporterType] = handler
32+
}
33+
34+
// GetExporterNames retrieves all available exporter types.
35+
func GetExporterNames() []types.ExporterType {
36+
exporterTypes := []types.ExporterType{}
37+
for exporterType := range handlers {
38+
exporterTypes = append(exporterTypes, exporterType)
39+
}
40+
return exporterTypes
41+
}
42+
43+
// GetExporterHandlerOrDie retrieves the ExporterHandler for a specific type of exporter, panic if error occurs..
44+
func GetExporterHandlerOrDie(exporterType types.ExporterType) types.ExporterHandler {
45+
handler, ok := handlers[exporterType]
46+
if !ok {
47+
panic(fmt.Sprintf("Exporter handler for %v does not exist", exporterType))
48+
}
49+
return handler
50+
}
51+
52+
// NewExporters creates all exporters based on the configurations initialized.
53+
func NewExporters() []types.Exporter {
54+
exporters := []types.Exporter{}
55+
for _, handler := range handlers {
56+
exporter := handler.CreateExporterOrDie(handler.Options)
57+
if exporter == nil {
58+
continue
59+
}
60+
exporters = append(exporters, exporter)
61+
}
62+
return exporters
63+
}

0 commit comments

Comments
 (0)