Skip to content

Commit f3e7ace

Browse files
roffemurali-reddy
authored andcommitted
Metrics + Logging update (#294)
* - added protocol & port label to metrics - removed some redundant code * added example dashboard * added dashboard screenshot * updated dashboard json & screenshot * ammend bad dashboard export * first new metric * . * more metrics: controller_publish_metrics_time & controller_iptables_sync_time * namespace redeclared * fix typo in name * smal fixes * new metric controller_bgp_peers & controller_bgp_internal_peers_sync_time * typo fix * new metric controller_ipvs_service_sync_time * fix * register metric * fix * fix * added more metrics * service controller log levels * fix * fix * added metrics controller * fixes * fix * fix * fixed more log levels * server and graceful shutdown * fix * fix * fix * code cleanup * docs * move metrics exporting to controller * fix * fix * fixes * fix * fix missing * fix * fix * test * test * fix * fix * fix * updated dashboard * updates to metric controller * fixed order in newmetricscontroller * err declared and not used * updated dashboard * updated dashboard screenshot * removed --metrics & changed --metrics-port to enable / disable metrics * #271 * cannot use config.MetricsPort (type uint16) as type int in assignment * cannot use mc.MetricsPort (type uint16) as type int in argument to strconv.Itoa * updated docs * changed default metric port to 0, disabled * added missing newline to .dockerignore * add lag parse to pickup on -v directives * test * test * test * fix regression * syntax error: non-declaration statement outside function body * fix * changed nsc to mc * updated docs * markdown fix * moved metrics registration out to respective controller so only metrics for running parts will be exposed * removed junk that came from visual studio code * fixed some typos * Moved the metrics back into each controller and added expose behaviour so only the running components metrics would be published * removed to much, added back instanciation of metricscontroller * fixed some invalid variable names * fixed last typos on config name * fixed order in newnetworkservicecontroller * updated metrics docs & removed the metrics sync period as it will obey the controllers sync period * forgott to save options.go * cleanup * Updated metric name & docs * updated metrics.md * fixed a high cpu usage bug in the metrics_controller's wait loop
1 parent 1492f0b commit f3e7ace

File tree

13 files changed

+1098
-285
lines changed

13 files changed

+1098
-285
lines changed

.dockerignore

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,12 @@
1+
.git
12
**/_cache
3+
app
4+
build-image
5+
cni
6+
contrib
7+
daemonset
8+
dashboard
9+
Documentation
10+
hack
11+
utils
12+
vendor

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
.vscode
12
/kube-router
23
/gobgp
34
_output

Documentation/generic.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,6 @@ Any iptables rules kube-proxy left around will also need to be cleaned up. This
5151

5252
docker run --privileged --net=host gcr.io/google_containers/kube-proxy-amd64:v1.7.3 kube-proxy --cleanup-iptables
5353

54-
5554
## Running kube-router without the service proxy
5655

5756
This runs kube-router with pod/service networking and the network policy firewall. The Services proxy is disabled.
@@ -60,4 +59,8 @@ This runs kube-router with pod/service networking and the network policy firewal
6059

6160
In this mode kube-router relies on for example [kube-proxy](https://kubernetes.io/docs/reference/generated/kube-proxy/) to provide service networking.
6261

63-
When service proxy is disabled kube-router will use [in-cluster configuration](https://github.com/kubernetes/client-go/tree/master/examples/in-cluster-client-configuration) to access APIserver through cluster-ip. Service networking must therefore be setup before deploying kube-router.
62+
When service proxy is disabled kube-router will use [in-cluster configuration](https://github.com/kubernetes/client-go/tree/master/examples/in-cluster-client-configuration) to access APIserver through cluster-ip. Service networking must therefore be setup before deploying kube-router.
63+
64+
## Debugging
65+
66+
kube-router supports setting log level via the command line -v or --v, To get maximal debug output from kube-router please start with `--v=3`

Documentation/metrics.md

Lines changed: 40 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,24 @@
55
The scope of this document is to describe how to setup the [annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) needed for [Prometheus](https://prometheus.io/) to use [Kubernetes SD](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<kubernetes_sd_config>) to discover & scape kube-router [pods](https://kubernetes.io/docs/concepts/workloads/pods/pod/).
66
For help with installing Prometheus please see their [docs](https://prometheus.io/docs/introduction/overview/)
77

8-
By default kube-router will export Prometheus metrics on port `8080` under the path `/metrics`.
9-
If running kube-router as [daemonset](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) this port might collide with other applications running on the host network and must be changed.
8+
Metrics options:
109

11-
kube-router 0.1.0-rc2 and upwards supports the following runtime configuration for controlling where to expose the metrics.
12-
If you are using a older version, metrics path & port is locked to `/metrics` & `8080`.
10+
--metrics-path string Path to serve Prometheus metrics on ( default: /metrics )
11+
--metrics-port uint16 <0-65535> Prometheus metrics port to use ( default: 0, disabled )
1312

14-
--metrics-port int Prometheus metrics port to use ( default 8080 )
15-
--metrics-path string Path to serve Prometheus metrics on ( default /metrics )
13+
To enable kube-router metrics, start kube-router with `--metrics-port` and provide a port over 0
1614

17-
By enabling [Kubernetes SD](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<kubernetes_sd_config>) in Prometheus configuration & adding required annotations it can automaticly discover & scrape kube-router metrics.
15+
Metrics is generally exported at the same rate as the sync period for each service.
16+
17+
The default values unless other specified is
18+
iptables-sync-period - 1 min
19+
ipvs-sync-period - 1 min
20+
routes-sync-period - 1 min
21+
22+
By enabling [Kubernetes SD](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<kubernetes_sd_config>) in Prometheus configuration & adding required annotations Prometheus can automaticly discover & scrape kube-router metrics
23+
24+
## Version notes
25+
kube-router 0.1.0-rc2 and upwards supports the runtime configuration for controlling where to expose the metrics. If you are using a older version, metrics path & port is locked to `/metrics` & `8080`
1826

1927
## Supported annotations
2028

@@ -39,8 +47,32 @@ For example:
3947

4048
## Avail metrics
4149

50+
If metrics is enabled only the running services metrics are exposed
51+
4252
The following metrics is exposed by kube-router prefixed by `kube_router_`
4353

54+
### run-router = true
55+
56+
* controller_bgp_peers
57+
Number of BGP peers of the instance
58+
* controller_bgp_advertisements_received
59+
Number of total BGP advertisements received since kube-router start
60+
* controller_bgp_internal_peers_sync_time
61+
Time it took for the BGP internal peer sync loop to complete
62+
63+
### run-firewall=true
64+
65+
* controller_iptables_sync_time
66+
Time it took for the iptables sync loop to complete
67+
68+
### run-service-proxy = true
69+
70+
* controller_ipvs_services_sync_time
71+
Time it took for the ipvs sync loop to complete
72+
* controller_ipvs_services
73+
The number of ipvs services in the instance
74+
* controller_ipvs_metrics_export_time
75+
The time it took to run the metrics export for IPVS services
4476
* service_total_connections
4577
Total connections made to the service since creation
4678
* service_packets_in
@@ -68,4 +100,4 @@ To get a grouped list of CPS for each service a Prometheus query could look like
68100
## Grafana Dashboard
69101

70102
This repo contains a example [Grafana dashboard](https://raw.githubusercontent.com/cloudnativelabs/kube-router/master/dashboard/kube-router.json) utilizing all the above exposed metrics from kube-router.
71-
![dashboard](https://raw.githubusercontent.com/cloudnativelabs/kube-router/master/dashboard/dashboard.png)
103+
![dashboard](https://raw.githubusercontent.com/cloudnativelabs/kube-router/master/dashboard/dashboard.png)
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
package controllers
2+
3+
import (
4+
"net"
5+
"net/http"
6+
"strconv"
7+
"sync"
8+
9+
"github.com/cloudnativelabs/kube-router/app/options"
10+
"github.com/golang/glog"
11+
"github.com/prometheus/client_golang/prometheus"
12+
"github.com/prometheus/client_golang/prometheus/promhttp"
13+
"golang.org/x/net/context"
14+
"k8s.io/client-go/kubernetes"
15+
)
16+
17+
var (
18+
serviceTotalConn = prometheus.NewGaugeVec(prometheus.GaugeOpts{
19+
Namespace: namespace,
20+
Name: "service_total_connections",
21+
Help: "Total incoming conntections made",
22+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
23+
servicePacketsIn = prometheus.NewGaugeVec(prometheus.GaugeOpts{
24+
Namespace: namespace,
25+
Name: "service_packets_in",
26+
Help: "Total incoming packets",
27+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
28+
servicePacketsOut = prometheus.NewGaugeVec(prometheus.GaugeOpts{
29+
Namespace: namespace,
30+
Name: "service_packets_out",
31+
Help: "Total outoging packets",
32+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
33+
serviceBytesIn = prometheus.NewGaugeVec(prometheus.GaugeOpts{
34+
Namespace: namespace,
35+
Name: "service_bytes_in",
36+
Help: "Total incoming bytes",
37+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
38+
serviceBytesOut = prometheus.NewGaugeVec(prometheus.GaugeOpts{
39+
Namespace: namespace,
40+
Name: "service_bytes_out",
41+
Help: "Total outgoing bytes",
42+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
43+
servicePpsIn = prometheus.NewGaugeVec(prometheus.GaugeOpts{
44+
Namespace: namespace,
45+
Name: "service_pps_in",
46+
Help: "Incoming packets per second",
47+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
48+
servicePpsOut = prometheus.NewGaugeVec(prometheus.GaugeOpts{
49+
Namespace: namespace,
50+
Name: "service_pps_out",
51+
Help: "Outoging packets per second",
52+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
53+
serviceCPS = prometheus.NewGaugeVec(prometheus.GaugeOpts{
54+
Namespace: namespace,
55+
Name: "service_cps",
56+
Help: "Service connections per second",
57+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
58+
serviceBpsIn = prometheus.NewGaugeVec(prometheus.GaugeOpts{
59+
Namespace: namespace,
60+
Name: "service_bps_in",
61+
Help: "Incoming bytes per second",
62+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
63+
serviceBpsOut = prometheus.NewGaugeVec(prometheus.GaugeOpts{
64+
Namespace: namespace,
65+
Name: "service_bps_out",
66+
Help: "Outoging bytes per second",
67+
}, []string{"namespace", "service_name", "service_vip", "protocol", "port"})
68+
controllerIpvsServices = prometheus.NewGaugeVec(prometheus.GaugeOpts{
69+
Namespace: namespace,
70+
Name: "controller_ipvs_services",
71+
Help: "Number of ipvs services in the instance",
72+
}, []string{})
73+
controllerIptablesSyncTime = prometheus.NewGaugeVec(prometheus.GaugeOpts{
74+
Namespace: namespace,
75+
Name: "controller_iptables_sync_time",
76+
Help: "Time it took for controller to sync iptables",
77+
}, []string{})
78+
controllerPublishMetricsTime = prometheus.NewGaugeVec(prometheus.GaugeOpts{
79+
Namespace: namespace,
80+
Name: "controller_publish_metrics_time",
81+
Help: "Time it took to publish metrics",
82+
}, []string{})
83+
controllerIpvsServicesSyncTime = prometheus.NewGaugeVec(prometheus.GaugeOpts{
84+
Namespace: namespace,
85+
Name: "controller_ipvs_services_sync_time",
86+
Help: "Time it took for controller to sync ipvs services",
87+
}, []string{})
88+
controllerBPGpeers = prometheus.NewGaugeVec(prometheus.GaugeOpts{
89+
Namespace: namespace,
90+
Name: "controller_bgp_peers",
91+
Help: "BGP peers in the runtime configuration",
92+
}, []string{})
93+
controllerBGPInternalPeersSyncTime = prometheus.NewGaugeVec(prometheus.GaugeOpts{
94+
Namespace: namespace,
95+
Name: "controller_bgp_internal_peers_sync_time",
96+
Help: "Time it took to sync internal bgp peers",
97+
}, []string{})
98+
controllerBGPadvertisementsReceived = prometheus.NewGaugeVec(prometheus.GaugeOpts{
99+
Namespace: namespace,
100+
Name: "controller_bgp_advertisements_received",
101+
Help: "Time it took to sync internal bgp peers",
102+
}, []string{})
103+
controllerIpvsMetricsExportTime = prometheus.NewGaugeVec(prometheus.GaugeOpts{
104+
Namespace: namespace,
105+
Name: "controller_ipvs_metrics_export_time",
106+
Help: "Time it took to export metrics",
107+
}, []string{})
108+
)
109+
110+
// MetricsController Holds settings for the metrics controller
111+
type MetricsController struct {
112+
endpointsMap endpointsInfoMap
113+
MetricsPath string
114+
MetricsPort uint16
115+
mu sync.Mutex
116+
nodeIP net.IP
117+
serviceMap serviceInfoMap
118+
}
119+
120+
// Run prometheus metrics controller
121+
func (mc *MetricsController) Run(stopCh <-chan struct{}, wg *sync.WaitGroup) error {
122+
defer wg.Done()
123+
glog.Info("Starting metrics controller")
124+
125+
// register metrics for this controller
126+
prometheus.MustRegister(controllerIpvsMetricsExportTime)
127+
128+
srv := &http.Server{Addr: ":" + strconv.Itoa(int(mc.MetricsPort)), Handler: http.DefaultServeMux}
129+
130+
// add prometheus handler on metrics path
131+
http.Handle(mc.MetricsPath, promhttp.Handler())
132+
133+
go func() {
134+
if err := srv.ListenAndServe(); err != nil {
135+
// cannot panic, because this probably is an intentional close
136+
glog.Errorf("Metrics controller error: %s", err)
137+
}
138+
}()
139+
140+
<-stopCh
141+
glog.Infof("Shutting down metrics controller")
142+
if err := srv.Shutdown(context.Background()); err != nil {
143+
glog.Errorf("could not shutdown: %v", err)
144+
}
145+
return nil
146+
}
147+
148+
// NewMetricsController returns new MetricController object
149+
func NewMetricsController(clientset *kubernetes.Clientset, config *options.KubeRouterConfig) (*MetricsController, error) {
150+
mc := MetricsController{}
151+
mc.MetricsPath = config.MetricsPath
152+
mc.MetricsPort = config.MetricsPort
153+
return &mc, nil
154+
}

0 commit comments

Comments
 (0)