From 3c688fc883b4c0dc1b5a5e3bd3b1729592488bdb Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Fri, 1 Nov 2019 20:17:02 +0900 Subject: [PATCH 1/9] Add KEP for Request ID --- .../20191101-add-request-id-to-k8s-logs.md | 87 +++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 keps/sig-auth/20191101-add-request-id-to-k8s-logs.md diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md new file mode 100644 index 00000000000..fc6bdb488b8 --- /dev/null +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -0,0 +1,87 @@ +--- +title: Add Request-ID to each k8s component log +authors: + - "@hase1128" + - "@sshukun" + - "@furukawa3" + - "@vanou" +owning-sig: sig-auth +participating-sigs: + - sig-auth +reviewers: + - TBD +approvers: + - TBD +editor: TBD +creation-date: 2019-11-01 +last-updated: 2019-11-01 +status: provisional +--- + +# Add Request-ID to each k8s component log + +## Table of Contents + + + - [Summary](#summary) + - [Motivation](#motivation) + - [Target User](#target-user) + - [Target User's objective](#target-users-objective) + - [Case 1](#case-2) + - [Case 2](#case-1) + - [Goals](#goals) + - [Non-Goals](#non-goals) + + + +## Summary + +This KEP proposes a new unique logging meta-data into all K8s logs. It makes us +more easy to identfy specfic logs related to a single API request (such as +`kubectl apply -f `). This feature is similar to +[Request-ID](https://docs.openstack.org/api-guide/compute/faults.html) for +OpenStack. It greatly reduces the investigation cost. + +## Motivation + +### Target User + +Support team in k8s Service Provider + +### Target User's objective + +We'd like to resolve quickly for end users' problem. + +Tracking logs among each k8s component related to specific an API request is +very tough work. It is necessary to match logs with timestamps as hints. If +multiple users throw many API requests at the same time, it is very difficult to +track logs across each k8s component log. It is difficult that target user can +not resolve end user's problem quickly in the above. Therefore, we'd like to add +a new identifier which is unique to each API request. This feature is useful for +the following use cases: + +#### Case 1 + +In case of insecure or unauthorized operation happens, it is necessary to +identify what effect that operation caused. This proposed feature helps identify +what happened at each component or server by each insecure / unauthorized API +request. We can collect these logs as an evidence. + +#### Case 2 + +If the container is terminated by OOM killer, there is a case to break down the +issue into parts(Pod or k8s) from the messages related OOM killer on host logs +and the API processing just before OOM killer. If the cause is that some unknown +pod creations, it is helpful to detect the root API request and who called this +request. + +### Goals + +Adding a Request-ID into all K8s component logs. The Request-ID is unique to an +operation. + +### Non-Goals + +To centrally manage the logs of each k8s component with `Request-ID` (This can +be realized with existing OSS such as Kibana, so no need to implement into K8s +components). From 541863b37a85690a19c839dc2dd41e0126b4aea4 Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Thu, 12 Dec 2019 18:46:20 +0900 Subject: [PATCH 2/9] Update 20191101-add-request-id-to-k8s-logs.md --- .../20191101-add-request-id-to-k8s-logs.md | 50 +++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index fc6bdb488b8..1227fdbf31f 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -85,3 +85,53 @@ operation. To centrally manage the logs of each k8s component with `Request-ID` (This can be realized with existing OSS such as Kibana, so no need to implement into K8s components). + +## Proposal + +The following two features are necessary to realize log tracking + +### Propagate ID information + +- Add Request-ID and Parent-ID information to objects related to a single API request +- Example: When creating a deployment resource, objects with the following Request-ID and Parent-ID are created + +| Object | Object-UUID | Parent-ID | Request-ID | +| ------ | ------ | ------ | ------ | +| Deployment | 1000 | null | 100 | +| ReplicaSet | 1010 | 1000 | 100 | +| Pod1 | 1020 | 1000 | 100 | +| Pod2 | 1030 | 1000 | 100 | +| Pod3 | 1040 | 1000 | 100 | + +- Replicasets and pods related to the above Deployment are linked by Object-UUID and Parent-ID. +- In the above example, Replicaset and Pods have “1000” as the Parent-ID. In this case, these objects are linked to an object which has “1000” as Object-UUID +- Also, Reuest-ID should be associated with Request-ID in Audit log + +### Add ID information to the logs + + - Add the above Object-UUID and Parent-ID information to the logs of each k8s component + - Current log example: + +``` +Kube-apiserver +I1028 11:21:02.700432 11190 httplog.go:90] POST /api/v1/namespaces/kube-system/pods/kube-dns-68496566b5-24cwk/binding: (6.841722ms) 201 [hyperkube/v1.16.3 (linux/amd64) kubernetes/e76a12b/scheduler [::1]:35762] + +Kube-scheduler +I1028 11:21:02.692701 11463 scheduler.go:530] Attempting to schedule pod: kube-system/kube-dns-68496566b5-24cwk +I1028 11:21:02.693154 11463 factory.go:610] Attempting to bind kube-dns-68496566b5-24cwk to 127.0.0.1 +I1028 11:21:02.700681 11463 scheduler.go:667] pod kube-system/kube-dns-68496566b5-24cwk is bound successfully on node "127.0.0.1", 1 nodes evaluated, 1 nodes were found feasible. Bound node resource: "Capacity: CPU<4>|Memory<8037268Ki>|Pods<110>|StorageEphemeral<71724152Ki>. + +Kubelet +I1028 11:21:02.699986 11594 kubelet.go:1901] SyncLoop (ADD, "api"): "kube-dns-68496566b5-24cwk_kube-system(7b0e128d-2a58-4c2c-8374-1ef872eefa65)" +``` + + - New log example: Add ID information to the head of each log + +``` +I1028 11:21:02.700432 11190 httplog.go:90] [RequestID, ObjectID, ParentID] POST /api/v1/namespaces/kube-system/pods/kube-dns-68496566b5-24cwk/binding:(6.841722ms) 201 [hyperkube/v1.16.3 (linux/amd64) kubernetes/e76a12b/scheduler [::1]:35762] +I1028 11:21:02.699986 11594 kubelet.go:1901] [RequestID, ObjectID, ParentID] Attempting to bind kube-dns-68496566b5-24cwk to 127.0.0.1 +I1028 11:21:02.692701 11463 scheduler.go:530] [RequestID, ObjectID , ParentID] Attempting to schedule pod: kube-system/kube-dns-68496566b5-24cwk +``` + +- By combining the above two functions, we can easily search for logs related to a single API by using Request-ID information as a query key. +- In addition, we can also know more detail information about API request (e.g. when and who requested what API) by linking to Audit logs. From 02098a0fa496ac6739ea80671da58ea454a7c116 Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Thu, 12 Dec 2019 18:55:28 +0900 Subject: [PATCH 3/9] Update 20191101-add-request-id-to-k8s-logs.md --- keps/sig-auth/20191101-add-request-id-to-k8s-logs.md | 1 + 1 file changed, 1 insertion(+) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index 1227fdbf31f..24e57d0e1f4 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -31,6 +31,7 @@ status: provisional - [Case 2](#case-1) - [Goals](#goals) - [Non-Goals](#non-goals) + - [Proposal](#proposal) From 41fa2e0ad5a2f25f32ec9248f83562ca4992e872 Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Thu, 9 Jan 2020 16:03:56 +0900 Subject: [PATCH 4/9] Update 20191101-add-request-id-to-k8s-logs.md --- .../20191101-add-request-id-to-k8s-logs.md | 142 ++++++++++++++---- 1 file changed, 110 insertions(+), 32 deletions(-) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index 24e57d0e1f4..c831f1e7a14 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -14,7 +14,7 @@ approvers: - TBD editor: TBD creation-date: 2019-11-01 -last-updated: 2019-11-01 +last-updated: 2020-01-09 status: provisional --- @@ -89,50 +89,128 @@ components). ## Proposal -The following two features are necessary to realize log tracking +### Basic policy of Request-ID -### Propagate ID information + - Minimize the impact to existing features + - Simple implementation + - Collaborate with related KEPs. This means as follows. + - Ensure consistent metadata(This may need to be considered and managed across the Kubernetes beyond the related KEPs). + - Implementation which does not interfere related KEP's implementation -- Add Request-ID and Parent-ID information to objects related to a single API request -- Example: When creating a deployment resource, objects with the following Request-ID and Parent-ID are created +### Design Overview -| Object | Object-UUID | Parent-ID | Request-ID | -| ------ | ------ | ------ | ------ | -| Deployment | 1000 | null | 100 | -| ReplicaSet | 1010 | 1000 | 100 | -| Pod1 | 1020 | 1000 | 100 | -| Pod2 | 1030 | 1000 | 100 | -| Pod3 | 1040 | 1000 | 100 | +Request-ID feature consists the following two features. + - Export Request-ID to each kubernetes component log + - Propagate Request-ID to related objects -- Replicasets and pods related to the above Deployment are linked by Object-UUID and Parent-ID. -- In the above example, Replicaset and Pods have “1000” as the Parent-ID. In this case, these objects are linked to an object which has “1000” as Object-UUID -- Also, Reuest-ID should be associated with Request-ID in Audit log +#### Design overview of Export Request-ID -### Add ID information to the logs +**Collaboration with related KEP** +There is the existing KEP(structured logging feature) that related log exporting. +Main concept of this feature is structuring the log format and replace existing klog with logr for structuring. +Structured format is attractive, however this replacement is very tough work and may be required troublesome migration steps. +In this situation, it is expect to take a long time to migration completely. +In the meantime, Request-ID feature takes more simple way to export ID information to log files. +(It is possibility that Request-ID feature may take structured logging feature in the future) - - Add the above Object-UUID and Parent-ID information to the logs of each k8s component - - Current log example: +There is another related KEP(distributed tracing) that propagate Trace-ID(and etc.) to related kubernetes objects. +This feature adds these information(e.g. Trace-ID) to Annotations of objects. +Annotations is existing feature of kubernetes, and Request-ID feature adopts same method. +**Idea of design** +As a result, exporting Request-ID is implementable by just reading Annotations from objects when using klog. +Note that this is no impact to existing klog feature. + +##### Example of source code and log output + + - Original source code(scheduler.go) + +``` +func (sched *Scheduler) scheduleOne(ctx context.Context) { + fwk := sched.Framework + + podInfo := sched.NextPod() + + pod := podInfo.Pod + + klog.V(3).Infof("Attempting to schedule pod: %v/%v", pod.Namespace, pod.Name) ``` -Kube-apiserver -I1028 11:21:02.700432 11190 httplog.go:90] POST /api/v1/namespaces/kube-system/pods/kube-dns-68496566b5-24cwk/binding: (6.841722ms) 201 [hyperkube/v1.16.3 (linux/amd64) kubernetes/e76a12b/scheduler [::1]:35762] -Kube-scheduler -I1028 11:21:02.692701 11463 scheduler.go:530] Attempting to schedule pod: kube-system/kube-dns-68496566b5-24cwk -I1028 11:21:02.693154 11463 factory.go:610] Attempting to bind kube-dns-68496566b5-24cwk to 127.0.0.1 -I1028 11:21:02.700681 11463 scheduler.go:667] pod kube-system/kube-dns-68496566b5-24cwk is bound successfully on node "127.0.0.1", 1 nodes evaluated, 1 nodes were found feasible. Bound node resource: "Capacity: CPU<4>|Memory<8037268Ki>|Pods<110>|StorageEphemeral<71724152Ki>. + - Add Request-ID exportation into klog -Kubelet -I1028 11:21:02.699986 11594 kubelet.go:1901] SyncLoop (ADD, "api"): "kube-dns-68496566b5-24cwk_kube-system(7b0e128d-2a58-4c2c-8374-1ef872eefa65)" +``` +func (sched *Scheduler) scheduleOne(ctx context.Context) { + fwk := sched.Framework + + podInfo := sched.NextPod() + + pod := podInfo.Pod + + klog.V(3).Infof("Request-ID: %v Attempting to schedule pod: %v/%v", pod., pod.Namespace, pod.Name) ``` - - New log example: Add ID information to the head of each log + - Original log output +``` +I1220 08:58:31.000196 6869 scheduler.go:564] Attempting to schedule pod: default/nginx ``` -I1028 11:21:02.700432 11190 httplog.go:90] [RequestID, ObjectID, ParentID] POST /api/v1/namespaces/kube-system/pods/kube-dns-68496566b5-24cwk/binding:(6.841722ms) 201 [hyperkube/v1.16.3 (linux/amd64) kubernetes/e76a12b/scheduler [::1]:35762] -I1028 11:21:02.699986 11594 kubelet.go:1901] [RequestID, ObjectID, ParentID] Attempting to bind kube-dns-68496566b5-24cwk to 127.0.0.1 -I1028 11:21:02.692701 11463 scheduler.go:530] [RequestID, ObjectID , ParentID] Attempting to schedule pod: kube-system/kube-dns-68496566b5-24cwk + + - Request-ID log output ``` +I1220 08:58:31.000196 6869 scheduler.go:564] Request-ID : d0ac7061-d9fc-43d0-957f-dbc7306d3ace Attempting to schedule pod: default/nginx + +``` + +**Pros of this method** + - No impact to existing klog feature + - As a result, no interfere with structured logging implementation + +**Restricts** + - Only objects that can be obtained within the function scope calling klog can read annotations + +#### Step to implementation of Export Request-ID + +##### Step1 + +Target klogs: Only klogs that satisfy both of the following requirements. + - klogs that are called during typical kubectl operations + - e.g. kubectl operation: create, apply, delete, etc. + - e.g. kubernetes object: deployment, pod, etc. + - klogs that can get object's annotation in the scope which calls the klog. + +##### Step2 + +Expand the range of operations and resources from Step1. + - e.g. kubectl operation: rollout, scale, drain, etc. + - e.g. kubernetes object: service, secret, pv, etc. + +##### Step3(TBD.) + + - Considering a mechanism that can acquire annotations from any objects at any scopes, and then the target is klog that could not be done in Step1 and 2. + +#### Design overview of Propagate Request-ID + +**Collaboration with related KEP** +There is an idea to use `distributed context` of the existing KEP(distributed tracing). +In this case, use of OpenTelemetry is prerequisite. +However, Request-ID feature does not need Exporter of OpenTelemetry because Request-ID is exported to only kubernetes log file by klog. +Request-ID collaborates with distributed tracing KEP in terms of adding context(e.g. Trace-ID such as Request-ID) to Annotations and propagation feature(distributed context of OpenTelemetry). +At first(in Alpha stage), OpenTelemetry is prerequisite of Request-ID feature. +Eventually, propagating feature is implemented by in-tree code change only(not use OpenTelemetry).This idea is TBD. + +**Idea of design** +TBD. + + +### Summary of collaboration idea of related KEPs + +#### distributed tracing + + - Adding context information to kubernetes object's Annotations + - Propagate Annotations by distributed context of OpenTelemetry(Eventually, propagating feature is implemented by in-tree code change only) + - Consider consistent of metadata + +#### structured logging -- By combining the above two functions, we can easily search for logs related to a single API by using Request-ID information as a query key. -- In addition, we can also know more detail information about API request (e.g. when and who requested what API) by linking to Audit logs. + - Request-ID implementation does not interfere with structured logging implementation + - Consider consistent of metadata From 1383a3402229cd8f824686dc6ec394f321c35a8a Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Thu, 9 Jan 2020 16:43:28 +0900 Subject: [PATCH 5/9] Update 20191101-add-request-id-to-k8s-logs.md --- keps/sig-auth/20191101-add-request-id-to-k8s-logs.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index c831f1e7a14..1d4a6080847 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -32,6 +32,9 @@ status: provisional - [Goals](#goals) - [Non-Goals](#non-goals) - [Proposal](#proposal) + - [Basic policy of Request-ID](#basic-policy-of-request-id) + - [Design Overview](#design-overview) + - [Summary of collaboration idea of related KEPs](#summary-of-collaboration-idea-of-related-keps) @@ -163,7 +166,7 @@ I1220 08:58:31.000196 6869 scheduler.go:564] Request-ID : d0ac7061-d9fc-43d0- **Pros of this method** - No impact to existing klog feature - - As a result, no interfere with structured logging implementation + - As a result, no interfere with structured logging implementation during migration to logr **Restricts** - Only objects that can be obtained within the function scope calling klog can read annotations From d76022826be86edb8d8dbad2404a5c41b9240aaa Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Wed, 26 Feb 2020 20:03:14 +0900 Subject: [PATCH 6/9] Update 20191101-add-request-id-to-k8s-logs.md --- .../20191101-add-request-id-to-k8s-logs.md | 208 +++++++++--------- 1 file changed, 98 insertions(+), 110 deletions(-) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index 1d4a6080847..cbe7d7a75b0 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -5,20 +5,19 @@ authors: - "@sshukun" - "@furukawa3" - "@vanou" -owning-sig: sig-auth +owning-sig: sig-instrumentation participating-sigs: - - sig-auth reviewers: - TBD approvers: - TBD editor: TBD creation-date: 2019-11-01 -last-updated: 2020-01-09 +last-updated: 2020-02-26 status: provisional --- -# Add Request-ID to each k8s component log +# Add Request-ID to each Kubernetes component log ## Table of Contents @@ -34,14 +33,24 @@ status: provisional - [Proposal](#proposal) - [Basic policy of Request-ID](#basic-policy-of-request-id) - [Design Overview](#design-overview) - - [Summary of collaboration idea of related KEPs](#summary-of-collaboration-idea-of-related-keps) + - [Design overview of Propagate Request-ID](#design-overview-of-propagate-request-id) + - [Detail design of Propagate Request-ID](#detail-design-of-propagate-request-id) + - [Design overview of Export Request-ID](#design-overview-of-export-request-id) + - [Detail design of Export Request-ID](#detail-design-of-export-request-id) + - [Design overview of Control Request-ID](#design-overview-of-control-request-id) + - [Detail design of Control Request-ID](#detail-design-of-control-request-id) + - [Test Plan](#test-plan) + - [Migration / Graduation Criteria](#migration--graduation-criteria) + - [Alpha](#alpha) + - [Beta](#beta) + - [GA](#ga) ## Summary -This KEP proposes a new unique logging meta-data into all K8s logs. It makes us -more easy to identfy specfic logs related to a single API request (such as +This KEP proposes a new unique logging meta-data into all Kubernetes logs. It makes us +more easy to identify specific logs related to a single user operation (such as `kubectl apply -f `). This feature is similar to [Request-ID](https://docs.openstack.org/api-guide/compute/faults.html) for OpenStack. It greatly reduces the investigation cost. @@ -56,13 +65,9 @@ Support team in k8s Service Provider We'd like to resolve quickly for end users' problem. -Tracking logs among each k8s component related to specific an API request is -very tough work. It is necessary to match logs with timestamps as hints. If -multiple users throw many API requests at the same time, it is very difficult to -track logs across each k8s component log. It is difficult that target user can -not resolve end user's problem quickly in the above. Therefore, we'd like to add -a new identifier which is unique to each API request. This feature is useful for -the following use cases: +Tracking logs among each Kubernetes component related to specific an user operation is very tough work. It is necessary to match logs with timestamps as hints. If multiple users throw many API requests at the same time, it is very difficult to track logs across each Kubernetes component log. + +It is difficult that support team in k8s Service Provider resolve end user's problem quickly in the above. Therefore, we'd like to add a new identifier which is unique to each user operation. This feature is useful for the following use cases: #### Case 1 @@ -74,146 +79,129 @@ request. We can collect these logs as an evidence. #### Case 2 If the container is terminated by OOM killer, there is a case to break down the -issue into parts(Pod or k8s) from the messages related OOM killer on host logs +issue into parts(Pod or Kubernetes) from the messages related OOM killer on host logs and the API processing just before OOM killer. If the cause is that some unknown pod creations, it is helpful to detect the root API request and who called this request. ### Goals -Adding a Request-ID into all K8s component logs. The Request-ID is unique to an -operation. + - Adding a Request-ID into each K8s component log. + - The Request-ID is unique to an operation. + - Control enabled/disabled Request-ID feature(Request-ID feature is disabled on default to avoid an impact for existing user). ### Non-Goals -To centrally manage the logs of each k8s component with `Request-ID` (This can -be realized with existing OSS such as Kibana, so no need to implement into K8s + - To centrally manage the logs of each Kubernetes component with Request-ID (This can +be realized with existing OSS such as Kibana, so no need to implement into Kubernetes components). + - We don't associate Request-ID to all of operations(Our target is important operations such as `kubectl create/delete/etc.`). ## Proposal ### Basic policy of Request-ID - - Minimize the impact to existing features - - Simple implementation - - Collaborate with related KEPs. This means as follows. - - Ensure consistent metadata(This may need to be considered and managed across the Kubernetes beyond the related KEPs). - - Implementation which does not interfere related KEP's implementation + - Minimize the impact to existing users who are retrieving logs and analyzing with existing log format. + - So we disabled Request-ID feature on default. + - Collaborate with related KEPs to avoid unnecessary conflict to them regarding implementation and feature. + - Use existing KEP's feature as much as possible. + - Therefore, we will merge Request-ID feature after related KEP features are merged. ### Design Overview -Request-ID feature consists the following two features. - - Export Request-ID to each kubernetes component log - - Propagate Request-ID to related objects - -#### Design overview of Export Request-ID - -**Collaboration with related KEP** -There is the existing KEP(structured logging feature) that related log exporting. -Main concept of this feature is structuring the log format and replace existing klog with logr for structuring. -Structured format is attractive, however this replacement is very tough work and may be required troublesome migration steps. -In this situation, it is expect to take a long time to migration completely. -In the meantime, Request-ID feature takes more simple way to export ID information to log files. -(It is possibility that Request-ID feature may take structured logging feature in the future) - -There is another related KEP(distributed tracing) that propagate Trace-ID(and etc.) to related kubernetes objects. -This feature adds these information(e.g. Trace-ID) to Annotations of objects. -Annotations is existing feature of kubernetes, and Request-ID feature adopts same method. +Request-ID feature consists the three features. + - Propagate Request-ID to related objects. + - Export Request-ID to each kubernetes component log. + - Control enabled/disabled the above two features. -**Idea of design** -As a result, exporting Request-ID is implementable by just reading Annotations from objects when using klog. -Note that this is no impact to existing klog feature. - -##### Example of source code and log output +#### Design overview of Propagate Request-ID - - Original source code(scheduler.go) +There is an idea to use `distributed context` of the existing KEP([Distributed Tracing](https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/0034-distributed-tracing-kep.md)). We use `Distributed Tracing` feature for propagation. -``` -func (sched *Scheduler) scheduleOne(ctx context.Context) { - fwk := sched.Framework + - We use Trace-ID as Request-ID + - Trace-ID is contained in Annotation of Kubernetes objects. (This is a part of `Distributed Tracing` feature) + - Trace-ID is removed when tracing is finished. (This is a part of `Distributed Tracing` feature) - podInfo := sched.NextPod() - - pod := podInfo.Pod - - klog.V(3).Infof("Attempting to schedule pod: %v/%v", pod.Namespace, pod.Name) -``` +We have two notes should be considered when using `Distributed Tracing` feature for Request-ID feature. - - Add Request-ID exportation into klog +**NOTE1:** +`Distributed Tracing` feature is only enabled when `--trace` option is added to `kubectl` command. However, in order for the Kubernetes service provider's support team to troubleshoot from customer's Kubernetes log files, we would like to ensure that Request-ID is always added to the log file regardless of `--trace` option. So we need to implement additional parameter which called `--request-id`. This parameter can control(enabled/disabled) Request-ID feature(See [Design overview of Control Request-ID](#design-overview-of-control-request-id)). As a result, the customers who want to use Request-ID can always use this feature, and does not affect other users who does not want to use this feature. Below is a table showing the relationship between the `--trace option` and the `--request-id` parameter. -``` -func (sched *Scheduler) scheduleOne(ctx context.Context) { - fwk := sched.Framework +| | --trace: OFF | --trace: ON | +| ------ | ------ | ------ | +| --request-id = 0 | No Tracing / No export Request-ID | Tracing / No export Request-ID | +| --request-id > 0 | Tracing / Export Request-ID | Tracing / Export Request-ID | - podInfo := sched.NextPod() - - pod := podInfo.Pod - - klog.V(3).Infof("Request-ID: %v Attempting to schedule pod: %v/%v", pod., pod.Namespace, pod.Name) -``` +**NOTE2:** +To trace each Kubernetes function, we need to add codes into related k8s function. So the following implementation is needed. + - Case1. The function which is added Tracing codes by `Distributed Tracing` KEP + - We add Request-ID codes over Tracing codes. + - Case2. The function which is not added Tracing codes by `Distributed Tracing` KEP + - We add both of Tracing codes and Request-id codes. - - Original log output -``` -I1220 08:58:31.000196 6869 scheduler.go:564] Attempting to schedule pod: default/nginx +#### Detail design of Propagate Request-ID -``` +TBD. I will write down the following things. + - List up the target function that we add tracing and Request-id codes + - Sample codes of case 1 and 2 of the above `NOTE2` - - Request-ID log output -``` -I1220 08:58:31.000196 6869 scheduler.go:564] Request-ID : d0ac7061-d9fc-43d0-957f-dbc7306d3ace Attempting to schedule pod: default/nginx +#### Design overview of Export Request-ID -``` +We add Request-ID information into klog calls. Note that we don't associate Request-ID to all of operations. Our target is important operations such as `kubectl create/delete/etc.`, and our target klog calls is the only klogs which is called via such important operations. Request-ID feature does not change existing klog function, but changes each klog calls and their log format. Currently, there is [Structured logging](https://github.com/serathius/enhancements/blob/structured-logging/keps/sig-instrumentation/20191115-structured-logging.md) KEP, and this KEP also change specific klog calls. We will merge Request-ID feature after Structured logging KEP is merged. -**Pros of this method** - - No impact to existing klog feature - - As a result, no interfere with structured logging implementation during migration to logr +#### Detail design of Export Request-ID -**Restricts** - - Only objects that can be obtained within the function scope calling klog can read annotations +TBD. I will write down the following things. + - List up the target klogs that we add Request-id + - Sample codes of klog calls that we changes + - Sample logs which is added Request-ID -#### Step to implementation of Export Request-ID +#### Design overview of Control Request-ID -##### Step1 +We should control Request-ID feature to avoid an impact to existing users who are retrieving logs and analyzing with existing log format. So we introduce `--request-id` parameter which enables/disables Request-ID feature. We also manage the range of operations which are added Request-ID. The effect of each parameter of `--request-id` is as follows. -Target klogs: Only klogs that satisfy both of the following requirements. - - klogs that are called during typical kubectl operations - - e.g. kubectl operation: create, apply, delete, etc. - - e.g. kubernetes object: deployment, pod, etc. - - klogs that can get object's annotation in the scope which calls the klog. +| parameter | efficient | +| ------ | ------ | +| --request-id=0 | Request-ID feature is disabled (Default) | +| --request-id=1 | Request-ID feature is enabled, and Request-ID is added to klogs related to the `Alpha` target operations | +| --request-id=2 | Request-ID feature is enabled, and Request-ID is added to klogs related to the `Alpha and Beta` target operations | -##### Step2 +Alpha and Beta target operations are described in Migration / Graduation Criteria section. -Expand the range of operations and resources from Step1. - - e.g. kubectl operation: rollout, scale, drain, etc. - - e.g. kubernetes object: service, secret, pv, etc. +#### Detail design of Control Request-ID -##### Step3(TBD.) +TBD. I will write down the following things. + - How to realize `--request-id` parameter in each Kubernetes component. + - Sample codes which is used with `--request-id` and `--trace` option. - - Considering a mechanism that can acquire annotations from any objects at any scopes, and then the target is klog that could not be done in Step1 and 2. +### Test Plan -#### Design overview of Propagate Request-ID + - test against the combination of following patterns. + - --trace(OFF/ON) / --request-id(0/1/2) -**Collaboration with related KEP** -There is an idea to use `distributed context` of the existing KEP(distributed tracing). -In this case, use of OpenTelemetry is prerequisite. -However, Request-ID feature does not need Exporter of OpenTelemetry because Request-ID is exported to only kubernetes log file by klog. -Request-ID collaborates with distributed tracing KEP in terms of adding context(e.g. Trace-ID such as Request-ID) to Annotations and propagation feature(distributed context of OpenTelemetry). -At first(in Alpha stage), OpenTelemetry is prerequisite of Request-ID feature. -Eventually, propagating feature is implemented by in-tree code change only(not use OpenTelemetry).This idea is TBD. +### Migration / Graduation Criteria -**Idea of design** -TBD. +#### Alpha + - Add Request-ID against the following operations: + - kubectl create/apply/delete + - target resources: pod/deployment + - kubectl drain + - target resources: node + - Implement `--request-id` parameter + - E2e testing + - User-facing documentation -### Summary of collaboration idea of related KEPs +#### Beta -#### distributed tracing + - Add Request-ID against the following operations: + - kubectl create/apply/delete + - target resources: daemonset/pv/pvc/svc + - kubectl scale/rollout + - Update E2e testing + - Update documentation - - Adding context information to kubernetes object's Annotations - - Propagate Annotations by distributed context of OpenTelemetry(Eventually, propagating feature is implemented by in-tree code change only) - - Consider consistent of metadata +#### GA -#### structured logging + - All feedback is addressed. - - Request-ID implementation does not interfere with structured logging implementation - - Consider consistent of metadata From c39b428a1d4aa1204e3faddbc38d6dacf2487356 Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Wed, 11 Mar 2020 17:04:18 +0900 Subject: [PATCH 7/9] Update 20191101-add-request-id-to-k8s-logs.md --- .../20191101-add-request-id-to-k8s-logs.md | 187 ++++++++++++++---- 1 file changed, 144 insertions(+), 43 deletions(-) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index cbe7d7a75b0..c51aa43989c 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -33,10 +33,8 @@ status: provisional - [Proposal](#proposal) - [Basic policy of Request-ID](#basic-policy-of-request-id) - [Design Overview](#design-overview) - - [Design overview of Propagate Request-ID](#design-overview-of-propagate-request-id) - - [Detail design of Propagate Request-ID](#detail-design-of-propagate-request-id) - - [Design overview of Export Request-ID](#design-overview-of-export-request-id) - - [Detail design of Export Request-ID](#detail-design-of-export-request-id) + - [Design of Propagate Request-ID](#design-of-propagate-request-id) + - [Design of Export Request-ID](#design-of-export-request-id) - [Design overview of Control Request-ID](#design-overview-of-control-request-id) - [Detail design of Control Request-ID](#detail-design-of-control-request-id) - [Test Plan](#test-plan) @@ -53,7 +51,7 @@ This KEP proposes a new unique logging meta-data into all Kubernetes logs. It ma more easy to identify specific logs related to a single user operation (such as `kubectl apply -f `). This feature is similar to [Request-ID](https://docs.openstack.org/api-guide/compute/faults.html) for -OpenStack. It greatly reduces the investigation cost. +OpenStack. It greatly reduces investigation cost. ## Motivation @@ -65,7 +63,7 @@ Support team in k8s Service Provider We'd like to resolve quickly for end users' problem. -Tracking logs among each Kubernetes component related to specific an user operation is very tough work. It is necessary to match logs with timestamps as hints. If multiple users throw many API requests at the same time, it is very difficult to track logs across each Kubernetes component log. +Tracking logs among each Kubernetes component related to specific an user operation is very tough work. It is necessary to match logs by basically using timestamps and object's name as hints. If multiple users throw many API requests at the same time, it is very difficult to track logs across each Kubernetes component log. It is difficult that support team in k8s Service Provider resolve end user's problem quickly in the above. Therefore, we'd like to add a new identifier which is unique to each user operation. This feature is useful for the following use cases: @@ -111,51 +109,154 @@ components). Request-ID feature consists the three features. - Propagate Request-ID to related objects. - - Export Request-ID to each kubernetes component log. - - Control enabled/disabled the above two features. + - Export Request-ID to each Kubernetes component log. + - Control enabled/disabled exporting Request-ID. -#### Design overview of Propagate Request-ID +#### Design of Propagate Request-ID There is an idea to use `distributed context` of the existing KEP([Distributed Tracing](https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/0034-distributed-tracing-kep.md)). We use `Distributed Tracing` feature for propagation. - - We use Trace-ID as Request-ID - - Trace-ID is contained in Annotation of Kubernetes objects. (This is a part of `Distributed Tracing` feature) - - Trace-ID is removed when tracing is finished. (This is a part of `Distributed Tracing` feature) + - We just use Tracing codes provided by `Distributed Tracing` feature. + - We use Trace-ID as Request-ID (This does not interfere Tracing codes). -We have two notes should be considered when using `Distributed Tracing` feature for Request-ID feature. +The target functions that we add Request-id codes are the following tables. The target functions include the klog call which called via important `kubectl` operations. These operations are listed in [Migration / Graduation Criteria](#migration--graduation-criteria) section. As a result, we can propagate Request-ID regarding important `kubectl` operations. -**NOTE1:** -`Distributed Tracing` feature is only enabled when `--trace` option is added to `kubectl` command. However, in order for the Kubernetes service provider's support team to troubleshoot from customer's Kubernetes log files, we would like to ensure that Request-ID is always added to the log file regardless of `--trace` option. So we need to implement additional parameter which called `--request-id`. This parameter can control(enabled/disabled) Request-ID feature(See [Design overview of Control Request-ID](#design-overview-of-control-request-id)). As a result, the customers who want to use Request-ID can always use this feature, and does not affect other users who does not want to use this feature. Below is a table showing the relationship between the `--trace option` and the `--request-id` parameter. +**kube-apiserver.log** -| | --trace: OFF | --trace: ON | -| ------ | ------ | ------ | -| --request-id = 0 | No Tracing / No export Request-ID | Tracing / No export Request-ID | -| --request-id > 0 | Tracing / Export Request-ID | Tracing / Export Request-ID | - -**NOTE2:** -To trace each Kubernetes function, we need to add codes into related k8s function. So the following implementation is needed. - - Case1. The function which is added Tracing codes by `Distributed Tracing` KEP - - We add Request-ID codes over Tracing codes. - - Case2. The function which is not added Tracing codes by `Distributed Tracing` KEP - - We add both of Tracing codes and Request-id codes. - -#### Detail design of Propagate Request-ID - -TBD. I will write down the following things. - - List up the target function that we add tracing and Request-id codes - - Sample codes of case 1 and 2 of the above `NOTE2` - -#### Design overview of Export Request-ID +| source file | function name | +| ------ | ------ | +| staging/src/k8s.io/apiserver/pkg/endpoints/handlers/get.go | ListResource | +| staging/src/k8s.io/apiserver/pkg/server/httplog/httplog.go | WithLogging | -We add Request-ID information into klog calls. Note that we don't associate Request-ID to all of operations. Our target is important operations such as `kubectl create/delete/etc.`, and our target klog calls is the only klogs which is called via such important operations. Request-ID feature does not change existing klog function, but changes each klog calls and their log format. Currently, there is [Structured logging](https://github.com/serathius/enhancements/blob/structured-logging/keps/sig-instrumentation/20191115-structured-logging.md) KEP, and this KEP also change specific klog calls. We will merge Request-ID feature after Structured logging KEP is merged. +**kube-controller-manager.log** -#### Detail design of Export Request-ID +| source file | function name | +| ------ | ------ | +| pkg/controller/deployment/deployment_controller.go | syncDeployment | +| pkg/controller/deployment/deployment_controller.go | handleErr | +| pkg/controller/garbagecollector/garbagecollector.go | attemptToDeleteItem | +| pkg/controller/replicaset/replica_set.go | manageReplicas | -TBD. I will write down the following things. - - List up the target klogs that we add Request-id - - Sample codes of klog calls that we changes - - Sample logs which is added Request-ID +**kubelet.log** +| source file | function name | +| ------ | ------ | +| pkg/volume/emptydir/empty_dir.go | setupTmpfs | +| pkg/kubelet/kubelet.go | syncLoopIteration | +| pkg/kubelet/kubelet.go | HandlePodAdditions | +| pkg/kubelet/kubelet.go | HandlePodUpdates | +| pkg/kubelet/kubelet.go | HandlePodRemoves | +| pkg/kubelet/kubelet.go | HandlePodReconcile | +| pkg/kubelet/kubelet_pods.go | cleanupOrphanedPodCgroups | | +| pkg/kubelet/kubelet_pods.go | makeMounts | +| pkg/kubelet/kubelet_pods.go | PodResourcesAreReclaimed | +| pkg/kubelet/kubelet_pods.go | podKiller | +| pkg/kubelet/kubelet_pods.go | generateAPIPodStatus | +| pkg/kubelet/kubelet_pods.go | cleanupOrphanedPodCgroups | +| pkg/kubelet/kubelet_volumes.go | cleanupOrphanedPodDirs | +| pkg/kubelet/kuberuntime/kuberuntime_container.go | killContainer | +| pkg/kubelet/kuberuntime/kuberuntime_image.go | PullImage | +| pkg/kubelet/kuberuntime/kuberuntime_manager.go | podSandboxChanged | +| pkg/kubelet/kuberuntime/kuberuntime_manager.go | SyncPod | +| pkg/kubelet/dockershim/docker_service.go | GenerateExpectedCgroupParent | +| pkg/kubelet/dockershim/libdocker/kube_docker_client.go | start | +| pkg/volume/util/operationexecutor/operation_generator.go | GenerateMountVolumeFunc | +| pkg/volume/util/operationexecutor/operation_generator.go | GenerateUnmountVolumeFunc | +| pkg/kubelet/dockershim/network/plugins.go | SetUpPod | +| pkg/kubelet/dockershim/network/plugins.go | TearDownPod | +| pkg/kubelet/pod_container_deletor.go:75 | getContainersToDeleteInPod | +| pkg/kubelet/volumemanager/reconciler/reconciler.go | unmountVolumes | +| pkg/kubelet/volumemanager/reconciler/reconciler.go | mountAttachVolumes | +| pkg/kubelet/volumemanager/reconciler/reconciler.go | unmountDetachDevices | +| pkg/kubelet/remote/remote_runtime.go | ContainerStatus | +| pkg/volume/secret/secret.go | SetUpAt | +| pkg/kubelet/status/status_manager.go | updateStatusInternal | +| pkg/kubelet/status/status_manager.go | syncPod | +| pkg/volume/util/util.go | UnmountViaEmptyDir | +| pkg/kubelet/volumemanager/volume_manager.go | WaitForAttachAndMount | + +**kube-proxy** + +``` +NONE +``` + +**kube-scheduler** + +| source file | function name | +| ------ | ------ | +| pkg/scheduler/framework/plugins/defaultbinder/default_binder.go | Bind | +| pkg/scheduler/eventhandlers.go | addPodToSchedulingQueue | +| pkg/scheduler/eventhandlers.go | updatePodInSchedulingQueue | +| pkg/scheduler/eventhandlers.go | deletePodFromSchedulingQueue | +| pkg/scheduler/eventhandlers.go | addPodToCache | +| pkg/scheduler/eventhandlers.go | updatePodInCache | +| pkg/scheduler/eventhandlers.go | deletePodFromCache | +| pkg/scheduler/scheduler.go | scheduleOne | + +**NOTE1:** + +We need context or object or http request to get Trace-ID(This is specification of `Distributed Tracing` feature). So, we may exclude function which does not have such resource(context, object, http request) from our target. + +**NOTE2:** + +We use Tracing codes by `Distributed Tracing` KEP for propagation. So the following implementation is required. + - Case1. The function which contains Tracing codes by `Distributed Tracing` KEP + - We just add Request-ID codes (Request-ID codes don't interfere Tracing codes). + - Case2. The function which does not contain Tracing codes by `Distributed Tracing` KEP + - We add both of Tracing codes and Request-id codes (Request-ID codes don't interfere Tracing codes). + +#### Design of Export Request-ID + +We get Trace-ID by using `Distributed Tracing` feature in each Kubernetes function that we add Tracing codes. And then, we add Trace-ID information into **each klog call** as Request-ID. Note that we don't associate Request-ID to all of klog calls. Our target is important operations such as `kubectl create/delete/etc.`, and our target klog calls are the only klogs which is called via such important operations. Request-ID feature does not change existing klog function/method, but changes each klog calls and their log format. Currently, there is [Structured logging](https://github.com/serathius/enhancements/blob/structured-logging/keps/sig-instrumentation/20191115-structured-logging.md) KEP, and this KEP also change specific klog calls. We will merge Request-ID feature after Structured logging KEP is merged. + +**Examples (We quote some parts from `Structured logging` KEP)** + +Source Code + +Original +```go +klog.Infof("Updated pod %s status to ready", pod.name) +``` + +Structured format +```go +klog.InfoS("Pod status updated", "pod", pod, "status", "ready") +``` +Request-ID with structured format +```go +klog.InfoS("Request-ID", trace-id, "Pod status updated", "pod", pod, "status", "ready") +``` + +Expected Log + +Structured format +```json +{ + "ts": 1580306777.04728, + "v": 4, + "msg": "Pod status updated", + "pod":{ + "name": "nginx-1", + "namespace": "default" + }, + "status": "ready" +} + +Request-ID with structured format +```json +{ + "ts": 1580306777.04728, + "request-id": 5acf2a4d258157e06402fb734186b684 + "v": 4, + "msg": "Pod status updated", + "pod":{ + "name": "nginx-1", + "namespace": "default" + }, + "status": "ready" +} +``` #### Design overview of Control Request-ID We should control Request-ID feature to avoid an impact to existing users who are retrieving logs and analyzing with existing log format. So we introduce `--request-id` parameter which enables/disables Request-ID feature. We also manage the range of operations which are added Request-ID. The effect of each parameter of `--request-id` is as follows. @@ -166,18 +267,18 @@ We should control Request-ID feature to avoid an impact to existing users who ar | --request-id=1 | Request-ID feature is enabled, and Request-ID is added to klogs related to the `Alpha` target operations | | --request-id=2 | Request-ID feature is enabled, and Request-ID is added to klogs related to the `Alpha and Beta` target operations | -Alpha and Beta target operations are described in Migration / Graduation Criteria section. +Alpha and Beta target operations are described in [Migration / Graduation Criteria](#migration--graduation-criteria) section. #### Detail design of Control Request-ID TBD. I will write down the following things. - How to realize `--request-id` parameter in each Kubernetes component. - - Sample codes which is used with `--request-id` and `--trace` option. + - Sample codes which is used with `--request-id` option. ### Test Plan - test against the combination of following patterns. - - --trace(OFF/ON) / --request-id(0/1/2) + - --request-id(0/1/2) ### Migration / Graduation Criteria From afa7106ecf62f0dfbe90f8d54f578fb112a7b7dd Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Tue, 17 Mar 2020 17:45:23 +0900 Subject: [PATCH 8/9] Update 20191101-add-request-id-to-k8s-logs.md --- .../20191101-add-request-id-to-k8s-logs.md | 34 +++++++++++++------ 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index c51aa43989c..b751af71643 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -1,5 +1,5 @@ --- -title: Add Request-ID to each k8s component log +title: Add Request-ID to each Kubernetes component log authors: - "@hase1128" - "@sshukun" @@ -13,7 +13,7 @@ approvers: - TBD editor: TBD creation-date: 2019-11-01 -last-updated: 2020-02-26 +last-updated: 2020-03-17 status: provisional --- @@ -50,7 +50,7 @@ status: provisional This KEP proposes a new unique logging meta-data into all Kubernetes logs. It makes us more easy to identify specific logs related to a single user operation (such as `kubectl apply -f `). This feature is similar to -[Request-ID](https://docs.openstack.org/api-guide/compute/faults.html) for +[Global request ID](https://docs.openstack.org/api-guide/compute/faults.html) for OpenStack. It greatly reduces investigation cost. ## Motivation @@ -72,7 +72,13 @@ It is difficult that support team in k8s Service Provider resolve end user's pro In case of insecure or unauthorized operation happens, it is necessary to identify what effect that operation caused. This proposed feature helps identify what happened at each component or server by each insecure / unauthorized API -request. We can collect these logs as an evidence. +request. We can collect these logs as an evidence. This is similar to the +[Auditing](https://kubernetes.io/docs/tasks/debug-application-cluster/audit/), +except for the following points. + + - Audit only collects information about http request sending and receiving in kube-apiserver, so it can't track internal work of each component. + - Audit logs can't be associated to logs related to user operation (kubectl operation), because auditID is different for each http request. + #### Case 2 @@ -85,7 +91,8 @@ request. ### Goals - Adding a Request-ID into each K8s component log. - - The Request-ID is unique to an operation. + - The Request-ID is unique to a kubectl operation. + - (One kubectl operation by user causes multiple API requests and klog calls. Request-ID has same value in these klog calls.) - Control enabled/disabled Request-ID feature(Request-ID feature is disabled on default to avoid an impact for existing user). ### Non-Goals @@ -119,7 +126,14 @@ There is an idea to use `distributed context` of the existing KEP([Distributed T - We just use Tracing codes provided by `Distributed Tracing` feature. - We use Trace-ID as Request-ID (This does not interfere Tracing codes). -The target functions that we add Request-id codes are the following tables. The target functions include the klog call which called via important `kubectl` operations. These operations are listed in [Migration / Graduation Criteria](#migration--graduation-criteria) section. As a result, we can propagate Request-ID regarding important `kubectl` operations. +The target functions that we add Request-id codes are the following tables. The target functions include the klog call which called via important `kubectl` operations. We prioritize `kubectl` operations and target resources by the following criteria. + + - Operations that are frequently used by user + - Resources that are user's application and asset + +Operations against Pod(Deployment) are the most frequently used by user. PV / PVC stores user assets. In addition to this, network is relatively troublesome for orchestration. These operations are listed in [Migration / Graduation Criteria](#migration--graduation-criteria) section. As a result, we can propagate Request-ID regarding important `kubectl` operations. + +Here is the target functions that we add Request-id codes in **Alpha implementation**. **kube-apiserver.log** @@ -142,7 +156,6 @@ The target functions that we add Request-id codes are the following tables. The | source file | function name | | ------ | ------ | | pkg/volume/emptydir/empty_dir.go | setupTmpfs | -| pkg/kubelet/kubelet.go | syncLoopIteration | | pkg/kubelet/kubelet.go | HandlePodAdditions | | pkg/kubelet/kubelet.go | HandlePodUpdates | | pkg/kubelet/kubelet.go | HandlePodRemoves | @@ -157,7 +170,8 @@ The target functions that we add Request-id codes are the following tables. The | pkg/kubelet/kuberuntime/kuberuntime_container.go | killContainer | | pkg/kubelet/kuberuntime/kuberuntime_image.go | PullImage | | pkg/kubelet/kuberuntime/kuberuntime_manager.go | podSandboxChanged | -| pkg/kubelet/kuberuntime/kuberuntime_manager.go | SyncPod | +| pkg/kubelet/kuberuntime/kuberuntime_manager.go | killPodWithSyncResult | +| pkg/kubelet/kuberuntime/kuberuntime_sandbox.go | createPodSandbox | | pkg/kubelet/dockershim/docker_service.go | GenerateExpectedCgroupParent | | pkg/kubelet/dockershim/libdocker/kube_docker_client.go | start | | pkg/volume/util/operationexecutor/operation_generator.go | GenerateMountVolumeFunc | @@ -225,7 +239,7 @@ klog.InfoS("Pod status updated", "pod", pod, "status", "ready") ``` Request-ID with structured format ```go -klog.InfoS("Request-ID", trace-id, "Pod status updated", "pod", pod, "status", "ready") +klog.InfoS("Pod status updated", "pod", pod, "status", "ready", "Request-ID", trace-id) ``` Expected Log @@ -247,7 +261,6 @@ Request-ID with structured format ```json { "ts": 1580306777.04728, - "request-id": 5acf2a4d258157e06402fb734186b684 "v": 4, "msg": "Pod status updated", "pod":{ @@ -255,6 +268,7 @@ Request-ID with structured format "namespace": "default" }, "status": "ready" + "request-id": 5acf2a4d258157e06402fb734186b684 } ``` #### Design overview of Control Request-ID From c31aca0d41075b1b09787a4530174a293234d662 Mon Sep 17 00:00:00 2001 From: Jin Hase Date: Mon, 23 Mar 2020 14:14:35 +0900 Subject: [PATCH 9/9] Update 20191101-add-request-id-to-k8s-logs.md --- .../20191101-add-request-id-to-k8s-logs.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md index b751af71643..9c8d99e52a2 100644 --- a/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md +++ b/keps/sig-auth/20191101-add-request-id-to-k8s-logs.md @@ -47,7 +47,7 @@ status: provisional ## Summary -This KEP proposes a new unique logging meta-data into all Kubernetes logs. It makes us +This KEP proposes a new unique logging meta-data into key Kubernetes logs(Details are described in [Design of Propagate Request-ID section](#design-of-propagate-request-id)). It makes us more easy to identify specific logs related to a single user operation (such as `kubectl apply -f `). This feature is similar to [Global request ID](https://docs.openstack.org/api-guide/compute/faults.html) for @@ -88,19 +88,27 @@ and the API processing just before OOM killer. If the cause is that some unknown pod creations, it is helpful to detect the root API request and who called this request. +#### Summary of Cases + + - Given a component log(such as error log), find the API request that caused this (error) log. + - Given an API Request(such as suspicious API request), find the resulting component logs. + ### Goals - Adding a Request-ID into each K8s component log. - The Request-ID is unique to a kubectl operation. - (One kubectl operation by user causes multiple API requests and klog calls. Request-ID has same value in these klog calls.) - - Control enabled/disabled Request-ID feature(Request-ID feature is disabled on default to avoid an impact for existing user). + - Control enabled/disabled Request-ID feature + - Request-ID feature is disabled on default to avoid an impact for existing user. The expected impacts are as follows. + - Increasing log size + - Changing log format. (If there is a user who collects various log texts based on specific rules by 3rd party tools etc., it may not be possible to extract it with existing rules.) ### Non-Goals - To centrally manage the logs of each Kubernetes component with Request-ID (This can be realized with existing OSS such as Kibana, so no need to implement into Kubernetes components). - - We don't associate Request-ID to all of operations(Our target is important operations such as `kubectl create/delete/etc.`). + - We don't associate Request-ID to all of operations(Our target is important operations such as `kubectl create/delete/etc.` which cause changes in the system, and read operations such as `kubectl get/describe/etc.` are not our target). ## Proposal