GRPC load balancing in k8s cluster: Experiment

Pre-reading

GRPC load balancing in k8s cluster

API

/experiment/client?type={client_type}&n={concurrent_reqs}

client_type: the grpc client we use to send data to greeter server
- Client: Create new channel for every request
- ClientWithReuseConn: Create channel initially and reuse that channel for further requests.
- ClientWithKubeResolver: Use kubeResolver as a grpc name resolver. It would monitor the Greeter-service and resolve the ips behind it dynamically.
concurrent_reqs: It's the number of concurrent requests we send to greeter-server in the experiment.

Architecture

KubeResolver

Prerequiste

minikube: A local kubernetes control plane.
For kubeResolver, we should give service account in the greeter-client pod GET and WATCH access for resource endpoints.

How to start the greeter-client & greeter-server

Start cluster: minikube start
(Optional) Enable local build registry
- Run eval $(minikube -p minikube docker-env), point your terminal’s docker-cli to the Docker Engine inside minikube. (minikube handbook)
Build greeter-client & greeter-server image: make build-image
Deploy server: kubectl apply -f k8s/server.yaml
Deploy client: kubectl apply -f k8s/client.yaml
- For kubeResolver, we should also give default service account RBAC access for endpoints: kubectl apply -f k8s/client-rbac.yaml
Open tunnel to the greeter-client NodePort: in the other terminal, run minikube service greeter-client
Run experiment: curl "localhost:{port_from_step_7}/?type={client_type}&n={concurrent_reqs}"

Experiment

Client v.s ClientWithReuseConn

	Client	ClientWithReuseConn
n = 1000	687.692001ms	522.581208ms
n = 2000	1.036497709s	547.844917ms
n = 3000	Crash	558.110917ms
n = 10000	Crash	735.631459ms

LoadBalancing

Client

"count_per_ip":{
  "10.244.0.103":684,
  "10.244.0.105":662,
  "10.244.0.113":654
}

ClientWithReuseConn

"count_per_ip":{"10.244.0.113":2000}

Comment

Reconnect for every request is slow. If we ignore the server sleep time 500ms, Client takes 8.5x~10x time than ClientWithReuseConn.
For 3000 concurrent requests, Client would crash due to the resource surge, while it only takes little overhead for ClientWithReuseConn
Client load balancing policy is more complicated than round-robin, but it's still closed to evenly distributed in our workload
ClientWithReuseConn does not have load-balancing at all.

ClientWithResueConn v.s ClientWithKubeResolver(Replica=3)

	ClientWithReuseConn	ClientWithKubeResolver
n = 1000	522.581208ms	513.53575ms
n = 2000	547.844917ms	548.33675ms
n = 3000	558.110917ms	533.248667ms
n = 10000	735.631459ms	699.490834ms

LoadBalancing

ClientWithReuseConn

"count_per_ip":{"10.244.0.113":10000}

ClientWithKubeResolver

"count_per_ip":{
  "10.244.0.103":3334,
  "10.244.0.105":3333,
  "10.244.0.113":3333
}

Comment

HTTP/2 and Protobuf is efficient: Efficiency of one connection is as good as load-balancing in our workload. It does not need to open an new connection even when we have 10000 concurrent request.
KubeResolver successfully load balancing using round-robin, but it would not consider the server state.

ClientWithKubeResolver with scaling up or down

Scale up to 10 replica

kubectl scale deployments/greeter-server --replicas=10
During the process of scaling

"count_per_ip":{
  "10.244.0.103":500,
  "10.244.0.105":500,
  "10.244.0.113":500,
  "10.244.0.122":500
}

After Scaling is done

"count_per_ip":{
  "10.244.0.103":200,
  "10.244.0.105":200,
  "10.244.0.113":200,
  "10.244.0.121":200,
  "10.244.0.122":200,
  "10.244.0.123":200,
  "10.244.0.124":200,
  "10.244.0.125":200,
  "10.244.0.126":200,
  "10.244.0.127":200
}

Kill one pod

Before we kill a pod

"count_per_ip":{
  "10.244.0.103":666,
  "10.244.0.105":667,
  "10.244.0.113":667
}

After we kill a pod
- kubectl delete pod {pod_name}
- We can see one of ips changes from "10.244.0.105" to "10.244.0.128"

"count_per_ip":{
  "10.244.0.103":667,
  "10.244.0.113":666,
  "10.244.0.128":667
}

Scale down to 2 replica

kubectl scale deployments/greeter-server --replicas=2

"count_per_ip":{
  "10.244.0.103":1000,
  "10.244.0.113":1000
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
client		client
k8s		k8s
proto		proto
server		server
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GRPC load balancing in k8s cluster: Experiment

Pre-reading

API

/experiment/client?type={client_type}&n={concurrent_reqs}

Architecture

KubeResolver

Prerequiste

How to start the greeter-client & greeter-server

Experiment

Client v.s ClientWithReuseConn

LoadBalancing

Comment

ClientWithResueConn v.s ClientWithKubeResolver(Replica=3)

LoadBalancing

Comment

ClientWithKubeResolver with scaling up or down

Scale up to 10 replica

Kill one pod

Scale down to 2 replica

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GRPC load balancing in k8s cluster: Experiment

Pre-reading

API

/experiment/client?type={client_type}&n={concurrent_reqs}

Architecture

KubeResolver

Prerequiste

How to start the greeter-client & greeter-server

Experiment

Client v.s ClientWithReuseConn

LoadBalancing

Comment

ClientWithResueConn v.s ClientWithKubeResolver(Replica=3)

LoadBalancing

Comment

ClientWithKubeResolver with scaling up or down

Scale up to 10 replica

Kill one pod

Scale down to 2 replica

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages