Skip to content

Commit bc39a00

Browse files
authored
Add kubernetes-based develpment documentation (#144)
* feat: add kubernetes-based development recommendations
1 parent 13a855c commit bc39a00

File tree

2 files changed

+138
-1
lines changed

2 files changed

+138
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ with only a subset of sections having recommendations):
6565
- Development
6666

6767
- [Typical Development Workflows](./docs/development/dev-flows.md)
68-
- Development Environment
68+
- [Kubernetes-based Development Environment] (./docs/kubernetes-dev-environment.md)
6969
- [Choosing and vetting dependencies](./docs/development/dependencies.md)
7070
- [Building good containers](./docs/development/building-good-containers.md)
7171
- [Static Assets](./docs/functional-components/static-assets.md)
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
# Kubernetes-based Development Environment Recommendations
2+
3+
Developement and local-testing flows should ideally be the same for all the developers of a project, but its
4+
common for developers on a team to have heterogeneus development platforms (i.e. Linux, Intel-based MacOS, M1-based MacOS, Windows).
5+
Using kubernetes as a homogenized execution/configuration layer can be key to ensure testing/deployment
6+
use the same configuration settings for all developers. Building a docker image is the same "docker build ..." command
7+
regardless of development platform, and kubernetes configuration settings, ideally built through yaml configuration files and
8+
helm charts, are platform independent.
9+
10+
In the past the most common approach for building/running containers on Windows and MacOS
11+
was docker desktop. Now that docker desktop is no longer free, development teams are exploring other
12+
options which include:
13+
* Virtualbox with a linux VM (the disadvantage is that this requires developers to become more
14+
familiar with Linux). Docker client is still free and it can connect to a remove virtual
15+
machine. However, this has not worked well with recent versions of MacOS Ventura.
16+
* podman and podman desktop which provides similar functionality to docker desktop but is
17+
open source and free. The podman virtual machine does have two stability issues (it doesn't resync its
18+
clock after a sleep) and sometimes the minikube client on it becomes unresponsive, but usually it can run for multiple
19+
days before that happens. Often, is fastest to just delete/rebuild the podman virtual machine, which can take 5-10 minutes.
20+
21+
# Sharing development-zone configurations
22+
When developing microservices within a multi-microservice app, its ideal for the microservice architecture to not require all the
23+
microservices to be deployed locally, just the microservices and databases/queues a specific microservice depends upon. When it comes to
24+
supporting a local kubernetes running your development/tests, having a microservice only needing to rely upon a 3-6 pods vs. 15+ is key to
25+
fitting within a 8Gb memory footprint for your kubernetes deployment. Another powerful trick is to identify those dependent services that
26+
provide stateless support, in that functional calls to them are stateless where they are deployed, such as identity-lookup/token-exchange
27+
APIs, and utilize the deployments in your shared development-zone cluster (what is deployed to when your Git PRs are merged) for your
28+
local-microservice needs. If your microservice APIs have to support multiple tenants/organizations, it is recommended that your
29+
development-zone cluster and local cluster share definitions of test users and organizations, but DO NOT share them with other CI pipeline
30+
clusters. Specifically, test org display-names can still be the same in other clusters, but the guids and credentials for them should be
31+
different.
32+
33+
We've had good success with defining our configurations for microservices in a hierarchical set of yaml files where the overrides
34+
work in ascending order:
35+
1. cross-zone app-generic settings,
36+
2. cross-zone app-specific settings,
37+
3. zone-specific app-generic settings,
38+
4. zone-specific app-specific settings,
39+
5. **local development settings (if local)**
40+
41+
This structure helps avoid redundancy of configuration settings and also allows our local development yaml to just override a few settings
42+
like external hostnames and references to local databases/queues but otherwise inherit our 'development' zone settings. Typically, this
43+
hierarchy above is established by the order of yaml includes into the helm template (or helm deploy) command. Example command:
44+
```
45+
helm template ../helm-charts/app1 --name-template app1-dev --output-dir ../kubectl-specs/dev --namespace dev -f ../k8s-yamls/common.yaml -f ../k8s-yamls/app1.common.yaml -f ../k8s-yamls/dev/common.yaml -f ../k8s-yamls/dev/app1.yaml -f ../k8s-yamls/local/common.yaml -f ../k8s-yamls/local/app1.yaml
46+
```
47+
48+
# Replicating non-stateless DBs/queues locally
49+
If possible, avoid using shared databases and use local k8s deployments of the databases/queues to avoid test case collision between
50+
developers. As long as you are careful to make sure your development zone does not contain customer data, then making backups of your
51+
development databases and loading those backups in local development copies can help speed up your testcase creation process and
52+
leverage "end to end" tests that may use such data. It's common to use public helm charts (or make a copy of them) to deploy these
53+
databases in your local or CICD environments. Example charts:
54+
* Minio to locally replicate S3 APIs and storage: https://github.com/minio/minio/tree/master/helm/minio
55+
* A huge selection of others can be found from Bitnami's collection at https://github.com/bitnami/charts/tree/main/bitnami
56+
57+
# Doing interactive nodejs debugging in containers
58+
The kubernetes pod's spec.containers.args field can be configured in a helm chart to conditionally override of the primary command the
59+
container uses when starting up. This can be used to then override the regular node (my-app-entry).js command to also pass in the
60+
--inspect-brk flag, making the node (or nodemon) process wait for a remote debugger connection after starting up. Remember, the same
61+
yaml-config value used to turn on this debugging mode should also be used to disable liveness/readiness checks in the pod, or else the pod
62+
will get restarted by such checks. One can then use the "kubectl port-forward" command to establish a tunnel to the remote-debug port the
63+
nodejs process is listening to. IDE's like Visual Studio Code can then connect to that tunnel-port to reach the backend nodejs process you
64+
want to interactively debug.
65+
66+
# Using nodemon in local containers
67+
Similar to changing the startup command to set the --inspect-brk flag, local containers can also replace node with 'nodemon'. Nodemon is
68+
like node, but watches if any of the source .js files change and restart after second or two after changes have stabilized. This can
69+
allow developers to modify the js code running on a container without having to rebuild and redeploy the container. Docker build caching
70+
helps to accelerate the re-build process, but just copying over a few changed files can save 30 seconds or so in the iteration process.
71+
Nodemon should NEVER be used in production but it's fine to be installed but unused in a production image.
72+
73+
Here is a sample bash shell script to copy over the updated .ts file and re-run 'tsc' in the deployed container, which often only takes
74+
about 5 seconds to run:
75+
```
76+
updatePod(){
77+
POD_NAME="$1"
78+
echo "*****UPDATING POD: $POD_NAME******"
79+
LAST_BUILD_TIMESTAMP=`kubectl exec -n $NAME_SPACE $POD_NAME -- ls --full-time $DOCKER_WORKDIR/$MAIN_BUILT_FILE | awk '{ print \$6, \$7 }'`
80+
echo "LAST_BUILD_TIMESTAMP = $LAST_BUILD_TIMESTAMP"
81+
LAST_BUILD_IN_SECONDS=`kubectl exec -n $NAME_SPACE $POD_NAME -- date +%s -d"$LAST_BUILD_TIMESTAMP"`
82+
echo "LAST_BUILD_IN_SECONDS = $LAST_BUILD_IN_SECONDS"
83+
NOW_IN_SECONDS=`kubectl exec -n $NAME_SPACE $POD_NAME -- date +%s`
84+
echo "NOW_IN_SECONDS = $NOW_IN_SECONDS"
85+
SECONDS_SINCE_LAST_BUILD=$(( $NOW_IN_SECONDS - $LAST_BUILD_IN_SECONDS ))
86+
echo "Last build was $SECONDS_SINCE_LAST_BUILD seconds ago"
87+
88+
# now iterate through source dirs
89+
export UPDATES_MADE=false
90+
declare -a SOURCE_DIRS_ARRAY=($SOURCE_DIRS)
91+
for SOURCE_DIR in "${SOURCE_DIRS_ARRAY[@]}"
92+
do
93+
UPDATED_FILES=`find ${SOURCE_DIR} -type f -newerct "${SECONDS_SINCE_LAST_BUILD} seconds ago"`
94+
echo "UPDATED_FILES = $UPDATED_FILES"
95+
if [[ "$UPDATED_FILES" ]]; then
96+
export UPDATES_MADE=true
97+
declare -a UPDATED_FILES_ARRAY=($UPDATED_FILES)
98+
for UPDATED_FILE in "${UPDATED_FILES_ARRAY[@]}"
99+
do
100+
echo "copying in $UPDATED_FILE into ${POD_NAME}..."
101+
kubectl cp $UPDATED_FILE $NAME_SPACE/$POD_NAME:$DOCKER_WORKDIR/$UPDATED_FILE
102+
done
103+
fi
104+
done
105+
106+
if [[ "$UPDATES_MADE" == "true" ]]; then
107+
kubectl exec -n $NAME_SPACE $POD_NAME -- npm run build
108+
kubectl exec -n $NAME_SPACE $POD_NAME -- touch $MAIN_BUILT_FILE
109+
echo "npm run build - finished inside $POD_NAME"
110+
else
111+
echo "No updates found, npm run build not executed in $POD_NAME"
112+
fi
113+
}
114+
```
115+
116+
# Extracting test results from Kubernetes pods
117+
So just like how debugger flags and nodemon can be conditionally turned on in a pod deployment, the same goes for starting a unit +
118+
integration test run. When doing this, one should run nyc around the execution of the unit + integration tests to capture the code
119+
coverage from the combination of the two types of tests. However, one can't have the pod test-script immediately finish once the test
120+
finishes, or else the test results will disappear when the test pod shuts down. Normally, local developer testing can then just look at
121+
the logs of the shutdown pod to see if it was successful or not, but when debugging failures in a CICD pipeline like Travis/Jenkins, its
122+
very useful to run the same logic locally on one's development system, and those flows will typically need the full test result files.
123+
124+
There are two general solutions for this:
125+
* For tests run in a non-production environment like minikube, a kubernetes Physical Volume (PV) can be created and
126+
and mounted in the test container for the post-test script to copy its result files into. The higher level test-logic flow on one's
127+
development system (or CICD engine like Travis/Jenkins) can then copy out the test result and coverage files.
128+
129+
* Utilize a state-handshake between the high level test scripting (eg. Travis/Jenkins/laptop shell) and the process in the
130+
test pod. This can be done with a file in the pod, where the test scripting writes the process return-code from the tests to that file and
131+
loops (i.e. stays alive) until the file is deleted. Example post-test script logic in the test-pod:
132+
```
133+
; TEST_RESULT=$?; echo $TEST_RESULT > /tmp/test.report.generated.semaphore; echo 'Waiting for test reports to be fetched...'; (while [ -f /tmp/test.report.generated.semaphore ]; do sleep 2; done); echo 'Test reports pulled!'; exit $TEST_RESULT
134+
```
135+
The parent layer can then use "kubectl exec ... -- ls /tmp/test.report.generated.semaphore" to check for the presence of the semaphore file
136+
that tells when tests finished. It can then copy out the results with "kubectl cp ...", and delete the semaphore file with "kubectl exec
137+
... -- rm /tmp/test.report.generated.semaphore"

0 commit comments

Comments
 (0)