You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
summary: "Meet the mentee: Yash Agarwal worked with the project maintainers on adding chaos testing to CloudNativePG, as part of the LFX mentorship program."
17
+
summary: "Meet the mentee: Yash Agarwal worked with the project maintainers on adding
18
+
chaos testing to CloudNativePG, as part of the LFX mentorship program."
18
19
---
19
20
20
-
In the summer we wrote about how CloudNativePG was back for the September-October-November LFX term with [several projects for mentoring](https://cloudnative-pg.io/blog/2025-term3-lfx-cncf-mentorship/). One of them was around Chaos Testing.
21
+
In the summer we wrote about how CloudNativePG was back for the September-
22
+
October-November LFX term with [several projects for mentoring](https://cloudnative-pg.io/blog/2025-term3-lfx-cncf-mentorship/). One of them was
23
+
around Chaos Testing.
21
24
22
-
Yash Agarwal worked with mentors and CloudNativePG maintainers Gabriele Bartolini, Marco Nenciarini, Francesco Canovai, and Jonathan Gonzalez, to enhance the project's test coverage. Introducing LitmusChaos, a comprehensive chaos testing framework, the team designed automated chaos experiments for common failure scenarios, integrated them into CI/CD workflows, and collected observability metrics like failover time and data consistency. I had a chat with Yash about his work, and about how he got into Tech in the first place.
25
+
Yash Agarwal worked with mentors and CloudNativePG maintainers Gabriele Bartolini,
26
+
Marco Nenciarini, Francesco Canovai, and Jonathan Gonzalez, to enhance the
27
+
project's test coverage. Introducing LitmusChaos, a comprehensive chaos testing
28
+
framework, the team designed automated chaos experiments for common failure
29
+
scenarios, integrated them into CI/CD workflows, and collected observability
30
+
metrics like failover time and data consistency. I had a chat with Yash about
31
+
his work, and about how he got into Tech in the first place.
23
32
24
33
## Start at the beginning
25
34
26
-
Yash's venture into programming started when he got introduced to Python in 11th grade. He was always fascinated by technology, and whenever he and his cousin Amit (now a software developer as well) met, he asked him a lot of questions "about everything".
35
+
Yash's venture into programming started when he got introduced to Python in 11th
36
+
grade. He was always fascinated by technology, and got further inspired to pursue
37
+
a career as a programmer by his cousin Amit, a software developer.
27
38
28
-
Today Yash is a full stack developer intern at Seeqlo, where he, among other things, focuses on streamlining cloud operations and optimizing performance. Based in Bengaluru, India, Yash is a member of Point Blank, a student-run tech community dedicated to learning together.
39
+
Today Yash is a full stack developer intern at Seeqlo, where he, among other
40
+
things, focuses on streamlining cloud operations and optimizing performance.
41
+
Based in Bengaluru, India, Yash is a member of Point Blank, a student-run tech
42
+
community dedicated to learning together.
29
43
30
-
He looks back at working with the CloudNativePG team as a "great learning experience". They met twice a week for 30 minutes to discuss the progress of the project. One thing that Yash says he learned from Jonathan is to have more patience. When he was ready to give up on gaining access to the Litmus Chaos Slack workspace, Jonathan hand-held him through the process.
44
+
He looks back at working with the CloudNativePG team as a "great learning experience".
45
+
They met twice a week for 30 minutes to discuss the progress of the project.
46
+
One thing that Yash says he learned is to have more patience.
31
47
32
48
## Chaos testing
33
49
34
-
The new [chaos-testing repository](https://github.com/cloudnative-pg/chaos-testing) Yash worked on provides automated tools to validate PostgreSQL cluster resilience under failure conditions. It combines two testing approaches:
50
+
The new [chaos-testing repository](https://github.com/cloudnative-pg/chaos-testing) Yash worked on provides automated tools to validate
51
+
PostgreSQL cluster resilience under failure conditions. It combines two testing
52
+
approaches:
35
53
36
-
* Jepsen Consistency Testing - Uses the famous Jepsen framework to perform mathematical proofs of database consistency. It continuously runs database operations (50 ops/sec) and validates that no data is lost or corrupted during failures.
37
-
* LitmusChaos Fault Injection - Uses LitmusChaos to simulate real-world failures by repeatedly deleting the PostgreSQL primary pod (every 60-180 seconds), forcing CloudNativePG to perform automatic failover.
54
+
* Jepsen Consistency Testing - Uses the famous Jepsen framework to perform
55
+
mathematical proofs of database consistency. It continuously runs database
56
+
operations (50 ops/sec) and validates that no data is lost or corrupted during
by repeatedly deleting the PostgreSQL primary pod (every 60-180 seconds),
60
+
forcing CloudNativePG to perform automatic failover.
38
61
39
-
You can read more about the project in the repository's [README](https://github.com/cloudnative-pg/chaos-testing/blob/main/README.md). And, in case you're curious, here's Yash's PR: https://github.com/cloudnative-pg/chaos-testing/pull/3
62
+
You can read more about the project in the repository's [README](https://github.com/cloudnative-pg/chaos-testing/blob/main/README.md). And, in case
Yash wasn't able to find how to get the chaos engine to target the primary pods since the appKind CloudNativePG uses isn't natively supported by Litmus. "I tried many things, but when I tried AppKind as "Cluster" with capital C it worked! I read the Litmus code and found that there were some validations which prevented "cluster" from working. This behavior was not described in Litmus' documentation, which meant I could submit a PR and prevent the next person from running into the same issue!"
68
+
Yash wasn't able to find how to get the chaos engine to target the primary pods
69
+
since the appKind CloudNativePG uses isn't natively supported by Litmus. "I tried
70
+
many things, but when I tried AppKind as "cluster" it worked! I read the Litmus
71
+
code and found that there were some validations which prevented "Cluster" (capital
72
+
"C") from working. This behavior was not described in Litmus' documentation,
73
+
which meant I could submit a PR and prevent the next person from running into
74
+
the same issue!"
45
75
46
76
## What's next?
47
77
48
-
In the second half of his 3rd year, Yash is exploring opportunities in the field of backend and DevOps. "I will surely try to contribute more towards CloudNativePG when time permits!" You can follow Yash's work on [GitHub](https://github.com/XploY04).
78
+
In the second half of his 3rd year, Yash is exploring opportunities in the field
79
+
of backend and DevOps. "I will surely try to contribute more towards CloudNativePG
80
+
when time permits!" You can follow Yash's work on [GitHub](https://github.com/XploY04).
0 commit comments