You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/posts/20250501-challenge-1-step-by-step/index.md
+40-15Lines changed: 40 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,6 @@
1
1
---
2
2
title: " Challenge 1 - step by step revealed"
3
-
date: 2025-05-01T11:58:54+02:00
4
-
draft: true
3
+
date: 2025-05-02T16:30:54+02:00
5
4
pin: true
6
5
7
6
resources:
@@ -25,7 +24,8 @@ summary:
25
24
26
25

27
26
28
-
> [!CAUTION]**Spoiler alert**
27
+
> [!CAUTION]
28
+
> **Spoiler alert**
29
29
> This page contains a step by step explanation of [challenge 1](../aocc-challenge-01).
30
30
31
31
At the start of the [Application Observability Code Challenges](../application-observability-code-challenges) I described observability as:
@@ -45,7 +45,7 @@ This is a screenshot of a part of the dashboard showing the request rate, durati
45
45

46
46
In the blocks I marked when the 3 scripts were executed.
47
47
48
-
> [!QUESTION]
48
+
> [!TIP]
49
49
> What can we learn from these metrics?
50
50
51
51
## Request rate panel (on top)
@@ -58,7 +58,7 @@ I did not take the full screenshot, but after this it even goes way up, which ar
58
58
59
59
## JVM Threads
60
60
The thread panel shows a very consistent number of threads. Even with 40 concurrent users during the second test run, the number of threads does not increase much.
61
-
This indicates that at least one thread is not being created per user.
61
+
If it would have created a thread per user, there would be much more threads, but that is not the case.
62
62
63
63
## Tomcat threads
64
64
This is also supported by the panel with Tomcat threads.
@@ -70,7 +70,7 @@ And this is also expected, as our application uses `async' message handling. In
70
70
Can we see any special behaviour regarding memory usage? Not really, it behaves normally. The garbage collector kicks in from time to time and is able to free the memory as expected.
A trace has also been created for each request. What can we learn from those traces?
@@ -85,7 +85,7 @@ This is an example trace:
85
85
86
86
Unfortunately we cannot see anything based on this, we can see that it took 2 seconds to process this request, no further details...
87
87
88
-
> [!QUESTION]
88
+
> [!TIP]
89
89
> What can we do to improve the observability to get more insight?
90
90
91
91
What kind of additional information can we extract from the application to get more information about what is going on?
@@ -185,7 +185,7 @@ public class ManagedAsyncExecutorProvider extends ThreadPoolExecutorProvider {
185
185
```
186
186
**But wait... we did not see the number of threads increase during the runs. How does the executor handle the tasks? Does it scale properly?
187
187
188
-
> [!QUESTION]
188
+
> [!TIP]
189
189
> What can we do to improve the observability of the async executor?
190
190
191
191
## Extending instrumentation with the OpenTelemetry API
@@ -283,13 +283,16 @@ The results of the K6 test will not be different, but we will have more metrics
283
283
We can use Grafana's 'Explore' feature or drill down metrics feature as long as we do not add these metrics to a dashboard.
284
284
285
285

286
+
286
287
In this graph we see that the core size is 4 and the maximum pool size is 40. This is expected.
287
288
The green line shows the number of active threads. It is going up, but not above the core size.
288
289
289
290

291
+
290
292
The total number of tasks created and the number of tasks completed follow each other closely. This is expected, as all requests have been completed.
291
293
292
294

295
+
293
296
The queue of tasks to be executed is growing, especially during the phase when 40 concurrent users were active. To process the tasks faster, more threads could have been created, but this is not done. Why not?
294
297
295
298
## Why is it not creating more threads?
@@ -313,15 +316,25 @@ Ok ... and what is more in the Java doc ...
313
316
Ah ... this is exactly the behaviour we see. The queue is growing and at these moments the duration of requests is high.
314
317
The core threads continue to process the requests, but no new threads are created.
315
318
316
-
> [!QUESTION]
319
+
## How to solve this?
320
+
> [!TIP]
317
321
> How to solve this? Can we prove that our solution solves it?
318
322
319
-
There are several solutions to this problem, to name a few:
323
+
There is no one size fits all solution, it depends on what you are trying to achieve. And in certain situations it depends on the architecture of the whole environment and what is expected of that service.
324
+
For all solutions, it is important to think in advance about what you want to achieve, whether the solution fits and, importantly(!), to validate that the solution works as expected.
320
325
321
-
- Remove the use of `@ManagedAsync`. This would allow Tomcat to handle each request in a separate thread.
322
-
- Change the way you use async requests.
326
+
Even for this simple setup, there are several solutions. To name a few:
327
+
328
+
- Just increase the corePoolSize
329
+
- Just increasing the core size is an easy change
330
+
- Consideration: This does not change the behaviour of the solution, it just delays the tipping point.
331
+
- Remove the use of `@ManagedAsync`.
332
+
- This is also a simple change. Instead of using a separate thread pool to handle the requests, use the Tomcat thread pool to handle each request by a thread.
333
+
- Consideration: This is a more fundamental change to your application. It changes from async to sync. In this example it is not a problem, but in other cases the impact may be greater.
323
334
- Tune the `ManagedAsyncExecutorProvider' to make it work.
324
-
-
335
+
- By tuning the `ManagedAsyncExecutorProvider' we keep the architecture of the application the same, we just make sure we make more use of the up and down scaling of the thread pool.
336
+
- Consideration: this requires more tuning and testing to see if it behaves as expected.
337
+
325
338
> [!TIP]
326
339
> You can try some solutions yourself and see what the results are when you run the tests again!
Now we can redeploy the change and run the test again.
341
354
342
-
343
355

356
+
344
357
Now we see the active threads go up above the core size. Once the test is finished, you see after a while the number of threads going down again.
345
358
346
359

360
+
347
361
We see that the queue is almost empty the whole time. The metric is a sampled value. Most likely it reached the limit of 2 more often, as more threads have been created.
348
362
But due to the sampling, you cannot see that.
349
363
@@ -352,4 +366,15 @@ And when we have a look at the results of K6 we see results as we hoped to see a
352
366
353
367
# Summary
354
368
355
-
<>
369
+
> #### Observability: Every Engineer’s Job, Not Just Ops’ Problem
370
+
> Observability is more about software engineering than it is about operations.
371
+
> by Martin Thwaites on [The New Stack](https://thenewstack.io/observability-every-engineers-job-not-just-ops-problem/)
372
+
373
+
I totally agree with this and what Martin wrote in the article. I also think that this code challenge shows that this is not just an Ops problem.
374
+
As a developer, you are the person who created and maintains the software. You should know how it works and which libraries are used.
375
+
For the important parts of these libraries, observability should be activated or created to make sure you can see the actual behaviour.
376
+
377
+
Practising your skills will also help you to be able to quickly see what is going on. If you have seen the graphs for the first time, you need time to understand what they mean.
378
+
If you have seen them quite often, you can quickly see where they deviate from normal situations and where problems might be.
379
+
380
+
I have some more challenges in mind, so **keep posted if you want to practice more**. I just need to find time to create the next challenge :smile:
0 commit comments