Review as part of non-critical support

johnsimons · johnsimons · commit 40e229cadda5 · 2025-08-22T14:57:44.000+10:00
diff --git a/tutorials/monitoring-demo/index.md b/tutorials/monitoring-demo/index.md
@@ -1,50 +1,60 @@
 ---
 title: "NServiceBus monitoring demo"
-reviewed: 2023-11-05
-summary: A self-contained demo solution that you can run to explore the monitoring features of the Particular Service Platform.
+reviewed: 2025-08-22
+summary: A self-contained demo solution to explore the monitoring features of the Particular Service Platform.
 suppressRelated: true
 redirects:
 - tutorials/monitoring/demo
 ---
 
-See how to use the monitoring features in the Particular Service Platform by trying them out in ServicePulse with a real system. This downloadable sample contains all of the necessary parts of the platform, already configured and ready to run, including four sample endpoints that communicate with each other by exchanging messages.
+Experience the monitoring features of the Particular Service Platform by running a real-world demo in ServicePulse. This downloadable sample includes all required platform components, pre-configured and ready to use, with four sample endpoints that communicate by exchanging messages.
 
-<div class="text-center inline-download hidden-xs"><a id='download-demo' href='https://s3.amazonaws.com/particular.downloads/MonitoringDemo/Particular.MonitoringDemo.zip' class="btn btn-primary btn-lg"><span class="glyphicon glyphicon-download-alt" aria-hidden="true"></span> Download demo</a>
+<div class="text-center inline-download hidden-xs">
+  <a id='download-demo' href='https://s3.amazonaws.com/particular.downloads/MonitoringDemo/Particular.MonitoringDemo.zip' class="btn btn-primary btn-lg">
+    <span class="glyphicon glyphicon-download-alt" aria-hidden="true"></span> Download demo
+  </a>
 </div>
 
 ## Prerequisites
 
-To run the downloaded sample, you will need the following prerequisites:
- 
-- .NET 8 runtime must be installed.
-- Windows operating system, the Particular Service Platform requires the Windows operating system
-  - Desktop: Windows 8 or higher
-  - Server: Windows Server 2016 or higher
+To run the sample, ensure you have:
 
-## Running the sample
+- [.NET 8 runtime](https://dotnet.microsoft.com/en-us/download/dotnet/8.0) installed
+- Windows operating system:
+  - Desktop: Windows 8 or later
+  - Server: Windows Server 2016 or later
 
-Once you have downloaded and extracted the zip package, open the extracted folder and double-click on `MonitoringDemo`.
+> **Note:** The Particular Service Platform requires Windows.
 
-The details on how the demo is set up can be found in the demo [setup walkthrough](walkthrough-setup.md).
+## How to run the sample
 
-## Demo walk-through
+1. Download and extract the zip package.
+2. Open the extracted folder.
+3. Double-click `MonitoringDemo` to start the demo.
 
-Once everything is running, you will have 4 endpoints which are configured like this:
+For more details on the demo setup, see the [setup walkthrough](walkthrough-setup.md).
 
-![Solution Diagram](diagram.svg "width=680")
+## Demo overview
 
-By default, the ClientUI endpoint sends a steady stream of 1 `PlaceOrder` message every second.
+When running, the demo starts four endpoints configured as shown below:
 
-The endpoints are also configured to send monitoring data to the Particular Software Platform, which you can see in ServicePulse.
+![Solution Diagram showing four endpoints](diagram.svg "width=680")
 
-![Service Pulse monitoring tab showing sample endpoints](servicepulse-monitoring-tab-sample-low-throughput.png "width=500")
+By default, the ClientUI endpoint sends one `PlaceOrder` message per second.
+
+All endpoints are configured to send monitoring data to the Particular Service Platform, which you can view in ServicePulse.
+
+![ServicePulse monitoring tab showing sample endpoints](servicepulse-monitoring-tab-sample-low-throughput.png "width=500")
 
 ## Explore the demo further
 
-See how monitoring tools in ServicePulse help answer the following questions:
+Use the monitoring tools in ServicePulse to investigate:
 
-- **[Which message types take the longest to process?](walkthrough-1.md)** Take a look at individual endpoint performance and decide where to optimize.
-- **[Which endpoints have the most work to do?](walkthrough-2.md)** Look for peaks of traffic and decide when to scale out. 
-- **[Are any of the endpoints struggling?](walkthrough-3.md)** Find hidden problems and fix them before messages start to fail.
+- **[Which message types take the longest to process?](walkthrough-1.md)**
+  Analyze individual endpoint performance to identify optimization opportunities.
+- **[Which endpoints have the most work to do?](walkthrough-2.md)**
+  Detect traffic peaks to make informed scaling decisions.
+- **[Are any of the endpoints struggling?](walkthrough-3.md)**
+  Uncover and resolve hidden issues before they cause message processing failures.
 
 include: monitoring-demo-next-steps
diff --git a/tutorials/monitoring-demo/walkthrough-3.md b/tutorials/monitoring-demo/walkthrough-3.md
@@ -1,102 +1,107 @@
 ---
-title: "Monitoring NServiceBus Demo - Struggling endpoints"
-reviewed: 2023-11-07
-summary: Use the Particular Service Platform to find hidden problems in your solution.
+title: "Monitoring NServiceBus Demo - Struggling Endpoints"
+reviewed: 2025-08-22
+summary: Use the Particular Service Platform to identify and diagnose hidden problems in your solution.
 suppressRelated: true
 ---
 
 _Are any of the endpoints struggling?_
 
-NServiceBus endpoints are designed to tolerate several types of failure. There are some early warning signs to be aware of that indicate that an endpoint is going to have a problem.
-
-This part of the tutorial guides you through how to use monitoring data to spot hidden problems in your NServiceBus system.
+This tutorial demonstrates how to use monitoring data in the Particular Service Platform to detect early warning signs and hidden issues in your NServiceBus system. You will learn how to spot struggling endpoints before they become critical problems.
 
 include: monitoring-demo-walkthrough-solution
 
+## Key metrics
 
-## Metrics
-
-One of the benefits of NServiceBus is that it can [handle transient errors](https://particular.net/blog/but-all-my-errors-are-severe) for you. If a network switch is being restarted or a web server is temporarily too busy to service requests, then an NServiceBus endpoint will roll the message it is processing back to its input queue and try again later. If the problem was short-lived and has since been corrected, then the message will process successfully when it is retried. If the problem is more permanent, the endpoint will eventually forward the message to an error queue.
-
-_Scheduled retry rate_ measures how often messages are failing and are marked to be retried.
-
-_Processing time_ is the time it takes for the endpoint to process a single message. A higher processing time indicates a slower endpoint and a lower processing time indicates a faster endpoint. Processing time is only measured for messages that are successfully processed.
+NServiceBus is designed to handle transient errors automatically. For example, if a network switch is restarted or a web server is temporarily unavailable, the endpoint will roll back the message to its input queue and retry later. If the issue is resolved quickly, the message will process successfully on retry. If the problem persists, the message will eventually be forwarded to the error queue.
 
+- **Scheduled retry rate**: Measures how often messages fail and are scheduled for retry.
+- **Processing time**: The time taken to process a single message. Higher processing times may indicate a struggling endpoint, while lower times suggest healthy performance. Only successfully processed messages are measured.
 
-## Sample walkthrough
+## Walkthrough: Identifying struggling endpoints
 
-The following walkthrough uses the sample solution to simulate problems with endpoints.
+Follow these steps to simulate and observe endpoint issues using the sample solution:
 
-**Run the sample solution. Open ServicePulse to the Monitoring tab.**
+1. Run the sample solution.
+2. Open ServicePulse and navigate to the Monitoring tab.
 
-![Service Pulse monitoring tab showing sample endpoints](servicepulse-monitoring-tab-sample-low-throughput.png "width=500")
+   ![ServicePulse Monitoring tab showing sample endpoints](servicepulse-monitoring-tab-sample-low-throughput.png "width=500")
 
-NServiceBus endpoints frequently rely on other resources to do their work. This might take the form of a database server that holds persisted data or a web server that hosts an API that the endpoint needs to call. The endpoints themselves are designed to tolerate failure, but there are some early indicators that failure is coming.
+Endpoints often depend on external resources, such as databases or web APIs. While endpoints are resilient to failures, monitoring can reveal early indicators of trouble.
 
+### Detecting slow message processing
 
-### Processing messages is getting slower
+A common early warning sign is an increase in message processing time. This may indicate that database queries or web API calls are taking longer than usual, signaling potential issues with dependent resources.
 
-The first indication that an endpoint is going to run into trouble is when processing messages starts to slow down. This is indicated by an increase in processing time. This means that database queries and web API calls are taking longer to process than they were before.
+**Simulate resource degradation:**
 
-**Find the Shipping endpoint windows and toggle the resource degradation simulation.**
+Find the Shipping endpoint window and toggle the resource degradation simulation.
 
 ![ServicePulse Monitoring tab showing resource degradation on Shipping endpoint](servicepulse-monitoring-tab-resource-degradation.png "width=500")
 
-Watch the processing time on the shipping endpoint. As the (simulated) third-party resources slow down, processing the messages takes longer and processing time goes up. To find the root cause, you need to know which message types are causing the problem.
+As the (simulated) third-party resources slow down, processing time for the Shipping endpoint increases. To diagnose the root cause, it's essential to identify which message types are affected.
 
-**In the ServicePulse UI, click the Shipping endpoint to open a detailed view.**
+**Analyze processing time by message type:**
+
+In the ServicePulse UI, click the Shipping endpoint to open a detailed view.
 
 ![ServicePulse Details tab showing resource degradation on OrderPlaced events](servicepulse-monitoring-details-resource-degradation.png "width=500")
 
-This screen shows a breakdown of processing time by message type. Even though the Shipping endpoint processes two types of message, only one of them is slowing down. There is something that is slowing down the processing of `OrderPlaced` events that is not affecting the processing of `OrderBilled` events.
+This view breaks down processing time by message type. In this case, only the `OrderPlaced` events are experiencing increased processing times, indicating an issue specific to that message type.
 
 > [!NOTE]
-> This example is a simulation, and there isn't a third party resource that is failing.  We're just simulating it with `Task.Delay`.
+> This example uses simulation to mimic resource degradation (e.g., `Task.Delay`).
 
-**Find the Shipping endpoint window and toggle the resource degradation simulation off. Return the ServicePulse Monitoring tab.**
+**Observe recovery:**
 
-Now look at the processing time for the Shipping endpoint again. As soon as the remote resource recovers, the processing time snaps back to where it was before. This is what it looks like when a failing resource is restarted.
+Find the Shipping endpoint window and toggle the resource degradation simulation off. Return to the ServicePulse Monitoring tab.
 
+Once the remote resource is simulated to recover, the processing time for the Shipping endpoint should return to normal, demonstrating the impact of the failing resource.
 
-### Messages are being retried
+### Monitoring scheduled retry rate
 
-The second indication that an endpoint is running into problems is that message processing starts to fail, and the endpoint starts scheduling messages to be retried. When an exception is thrown in a message handler, NServiceBus will remove the message being processed from the queue that it came from and try to handle that message again at a later time. If the exception is caused by a temporary problem, then waiting for a small period and re-processing the message will succeed.
+Another critical metric is the scheduled retry rate, which indicates how often messages are failing and being retried. A sudden increase in this rate may suggest that an endpoint is struggling to process messages successfully.
 
-If there are occasional network outages or database deadlocks, this works well. The message still gets processed successfully, and the system continues as if nothing happened. When the rate of these errors starts to increase, it might mask a broader issue.
+**Simulate increased failure rate:**
 
-**Find the Billing endpoint UI and increase the failure rate to 30%.**
+Find the Billing endpoint UI and increase the failure rate to 30%.
 
-Now look at the scheduled retry rate for the Billing endpoint in the ServicePulse monitoring tab. Notice that even though the endpoint is encountering difficulties processing roughly a third of its messages, it is still able to process every message successfully after a couple of retries.
+Monitor the scheduled retry rate for the Billing endpoint in the ServicePulse monitoring tab. Despite the increased failure rate, the endpoint may still process messages successfully after a few retries.
 
 > [!NOTE]
-> As the endpoint is wasting resources attempting to process a message that fails, the number of successfully processed messages (throughput) goes down. This has the effect of forcing messages to spend longer in the input queue which can impact queue length and critical time as well (to find out why, see [Which endpoints have the most work to do?](./walkthrough-2.md)).
+> A higher failure rate can lead to decreased throughput, as the endpoint spends resources retrying failed messages. This may also impact queue length and critical time, as explained in [Which endpoints have the most work to do?](./walkthrough-2.md).
 
-If you are concerned about the number of messages that are being retried, check the endpoint logs. When messages are scheduled to be retried, details about the message and the failure are logged at the WARN log level.
+Check the endpoint logs for detailed information about retried messages, including the message content and the nature of the failure.
 
+### Identifying failed messages
 
-### Messages are failing, even after being retried
+The final indicator of a struggling endpoint is when messages consistently fail to process, even after being retried. NServiceBus will forward these messages to ServiceControl for manual intervention.
 
-The final indication that an endpoint is having problems is when messages fail to process. If, after some retry attempts, NServiceBus is still not able to successfully process a message, it will send the message to ServiceControl for manual intervention in ServicePulse.
+**Increase the failure rate to 90%:**
 
-**Find the Billing endpoint UI and increase the failure rate to 90%.**
+Find the Billing endpoint UI and increase the failure rate to 90%.
 
-With such a high failure rate, it won't take long before messages begin exceeding the number of retries configured for the Billing endpoint. When this happens, these failed messages will appear in the Failed Messages tab in ServicePulse.
+With a high failure rate, messages will quickly exceed the configured retry attempts and appear in the Failed Messages tab in ServicePulse.
 
 ![ServicePulse failed messages tab](servicepulse-failed-messages.png "width=500")
 
-When ServiceControl receives failed messages from an endpoint, it will group them according to the Exception Type and the place in the code where the exception is thrown. In ServicePulse you can open up an exception group and look at each failed individually. This includes a full stack-trace, as well as access to the message headers and the body of the message.
+ServiceControl groups failed messages by exception type and the location in the code where the exception occurred. In ServicePulse, you can examine each failed message individually, including the stack trace, message headers, and body.
+
+Once the underlying issue is resolved, you can retry all failed messages in bulk from ServicePulse.
 
-Once the conditions that led to the error are resolved, you can retry all of the messages in bulk from ServicePulse.
+**Retry failed messages:**
 
-**Find the Billing endpoint UI and decrease the failure rate back down to 0%. In the ServicePulse Failed Messages tab, click the Request retry button. Confirm that you are ready to retry the messages.**
+Find the Billing endpoint UI and decrease the failure rate back down to 0%. In the ServicePulse Failed Messages tab, click the Request retry button. Confirm that you are ready to retry the messages.
 
-ServiceControl will stage the messages to be retried and then return them to the Billing endpoint where they will be successfully processed.
+ServiceControl will stage the messages for retry and return them to the Billing endpoint for successful processing.
 
 ![ServicePulse failed messages retried](servicepulse-failed-messages-retried.png "width=500")
 
-## Keep exploring the demo
+## Next Steps
+
+After identifying and resolving issues with struggling endpoints, consider exploring the following:
 
-- **[Which message types are taking the longest to process?](./walkthrough-1.md):** take a look at individual endpoint performance and decide where to optimize.
-- **[Which endpoints have the most work to do?](./walkthrough-2.md):** look for peaks of traffic and decide when to scale out.
+- **[Which message types are taking the longest to process?](./walkthrough-1.md):** Analyze individual endpoint performance to identify optimization opportunities.
+- **[Which endpoints have the most work to do?](./walkthrough-2.md):** Examine traffic patterns and determine optimal scaling strategies.
 
 include: monitoring-demo-next-steps
diff --git a/tutorials/monitoring-demo/walkthrough-setup.md b/tutorials/monitoring-demo/walkthrough-setup.md
@@ -29,4 +29,4 @@ After downloading the zip file, extract its contents into a folder. Open the fol
 
 ## Explore the demo
 
-Once you have the demo up and running, [begin exploring the demo](/tutorials/monitoring-demo/#demo-walk-through).
+Once you have the demo up and running, [begin exploring the demo](/tutorials/monitoring-demo/#demo-overview).

Original file line number	Diff line number	Diff line change
`@@ -29,4 +29,4 @@ After downloading the zip file, extract its contents into a folder. Open the fol`
`29`	`29`
`30`	`30`	`## Explore the demo`
`31`	`31`
`32`		`-Once you have the demo up and running, [begin exploring the demo](/tutorials/monitoring-demo/#demo-walk-through).`
	`32`	`+Once you have the demo up and running, [begin exploring the demo](/tutorials/monitoring-demo/#demo-overview).`