You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/service-fabric/monitor-service-fabric-reference.md
+67-3Lines changed: 67 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ See [Monitor Service Fabric](monitor-service-fabric.md) for details on the data
18
18
Azure Monitor doesn't collect any platform metrics or resource logs for Service Fabric. You can monitor and collect:
19
19
20
20
- Service Fabric system, node, and application events. For the full event listing, see [List of Service Fabric events](service-fabric-diagnostics-event-generation-operational.md).
21
-
- Windows performance counters on nodes and applications. For the list of performance counters, see [Performance metrics](service-fabric-diagnostics-event-generation-perf.md).
21
+
- Windows performance counters on nodes and applications. For the list of performance counters, see [Performance metrics](#performance-metrics).
22
22
- Cluster, node, and system service health data. You can use the [FabricClient.HealthManager property](/dotnet/api/system.fabric.fabricclient.healthmanager) to get the health client to use for health related operations, like report health or get entity health.
23
23
- Metrics for the guest operating system (OS) that runs on a cluster node, through one or more agents that run on the guest OS.
24
24
@@ -27,9 +27,74 @@ Azure Monitor doesn't collect any platform metrics or resource logs for Service
27
27
> [!NOTE]
28
28
> The Azure Monitor agent replaces the previously-used Azure Diagnostics extension and Log Analytics agent. For more information, see [Overview of Azure Monitor agents](/azure/azure-monitor/agents/agents-overview).
29
29
30
+
## Performance metrics
31
+
32
+
Metrics should be collected to understand the performance of your cluster as well as the applications running in it. For Service Fabric clusters, we recommend collecting the following performance counters.
33
+
34
+
### Nodes
35
+
36
+
For the machines in your cluster, consider collecting the following performance counters to better understand the load on each machine and make appropriate cluster scaling decisions.
37
+
38
+
| Counter Category | Counter Name |
39
+
| --- | --- |
40
+
| Logical Disk | Logical Disk Free Space |
41
+
| PhysicalDisk(per Disk) | Avg. Disk Read Queue Length |
42
+
| PhysicalDisk(per Disk) | Avg. Disk Write Queue Length |
43
+
| PhysicalDisk(per Disk) | Avg. Disk sec/Read |
44
+
| PhysicalDisk(per Disk) | Avg. Disk sec/Write |
45
+
| PhysicalDisk(per Disk) | Disk Reads/sec |
46
+
| PhysicalDisk(per Disk) | Disk Read Bytes/sec |
47
+
| PhysicalDisk(per Disk) | Disk Writes/sec |
48
+
| PhysicalDisk(per Disk) | Disk Write Bytes/sec |
49
+
| Memory | Available MBytes |
50
+
| PagingFile | % Usage |
51
+
| Processor(Total) | % Processor Time |
52
+
| Process (per service) | % Processor Time |
53
+
| Process (per service) | ID Process |
54
+
| Process (per service) | Private Bytes |
55
+
| Process (per service) | Thread Count |
56
+
| Process (per service) | Virtual Bytes |
57
+
| Process (per service) | Working Set |
58
+
| Process (per service) | Working Set - Private |
59
+
| Network Interface(all-instances) | Bytes recd |
60
+
| Network Interface(all-instances) | Bytes sent |
61
+
| Network Interface(all-instances) | Bytes total |
Service Fabric generates a substantial amount of custom performance counters. If you have the SDK installed, you can see the comprehensive list on your Windows machine in your Performance Monitor application (Start > Performance Monitor).
88
+
89
+
In the applications you are deploying to your cluster, if you are using Reliable Actors, add counters from `Service Fabric Actor` and `Service Fabric Actor Method` categories (see [Service Fabric Reliable Actors Diagnostics](service-fabric-reliable-actors-diagnostics.md)).
90
+
91
+
If you use Reliable Services or Service Remoting, we similarly have `Service Fabric Service` and `Service Fabric Service Method` counter categories that you should collect counters from, see [monitoring with service remoting](service-fabric-reliable-serviceremoting-diagnostics.md) and [reliable services performance counters](service-fabric-reliable-services-diagnostics.md#performance-counters).
92
+
93
+
If you use Reliable Collections, we recommend adding the `Avg. Transaction ms/Commit` from the `Service Fabric Transactional Replicator` to collect the average commit latency per transaction metric.
- See [Monitor Service Fabric](monitor-service-fabric.md) for a description of monitoring Service Fabric.
45
110
- See [Monitor Azure resources with Azure Monitor](/azure/azure-monitor/essentials/monitor-azure-resource) for details on monitoring Azure resources.
46
111
- See [List of Service Fabric events](service-fabric-diagnostics-event-generation-operational.md) for the list of Service Fabric system, node, and application events.
47
-
- See [Performance metrics](service-fabric-diagnostics-event-generation-perf.md) for the list of Windows performance counters on nodes and applications.
Copy file name to clipboardExpand all lines: articles/service-fabric/monitor-service-fabric.md
+17-8Lines changed: 17 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,10 +31,6 @@ You can monitor how your applications are used, the actions taken by the Service
31
31
32
32
[Service Fabric Explorer](service-fabric-visualizing-your-cluster.md), a desktop application for Windows, macOS, and Linux, is an open-source tool for inspecting and managing Azure Service Fabric clusters. To enable automation, every action that can be taken through Service Fabric Explorer can also be done through PowerShell or a REST API.
33
33
34
-
### Application Insights
35
-
36
-
Application Insights integrates with Service Fabric to provide Service Fabric specific metrics and tooling experiences for Visual Studio and Azure portal. Application Insights provides a comprehensive out-of-the-box logging experience. For more information, see [Event analysis and visualization with Application Insights](service-fabric-diagnostics-event-analysis-appinsights.md).
37
-
38
34
## Application monitoring
39
35
40
36
Application monitoring tracks how features and components of your application are being used. You want to monitor your applications to make sure issues that impact users are caught. The responsibility of application monitoring is on the users developing an application and its services since it is unique to the business logic of your application. Monitoring your applications can be useful in the following scenarios:
@@ -45,9 +41,24 @@ Application monitoring tracks how features and components of your application ar
45
41
* What is happening within the services running inside my containers?
46
42
47
43
The great thing about application monitoring is that developers can use whatever tools and framework they'd like since it lives within the context of your application! You can learn more about the Azure solution for application monitoring with Azure Monitor Application Insights in [Event analysis with Application Insights](service-fabric-diagnostics-event-analysis-appinsights.md).
44
+
48
45
We also have a tutorial with how to [set this up for .NET Applications](service-fabric-tutorial-monitoring-aspnet.md). This tutorial goes over how to install the right tools, an example to write custom telemetry in your application, and viewing the application diagnostics and telemetry in the Azure portal.
49
46
50
-
For more information on application monitoring, see [Application logging](service-fabric-diagnostics-event-generation-app.md).
47
+
### Application logging
48
+
49
+
Instrumenting your code is not only a way to gain insights about your users, but also the only way you can know whether something is wrong in your application, and to diagnose what needs to be fixed. Although technically it's possible to connect a debugger to a production service, it's not a common practice. So, having detailed instrumentation data is important.
50
+
51
+
Some products automatically instrument your code. Although these solutions can work well, manual instrumentation is almost always required to be specific to your business logic. In the end, you must have enough information to forensically debug the application. Service Fabric applications can be instrumented with any logging framework. This section describes a few different approaches to instrumenting your code, and when to choose one approach over another.
52
+
53
+
-**Application Insights SDK**: Application Insights has a rich integration with Service Fabric out of the box. Users can add the AI Service Fabric nuget packages and receive data and logs created and collected viewable in the Azure portal. Additionally, users are encouraged to add their own telemetry in order to diagnose and debug their applications and track which services and parts of their application are used the most. The [TelemetryClient](/dotnet/api/microsoft.applicationinsights.telemetryclient) class in the SDK provides many ways to track telemetry in your applications. For more information, see [Event analysis and visualization with Application Insights](service-fabric-diagnostics-event-analysis-appinsights.md).
54
+
55
+
Check out an example of how to instrument and add application insights to your application in our tutorial for [monitoring and diagnosing a .NET application](service-fabric-tutorial-monitoring-aspnet.md)
56
+
57
+
-**EventSource**: When you create a Service Fabric solution from a template in Visual Studio, an **EventSource**-derived class (**ServiceEventSource** or **ActorEventSource**) is generated. A template is created, in which you can add events for your application or service. The **EventSource** name **must** be unique, and should be renamed from the default template string MyCompany-<solution>-<project>. Having multiple **EventSource** definitions that use the same name causes an issue at run time. Each defined event must have a unique identifier. If an identifier is not unique, a runtime failure occurs. Some organizations preassign ranges of values for identifiers to avoid conflicts between separate development teams. For more information, see [Vance's blog](/archive/blogs/vancem/introduction-tutorial-logging-etw-events-in-c-system-diagnostics-tracing-eventsource) or the [MSDN documentation](/previous-versions/msp-n-p/dn774985(v=pandp.20)).
58
+
59
+
-**ASP.NET Core logging**: It's important to carefully plan how you will instrument your code. The right instrumentation plan can help you avoid potentially destabilizing your code base, and then needing to reinstrument the code. To reduce risk, you can choose an instrumentation library like [Microsoft.Extensions.Logging](https://www.nuget.org/packages/Microsoft.Extensions.Logging/), which is part of Microsoft ASP.NET Core. ASP.NET Core has an [ILogger](/dotnet/api/microsoft.extensions.logging.ilogger) interface that you can use with the provider of your choice, while minimizing the effect on existing code. You can use the code in ASP.NET Core on Windows and Linux, and in the full .NET Framework, so your instrumentation code is standardized.
60
+
61
+
For examples on how to use these suggestions, see [Add logging to your Service Fabric application](service-fabric-how-to-diagnostics-log.md).
51
62
52
63
## Platform (cluster) monitoring
53
64
@@ -86,9 +97,7 @@ Now that we've covered the diagnostics in your application and the platform, how
86
97
* Am I utilizing my hardware efficiently? Do you want to use your hardware at 90% CPU or 10% CPU. This comes in handy when scaling your cluster, or optimizing your application's processes.
87
98
* Can I predict infrastructure issues proactively? - many issues are preceded by sudden changes (drops) in performance, so you can use performance counters such as network I/O and CPU utilization to predict and diagnose the issues proactively.
88
99
89
-
A list of performance counters that should be collected at the infrastructure level can be found at [Performance metrics](service-fabric-diagnostics-event-generation-perf.md).
90
-
91
-
Service Fabric also provides a set of performance counters for the Reliable Services and Actors programming models. If you are using either of these models, these performance counters can information to ensure that your actors are spinning up and down correctly, or that your reliable service requests are being handled fast enough. For more information, see [Monitoring for Reliable Service Remoting](service-fabric-reliable-serviceremoting-diagnostics.md#performance-counters) and [Performance monitoring for Reliable Actors](service-fabric-reliable-actors-diagnostics.md#performance-counters).
100
+
A list of performance counters that should be collected at the infrastructure level can be found at [Performance metrics](monitor-service-fabric-reference.md#performance-metrics).
92
101
93
102
Azure Monitor Logs is recommended for monitoring cluster level events. After you configure the [Log Analytics agent](service-fabric-diagnostics-oms-agent.md) with your workspace, you can collect:
Copy file name to clipboardExpand all lines: articles/service-fabric/service-fabric-best-practices-applications.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,7 +70,7 @@ Service Fabric Reliable Actors enables you to easily create stateful, virtual ac
70
70
71
71
72
72
## Application diagnostics
73
-
Be thorough about adding [application logging](./service-fabric-diagnostics-event-generation-app.md) in service calls. It will help you diagnose scenarios in which services call each other. For example, when A calls B calls C calls D, the call could fail anywhere. If you don't have enough logging, failures are hard to diagnose. If the services are logging too much because of call volumes, be sure to at least log errors and warnings.
73
+
Be thorough about adding [application logging](monitor-service-fabric.md#application-logging) in service calls. It will help you diagnose scenarios in which services call each other. For example, when A calls B calls C calls D, the call could fail anywhere. If you don't have enough logging, failures are hard to diagnose. If the services are logging too much because of call volumes, be sure to at least log errors and warnings.
74
74
75
75
## Design guidance on Azure
76
76
* Visit the [Azure architecture center](/azure/architecture/microservices/) for design guidance on [building microservices on Azure](/azure/architecture/microservices/).
Copy file name to clipboardExpand all lines: articles/service-fabric/service-fabric-best-practices-monitoring.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ Generally, a watchdog is a separate service that watches health and load across
35
35
36
36
## Next steps
37
37
38
-
* Get started instrumenting your applications: [Application level event and log generation](service-fabric-diagnostics-event-generation-app.md).
38
+
* Get started instrumenting your applications: [Application level event and log generation](monitor-service-fabric.md#application-logging).
39
39
* Go through the steps to set up Application Insights for your application with [Monitor and diagnose an ASP.NET Core application on Service Fabric](service-fabric-tutorial-monitoring-aspnet.md).
40
40
* Learn more about monitoring the platform and the events Service Fabric provides for you: [Platform level event and log generation](service-fabric-diagnostics-event-generation-infra.md).
41
41
* Configure Azure Monitor logs integration with Service Fabric: [Set up Azure Monitor logs for a cluster](service-fabric-diagnostics-oms-setup.md)
Copy file name to clipboardExpand all lines: articles/service-fabric/service-fabric-diagnostics-event-aggregation-wad.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -299,7 +299,7 @@ To collect performance counters or event logs, modify the Resource Manager templ
299
299
300
300
## Collect Performance Counters
301
301
302
-
To collect performance metrics from your cluster, add the performance counters to your "WadCfg > DiagnosticMonitorConfiguration" in the Resource Manager template for your cluster. See [Performance monitoring with WAD](service-fabric-diagnostics-perf-wad.md) for steps on modifying your `WadCfg` to collect specific performance counters. Reference [Service Fabric Performance Counters](service-fabric-diagnostics-event-generation-perf.md) for a list of performance counters that we recommend collecting.
302
+
To collect performance metrics from your cluster, add the performance counters to your "WadCfg > DiagnosticMonitorConfiguration" in the Resource Manager template for your cluster. See [Performance monitoring with WAD](service-fabric-diagnostics-perf-wad.md) for steps on modifying your `WadCfg` to collect specific performance counters. Reference [Performance metrics](monitor-service-fabric-reference.md#performance-metrics) for a list of performance counters that we recommend collecting.
303
303
304
304
If you are using an Application Insights sink, as described in the section below, and want these metrics to show up in Application Insights, then make sure to add the sink name in the "sinks" section as shown above. This will automatically send the performance counters that are individually configured to your Application Insights resource.
0 commit comments