Skip to content

Alert Fine Tuning

Emre Guclu edited this page Nov 24, 2022 · 3 revisions

Top 20 Alerts

The ones with the IsMonitorAlert true needs to be closed by resetting the monitor.

Get-SCOMAlert | where ResolutionState -eq 0 | Group-Object -Property Name,IsMonitorAlert | Sort-Object -Property Count -Descending  | Select-Object -Property Count, @{Name="AlertName";Expression={$result=$_.Name -split "," ;$result[0] }}  ,@{Name="IsMonitorAlert";Expression={$result=$_.Name -split "," ;$result|ForEach-Object {if ($_ -match '(True|False)'){$Matches[1]}} }} -First 20

Sample Result

Count AlertName                                                                         IsMonitorAlert
----- ---------                                                                         --------------
   80 Microsoft.SystemCenter.Agent.MonitoringHost.PrivateBytesThreshold                 True          
   76 Power Shell Script failed to run                                                  False         
   65 MSSQL on Windows: Discovery error                                                 False         
   17 Microsoft.SystemCenter.Agent.MonitoringHost.HandleCountThreshold                  True          
   15 Application Pool worker process is unresponsive                                   False         
   12 Alert generation was temporarily suspended due to too many alerts.                False         
    9 Exchange Health Set                                                               True          
    9 Alert subscription data source module encountered errors while running            False         
    7 Logical Disk Free Space in MBytes is low                                          True          
    6 Operations Manager failed to start a process                                      False         
    4 MSSQL on Windows: Database is in offline/recovery pending/suspect/emergency state True          
    2 Memory Pages Per Second is too High.                                              True          
    2 MSSQL on Windows: Monitoring error                                                False         
    2 Logical Disk Free Space is low                                                    True          
    2 Failed to send notification using server/device                                   False         
    2 Failed to send notification                                                       False         
    2 Windows DNS - Conditional Forward Forwarder - All IP Addresses Failing NSLookup   True          
    1 MSSQL on Windows: Filegroup is running out of space                               True          
    1 IIS 8 Web Server is unavailable                                                   True          
    1 Health Service Heartbeat Failure                                                  True          

Alert Counts Per Day

You can check the selected alert distribution by day and decide if the high amount is due to an opeation at a specific day.

Get-SCOMAlert | where {$_.Name -eq 'Microsoft.SystemCenter.Agent.MonitoringHost.PrivateBytesThreshold' -and $_.ResolutionState -eq 0} | Select-Object -Property @{Name="Date";Expression={"{0:yyyy-MM-dd}" -f $_.TimeRaised}}, NetbiosComputerName | Group-Object -Property Date | Sort-Object -Property Name -Descending | Select-Object -Property Name,Count

Sample Result is as follows

Name       Count
----       -----
2020-12-01     5
2020-11-30     3
2020-11-28     5
2020-11-27     7
2020-11-26    11
2020-11-25     7
2020-11-22     1
2020-11-21     1
2020-11-19     1
2020-11-16     1
2020-11-14     1
2020-11-10     1
2020-11-07     1
2020-11-03     1

Which servers are generating the selected alert?

We can determine if theres one or more problematic servers generating this alert so that to focus on.

Get-SCOMAlert | where {$_.Name -eq 'MSSQL on Windows: Discovery error' -and $_.ResolutionState -eq 0}  | Group-Object -Property NetbiosComputerName | Sort-Object -Property Count -Descending |Select-Object -Property Name,Count

Sample result is as follows.

Name            Count
----            -----
Server1         50
Server2         1
Server3         1
Server4         1

Run the following to see the alert counts per hour for the selected date.

$Alerts = Get-SCOMAlert
$Perfalerts = $Alerts | where {$_.Name -like '*Performance data collection process unable to write data*'}
$HourofDay = @{Name='HourofDay';Expression={$_.TimeRaised.hour}}
$Perfalerts | where {$_.TimeRaised.ToShortDateString() -eq '19.11.2022'} | Select-Object $HourofDay| Group-Object -Property HourofDay -NoElement | ft

Side bar

Clone this wiki locally