Skip to content

Commit 851e5ad

Browse files
authored
Merge pull request #31485 from aliabuckner/patch-2
Add troubshooting section for OMI 100% CPU
2 parents c43c643 + e067c4e commit 851e5ad

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

articles/azure-monitor/platform/agent-linux-troubleshoot.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,33 @@ Below the output plugin, uncomment the following section by removing the `#` in
183183
## Issue: You see a 500 and 404 error in the log file right after onboarding
184184
This is a known issue that occurs on first upload of Linux data into a Log Analytics workspace. This does not affect data being sent or service experience.
185185

186+
187+
## Issue: You see omiagent using 100% CPU
188+
189+
### Probable causes
190+
A regression in nss-pem package [v1.0.3-5.el7](https://centos.pkgs.org/7/centos-x86_64/nss-pem-1.0.3-5.el7.x86_64.rpm.html) caused a severe performance issue, that we've been seeing come up a lot in Redhat/Centos 7.x distributions. To learn more about this issue, check the following documentation: Bug [1667121 Performance regression in libcurl](https://bugzilla.redhat.com/show_bug.cgi?id=1667121).
191+
192+
Performance related bugs don't happen all the time, and they are very difficult to reproduce. If you experience such issue with omiagent you should use the script omiHighCPUDiagnostics.sh which will collect the stack trace of the omiagent when exceeding a certain threshold.
193+
194+
1. Download the script <br/>
195+
`wget https://raw.githubusercontent.com/microsoft/OMS-Agent-for-Linux/master/tools/LogCollector/source/omiHighCPUDiagnostics.sh`
196+
197+
2. Run diagnostics for 24 hours with 30% CPU threshold <br/>
198+
`bash omiHighCPUDiagnostics.sh --runtime-in-min 1440 --cpu-threshold 30`
199+
200+
3. Callstack will be dumped in omiagent_trace file, If you notice many Curl and NSS function calls, follow resolution steps below.
201+
202+
### Resolution (step by step)
203+
204+
1. Upgrade the nss-pem package to [v1.0.3-5.el7_6.1](https://centos.pkgs.org/7/centos-updates-x86_64/nss-pem-1.0.3-5.el7_6.1.x86_64.rpm.html). <br/>
205+
`sudo yum upgrade nss-pem`
206+
207+
2. If nss-pem is not available for upgrade (mostly happens on Centos), then downgrade curl to 7.29.0-46. If by mistake you run "yum update", then curl will be upgraded to 7.29.0-51 and the issue will happen again. <br/>
208+
`sudo yum downgrade curl libcurl`
209+
210+
3. Restart OMI: <br/>
211+
`sudo scxadmin -restart`
212+
186213
## Issue: You are not seeing any data in the Azure portal
187214

188215
### Probable causes

0 commit comments

Comments
 (0)