-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Environment
- Telemetry Streaming Version: v1.33.0 and v1.34.0
- BIG-IP Version: BIG-IP 16.1.4.1 Build 0.58.5 Engineering Hotfix
- Resource provisioning: MGMT: Large, LTM: Nominal
- platform C117 - 4 CPUs, 16GB RAM
Summary
I configured an Prometheus Pull Consumer declaration and uploaded it to the API.
I try to pull the metrics and get the "HTTP ERROR 500 AsyncContext Timeout" after around 30s.
ircd_child processes consuming very high CPU and running long after this issue is displayed in the webbrowser.
The error happens always with "ABC_Pull_Consumer" (Default Metrics set by F5, nothing changed) and happens sometimes with "ABC_Pull_Consumer_2". This is on the standby system no load from User Traffic.
{
"class": "Telemetry",
"controls": {
"class":"Controls",
"logLevel": "info",
"debug": false
},
"ABC_System": {
"class": "Telemetry_System",
"enable": "true",
"systemPoller": [ "ABC_System_Poller1", "ABC_System_Poller2" ]
},
"ABC_System_Poller1": {
"class": "Telemetry_System_Poller",
"trace": false,
"interval": 0,
"enable": true,
"host": "localhost",
"port": 8100,
"protocol": "http",
"allowSelfSignedCert": true
},
"ABC_System_Poller2": {
"class": "Telemetry_System_Poller",
"trace": false,
"interval": 0,
"host": "localhost",
"port": 8100,
"protocol": "http",
"allowSelfSignedCert": true,
"endpointList": [ "Endpoints_ABC" ],
"enable": true
},
"Endpoints_ABC": {
"class": "Telemetry_Endpoints",
"items": {
"profileHttpStats": {
"path": "/mgmt/tm/ltm/profile/http/stats"
},
"system_performanceConnections": {
"path": "/mgmt/tm/sys/performance/connections/stats?options=detail"
},
"system_tmmTraffic": {
"path": "/mgmt/tm/sys/tmm-traffic"
},
"system_performanceThroughput": {
"path": "/mgmt/tm/sys/performance/throughput/stats?options=detail"
},
"nodeStats": {
"path": "/mgmt/tm/ltm/node/stats"
},
"node": {
"path": "/mgmt/tm/ltm/node"
}
}
},
"ABC_Pull_Consumer": {
"class": "Telemetry_Pull_Consumer",
"trace": false,
"type": "Prometheus",
"enable": true,
"systemPoller": [ "ABC_System_Poller1" ]
},
"ABC_Pull_Consumer_2": {
"class": "Telemetry_Pull_Consumer",
"trace": false,
"type": "Prometheus",
"enable": true,
"systemPoller": [ "ABC_System_Poller2" ]
}
}
Steps To Reproduce
Upload declaration, Access the Pull_Consumer Endpoints using webbrowser (or Prometheus)
Error Log
/var/log/restjavad.0.log shows this error:
[WARNING][5296][18 Jan 2024 08:51:26 UTC][8100/shared/iapp/build-package BuildRpmTaskCollectionWorker] Failed to execute the build command 'rpmbuild -bb --define '_tmppath /var/system/tmp' --define 'main /var/config/rest/iapps/f5-telemetry' --define '_topdir /var/config/rest/node/tmp' '/var/config/rest/node/tmp/8fb6a962-332b-4029-adcf-02f29cfd1e60.spec'', Threw:com.f5.rest.workers.shell.CommandExecuteException: Command execution process killed
at com.f5.rest.workers.shell.ShellExecutor.finishExecution(ShellExecutor.java:282)
at com.f5.rest.workers.shell.ShellExecutor.access$000(ShellExecutor.java:34)
at com.f5.rest.workers.shell.ShellExecutor$1.onProcessFailed(ShellExecutor.java:321)
at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:203)
at java.lang.Thread.run(Thread.java:748)
Expected Behavior
OpenTelemetry Plugin should provide the metrics in Prometheus format within a few seconds without consuming all CPU.
Actual Behavior
No metrics, high CPU for long time (5-10 minutes)
