如何排查flinkjob挂掉原因 #2544
Unanswered
ZeYuYang1024
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
服务器上启动了一个Flink流处理作业,运行近一个月时间后,发现个别task manager进程突然挂掉了。
该作业的数据流量较大,当前task manager的资源配置为:
Physical Memory: 4GB
JVM Heap Size: 1.56GB
Flink Managed Memory: 1.35GB
在task manager挂掉后,我检查了其日志文件,但是没有找到明显的异常信息,如OOM等。云平台的监控也没有发现CPU或内存使用异常。
已采取的排查步骤:
检查了task manager日志文件,没有明显异常
检查了云平台的CPU和内存监控,没有发现异常
Beta Was this translation helpful? Give feedback.
All reactions