https://www.jianshu.com/p/b93342d43e13
有一台机器,在某个时间点OS类似无响应,造成使用者感觉在该时间点机器应该发生重启,就此问题进行分析。
--BMC日志确认机器在该时间点没有发生重启
--OS日志在该时间点也没有记录到重启,但日志记录了一些异常,如下所示
Oct 23 12:46:59 localhost kernel: device ens3f1 left promiscuous mode
Oct 23 13:11:32 localhost rsyslogd: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Oct 23 13:11:32 localhost rsyslogd: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Oct 23 13:46:25 localhost kernel: device ens6f1 entered promiscuous mode
Oct 23 13:46:28 localhost kernel: INFO: NMI handler (perf_event_nmi_handler) took too long to run: 907875.014 msecs
Oct 23 13:46:28 localhost kernel: perf: interrupt took too long (120883 > 7575), lowering kernel.perf_event_max_sample_rate to 1000
Oct 23 13:46:31 localhost kernel: INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 982076.016 msecs
Oct 23 13:46:31 localhost kernel: INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 982837.016 msecs
Oct 23 13:46:37 localhost kernel: device ens6f1 left promiscuous mode
Oct 23 13:46:38 localhost kernel: device ens3f1 entered promiscuous mode
Oct 23 13:46:59 localhost kernel: device ens3f1 left promiscuous mode
Oct 23 13:50:19 localhost kernel: perf: interrupt took too long (151272 > 151103), lowering kernel.perf_event_max_sample_rate to 1000
Oct 23 13:52:39 localhost start_filebeat.sh: /opt/pamon/filebeat-redis/start_filebeat.sh: line 14: 72199 Killed $filebeat_exe -c $filebeat_config
Oct 23 13:55:26 localhost kernel: perf: interrupt took too long (189323 > 189090), lowering kernel.perf_event_max_sample_rate to 1000
Oct 23 14:17:02 localhost auditd[1623]: Audit daemon rotating log files
Oct 23 14:30:17 localhost kernel: perf: interrupt took too long (237320 > 236653), lowering kernel.perf_event_max_sample_rate to 1000
Oct 23 14:37:39 localhost su: