今天有同事咨询:项目上有台服务器操作系统启动时间较长,如何分析?
果然,好问题都来自实践。
经过查找,对于所有基于systemd的系统,可以使用systemd-analyze来分析系统启动时间。查看man手册,systemd-analyze blame和 systemd-analyze critical-chain可以有效分析
systemd-analyze blame:
This command prints a list of all running units, ordered by the time they took to initialize. This information may be used to optimize boot-up times.
systemd-analyze critical-chain:
This command prints a tree of the time-critical chain of units (for each of the specified UNITs or for the default target otherwise). The time after the unit is active or started is printed after the "@" character. The time the unit takes to start is printed after the "+" character.
配置:四路X86服务器
操作系统:CentOS Linux release 8.5.2111
# systemd-analyze blame|head -n 15
50.902s unbound-anchor.service
21.032s mysql-monitor-agent.service
15.143s DmServiceDMSERVER.service
15.137s DmAPService.service
10.305s kdump.service
8.214s systemd-udev-settle.service
6.170s NetworkManager-wait-online.service
6.032s mysql-monitor-server.service
4.536s mysqld.service
2.194s dnf-makecache.service
2.117s tuned.service
1.328s dracut-initqueue.service
1.226s lvm2-monitor.service
813ms rdma-load-modules@rdma.service
672ms lvm2-pvscan@8:3.service
# systemd-analyze critical-chain unbound-anchor.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
unbound-anchor.service +50.902s
└─basic.target @9.108s
└─sockets.target @9.108s
└─spice-vdagentd.socket @9.108s
└─sysinit.target @9.105s
└─systemd-update-utmp.service @9.100s +5ms
└─auditd.service @9.053s +45ms
└─systemd-tmpfiles-setup.service @9.024s +24ms
└─import-state.service @9.010s +13ms
└─local-fs.target @9.009s
└─boot.mount @8.956s +53ms
└─systemd-fsck@dev-disk-by\x2duuid-e2c118b0\x2dc161\x2d4599\x2dacfe\x2dd574ba48cb64.service >
└─local-fs-pre.target @8.923s
└─lvm2-monitor.service @245ms +1.226s
└─dm-event.socket @234ms
└─-.mount
└─system.slice
└─-.slice
配置:四路X86服务器
操作系统:CentOS Linux release 7.9.2009 (Core)
# systemd-analyze blame|head -n 15
1min 41.449s network.service
50.468s unbound-anchor.service
38.996s initial-setup.service
27.898s kdump.service
6.214s NetworkManager-wait-online.service
4.439s systemd-udev-settle.service
3.237s tuned.service
2.922s dev-mapper-centos\x2droot.device
1.715s lvm2-pvscan@8:3.service
1.112s containerd.service
1.005s docker.socket
1.004s lvm2-monitor.service
859ms dracut-initqueue.service
814ms postfix.service
756ms fwupd.service
# systemd-analyze critical-chain network.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
network.service +1min 41.449s
└─network-pre.target @7.418s
配置:2路 Phytium服务器
操作系统:Kylin Linux Advanced Server release V10 (Sword)
# systemd-analyze blame|head -n 15
19.884s NetworkManager-wait-online.service
14.006s rasdaemon.service
6.587s dracut-initqueue.service
3.324s kdump.service
2.607s plymouth-quit-wait.service
1.930s network.service
1.525s tuned.service
1.475s lvm2-monitor.service
995ms pmcd.service
791ms hdd.mount
637ms lvm2-pvscan@8:3.service
574ms initrd-switch-root.service
560ms systemd-sysctl.service
540ms dracut-pre-pivot.service
510ms udisks2.service
# systemd-analyze critical-chain NetworkManager-wait-online.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.
NetworkManager-wait-online.service +19.884s
└─NetworkManager.service @4.107s +66ms
└─network-pre.target @4.104s
配置:2路 鲲鹏服务器
操作系统:NFS Server release 3.1 (RTM4-A1)
# systemd-analyze blame|head -n 15
47.635s systemd-fsck-root.service
26.005s NetworkManager-wait-online.service
8.968s plymouth-quit-wait.service
2.028s lvm2-pvscan@8:3.service
1.709s docker.service
1.650s fwupd.service
1.081s postfix.service
1.038s systemd-udev-settle.service
935ms lvm2-monitor.service
877ms dev-mapper-nfs\x2d\x2d3.1\x2droot.device
614ms tuned.service
590ms bolt.service
505ms rdma.service
496ms containerd.service
427ms network.service
# systemd-analyze critical-chain systemd-fsck-root.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
systemd-fsck-root.service +47.635s
└─systemd-readahead-replay.service @1.257s +39ms
└─system.slice
└─-.slice