[1036]Linux启动时间分析

linux,启动,时间,分析 · 浏览次数 : 258

小编点评

**今天有同事咨询:项目上有台服务器操作系统启动时间较长,如何分析?** **问题分析:** * 服务器启动时间较长。 * 通过 `systemd-analyze blame` 和 `systemd-analyze critical-chain` 命令分析启动过程,可以了解启动过程中的关键步骤和时间。 **解决方案:** 1. 使用 `systemd-analyze blame` 命令分析系统启动 blame 过程。 2. 使用 `systemd-analyze critical-chain` 命令分析系统启动关键链。 3. 结合 blame 和关键链分析结果,优化服务器启动时间。 **步骤:** **1. 使用 `systemd-analyze blame` 命令分析 blame 过程:** ``` systemd-analyze blame ``` **2. 使用 `systemd-analyze critical-chain` 命令分析关键链:** ``` systemd-analyze critical-chain ``` **3. 分析结果:** * `blame` 命令显示启动过程中每个服务启动的顺序和时间。 * `critical-chain` 命令显示关键启动链中的每个服务及其启动时间。 **4. 优化启动时间:** 根据分析结果,可以针对特定的服务或步骤进行优化,例如: * 降低启动延迟的服务。 * 优化启动所需资源。 * 减少启动过程中的等待时间。

正文

简述

今天有同事咨询:项目上有台服务器操作系统启动时间较长,如何分析?

果然,好问题都来自实践。

经过查找,对于所有基于systemd的系统,可以使用systemd-analyze来分析系统启动时间。查看man手册,systemd-analyze blame和 systemd-analyze critical-chain可以有效分析

 systemd-analyze blame: 
     This command prints a list of all running units, ordered by the time they took to initialize. This information may be used to  optimize boot-up times.
    
 systemd-analyze critical-chain:
     This command prints a tree of the time-critical chain of units (for each of the specified UNITs or for the default target otherwise). The time after the unit is active or started is printed after the "@" character. The time the unit takes to start  is printed after the "+" character.

样例分析

服务器1

配置:四路X86服务器

操作系统:CentOS Linux release 8.5.2111

#  systemd-analyze blame|head -n 15
         50.902s unbound-anchor.service
         21.032s mysql-monitor-agent.service
         15.143s DmServiceDMSERVER.service
         15.137s DmAPService.service
         10.305s kdump.service
          8.214s systemd-udev-settle.service
          6.170s NetworkManager-wait-online.service
          6.032s mysql-monitor-server.service
          4.536s mysqld.service
          2.194s dnf-makecache.service
          2.117s tuned.service
          1.328s dracut-initqueue.service
          1.226s lvm2-monitor.service
           813ms rdma-load-modules@rdma.service
           672ms lvm2-pvscan@8:3.service
           
           
# systemd-analyze critical-chain unbound-anchor.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

unbound-anchor.service +50.902s
└─basic.target @9.108s
  └─sockets.target @9.108s
    └─spice-vdagentd.socket @9.108s
      └─sysinit.target @9.105s
        └─systemd-update-utmp.service @9.100s +5ms
          └─auditd.service @9.053s +45ms
            └─systemd-tmpfiles-setup.service @9.024s +24ms
              └─import-state.service @9.010s +13ms
                └─local-fs.target @9.009s
                  └─boot.mount @8.956s +53ms
                    └─systemd-fsck@dev-disk-by\x2duuid-e2c118b0\x2dc161\x2d4599\x2dacfe\x2dd574ba48cb64.service >
                      └─local-fs-pre.target @8.923s
                        └─lvm2-monitor.service @245ms +1.226s
                          └─dm-event.socket @234ms
                            └─-.mount
                              └─system.slice
                                └─-.slice
           

服务器2

配置:四路X86服务器

操作系统:CentOS Linux release 7.9.2009 (Core)

# systemd-analyze blame|head -n 15
    1min 41.449s network.service
         50.468s unbound-anchor.service
         38.996s initial-setup.service
         27.898s kdump.service
          6.214s NetworkManager-wait-online.service
          4.439s systemd-udev-settle.service
          3.237s tuned.service
          2.922s dev-mapper-centos\x2droot.device
          1.715s lvm2-pvscan@8:3.service
          1.112s containerd.service
          1.005s docker.socket
          1.004s lvm2-monitor.service
           859ms dracut-initqueue.service
           814ms postfix.service
           756ms fwupd.service

# systemd-analyze critical-chain network.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

network.service +1min 41.449s
└─network-pre.target @7.418s

服务器3

配置:2路 Phytium服务器

操作系统:Kylin Linux Advanced Server release V10 (Sword)

# systemd-analyze blame|head -n 15
19.884s NetworkManager-wait-online.service
14.006s rasdaemon.service
 6.587s dracut-initqueue.service
 3.324s kdump.service
 2.607s plymouth-quit-wait.service
 1.930s network.service
 1.525s tuned.service
 1.475s lvm2-monitor.service
  995ms pmcd.service
  791ms hdd.mount
  637ms lvm2-pvscan@8:3.service
  574ms initrd-switch-root.service
  560ms systemd-sysctl.service
  540ms dracut-pre-pivot.service
  510ms udisks2.service
  
# systemd-analyze critical-chain NetworkManager-wait-online.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

NetworkManager-wait-online.service +19.884s
└─NetworkManager.service @4.107s +66ms
  └─network-pre.target @4.104s

服务器4

配置:2路 鲲鹏服务器

操作系统:NFS Server release 3.1 (RTM4-A1)

# systemd-analyze blame|head -n 15
         47.635s systemd-fsck-root.service
         26.005s NetworkManager-wait-online.service
          8.968s plymouth-quit-wait.service
          2.028s lvm2-pvscan@8:3.service
          1.709s docker.service
          1.650s fwupd.service
          1.081s postfix.service
          1.038s systemd-udev-settle.service
           935ms lvm2-monitor.service
           877ms dev-mapper-nfs\x2d\x2d3.1\x2droot.device
           614ms tuned.service
           590ms bolt.service
           505ms rdma.service
           496ms containerd.service
           427ms network.service

# systemd-analyze critical-chain systemd-fsck-root.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

systemd-fsck-root.service +47.635s
└─systemd-readahead-replay.service @1.257s +39ms
  └─system.slice
    └─-.slice

与[1036]Linux启动时间分析相似的内容: