https://docs.openeuler.org/zh/docs/22.03_LTS_SP1/docs/A-Tune/%E4%BD%BF%E7%94%A8%E6%96%B9%E6%B3%95.html
用户可以通过命令行客户端atune-adm使用A-Tune提供的功能。本章介绍A-Tune客户端包含的功能和使用方法。
使用A-Tune需要使用root权限。
atune-adm支持的命令可以通过 atune-adm help/–help/-h 查询。
使用方法中所有命令的使用举例都是在单机部署模式下,如果是在分布式部署模式下,需要指定服务器IP和端口号,例如:
""# atune-adm -a 192.168.3.196 -p 60001 list
define、update、undefine、collection、train、upgrade不支持远程执行。
命令格式中,[ ] 表示参数可选,<> 表示参数必选,具体参数由实际情况确定。
查询系统当前支持的profile,以及当前处于active状态的profile。
atune-adm list
""# atune-adm list
Support profiles:
+------------------------------------------------+-----------+
| ProfileName | Active |
+================================================+===========+
| arm-native-android-container-robox | false |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-fio | false |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-lmbench | false |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-netperf | false |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-stream | false |
+------------------------------------------------+-----------+
| basic-test-suite-euleros-baseline-unixbench | false |
+------------------------------------------------+-----------+
| basic-test-suite-speccpu-speccpu2006 | false |
+------------------------------------------------+-----------+
| basic-test-suite-specjbb-specjbb2015 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-hdfs-dfsio-hdd | false |
+------------------------------------------------+-----------+
| big-data-hadoop-hdfs-dfsio-ssd | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-bayesian | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-kmeans | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql1 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql10 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql2 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql3 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql4 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql5 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql6 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql7 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql8 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-sql9 | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-tersort | false |
+------------------------------------------------+-----------+
| big-data-hadoop-spark-wordcount | false |
+------------------------------------------------+-----------+
| cloud-compute-kvm-host | false |
+------------------------------------------------+-----------+
| database-mariadb-2p-tpcc-c3 | false |
+------------------------------------------------+-----------+
| database-mariadb-4p-tpcc-c3 | false |
+------------------------------------------------+-----------+
| database-mongodb-2p-sysbench | false |
+------------------------------------------------+-----------+
| database-mysql-2p-sysbench-hdd | false |
+------------------------------------------------+-----------+
| database-mysql-2p-sysbench-ssd | false |
+------------------------------------------------+-----------+
| database-postgresql-2p-sysbench-hdd | false |
+------------------------------------------------+-----------+
| database-postgresql-2p-sysbench-ssd | false |
+------------------------------------------------+-----------+
| default-default | false |
+------------------------------------------------+-----------+
| docker-mariadb-2p-tpcc-c3 | false |
+------------------------------------------------+-----------+
| docker-mariadb-4p-tpcc-c3 | false |
+------------------------------------------------+-----------+
| hpc-gatk4-human-genome | false |
+------------------------------------------------+-----------+
| in-memory-database-redis-redis-benchmark | false |
+------------------------------------------------+-----------+
| middleware-dubbo-dubbo-benchmark | false |
+------------------------------------------------+-----------+
| storage-ceph-vdbench-hdd | false |
+------------------------------------------------+-----------+
| storage-ceph-vdbench-ssd | false |
+------------------------------------------------+-----------+
| virtualization-consumer-cloud-olc | false |
+------------------------------------------------+-----------+
| virtualization-mariadb-2p-tpcc-c3 | false |
+------------------------------------------------+-----------+
| virtualization-mariadb-4p-tpcc-c3 | false |
+------------------------------------------------+-----------+
| web-apache-traffic-server-spirent-pingpo | false |
+------------------------------------------------+-----------+
| web-nginx-http-long-connection | true |
+------------------------------------------------+-----------+
| web-nginx-https-short-connection | false |
+------------------------------------------------+-----------+
说明:
Active为true表示当前激活的profile,示例表示当前激活的profile是web-nginx-http-long-connection。
采集系统的实时统计数据进行负载类型识别,并进行自动优化。
atune-adm analysis [OPTIONS]
使用默认的模型进行应用识别
""# atune-adm analysis --characterization
使用默认的模型进行应用识别,并进行自动优化
""# atune-adm analysis
使用自训练的模型进行应用识别
""# atune-adm analysis --model /usr/libexec/atuned/analysis/models/new-model.m
A-Tune支持用户定义并学习新模型。定义新模型的操作流程如下:
添加用户自定义的应用场景,及对应的profile优化项。
atune-adm define <service_type> <application_name> <scenario_name> <profile_path>
新增一个profile,service_type的名称为test_service,application_name的名称为test_app,scenario_name的名称为test_scenario,优化项的配置文件为example.conf。
""# atune-adm define test_service test_app test_scenario ./example.conf
example.conf 可以参考如下方式书写(以下各优化项非必填,仅供参考),也可通过atune-adm info查看已有的profile是如何书写的。
"" [main]
# list its parent profile
[kernel_config]
# to change the kernel config
[bios]
# to change the bios config
[bootloader.grub2]
# to change the grub2 config
[sysfs]
# to change the /sys/* config
[systemctl]
# to change the system service status
[sysctl]
# to change the /proc/sys/* config
[script]
# the script extension of cpi
[ulimit]
# to change the resources limit of user
[schedule_policy]
# to change the schedule policy
[check]
# check the environment
[tip]
# the recommended optimization, which should be performed manunaly
采集业务运行时系统的全局资源使用情况以及OS的各项状态信息,并将收集的结果保存到csv格式的输出文件中,作为模型训练的输入数据集。
说明:
- 本命令依赖采样工具perf,mpstat,vmstat,iostat,sar。
- CPU型号目前仅支持鲲鹏920,可通过dmidecode -t processor检查CPU型号。
atune-adm collection <OPTIONS>
OPTIONS
""# atune-adm collection --filename name --interval 5 --duration 1200 --output_path /home/data --disk sda --network eth0 --app_type test_type
使用采集的数据进行模型的训练。训练时至少采集两种应用类型的数据,否则训练会出错。
atune-adm train <OPTIONS>
使用data目录下的csv文件作为训练输入,生成的新模型new-model.m存放在model目录下。
""# atune-adm train --data_path /home/data --output_file /usr/libexec/atuned/analysis/models/new-model.m
删除用户自定义的profile。
atune-adm undefine <profile>
删除自定义的profile。
""# atune-adm undefine test_service-test_app-test_scenario
查看对应的profile内容。
atune-adm info <profile>
查看web-nginx-http-long-connection的profile内容:
""# atune-adm info web-nginx-http-long-connection
*** web-nginx-http-long-connection:
#
# nginx http long connection A-Tune configuration
#
[main]
include = default-default
[kernel_config]
#TODO CONFIG
[bios]
#TODO CONFIG
[bootloader.grub2]
iommu.passthrough = 1
[sysfs]
#TODO CONFIG
[systemctl]
sysmonitor = stop
irqbalance = stop
[sysctl]
fs.file-max = 6553600
fs.suid_dumpable = 1
fs.aio-max-nr = 1048576
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_local_port_range = 1024 65500
net.ipv4.tcp_max_tw_buckets = 5000
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 262144
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syn_retries = 1
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_mem = 362619 483495 725238
net.ipv4.tcp_rmem = 4096 87380 6291456
net.ipv4.tcp_wmem = 4096 16384 4194304
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
[script]
prefetch = off
ethtool = -X {network} hfunc toeplitz
[ulimit]
{user}.hard.nofile = 102400
{user}.soft.nofile = 102400
[schedule_policy]
#TODO CONFIG
[check]
#TODO CONFIG
[tip]
SELinux provides extra control and security features to linux kernel. Disabling SELinux will improve the performance but may cause security risks. = kernel
disable the nginx log = application
用户根据需要更新已有profile。
将已有profile中原来的优化项更新为new.conf中的内容。
atune-adm update <profile> <profile_path>
更新名为test_service-test_app-test_scenario的profile优化项为new.conf。
""# atune-adm update test_service-test_app-test_scenario ./new.conf
手动激活profile,使其处于active状态。
atune-adm profile <profile>
profile名参考list命令查询结果。
激活web-nginx-http-long-connection对应的profile配置。
""# atune-adm profile web-nginx-http-long-connection
回退当前的配置到系统的初始配置。
atune-adm rollback
""# atune-adm rollback
更新系统的数据库。
atune-adm upgrade <DB_FILE>
DB_FILE
新的数据库文件路径
数据库更新为new_sqlite.db。
""# atune-adm upgrade ./new_sqlite.db
检查系统当前的cpu、bios、os、网卡等信息。
atune-adm check
""# atune-adm check
cpu information:
cpu:0 version: Kunpeng 920-6426 speed: 2600000000 HZ cores: 64
cpu:1 version: Kunpeng 920-6426 speed: 2600000000 HZ cores: 64
system information:
DMIBIOSVersion: 0.59
OSRelease: 4.19.36-vhulk1906.3.0.h356.eulerosv2r8.aarch64
network information:
name: eth0 product: HNS GE/10GE/25GE RDMA Network Controller
name: eth1 product: HNS GE/10GE/25GE Network Controller
name: eth2 product: HNS GE/10GE/25GE RDMA Network Controller
name: eth3 product: HNS GE/10GE/25GE Network Controller
name: eth4 product: HNS GE/10GE/25GE RDMA Network Controller
name: eth5 product: HNS GE/10GE/25GE Network Controller
name: eth6 product: HNS GE/10GE/25GE RDMA Network Controller
name: eth7 product: HNS GE/10GE/25GE Network Controller
name: docker0 product:
A-Tune提供了最佳配置的自动搜索能力,免去人工反复做参数调整、性能评价的调优过程,极大地提升最优配置的搜寻效率。
使用指定的项目文件对参数进行动态空间的搜索,找到当前环境配置下的最优解。
说明:
在运行命令前,需要满足如下条件:
- 服务端的yaml配置文件已经编辑完成并放置于 atuned服务下的**/etc/atuned/tuning/**目录中。
- 客户端的yaml配置文件已经编辑完成并放置于atuned客户端任意目录下。
atune-adm tuning [OPTIONS] <PROJECT_YAML>
OPTIONS
说明:
当使用参数时,-p参数后需要跟具体的项目名称且必须指定该项目yaml文件。
PROJECT_YAML:客户端yaml配置文件。
表 1 服务端yaml文件
最大调优迭代次数,用于限制客户端的迭代次数。一般来说,调优迭代次数越多,优化效果越好,但所需时间越长。用户必须根据实际的业务场景进行配置。 |
|||
object 配置项请参见表2。 |
表 2 object项配置说明
表 3 客户端yaml文件配置说明
evaluations 配置项请参见表4 |
表 4 evaluations项配置说明
服务端yaml文件配置示例:
""project: "compress"
maxiterations: 500
startworkload: ""
stopworkload: ""
object :
-
name : "compressLevel"
info :
desc : "The compresslevel parameter is an integer from 1 to 9 controlling the level of compression"
get : "cat /root/A-Tune/examples/tuning/compress/compress.py | grep 'compressLevel=' | awk -F '=' '{print $2}'"
set : "sed -i 's/compressLevel=\\s*[0-9]*/compressLevel=$value/g' /root/A-Tune/examples/tuning/compress/compress.py"
needrestart : "false"
type : "continuous"
scope :
- 1
- 9
dtype : "int"
-
name : "compressMethod"
info :
desc : "The compressMethod parameter is a string controlling the compression method"
get : "cat /root/A-Tune/examples/tuning/compress/compress.py | grep 'compressMethod=' | awk -F '=' '{print $2}' | sed 's/\"//g'"
set : "sed -i 's/compressMethod=\\s*[0-9,a-z,\"]*/compressMethod=\"$value\"/g' /root/A-Tune/examples/tuning/compress/compress.py"
needrestart : "false"
type : "discrete"
options :
- "bz2"
- "zlib"
- "gzip"
dtype : "string"
客户端yaml文件配置示例:
""project: "compress"
engine : "gbrt"
iterations : 20
random_starts : 10
benchmark : "python3 /root/A-Tune/examples/tuning/compress/compress.py"
evaluations :
-
name: "time"
info:
get: "echo '$out' | grep 'time' | awk '{print $3}'"
type: "positive"
weight: 20
-
name: "compress_ratio"
info:
get: "echo '$out' | grep 'compress_ratio' | awk '{print $3}'"
type: "negative"
weight: 80
进行tuning调优
""# atune-adm tuning --project compress --detail compress_client.yaml
恢复tuning调优前的初始配置,compress为yaml文件中的项目名称
""# atune-adm tuning --restore --project compress