[转帖]Working Set Size (WSS) Tools for Linux

working,set,size,wss,tools,for,linux · 浏览次数 : 0

小编点评

## Summary of the WSS Tools This document describes two C programs for measuring and analyzing working set size (WSS): **1. wss.pl:** * This is a simple tool that uses the pagemap data to estimate WSS. * It takes a snapshot of the page flags and reads them back to calculate WSS. * It warns about setting and reading system page flags, which can take over a second of CPU time. * It is slower than the other tool, but it is more accurate. **2. wss-v2.c:** * This is a more advanced tool that uses idle page tracking to estimate WSS. * It takes snapshots of the system's idle page flags at regular intervals. * This allows it to track WSS over long periods of time. * It is faster than wss.pl, but it may be less accurate. **Key differences:** | Feature | wss.pl | wss-v2.c | |---|---|---| | Measurement method | Pagemap data snapshot | Idle page flags | | Frequency of measurements | Once | Regular intervals | | Accuracy | Higher | Lower | | Speed | Slower | Faster | | Risk of overheads | Setting and reading system page flags | Setting and reading system and process page flags | **Recommendation:** * For most scenarios, wss.pl is the recommended tool. * If you need a tool that is more accurate and efficient for large datasets, consider wss-v2.c. * Be aware of the risk of overheads associated with both tools.

正文

https://github.com/brendangregg/wss

 

These are experimental tools for doing working set size estimation, using different Linux facilities. See WARNINGs.

Main website: http://www.brendangregg.com/wss.html

Tools:

  • wss.pl: For Linux 2.6.22+. Uses the referenced page flag for a page-based WSS estimation.
  • wss-v1: For Linux 4.3+, and small processes. Uses the idle page flag for a page-based WSS estimation.
  • wss-v2: For Linux 4.3+, and large processes. Uses the idle page flag for a page-based WSS estimation.

wss.pl (referenced page flag)

This tool should work on Linux 2.26.22+, although with caveats described below. It resets the PG_referenced page flags via /proc/PID/clear_refs, then checks referenced memory after a duration. Eg:

# ./wss.pl 23593 0.1
Watching PID 23593 page references during 0.1 seconds...
Est(s)     RSS(MB)    PSS(MB)    Ref(MB)
0.100       201.18     200.10      10.41

The output shows that the process had 201 Mbytes of RSS (main memory), and during 0.1 seconds only 10.41 Mbytes (worth of pages) was touched (read/written).

Columns:

  • Est(s): Estimated WSS measurement duration: this accounts for delays with setting and reading pagemap data, which inflates the intended sleep duration.
  • RSS(MB): Resident Set Size (Mbytes). The main memory size.
  • PSS(MB): Proportional Set Size (Mbytes). Accounting for shared pages.
  • Ref(MB): Referenced (Mbytes) during the specified duration. This is the working set size metric.
  • Dur(s): Full duration of measurement (seconds), from beginning to set page flags to completing reading them.
  • Slp(s): Total sleep time.

USAGE:

# ./wss.pl -h
USAGE: wss [options] PID duration(s)
	-C         # show cumulative output every duration(s)
	-s secs    # take duration(s) snapshots after secs pauses
	-d secs    # total duration of measuremnt (for -s or -C)
	-P steps   # profile run (cumulative), from duration(s)
	-t         # show additional timestamp columns
   eg,
	wss 181 0.01       # measure PID 181 WSS for 10 milliseconds
	wss 181 5          # measure PID 181 WSS for 5 seconds (same overhead)
	wss -C 181 5       # show PID 181 growth every 5 seconds
	wss -C -d 10 181 1 # PID 181 growth each second for 10 seconds total
	wss -s 1 181 0.01  # show a 10 ms WSS snapshot every 1 second
	wss -s 0 181 1     # measure WSS every 1 second (not cumulative)
	wss -P 10 181 0.01 # 10 step power-of-2 profile, starting with 0.01s

WARNINGs:

This tool uses /proc/PID/clear_refs and /proc/PID/smaps, which can cause slightly higher application latency (eg, 10%) while the kernel walks page structures. For large processes (> 100 Gbytes) this duration of higher latency can last over 1 second, during which this tool is consuming system CPU time. Consider these overheads. This also resets the referenced flag, which might confuse the kernel as to which pages to reclaim, especially if swapping is active. This also activates some old kernel code that may not have been used in your environment before, and which modifies page flags: I'd guess there is a risk of an undiscovered kernel panic (the Linux mm community may be able to say how real this risk is). Test in a lab environment for your kernel versions, and consider this experimental: use at your on risk.

wss-v1.c (idle page flag: small process)

This is a proof-of-concept tool that uses idle page tracking, which was added to Linux 4.3. This is considered safer than modifying the referenced page flag, since the referenced page flag may confuse the kernel reclaim code, especially if the system is swapping.

This version of this tool walks page structures one by one, and is suited for small processes only. On large processes (>100 Gbytes), this tool can take several minutes to write. See wss-v2.c, which uses page data snapshots and is much faster for large processes (50x), as well as wss.pl, which is even faster (although uses the referenced page flag).

Here is some example output, comparing this tool to the earlier wss.pl:

# ./wss-v1 33583 0.01
Watching PID 33583 page references during 0.01 seconds...
Est(s)     Ref(MB)
0.055        10.00

# ./wss.pl 33583 0.01
Watching PID 33583 page references during 0.01 seconds...
Est(s)     RSS(MB)    PSS(MB)    Ref(MB)
0.011        21.07      20.10      10.03

The output shows that that process referenced 10 Mbytes of data (this is correct: it's a synthetic workload).

Columns:

  • Est(s): Estimated WSS measurement duration: this accounts for delays with setting and reading pagemap data, which inflates the intended sleep duration.
  • Ref(MB): Referenced (Mbytes) during the specified duration. This is the working set size metric.

WARNINGs:

This tool sets and reads process page flags, which for large processes (> 100 Gbytes) can take several minutes (use wss-v2 for those instead). During that time, this tool consumes one CPU, and the application may experience slightly higher latency (eg, 5%). Consider these overheads. Also, this is activating some new kernel code added in Linux 4.3 that you may have never executed before. As is the case for any such code, there is the risk of undiscovered kernel panics (I have no specific reason to worry, just being paranoid). Test in a lab environment for your kernel versions, and consider this experimental: use at your own risk.

wss-v2.c (idle page flag: large process)

This is a proof-of-concept tool that uses idle page tracking, which was added to Linux 4.3. This is considered safer than modifying the referenced page flag, since the referenced page flag may confuse the kernel reclaim code, especially if the system is swapping.

This version of this tool takes a snapshot of the system's idle page flags, which speeds up analysis of large processes, but not small ones. See wss-v1.c, which may be faster for small processes, as well as wss.pl, which is even faster (although uses the referenced page flag).

Here is some example output, comparing this tool to wss-v1 (which runs much slower), and the earlier wss.pl:

# ./wss-v2 27357 0.01
Watching PID 27357 page references during 0.01 seconds...
Est(s)     Ref(MB)
0.806        15.00

# ./wss-v1 27357 0.01
Watching PID 27357 page references during 0.01 seconds...
Est(s)     Ref(MB)
44.571       16.00

# ./wss.pl 27357 0.01
Watching PID 27357 page references during 0.01 seconds...
Est(s)     RSS(MB)    PSS(MB)    Ref(MB)
0.080     20001.12   20000.14      15.03

The output shows that that process referenced 15 Mbytes of data (this is correct: it's a synthetic workload).

Columns:

  • Est(s): Estimated WSS measurement duration: this accounts for delays with setting and reading pagemap data, which inflates the intended sleep duration.
  • Ref(MB): Referenced (Mbytes) during the specified duration. This is the working set size metric.

WARNINGs:

This tool sets and reads system and process page flags, which can take over one second of CPU time, during which application may experience slightly higher latency (eg, 5%). Consider these overheads. Also, this is activating some new kernel code added in Linux 4.3 that you may have never executed before. As is the case for any such code, there is the risk of undiscovered kernel panics (I have no specific reason to worry, just being paranoid). Test in a lab environment for your kernel versions, and consider this experimental: use at your own risk.

与[转帖]Working Set Size (WSS) Tools for Linux相似的内容:

[转帖]Working Set Size (WSS) Tools for Linux

https://github.com/brendangregg/wss These are experimental tools for doing working set size estimation, using different Linux facilities. See WARNINGs

[转帖]一次ORA-3136的处理

https://oracleblog.org/working-case/deal-with-ora3136/ 最近收到一个告警,用户说数据库无法连接,但是从监控上看,oracle的后台进程已经侦听进程还是在的,没有任何的alert。 登录数据库,已经恢复正常,但是在数据库的alertlog中发现大量

[转帖]

Linux ubuntu20.04 网络配置(图文教程) 因为我是刚装好的最小系统,所以很多东西都没有,在开始配置之前需要做下准备 环境准备 系统:ubuntu20.04网卡:双网卡 网卡一:供连接互联网使用网卡二:供连接内网使用(看情况,如果一张网卡足够,没必要做第二张网卡) 工具: net-to

[转帖]

https://cloud.tencent.com/developer/article/2168105?areaSource=104001.13&traceId=zcVNsKTUApF9rNJSkcCbB 前言 Redis作为高性能的内存数据库,在大数据量的情况下也会遇到性能瓶颈,日常开发中只有时刻

[转帖]ISV 、OSV、 SIG 概念

ISV 、OSV、 SIG 概念 2022-10-14 12:29530原创大杂烩 本文链接:https://www.cndba.cn/dave/article/108699 1. ISV: Independent Software Vendors “独立软件开发商”,特指专门从事软件的开发、生产、

[转帖]Redis 7 参数 修改 说明

2022-06-16 14:491800原创Redis 本文链接:https://www.cndba.cn/dave/article/108066 在之前的博客我们介绍了Redis 7 的安装和配置,如下: Linux 7.8 平台 Redis 7 安装并配置开机自启动 操作手册https://ww

[转帖]HTTPS中间人攻击原理

https://www.zhihu.com/people/bei-ji-85/posts 背景 前一段时间,公司北京地区上线了一个HTTPS防火墙,用来监听HTTPS流量。防火墙上线之前,邮件通知给管理层,我从我老大那里听说这个事情的时候,说这个有风险,然后意外地发现,很多人原来都不知道HTTPS防

[转帖]关于字节序(大小端)的一点想法

https://www.zhihu.com/people/bei-ji-85/posts 今天在一个技术群里有人问起来了,当时有一些讨论(不完全都是我个人的观点),整理一下: 为什么网络字节序(多数情况下)是大端? 早年设备的缓存很小,先接收高字节能快速的判断报文信息:包长度(需要准备多大缓存)、地

[转帖]awk提取某一行某一列的数据

https://www.jianshu.com/p/dbcb7fe2da56 1、提取文件中第1列数据 awk '{print $1}' filename > out.txt 2、提取前2列的文件 awk `{print $1,$2}' filename > out.txt 3、打印完第一列,然后打

[转帖]awk 中 FS的用法

https://www.cnblogs.com/rohens-hbg/p/5510890.html 在openwrt文件 ar71xx.sh中 查询设备类型时,有这么一句, machine=$(awk 'BEGIN{FS="[ \t]+:[ \t]"} /machine/ {print $2}' /