[转帖]Working Set Size (WSS) Tools for Linux

working,set,size,wss,tools,for,linux · 浏览次数 : 0

小编点评

## Summary of the WSS Tools This document describes two C programs for measuring and analyzing working set size (WSS): **1. wss.pl:** * This is a simple tool that uses the pagemap data to estimate WSS. * It takes a snapshot of the page flags and reads them back to calculate WSS. * It warns about setting and reading system page flags, which can take over a second of CPU time. * It is slower than the other tool, but it is more accurate. **2. wss-v2.c:** * This is a more advanced tool that uses idle page tracking to estimate WSS. * It takes snapshots of the system's idle page flags at regular intervals. * This allows it to track WSS over long periods of time. * It is faster than wss.pl, but it may be less accurate. **Key differences:** | Feature | wss.pl | wss-v2.c | |---|---|---| | Measurement method | Pagemap data snapshot | Idle page flags | | Frequency of measurements | Once | Regular intervals | | Accuracy | Higher | Lower | | Speed | Slower | Faster | | Risk of overheads | Setting and reading system page flags | Setting and reading system and process page flags | **Recommendation:** * For most scenarios, wss.pl is the recommended tool. * If you need a tool that is more accurate and efficient for large datasets, consider wss-v2.c. * Be aware of the risk of overheads associated with both tools.

正文

https://github.com/brendangregg/wss

These are experimental tools for doing working set size estimation, using different Linux facilities. See WARNINGs.

Main website: http://www.brendangregg.com/wss.html

Tools:

wss.pl: For Linux 2.6.22+. Uses the referenced page flag for a page-based WSS estimation.
wss-v1: For Linux 4.3+, and small processes. Uses the idle page flag for a page-based WSS estimation.
wss-v2: For Linux 4.3+, and large processes. Uses the idle page flag for a page-based WSS estimation.

wss.pl (referenced page flag)

This tool should work on Linux 2.26.22+, although with caveats described below. It resets the PG_referenced page flags via /proc/PID/clear_refs, then checks referenced memory after a duration. Eg:

# ./wss.pl 23593 0.1
Watching PID 23593 page references during 0.1 seconds...
Est(s)     RSS(MB)    PSS(MB)    Ref(MB)
0.100       201.18     200.10      10.41

The output shows that the process had 201 Mbytes of RSS (main memory), and during 0.1 seconds only 10.41 Mbytes (worth of pages) was touched (read/written).

Columns:

Est(s): Estimated WSS measurement duration: this accounts for delays with setting and reading pagemap data, which inflates the intended sleep duration.
RSS(MB): Resident Set Size (Mbytes). The main memory size.
PSS(MB): Proportional Set Size (Mbytes). Accounting for shared pages.
Ref(MB): Referenced (Mbytes) during the specified duration. This is the working set size metric.
Dur(s): Full duration of measurement (seconds), from beginning to set page flags to completing reading them.
Slp(s): Total sleep time.

USAGE:

# ./wss.pl -h
USAGE: wss [options] PID duration(s)
	-C         # show cumulative output every duration(s)
	-s secs    # take duration(s) snapshots after secs pauses
	-d secs    # total duration of measuremnt (for -s or -C)
	-P steps   # profile run (cumulative), from duration(s)
	-t         # show additional timestamp columns
   eg,
	wss 181 0.01       # measure PID 181 WSS for 10 milliseconds
	wss 181 5          # measure PID 181 WSS for 5 seconds (same overhead)
	wss -C 181 5       # show PID 181 growth every 5 seconds
	wss -C -d 10 181 1 # PID 181 growth each second for 10 seconds total
	wss -s 1 181 0.01  # show a 10 ms WSS snapshot every 1 second
	wss -s 0 181 1     # measure WSS every 1 second (not cumulative)
	wss -P 10 181 0.01 # 10 step power-of-2 profile, starting with 0.01s

WARNINGs:

This tool uses /proc/PID/clear_refs and /proc/PID/smaps, which can cause slightly higher application latency (eg, 10%) while the kernel walks page structures. For large processes (> 100 Gbytes) this duration of higher latency can last over 1 second, during which this tool is consuming system CPU time. Consider these overheads. This also resets the referenced flag, which might confuse the kernel as to which pages to reclaim, especially if swapping is active. This also activates some old kernel code that may not have been used in your environment before, and which modifies page flags: I'd guess there is a risk of an undiscovered kernel panic (the Linux mm community may be able to say how real this risk is). Test in a lab environment for your kernel versions, and consider this experimental: use at your on risk.

wss-v1.c (idle page flag: small process)

This is a proof-of-concept tool that uses idle page tracking, which was added to Linux 4.3. This is considered safer than modifying the referenced page flag, since the referenced page flag may confuse the kernel reclaim code, especially if the system is swapping.

This version of this tool walks page structures one by one, and is suited for small processes only. On large processes (>100 Gbytes), this tool can take several minutes to write. See wss-v2.c, which uses page data snapshots and is much faster for large processes (50x), as well as wss.pl, which is even faster (although uses the referenced page flag).

Here is some example output, comparing this tool to the earlier wss.pl:

# ./wss-v1 33583 0.01
Watching PID 33583 page references during 0.01 seconds...
Est(s)     Ref(MB)
0.055        10.00

# ./wss.pl 33583 0.01
Watching PID 33583 page references during 0.01 seconds...
Est(s)     RSS(MB)    PSS(MB)    Ref(MB)
0.011        21.07      20.10      10.03

The output shows that that process referenced 10 Mbytes of data (this is correct: it's a synthetic workload).

Columns:

Est(s): Estimated WSS measurement duration: this accounts for delays with setting and reading pagemap data, which inflates the intended sleep duration.
Ref(MB): Referenced (Mbytes) during the specified duration. This is the working set size metric.

WARNINGs:

This tool sets and reads process page flags, which for large processes (> 100 Gbytes) can take several minutes (use wss-v2 for those instead). During that time, this tool consumes one CPU, and the application may experience slightly higher latency (eg, 5%). Consider these overheads. Also, this is activating some new kernel code added in Linux 4.3 that you may have never executed before. As is the case for any such code, there is the risk of undiscovered kernel panics (I have no specific reason to worry, just being paranoid). Test in a lab environment for your kernel versions, and consider this experimental: use at your own risk.

wss-v2.c (idle page flag: large process)

This version of this tool takes a snapshot of the system's idle page flags, which speeds up analysis of large processes, but not small ones. See wss-v1.c, which may be faster for small processes, as well as wss.pl, which is even faster (although uses the referenced page flag).

Here is some example output, comparing this tool to wss-v1 (which runs much slower), and the earlier wss.pl:

# ./wss-v2 27357 0.01
Watching PID 27357 page references during 0.01 seconds...
Est(s)     Ref(MB)
0.806        15.00

# ./wss-v1 27357 0.01
Watching PID 27357 page references during 0.01 seconds...
Est(s)     Ref(MB)
44.571       16.00

# ./wss.pl 27357 0.01
Watching PID 27357 page references during 0.01 seconds...
Est(s)     RSS(MB)    PSS(MB)    Ref(MB)
0.080     20001.12   20000.14      15.03

The output shows that that process referenced 15 Mbytes of data (this is correct: it's a synthetic workload).

Columns:

Est(s): Estimated WSS measurement duration: this accounts for delays with setting and reading pagemap data, which inflates the intended sleep duration.
Ref(MB): Referenced (Mbytes) during the specified duration. This is the working set size metric.

WARNINGs:

This tool sets and reads system and process page flags, which can take over one second of CPU time, during which application may experience slightly higher latency (eg, 5%). Consider these overheads. Also, this is activating some new kernel code added in Linux 4.3 that you may have never executed before. As is the case for any such code, there is the risk of undiscovered kernel panics (I have no specific reason to worry, just being paranoid). Test in a lab environment for your kernel versions, and consider this experimental: use at your own risk.