[转帖]Translating exiting DTrace scripts into SystemTap scripts

translating,exiting,dtrace,scripts,into,systemtap · 浏览次数 : 0

小编点评

**Third Example** ```stap global bytes; probe syscall.read { stime[tid()] = gettimeofday_ns(); if (tid() in bytes) { printf("%s read() %d nsecs\\", execname(), gettimeofday_ns() - stime[tid()]); delete stime[tid()]; } } ``` **Explanation** - This script defines a global array `bytes` to store the data sizes of each executable. - It uses the `gettimeofday_ns()` function to get the wallclock time in nanoseconds when a syscall is executed. - It iterates through the entries in the `bytes` array and prints the executable name and the time taken to read each file. - The `delete` statement is used to remove the entries from the `bytes` array as soon as they are processed. - The `pid()` function is used to determine if the current thread is the same as the thread that took the original syscall. - The `if (tid() in stime)` condition ensures that only entries for the current thread are printed.

正文

https://sourceware.org/systemtap/wiki/PortingDTracetoSystemTap

 

If you are familiar with DTrace and have existing DTrace scripts to diagnose performance problems, it is not difficult to translate those existing DTrace into equivalent SystemTap scripts. The ouline of the process is:

  • Adapt command-line, options, and file name
  • Match up DTrace providers to SystemTap probe points

  • Map the Dtrace built-in variable into Systemtap context variables and functions
  • Convert DTrace predicates into SystemTap conditional statements

  • Convert thread local variable into associative arrays
  • Modify the DTrace printout code

These steps will be decribed in greater detail in the process of converting of converting some very simple DTrace examples from:

http://www.brendangregg.com/DTrace/dtrace_oneliners.txt

 

First Example: Successful Signal Send Details

One example in the DTrace one-liners prints out detailed information on signals:

 

dtrace -n 'proc:::signal-send /pid/ { printf("%s -%d %d",execname,args[2],args[1]->pr_pid); }'

This command-line DTrace script prints out the executable name, the signal number, and the process pid each time a user process sends a signal.

 

Use the stap command and options

First step is to use the proper command and options for SystemTap to execute SystemTap from a command line ("stap -e"):

 

stap -e 'proc:::signal-send /pid/ { printf("%s -%d %d",execname,args[2],args[1]->pr_pid); }'

Look at "man stap" for more details on the available options for the stap command.

 

Match up the DTrace providers and SystemTap probe points

There is not a one-to-one correspondence between DTrace providers and SystemTap probe points, but in most cases matches can be found. To get an understanding what a particular DTrace provides supplies look it up at:

SystemTap has similiar information describing the probe points and supporting functions at:

For this particular example we find that the SystemTap signal.send probe point is a good match for proc:::signal-send and the script is now written as:

 

stap -e 'probe signal.send /pid/ { printf("%s -%d %d",execname,args[2],args[1]->pr_pid); }'

 

Map the Dtrace built-in variable into Systemtap context variables and functions

SystemTap probe points and supporting functions are implmented as tapsets. These tapset provide the equivalent to the DTrace built-in variables and provider arguments. The DTrace example uses: pid and execname; these can be mapped to the pid() and execname() functions respectively. The DTrace proc:::signal-send provider args[2] is the signal number and arg[1]->pr_pid is the pid of the process receiving the signal. As described in the SystemTap documentation, the signal.send probepoint provides similar variables: sig and sig_pid. Thus, the script is now:

 

stap -e 'probe signal.send /pid()/ { printf("%s -%d %d",execname(), sig, sig_pid); }'

 

Convert DTrace predicates into SystemTap conditional statements

DTrace has a more restrictive execution model for the probe handlers than SystemTap as a result most DTrace scripts use predication. Systemtap is a bit more flexible and allow conditional code inside the probe handler. The direct translation of the predication would be to negate the predicate and use the next statement to skip the rest of the Systemtap probe handler:

 

stap -e 'probe signal.send { if (!pid()) next; printf("%s -%d %d",execname(), sig, sig_pid); }'

In this case it would be clearer to simply write the code as:

 

stap -e 'probe signal.send { if (pid()) printf("%s -%d %d",execname(), sig, sig_pid); }'

 

Convert thread local variable into associative arrays

In this parictular case the example doesn't have any thread local storage so nothing needs to done for this particular step.

 

Modify the DTrace printout code

There are many differences between Dtrace and SystemTap output. DTrace has more default rules to output data without explicit code in the script. Also DTrace adds newline to printf statment output. To avoid having this particular example have all output on a single line you need to add a "\n" to the printf function. The command line below is the completely translated script suitable for use with SystemTap:

 

stap -e 'probe signal.send { if (pid()) printf("%s -%d %d\n",execname(), sig, sig_pid); }'

 

Second Example: Write size distribution by Executable name

Another of the DTrace one liners prints out distributions on the size of data written by each executable:

 

dtrace -n 'sysinfo:::writech { @dist[execname] = quantize(arg0); }'

 

Use the stap command and options

You need to change the the "dtrace -n" into "stap -e", yielding:

 

stap -e 'sysinfo:::writech { @dist[execname] = quantize(arg0); }'

 

Match up DTrace providers to SystemTap probe points

The DTrace sysinfo:::writech provider instruments the write, writev and pwrite syscalls. The same syscalls exist in Linux. The script becomes:

 

stap -e 'probe syscall.write.return, syscall.writev.return, syscall.pwrite.return { @dist[execname] = quantize(arg0); }'

SystemTap allows multiple probe events to share the same probe handler. The multiple probe events can be specified with wild card or enumerated and separated by commas. For this particular example we must determine that the how much data was actually written and that the write was successful so the probes are on syscall.write.return, syscall.writev.return, and syscall.pwrite.return rather than on syscall.write, syscall.writev, and syscall.pwrite.

 

Map the Dtrace built-in variable into Systemtap context variables and functions

The DTrace execname is eqivalent to the SystemTap execname() function. Each *.return probe event includes a $return context variable which is the return value for the probe point. In this case that is the number of bytes actually written.

Like DTrace, SystemTap provides associative arrays and aggregates. However, SystemTap must have the associate arrays declared as global variable. You need to add "global dist" for the associative array to store the information. The indexing of the associative arrays is similar for SystemTap. SystemTap has statistical operator "<<<" to add a sample. This data can later be printed out as histograms or provide averages, counts, minimums, and maximum.

After modifying the script we now have:

 

stap -e 'global dist; probe syscall.write.return, syscall.writev.return, syscall.pwrite.return
{ dist[execname()] <<< $return; }
'

 

Convert DTrace predicates into SystemTap conditional statements

All of these probe events fire whether the write was successful or not. You need to put a test of the $return value to ensure that negative error values are not included in the data.

 

stap -e 'global dist; probe syscall.write.return, syscall.writev.return, syscall.pwrite.return
  { if ($return >=0) dist[execname()] <<< $return; }
'

 

Convert thread local variable into associative arrays

There are no thread local variables in this example, so nothing needs to be done for this step.

 

Modify the DTrace printout code

DTrace and SystemTap differ significantly in how they produce output. DTrace automatically selects the format of the output when the script exits. SystemTap needs a "probe end" event to print out the data in the desired format. In this case you want to print out @hist_log of each of the entries in the associative array. This is implement with a "foreach" statement. You also want to label the execname for each histogram, so a printf precedes the printing of the histogram. The final SystemTap script is:

 

stap -e ' global bytes; probe syscall.write.return, syscall.writev.return, syscall.pwrite.return
  { if ($return>=0) bytes[execname()] <<< $return }
probe end
  {foreach (e in bytes) {printf("%s\n", e); print(@hist_log(bytes[e]))}}
'

This script print the histograms out when it exits with a ctl-C.

 

Third Example: Translating scripts with thread-local variables

This example is from:

http://www.tablespace.net/quicksheet/dtrace-quickstart.html

Let's assume that the example is call read_time.stp and contains:

 

syscall::read:entry {
  self->stime = timestamp;
}

syscall::read:return /self->stime != 0/ {
  printf("%s read() %d nsecs\n",
  execname,
  timestamp - self->stime);
}

It will print out the the executable name followed by wallclock time in nanoseconds for each read syscall.

 

Use the stap command and options

Rename the script with the ".stp" extension to read_time.stp.

Match up DTrace providers to SystemTap probe points

The DTrace providers used in this example directly match SystemTap syscall.read and syscall.read.return. The current script is:

 

probe syscall.read {
  self->stime = timestamp;
}

probe syscall.read.return /self->stime != 0/ {
  printf("%s read() %d nsecs\n",
  execname,
  timestamp - self->stime);
}

 

Map the Dtrace built-in variable into Systemtap context variables and functions

SystemTap does not implement thread-local variable in the same manner as DTrace; you use a global array and the thread ID (tid()) to index the entries thread specific value in the global array. When a thread-local value is no longer needed it should be deleted to avoid filling the associative arrary with dead values. In this case the example has the global stime to hold the thread local values.

The DTrace timestamp and execname variables map to the SystemTap gettimeofday_ns() and execname() functions. This yields the following intermediate version of the script:

 

global stime

probe syscall.read {
  stime[tid()] = gettimeofday_ns();
}

probe syscall.read.return /self->stime != 0/ {
  printf("%s read() %d nsecs\n",
  execname(), gettimesofday_ns() - stime[tid()]);
  delete stime[tid()];
}

 

Convert DTrace predicates into SystemTap conditional statements

In the original DTrace script the predication limited the execution of the syscall::read:return event only to ones that had a matching syscall::read:entry timestamp. The SystemTap version of the script needs to do the same. By default if there is no entry in the associative array for a index value it is assumed to be 0. Subtracting the current time from zero will give a very large and incorrect value. This predication is implemented with a check to determine whether the current tid() has an entry in the associative array with the "in" operator:

 

global stime

probe syscall.read {
  stime[tid()] = gettimeofday_ns();
}

probe syscall.read.return {
  if (tid() in stime) {
    printf("%s read() %d nsecs\n",
    execname(), gettimeofday_ns() - stime[tid()]);
    delete stime[tid()];
  }
}

 

Eliminate stapio read syscalls from the output

The SystemTap script will instrument all syscall read operations including SystemTap's syscalls. Those can be filtered out with a conditional statement in the syscall.read event handler. This yields the following script:

 

global stime

probe syscall.read {
  if (pid() != stp_pid())
    stime[tid()] = gettimeofday_ns();
}

probe syscall.read.return {
  if (tid() in stime) {
    printf("%s read() %d nsecs\n",
    execname(), gettimeofday_ns() - stime[tid()]);
    delete stime[tid()];
 }
}

与[转帖]Translating exiting DTrace scripts into SystemTap scripts相似的内容:

[转帖]Translating exiting DTrace scripts into SystemTap scripts

https://sourceware.org/systemtap/wiki/PortingDTracetoSystemTap If you are familiar with DTrace and have existing DTrace scripts to diagnose performanc

[转帖]Dapper,大规模分布式系统的跟踪系统

http://bigbully.github.io/Dapper-translation/ 作者:Benjamin H. Sigelman, Luiz Andr´e Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver,

[转帖]LVS负载均衡的三种方式

1.VS-NAT(基于网络地址转换,network address translation ,NAT) VS-NAT是LVS最基本的方法,如果想要设置一个用于测试的LVS,这是一个最简单的方法。 当客户发出请求,lvs负载均衡中的director会将接受到的包的目标地址重写为某个real-serve

[转帖][译] NAT - 网络地址转换(2016)

http://arthurchiao.art/blog/nat-zh/ 译者序 本文翻译自 2016 年的一篇英文博客 NAT - Network Address Translation 。 由于译者水平有限,本文不免存在遗漏或错误之处。如有疑问,请查阅原文。 以下是译文。 译者序 1 绪论 2 网

[转帖]

Linux ubuntu20.04 网络配置(图文教程) 因为我是刚装好的最小系统,所以很多东西都没有,在开始配置之前需要做下准备 环境准备 系统:ubuntu20.04网卡:双网卡 网卡一:供连接互联网使用网卡二:供连接内网使用(看情况,如果一张网卡足够,没必要做第二张网卡) 工具: net-to

[转帖]

https://cloud.tencent.com/developer/article/2168105?areaSource=104001.13&traceId=zcVNsKTUApF9rNJSkcCbB 前言 Redis作为高性能的内存数据库,在大数据量的情况下也会遇到性能瓶颈,日常开发中只有时刻

[转帖]ISV 、OSV、 SIG 概念

ISV 、OSV、 SIG 概念 2022-10-14 12:29530原创大杂烩 本文链接:https://www.cndba.cn/dave/article/108699 1. ISV: Independent Software Vendors “独立软件开发商”,特指专门从事软件的开发、生产、

[转帖]Redis 7 参数 修改 说明

2022-06-16 14:491800原创Redis 本文链接:https://www.cndba.cn/dave/article/108066 在之前的博客我们介绍了Redis 7 的安装和配置,如下: Linux 7.8 平台 Redis 7 安装并配置开机自启动 操作手册https://ww

[转帖]HTTPS中间人攻击原理

https://www.zhihu.com/people/bei-ji-85/posts 背景 前一段时间,公司北京地区上线了一个HTTPS防火墙,用来监听HTTPS流量。防火墙上线之前,邮件通知给管理层,我从我老大那里听说这个事情的时候,说这个有风险,然后意外地发现,很多人原来都不知道HTTPS防

[转帖]关于字节序(大小端)的一点想法

https://www.zhihu.com/people/bei-ji-85/posts 今天在一个技术群里有人问起来了,当时有一些讨论(不完全都是我个人的观点),整理一下: 为什么网络字节序(多数情况下)是大端? 早年设备的缓存很小,先接收高字节能快速的判断报文信息:包长度(需要准备多大缓存)、地