[转帖]perf-map-agent

perf,map,agent · 浏览次数 : 0

小编点评

**Summary of the Agent Library** The provided agent library is designed to help analyze Java code and generate visual performance data. It uses the Java Performance Agent (JPA) to collect and analyze performance data from the target application. **Features:** * Creates a new map file named `perf-<pid>`.map  * Collects performance data using `dtrace-java-record-stack` and `flamegraph` tools * Generates a flamegraph SVG file named `flamegraph-<pid>`.svg` * Provides options for customizing the data collection and visualization **Requirements:** * The target application must be running with the `-XX:+PreserveFramePointer` option enabled. * The `perf_java_tmp` directory must be set up with sufficient permissions. * The application must be compiled with the `-XX:+DebugNonSafepoints` option. **Usage:** 1. Set up the `PERF_MAP_OPTIONS` environment variable to specify additional options for the agent. 2. Run the application with the specified options. 3. The agent will collect performance data and generate a flamegraph SVG file. **Known Issues:** * Skid: This can affect the accuracy of instruction level profiling when code is inlined from other methods. * Inaccurate mappings: Using `unfold*` options can cause issues if the code is not fully inlined. **Note:** * The code may be experimental and may contain some errors or missing checks. * The generated flamegraph may be noisy due to the inclusion of extra information from inlined methods.

正文

https://github.com/jvm-profiling-tools/perf-map-agent

 

A java agent to generate /tmp/perf-<pid>.map files for just-in-time(JIT)-compiled methods for use with the Linux perf tools.

Build

Make sure JAVA_HOME is configured to point to a JDK. You need cmake >= 2.8.6 (see #30). Then run the following on the command line:

cmake .
make

# will create links to run scripts in <somedir>
bin/create-links-in <somedir>

Architecture

Linux perf tools will expect symbols for code executed from unknown memory regions at /tmp/perf-<pid>.map. This allows runtimes that generate code on the fly to supply dynamic symbol mappings to be used with the perf suite of tools.

perf-map-agent is an agent that will generate such a mapping file for Java applications. It consists of a Java agent written C and a small Java bootstrap application which attaches the agent to a running Java process.

When the agent is attached it instructs the JVM to report code blobs generated by the JVM at runtime for various purposes. Most importantly, this includes JIT-compiled methods but also various dynamically-generated infrastructure parts like the dynamically created interpreter, adaptors, and jump tables for virtual dispatch (see vtable and itable entries). The agent creates a /tmp/perf-<pid>.map file which it fills with one line per code blob that maps a memory location to a code blob name.

The Java application takes the PID of a Java process as an argument and an arbitrary number of additional arguments which it passes to the agent. It then attaches to the target process and instructs it to load the agent library.

Command line scripts

The bin directory contains a set of shell scripts to combine common perf / dtrace perations with creating the map file.

  • create-java-perf-map.sh <pid> <options*> takes a PID and options. It knows where to find libraries relative to the bin directory.
  • perf-java-top <pid> <perf-top-options> takes a PID and additional options to pass to perf top. Uses the agent to create a new /tmp/perf-<pid>.map and then calls perf top with the given options.
  • perf-java-record-stack <pid> <perf-record-options> takes a PID and additional options to pass to perf record. Runs perf record -g -p <pid> <perf-record-options> to collect performance data including stack traces. Afterwards it uses the agent to create a new /tmp/perf-<pid>.map file.
  • perf-java-report-stack <pid> <perf-record-options> calls first perf-java-record-stack <pid> <perf-record-options> and then runs perf report to directly analyze the captured data. You can call perf report -i /tmp/perf-<pid>.data again with any options after the script has exited to further analyze the data from the previous run.
  • perf-java-flames <pid> <perf-record-options> collects data with perf-java-record-stack and then creates a visualization using @brendangregg's FlameGraph tools. To get meaningful stacktraces spanning several JIT-compiled methods, you need to run your JVM with -XX:+PreserveFramePointer (which is available starting from JDK8 update 60 build 19) as detailed in ag netflix blog entry.
  • create-links-in <targetdir> will install symbolic links to the above scripts into <targetdir>.
  • dtrace-java-record-stack <pid> takes a PID. Runsdtrace to collect performance data including stack traces. Afterwards it uses the agent to create a new /tmp/perf-<pid>.map file.
  • dtrace-java-flames <pid> collects data with dtrace-java-record-stack and then creates a visualization using @brendangregg's FlameGraph tools. To get meaningful stacktraces spanning several JIT-compiled methods, you need to run your JVM with -XX:+PreserveFramePointer (which is available starting from JDK8 update 60 build 19) as detailed in ag netflix blog entry.

Environment variables:

  • PERF_MAP_OPTIONS: a string of additional options to pass to the agent as described below.
  • PERF_RECORD_SECONDS: the number of seconds, perf-java-report-stack and similar tools will record performance data
  • PERF_RECORD_FREQ: the sampling frequence as passed to perf record -F
  • FLAMEGRAPH_DIR: the directory into which @brendangregg's FlameGraph has been checked out
  • PERF_JAVA_TMP: the directory to put temporary files in, the default is /tmp
  • PERF_DATA_FILE: the file name where perf-java-record-stack will output performance data into, the default is $PERF_JAVA_TMP/perf-<pid>.data
  • PERF_COLLAPSE_OPTS: a string of additional flags to pass to stackcollapse-perf.pl (found in FLAMEGRAPH_DIR), (add --inline with unfoldall perfmap)
  • PERF_FLAME_OUTPUT: the file name to which the flamegraph SVG will be written, the default is flamegraph-<pid>.svg
  • PERF_FLAME_OPTS: options to pass to flamegraph.pl (found in FLAMEGRAPH_DIR), the default is --color java
  • DTRACE_SECONDS: the number of seconds, dtrace and similar tools will record performance data
  • DTRACE_FREQ: the sampling frequence as passed to dtrace
  • DTRACE_JAVA_TMP: the directory to put temporary files in, the default is /tmp
  • DTRACE_DATA_FILE: the file name where dtrace-java-record-stack will output performance data into, the default is $DTRACE_JAVA_TMP/dtrace-<pid>.data

Options

You can add a comma separated list of options to perf-java (or the AttachOnce runner). These options are currently supported:

  • unfold: Create extra entries for every codeblock inside a method that was inlined from elsewhere (named <inlined_method> in <root_method>). Be aware of the effects of 'skid' in relation with unfolding. See the section below. Also, see the below section about inaccurate inlining information.
  • unfoldall: Similar to unfold but will include the complete inlined stack at a code location in the form root_method->inlined method 1->inlined method 2->...->inlined method on top.
  • unfoldsimple: similar to unfold, however, the extra entries do not include the " in <root_method>" part
  • msig: include full method signature in the name string
  • dottedclass: convert class signature (Ljava/lang/Class;) to the usual class names with segments separated by dots (java.lang.Class). NOTE: this currently breaks coloring when used in combination with flamegraphs.
  • sourcepos: Adds the name of the source file and the line number on which it is declared for each method. Useful when profiling Scala applications that crate a lot of synthetic classes and methods. Does not work with native methods.

Known Issues

Skid

You should be aware that instruction level profiling is not absolutely accurate but suffers from 'skid'. 'skid' means that the actual instruction pointer may already have moved a bit further when a sample is recorded. In that case, (possibly hot) code is reported at an address shortly after the actual hot instruction. See this sample from one of Brendan's presentations demonstrating this issue.

If using unfold, perf-map-agent will report sections that contain code inlined from other methods as separate entries. Unfolded entries can be quite short, e.g. an inlined getter may only consist of a few instructions that now lives inside of another method's JITed code. The next few instructions may then already belong to another entry. In such a case, it is more likely that skid will not only affect the instruction pointer inside of a method entry but may affect which entry is chosen in the first place.

Skid that occurs inside a method is only visible when analyzing the actual assembler code (as with perf annotate). Skid that affects the actual symbol resolution to choose a wrong entry will be much more visible as wrong entries will be reported with tools that operate on the symbol level like the standard views of perf reportperf top, or in flame graphs.

So, while it is tempting to enable unfolded entries for the perceived extra resolution, this extra information is sometimes just noise which will not only clutter the overall view but may also be misleading or wrong.

Inaccurate mappings using the unfold* options

Hotspot does not retain line number and other debug information for inlined code at other places than safepoints. This makes sense because you don't usually observe code running between safepoints from the JVM's perspective. This is different when observing a process from the outside like with perf. For observed code locations outside of safepoints, the JVM will not report any inlining information and perf-map-agent will assign those areas to the host method of the inlining.

For more fidelity, Hotspot can be instructed to include debug information for non-safepoints as well. Use -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints when running the target process. Note, however, that this will produce a lot more information with the generated perf-<pid>.map file potentially growing to MBs of size.

Agent Library Unloading

Unloading or reloading of a changed agent library is not supported by the JVM (but re-attaching is). Therefore, if you make changes to the agent and recompile it you need to restart a target process that has an older version loaded to use the newer version.

Missing symbols for libjvm.so

libjvm.so is the runtime component of the JVM. It is not covered by perf-map-agent but perf will use debug symbols as provided by the distribution. If symbols for libjvm.so are missing see instructions for your Linux distribution to install debug symbols for the JVM. See also issue #39 which contains a few pointers about how to install these.

Disclaimer

I'm not a professional C code writer. The code is very "experimental", and it is e.g. missing checks for error conditions etc.. Use it at your own risk. You have been warned!

License

This library is licensed under GPLv2. See the LICENSE file.

与[转帖]perf-map-agent相似的内容:

[转帖]perf-map-agent

https://github.com/jvm-profiling-tools/perf-map-agent A java agent to generate /tmp/perf-.map files for just-in-time(JIT)-compiled methods for us

[转帖]perf-map-agent

https://github.com/brendangregg/perf-map-agent A java agent to generate /tmp/perf-.map files for just-in-time(JIT)-compiled methods for use with

[转帖]Perf IPC以及CPU性能

https://plantegg.github.io/2021/05/16/Perf_IPC%E4%BB%A5%E5%8F%8ACPU%E5%88%A9%E7%94%A8%E7%8E%87/ 为了让程序能快点,特意了解了CPU的各种原理,比如多核、超线程、NUMA、睿频、功耗、GPU、大小核再到分支

[转帖]perf-tools

https://github.com/brendangregg/perf-tools 网络不好 原作者很厉害 转帖学习. A miscellaneous collection of in-development and unsupported performance analysis tools f

[转帖]Perf分析CPU性能问题笔记

https://cloud.tencent.com/developer/article/1416234 本文仅仅是一个笔记。 场景 观察进程的CPU使用情况 观察进程内各个函数的CPU使用情况: sudo perf top -p 复制 同时显示函数调用链: sudo perf top -

[转帖]Perf 笔记

https://www.cnblogs.com/jyi2ya/p/16278495.html 环境 Linux Syameimaru-Aya 5.17.0-2-amd64 #1 SMP PREEMPT Debian 5.17.6-1 (2022-05-11) x86_64 GNU/Linux。 Pe

[转帖]Perf IPC以及CPU性能

https://plantegg.github.io/2021/05/16/Perf%20IPC%E4%BB%A5%E5%8F%8ACPU%E5%88%A9%E7%94%A8%E7%8E%87/ Perf IPC以及CPU性能 为了让程序能快点,特意了解了CPU的各种原理,比如多核、超线程、NUMA

[转帖]perf学习-linux自带性能分析工具

存储技术为满足层出不穷应用的海量数据存储需求,从物理介质到技术架构也同样发生了天翻地覆的变革。无论技术如何更新换代,其目的都是为了更好的提供高性能,高容量,高可用的数据服务。本系列文章会对存储系统的测试和调试工具做一个介绍。 dd - Linux世界中的搬运工 FIO – IO压力测试工具 vdbe

[转帖]perf学习-linux自带性能分析工具

目前在做性能分析的事情,之前没怎么接触perf,找了几篇文章梳理了一下,按照问题的形式记录在这里。 方便自己查看。 什么是perf? linux性能调优工具,32内核以上自带的工具,软件性能分析。在2.6.31及后续版本的Linux内核里,安装perf非常的容易。 几乎能够处理所有与性能相关的事件。

[转帖]perf学习-linux自带性能分析工具

目前在做性能分析的事情,之前没怎么接触perf,找了几篇文章梳理了一下,按照问题的形式记录在这里。 方便自己查看。 什么是perf? linux性能调优工具,32内核以上自带的工具,软件性能分析。在2.6.31及后续版本的Linux内核里,安装perf非常的容易。 几乎能够处理所有与性能相关的事件。