aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/ui/browsers
AgeCommit message (Collapse)AuthorFilesLines
2025-10-23perf annotate: Fix Clang build by adding block in switch caseJames Clark1-1/+2
Clang and GCC disagree with what constitutes a "declaration after statement". GCC allows declarations in switch cases without an extra block, as long as it's immediately after the label. Clang does not. Unfortunately this is the case even in the latest versions of both compilers. The only option that makes them behave in the same way is -Wpedantic, which can't be enabled in Perf because of the number of warnings it generates. Add a block to fix the Clang build, which is the only thing we can do. Fixes the build error: ui/browsers/annotate.c:999:4: error: expected expression struct annotation_line *al = NULL; ui/browsers/annotate.c:1008:4: error: use of undeclared identifier 'al' al = annotated_source__get_line(notes->src, offset); ui/browsers/annotate.c:1009:24: error: use of undeclared identifier 'al' browser->curr_hot = al ? &al->rb_node : NULL; ui/browsers/annotate.c:1009:30: error: use of undeclared identifier 'al' browser->curr_hot = al ? &al->rb_node : NULL; ui/browsers/annotate.c:1000:8: error: mixing declarations and code is incompatible with standards before C99 [-Werror,-Wdeclaration-after-statement] s64 offset = annotate_browser__curr_hot_offset(browser); Fixes: ad83f3b7155d ("perf c2c annotate: Start from the contention line") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-21perf annotate: fix a crash when annotate the same symbol with 's' and 'T'Tianyou Li1-4/+19
When perf report with annotation for a symbol, press 's' and 'T', then exit the annotate browser. Once annotate the same symbol, the annotate browser will crash. The browser.arch was required to be correctly updated when data type feature was enabled by 'T'. Usually it was initialized by symbol__annotate2 function. If a symbol has already been correctly annotated at the first time, it should not call the symbol__annotate2 function again, thus the browser.arch will not get initialized. Then at the second time to show the annotate browser, the data type needs to be displayed but the browser.arch is empty. Stack trace as below: Perf: Segmentation fault -------- backtrace -------- #0 0x55d365 in ui__signal_backtrace setup.c:0 #1 0x7f5ff1a3e930 in __restore_rt libc.so.6[3e930] #2 0x570f08 in arch__is perf[570f08] #3 0x562186 in annotate_get_insn_location perf[562186] #4 0x562626 in __hist_entry__get_data_type annotate.c:0 #5 0x56476d in annotation_line__write perf[56476d] #6 0x54e2db in annotate_browser__write annotate.c:0 #7 0x54d061 in ui_browser__list_head_refresh perf[54d061] #8 0x54dc9e in annotate_browser__refresh annotate.c:0 #9 0x54c03d in __ui_browser__refresh browser.c:0 #10 0x54ccf8 in ui_browser__run perf[54ccf8] #11 0x54eb92 in __hist_entry__tui_annotate perf[54eb92] #12 0x552293 in do_annotate hists.c:0 #13 0x55941c in evsel__hists_browse hists.c:0 #14 0x55b00f in evlist__tui_browse_hists perf[55b00f] #15 0x42ff02 in cmd_report perf[42ff02] #16 0x494008 in run_builtin perf.c:0 #17 0x494305 in handle_internal_command perf.c:0 #18 0x410547 in main perf[410547] #19 0x7f5ff1a295d0 in __libc_start_call_main libc.so.6[295d0] #20 0x7f5ff1a29680 in __libc_start_main@@GLIBC_2.34 libc.so.6[29680] #21 0x410b75 in _start perf[410b75] Fixes: 1d4374afd000 ("perf annotate: Add 'T' hot key to toggle data type display") Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tianyou Li <tianyou.li@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-19perf c2c annotate: Start from the contention lineTianyou Li2-5/+45
Add support to highlight the contention line in the annotate browser, use 'TAB'/'UNTAB' to refocus to the contention line. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Jiebin Sun <jiebin.sun@intel.com> Reviewed-by: Pan Deng <pan.deng@intel.com> Reviewed-by: Zhiguo Zhou <zhiguo.zhou@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-09-09perf annotate: Fix title line after return from callNamhyung Kim1-3/+7
The second title line which shows symbol and DSO name is broken after moving to another function at 'callq' instruction. The ui_browser__show_title() is used for the first line which shows global sample count and event name so it doesn't change across the functions. What it needs after processing 'call' instruction is to update the second line onlly. Add a comment and call appropriate function. You can verify the change by pressing ENTER on a 'call' instruction and then ESC. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-09perf annotate: Factor out annotate_browser__show_function_title()Namhyung Kim1-11/+16
It'll be used in other places. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-09perf annotate: Fix signature of annotate_browser__show()Namhyung Kim1-12/+13
According to convention, the first argument should be 'struct annotate_browser' instead of 'struct ui_brwoser'. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250908061050.27517-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-02perf annotate: Use a hashmap to save type dataNamhyung Kim1-1/+33
It can slowdown annotation browser if objdump is processing large DWARF data. Let's add a hashmap to save the data type info for each line. Note that this is needed for TUI only because stdio only processes each line once. TUI will display the same line whenever it refreshes the screen. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250816031635.25318-13-namhyung@kernel.org [ Add lines around an if block and use zfree() in one case, acked by Namhyung ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-08-28perf annotate: Add dso__debuginfo() helperNamhyung Kim1-2/+2
It'd be great if it can get the correct debug information using DSO build-Id not just the path name. Instead of adding new callsites of debuginfo__new(), let's add dso__debuginfo() which can hide the access using the pathname and help the future conversion. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250816031635.25318-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-08-28perf annotate: Show warning when debuginfo is not availableNamhyung Kim1-0/+17
When user requests data-type annotation but no DWARF info is available, show a warning message about it. Warning: DWARF debuginfo not found. Data-type in this DSO will not be displayed. Please make sure to have debug information. Press any key... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250816031635.25318-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-08-28perf annotate: Add 'T' hot key to toggle data type displayNamhyung Kim1-5/+12
Support data type display with a key press so that users can toggle the output dynamically on TUI. Also display "[Type]" in the title line if it's enabled. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250816031635.25318-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-08-28perf annotate: Add --code-with-type support for TUINamhyung Kim1-0/+6
Until now, the --code-with-type option is available only on stdio. But it was an artifical limitation because of an implemention issue. Implement the same logic in annotation_line__write() for stdio2/TUI and remove the limitation and update the man page. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250816031635.25318-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-08-28perf annotate: Pass annotation_print_data to annotation_line__write()Namhyung Kim1-2/+11
It will be used for data type display later. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250816031635.25318-5-namhyung@kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: LKML <linux-kernel@vger.kernel.org> Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-08-28perf annotate: Rename to __hist_entry__tui_annotate()Namhyung Kim2-11/+31
There are three different but similar functions for annotation on TUI. Rename it to __hist_entry__tui_annotate() and make sure it passes 'he'. It's not used for now but it'll be needed for later use. Also remove map_symbol__tui_annotate() which was a simple wrapper. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250816031635.25318-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-07-25perf evlist: Change env variable to sessionIan Rogers2-4/+2
The session holds a perf_env pointer env. In UI code container_of is used to turn the env to a session, but this assumes the session header's env is in use. Rather than a dubious container_of, hold the session in the evlist and derive the env from the session with evsel__env, perf_session__env, etc. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-22perf ui scripts: Switch FILENAME_MAX to NAME_MAXIan Rogers1-1/+1
FILENAME_MAX is the same as PATH_MAX (4kb) in glibc rather than NAME_MAX's 255. Switch to using NAME_MAX and ensure the '\0' is accounted for in the path's buffer size. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250717150855.1032526-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf annotate: Fix source code annotate with objdumpNamhyung Kim1-3/+83
Recently it uses llvm and capstone to speed up annotation or disassembly of instructions. But they don't support source code view yet. Until it fixed, we can force to use objdump for source code annotation. To prevent performance loss, it's disabled by default and turned it on when user requests it in TUI by pressing 's' key. Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625230339.702610-1-namhyung@kernel.org Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-05-02perf mem: Add 'dtlb' output fieldNamhyung Kim1-0/+3
This is a breakdown of perf_mem_data_src.mem_dtlb values. It assumes PMU drivers would set PERF_MEM_TLB_HIT bit with an appropriate level. And having PERF_MEM_TLB_MISS means that it failed to find one in any levels of TLB. For now, it doesn't use PERF_MEM_TLB_{WK,OS} bits. Also it seems Intel machines don't distinguish L1 or L2 precisely. So I added ANY_HIT (printed as "L?-Hit") to handle the case. $ perf mem report -F overhead,dtlb,dso --stdio ... # --- D-TLB ---- # Overhead L?-Hit Miss Shared Object # ........ .............. ................. # 67.03% 99.5% 0.5% [unknown] 31.23% 99.2% 0.8% [kernel.kallsyms] 1.08% 97.8% 2.2% [i915] 0.36% 100.0% 0.0% [JIT] tid 6853 0.12% 100.0% 0.0% [drm] 0.05% 100.0% 0.0% [drm_kms_helper] 0.05% 100.0% 0.0% [ext4] 0.02% 100.0% 0.0% [aesni_intel] 0.02% 100.0% 0.0% [crc32c_intel] 0.02% 100.0% 0.0% [dm_crypt] ... Committer testing: # perf report --header | grep cpudesc # cpudesc : AMD Ryzen 9 9950X3D 16-Core Processor # perf mem report -F overhead,dtlb,dso --stdio | head -20 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 2K of event 'cycles:P' # Total weight : 2637 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc # # ---------- D-TLB ----------- # Overhead L1-Hit L2-Hit Miss Other Shared Object # ........ ............................ ................................. # 77.47% 18.4% 0.1% 0.6% 80.9% [kernel.kallsyms] 5.61% 36.5% 0.7% 1.4% 61.5% libxul.so 2.77% 39.7% 0.0% 12.3% 47.9% libc.so.6 2.01% 34.0% 1.9% 1.9% 62.3% libglib-2.0.so.0.8400.1 1.93% 31.4% 2.0% 2.0% 64.7% [amdgpu] 1.63% 48.8% 0.0% 0.0% 51.2% [JIT] tid 60168 1.14% 3.3% 0.0% 0.0% 96.7% [vdso] # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf mem: Add 'snoop' output fieldNamhyung Kim1-0/+3
This is a breakdown of perf_mem_data_src.mem_snoop values. For now, it doesn't use mem_snoopx values like FWD and PEER. $ perf mem report -F overhead,snoop,comm --stdio ... # ---------- Snoop ----------- # Overhead Hit HitM Miss Other Command # ........ ............................ ............... # 34.24% 0.6% 0.0% 0.0% 99.4% gnome-shell 12.02% 1.0% 0.0% 0.0% 99.0% chrome 9.32% 1.0% 0.0% 0.3% 98.7% Isolated Web Co 6.85% 1.0% 0.3% 0.0% 98.6% swapper 6.30% 0.8% 0.8% 0.0% 98.5% Xorg 3.02% 2.4% 0.0% 0.0% 97.6% VizCompositorTh 2.35% 0.0% 0.0% 0.0% 100.0% firefox-esr 2.04% 0.0% 0.0% 0.0% 100.0% JS Helper 1.51% 3.2% 0.0% 0.0% 96.8% threaded-ml 1.44% 0.0% 0.0% 0.0% 100.0% AudioIP~allback ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf mem: Add 'cache' and 'memory' output fieldsNamhyung Kim1-0/+6
This is a breakdown of perf_mem_data_src.mem_lvl_num. But it's also divided into two parts because the combination is bigger than 8. Since there are many entries for different cache levels, 'cache' field focuses on them. I generalized buffers like LFB, MAB and MHB to L1-buf and L2-buf. The rest goes to 'memory' field which can be RAM, CXL, PMEM, IO, etc. $ perf mem report -F cache,mem,dso --stdio ... # # -------------- Cache -------------- --- Memory --- # L1 L2 L3 L1-buf Other RAM Other Shared Object # ................................... .............. .................................... # 53.9% 3.6% 16.2% 21.6% 4.8% 4.8% 95.2% [kernel.kallsyms] 64.7% 1.7% 3.5% 17.4% 12.8% 12.8% 87.2% chrome (deleted) 78.3% 2.8% 0.0% 1.0% 17.9% 17.9% 82.1% libc.so.6 39.6% 1.5% 0.0% 5.7% 53.2% 53.2% 46.8% libxul.so 26.2% 0.0% 0.0% 0.0% 73.8% 73.8% 26.2% [unknown] 85.5% 0.0% 0.0% 14.5% 0.0% 0.0% 100.0% libspa-audioconvert.so 66.3% 4.4% 0.0% 29.4% 0.0% 0.0% 100.0% libglib-2.0.so.0.8200.1 (deleted) 1.9% 0.0% 0.0% 0.0% 98.1% 98.1% 1.9% libmutter-cogl-15.so.0.0.0 (deleted) 10.6% 0.0% 0.0% 89.4% 0.0% 0.0% 100.0% libpulsecommon-16.1.so 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 100.0% libfreeblpriv3.so (deleted) ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf mem: Add 'op' output fieldNamhyung Kim1-0/+3
This is an actual example of the he_mem_stat based sample breakdown. It uses 'mem_op' field of union perf_mem_data_src which means memory operations. It'd have basically 'load' or 'store' which can be useful if PMU doesn't have separate events for them like IBS or SPE. In addition, there's an entry in case load and store happen at the same time. Also adds entries for prefetching and execution. $ perf mem report -F +op -s comm --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 4K of event 'ibs_op//' # Total weight : 9559 # Sort order : comm # # --------------------- Mem Op ---------------------- # Overhead Samples Load Store Ld+St Pfetch Exec Other N/A N/A Command # ........ ....... ................................................... ............... # 44.85% 4077 21.1% 30.7% 0.0% 0.0% 0.0% 48.3% 0.0% 0.0% swapper 26.82% 45 98.8% 0.3% 0.0% 0.0% 0.0% 0.9% 0.0% 0.0% netsli-prober 7.19% 442 51.7% 13.7% 0.0% 0.0% 0.0% 34.6% 0.0% 0.0% perf 5.81% 75 89.7% 2.2% 0.0% 0.0% 0.0% 8.1% 0.0% 0.0% qemu-system-ppc 4.77% 1 100.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% notifications_c 1.77% 10 95.9% 1.2% 0.0% 0.0% 0.0% 3.0% 0.0% 0.0% MemoryReleaser 0.77% 32 71.6% 4.1% 0.0% 0.0% 0.0% 24.3% 0.0% 0.0% DefaultEventMan 0.19% 10 66.7% 22.2% 0.0% 0.0% 0.0% 11.1% 0.0% 0.0% gnome-shell Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf hist: Implement output fields for mem statsNamhyung Kim1-0/+11
This is a preparation for later changes to support mem_stat output. The new fields will need two lines for the header - the first line will show type of mem stat and the second line will show the name of each item which is returned by mem_stat_name(). Each element in the mem_stat array will be printed in percentage for the hist_entry and their sum would be 100%. Add new output field dimension only for SORT_MODE__MEM using mem_stat. To handle possible name conflict with existing sort keys, move the order of checking output field dimensions after the sort dimensions when it looks for sort keys. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf hist: Support multi-line headerNamhyung Kim1-8/+16
This is a preparation to support multi-line headers in 'perf mem report'. Normal sort keys and output fields that don't have contents for multi- line will print the header string at the last line only. As we don't use multi-line headers normally, it should not have any changes in the output. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser hists: Set actions->thread before calling do_zoom_thread()Arnaldo Carvalho de Melo1-1/+1
In 7cecb7fe8388d5c3 ("perf hists: Move sort__has_comm into struct perf_hpp_list") it assumes that act->thread is set prior to calling do_zoom_thread(). This doesn't happen when we use ESC or the Left arrow key to Zoom out of a specific thread, making this operation not to work and we get stuck into the thread zoom. In 6422184b087ff435 ("perf hists browser: Simplify zooming code using pstack_peek()") it says no need to set actions->thread, and at that point that was true, but in 7cecb7fe8388d5c3 a actions->thread == NULL check was added before the zoom out of thread could kick in. We can zoom out using the alternative 't' thread zoom toggle hotkey to finally set actions->thread before calling do_zoom_thread() and zoom out, but lets also fix the ESC/Zoom out of thread case. Fixes: 7cecb7fe8388d5c3 ("perf hists: Move sort__has_comm into struct perf_hpp_list") Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser hists: Simplify the routines that add entries to the popup menuArnaldo Carvalho de Melo1-38/+23
Some don't need some args, ditch them, also struct popup_actions->evsel isn't needed as it is always obtainable from hists_to_evsel(browser->hists). This way we simplify debugging by reducing this needless complexity. Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_dkNDj9EPFwPqq1@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser annotate: Don't show the source code view status initiallyArnaldo Carvalho de Melo1-2/+9
To avoid initial clutter, and not to change the view users that are not interested in toggling the source code view, just show it when the user does the first toggle keypress (pressing 's'). I know that there are users that really disable the source code view by using: # perf config annotate.hide_src_code=yes Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser annotate: Show in the title the source code view toggleArnaldo Carvalho de Melo1-3/+13
Ingo reported that having a visual cue if the source code view is enabled will help in noticing a bug when no source is presented. Change the title scnprintf routine for the annotation browser to do that. More work is needed to have the capabilities of the existing disassemblers listed somehow and start using the fastest one but switch to another that provides features only made available by some particular one, like the first one, the objdump output parsing one. Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser map: Provide feedback on unhandled hotkeysArnaldo Carvalho de Melo1-1/+3
Don't just eat unknown keys without providing visual feedback and instructions on how to see which ones are assigned. Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser hists: Provide feedback on unhandled hotkeysArnaldo Carvalho de Melo1-2/+8
Don't just eat unknown keys without providing visual feedback and instructions on how to see which ones are assigned. Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser header: Provide feedback on unhandled hotkeysArnaldo Carvalho de Melo1-0/+1
Don't just eat unknown keys without providing visual feedback and instructions on how to see which ones are assigned. Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser annotate: Provide feedback on unhandled hotkeysArnaldo Carvalho de Melo1-0/+1
Don't just eat unknown keys without providing visual feedback and instructions on how to see which ones are assigned. Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10perf ui browser annotate-data: Provide feedback on unhandled hotkeysArnaldo Carvalho de Melo1-0/+1
Don't just eat unknown keys without providing visual feedback and instructions on how to see which ones are assigned. Suggested-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-02-18perf report: Add latency output fieldDmitry Vyukov1-10/+17
Latency output field is similar to overhead, but represents overhead for latency rather than CPU consumption. It's re-scaled from overhead by dividing weight by the current parallelism level at the time of the sample. It effectively models profiling with 1 sample taken per unit of wall-clock time rather than unit of CPU time. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/b6269518758c2166e6ffdc2f0e24cfdecc8ef9c1.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-18perf annotate: Prefer passing evsel to evsel->core.idxIan Rogers1-1/+1
An evsel idx may not be stable due to sorting, evlist removal, etc. Try to reduce it being part of APIs by explicitly passing the evsel in annotate code. Internally the code just reads evsel->core.idx so behavior is unchanged. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Chen Ni <nichen@iscas.ac.cn> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20250117181848.690474-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-18perf script: Move find_scripts to browser/scripts.cIan Rogers1-2/+175
The only use of find_scripts is in browser/scripts.c but the definition in builtin causes linking problems requiring a stub in python.c. Move the function to allow the stub to be removed. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-21perf annotate-data: Show offset and size in hexNamhyung Kim1-2/+2
It'd be better to have them in hex to check cacheline alignment. Percent offset size field 100.00 0 0x1c0 struct cfs_rq { 0.00 0 0x10 struct load_weight load { 0.00 0 0x8 long unsigned int weight; 0.00 0x8 0x4 u32 inv_weight; }; 0.00 0x10 0x4 unsigned int nr_running; 14.56 0x14 0x4 unsigned int h_nr_running; 0.00 0x18 0x4 unsigned int idle_nr_running; 0.00 0x1c 0x4 unsigned int idle_h_nr_running; ... Committer notes: Justification from Namhyung when asked about why it would be "better": Cache line sizes are power of 2 so it'd be natural to use hex and check whether an offset is in the same boundary. Also 'perf annotate' shows instruction offsets in hex. > > Maybe this should be selectable? I can add an option and/or a config if you want. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240819233603.54941-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14perf annotate: Display the branch counter histogramKan Liang2-3/+18
Display the branch counter histogram in the annotation view. Press 'B' to display the branch counter's abbreviation list as well. Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }', 4000 Hz, Event count (approx.): f3 /home/sdp/test/tchain_edit [Percent: local period] Percent │ IPC Cycle Branch Counter (Average IPC: 1.39, IPC Coverage: 29.4%) │ 0000000000401755 <f3>: 0.00 0.00 │ endbr64 │ push %rbp │ mov %rsp,%rbp │ movl $0x0,-0x4(%rbp) 0.00 0.00 │1.33 3 |A |- | ↓ jmp 25 11.03 11.03 │ 11: mov -0x4(%rbp),%eax │ and $0x1,%eax │ test %eax,%eax 17.13 17.13 │2.41 1 |A |- | ↓ je 21 │ addl $0x1,-0x4(%rbp) 21.84 21.84 │2.22 2 |AA |- | ↓ jmp 25 17.13 17.13 │ 21: addl $0x1,-0x4(%rbp) 21.84 21.84 │ 25: cmpl $0x270f,-0x4(%rbp) 11.03 11.03 │0.61 3 |A |- | ↑ jle 11 │ nop │ pop %rbp 0.00 0.00 │0.24 20 |AA |B | ← ret Originally-by: Tinghao Zhang <tinghao.zhang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-8-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14perf report: Display the branch counter histogramKan Liang1-1/+16
Reusing the existing --total-cycles option to display the branch counters. Add a new PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER to display the logged branch counter events. They are shown right after all the cycle-related annotations. Extend the 'struct block_info' to store and pass the branch counter related information. The annotation_br_cntr_entry() is to print the histogram of each branch counter event. If the number of logged events is less than 4, the exact number of the abbr name is printed. Otherwise, using '+' to stands for more than 3 events. Assume the number of logged events is less than 4. The annotation_br_cntr_abbr_list() prints the branch counter's abbreviation list. Press 'B' to display the list in the TUI mode. $ perf record -e "{branch-instructions:ppp,branch-misses}:S" -j any,counter $ perf report --total-cycles --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }' # Event count (approx.): 1610046 # # Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range] # ............... .............. ........... .......... .............. .................. # 57.55% 2.5M 0.00% 3 |A |- | ... 25.27% 1.1M 0.00% 2 |AA |- | ... 15.61% 667.2K 0.00% 1 |A |- | ... 0.16% 6.9K 0.81% 575 |A |- | ... 0.16% 6.8K 1.38% 977 |AA |- | ... 0.16% 6.8K 0.04% 28 |AA |B | ... 0.15% 6.6K 1.33% 946 |A |- | ... 0.11% 4.5K 0.06% 46 |AAA+|- | ... 0.10% 4.4K 0.88% 624 |A |- | ... 0.09% 3.7K 0.74% 524 |AAA+|B | ... With -v applied, # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range] # ............... .............. ........... .......... .............. .................. # 57.55% 2.5M 0.00% 3 A=1 ,B=- ... 25.27% 1.1M 0.00% 2 A=2 ,B=- ... 15.61% 667.2K 0.00% 1 A=1 ,B=- ... 0.16% 6.9K 0.81% 575 A=1 ,B=- ... 0.16% 6.8K 1.38% 977 A=2 ,B=- ... 0.16% 6.8K 0.04% 28 A=2 ,B=1 ... 0.15% 6.6K 1.33% 946 A=1 ,B=- ... 0.11% 4.5K 0.06% 46 A=3+,B=- ... 0.10% 4.4K 0.88% 624 A=1 ,B=- ... 0.09% 3.7K 0.74% 524 A=3+,B=1 ... Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-7-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf annotate-data: Show first-level children by default in TUINamhyung Kim1-2/+10
Now default is to fold everything but it only shows the name of the top-level data type which is not very useful. Instead just expand the top level entry so that it can show the layout at a higher level. Annotate type: 'struct task_struct' (4 samples) Percent Offset Size Field - 100.00 0 9792 struct task_struct { ◆ + 0.50 0 24 struct thread_info thread_info; ▒ 0.00 24 4 unsigned int __state; ▒ 0.00 32 8 void* stack; ▒ + 0.00 40 4 refcount_t usage; ▒ 0.00 44 4 unsigned int flags; ▒ 0.00 48 4 unsigned int ptrace; ▒ 0.00 52 4 int on_cpu; ▒ + 0.00 56 16 struct __call_single_node wake_entry; ▒ 0.00 72 4 unsigned int wakee_flips; ▒ 0.00 80 8 long unsigned int wakee_flip_decay_ts;▒ 0.00 88 8 struct task_struct* last_wakee; ▒ 0.00 96 4 int recent_used_cpu; ▒ 0.00 100 4 int wake_cpu; ▒ 0.00 104 4 int on_rq; ▒ 0.00 108 4 int prio; ▒ 0.00 112 4 int static_prio; ▒ 0.00 116 4 int normal_prio; ▒ 0.00 120 4 unsigned int rt_priority; ▒ + 0.00 128 256 struct sched_entity se; ▒ + 0.00 384 48 struct sched_rt_entity rt; ▒ + 0.00 432 224 struct sched_dl_entity dl; ▒ 0.00 656 8 struct sched_class* sched_class; ▒ ... Committer testing: # perf mem record -a sleep 5s # perf annotate --group --data-type=pthread_mutex_t Annotate type: 'pthread_mutex_t' (13 samples) Percent Offset Size Field - 100.00 0