aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util/header.h
AgeCommit message (Collapse)AuthorFilesLines
2023-05-23perf: Extract building cache level for a CPU into separate functionK Prateek Nayak1-0/+4
build_caches() builds the complete cache topology of the system by iterating over all CPU, building and comparing cache levels of each CPU, keeping only the unique ones at the end. Extract the unit that build the cache levels for a single CPU into a separate function. Expose this function, and the MAX_CACHE_LVL value to be used elsewhere in perf too. Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Gautham Shenoy <gautham.shenoy@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Wen Pu <puwen@hygon.cn> Link: https://lore.kernel.org/r/20230517172745.5833-2-kprateek.nayak@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-10perf header: Move perf_version_string declarationIan Rogers1-0/+2
Move to match the definition in header.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Chengdong Li <chengdongli@tencent.com> Cc: Denis Nikitin <denik@chromium.org> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Raul Silvera <rsilvera@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230410162511.3055900-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf build: Use libtraceevent from the systemIan Rogers1-0/+2
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command line variables. If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the build, don't compile in libtraceevent and libtracefs support. This also disables CONFIG_TRACE that controls "perf trace". CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles, HAVE_LIBTRACEEVENT is used in C code. Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the commands kmem, kwork, lock, sched and timechart are removed. The majority of commands continue to work including "perf test". Committer notes: Fixed up a tools/perf/util/Build reject and added: #include <traceevent/event-parse.h> to tools/perf/util/scripting-engines/trace-event-perl.c. Committer testing: $ rpm -qi libtraceevent-devel Name : libtraceevent-devel Version : 1.5.3 Release : 2.fc36 Architecture: x86_64 Install Date: Mon 25 Jul 2022 03:20:19 PM -03 Group : Unspecified Size : 27728 License : LGPLv2+ and GPLv2+ Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4 Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm Build Date : Fri 15 Apr 2022 10:57:01 AM -03 Build Host : buildvm-x86-05.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ Bug URL : https://bugz.fedoraproject.org/libtraceevent Summary : Development headers of libtraceevent Description : Development headers of libtraceevent-libs $ Default build: $ ldd ~/bin/perf | grep tracee libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000) $ # perf trace -e sched:* --max-events 10 0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1) 0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1) 0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120) 1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120) 1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120) 0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2) 0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2) 0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120) 1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1) 1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120) # Had to tweak tools/perf/util/setup.py to make sure the python binding shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is present in CFLAGS. Building with NO_LIBTRACEEVENT=1 uncovered some more build failures: - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y - perf-$(CONFIG_LIBTRACEEVENT) += scripts/ - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y - The python binding needed some fixups and util/trace-event.c can't be built and linked with the python binding shared object, so remove it in tools/perf/util/setup.py and exclude it from the list of dependencies in the python/perf.so Makefile.perf target. Building without libtraceevent-devel installed uncovered more build failures: - The python binding tools/perf/util/python.c was assuming that traceevent/parse-events.h was always available, which was the case when we defaulted to using the in-kernel tools/lib/traceevent/ files, now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like the other parts of it that deal with tracepoints. - We have to ifdef the rules in the Build files with CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't detect libtraceevent-devel installed in the system. Simplification here to avoid these two ways of disabling builtin-trace.c and not having CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean way. From Athira: <quote> tools/perf/arch/powerpc/util/Build -perf-y += kvm-stat.o +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o </quote> Then, ditto for arm64 and s390, detected by container cross build tests. - s/390 uses test__checkevent_tracepoint() that is now only available if HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT. Also from Athira: <quote> With this change, I could successfully compile in these environment: - Without libtraceevent-devel installed - With libtraceevent-devel installed - With “make NO_LIBTRACEEVENT=1” </quote> Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for consistency with other libraries detected in tools/perf/. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-18Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo1-0/+2
To update the perf/core codebase. Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end of evsel__config(), after what was added in: 49c692b7dfc9b6c0 ("perf offcpu: Accept allowed sample types only") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26perf inject: Adjust output data offset for backward compatibilityRaul Silvera1-0/+2
When 'perf inject' creates a new file, it reuses the data offset from the input file. If there has been a change on the size of the header, as happened in v5.12 -> v5.13, the new offsets will be wrong, resulting in a corrupted output file. This change adds the function perf_session__data_offset to compute the data offset based on the current header size, and uses that instead of the offset from the original input file. Signed-off-by: Raul Silvera <rsilvera@google.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.king@intel.com> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220621152725.2668041-1-rsilvera@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-24perf header: Record non-CPU PMU capabilitiesRavi Bangoria1-1/+1
PMUs advertise their capabilities via sysfs attribute files but the perf tool currently parses only core(CPU) or hybrid core PMU capabilities. Add support of recording non-core PMU capabilities int perf.data header. Note that a newly proposed HEADER_PMU_CAPS is replacing existing HEADER_HYBRID_CPU_PMU_CAPS. Special care is taken for hybrid core PMUs by writing their capabilities first in the perf.data header to make sure new perf.data file being read by old perf tool does not break. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <rrichter@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: like.xu.linux@gmail.com Cc: x86@kernel.org Link: https://lore.kernel.org/r/20220604044519.594-6-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23perf inject: Keep some features sections from input fileAdrian Hunter1-0/+5
perf inject overwrites feature sections with information from the current machine. It makes more sense to keep original information that describes the machine or software when perf record was run. Example: perf.data from "Desktop" injected on "nuc11" Before: $ perf script --header-only -i perf.data-from-desktop | head -15 # ======== # captured on : Thu May 19 09:55:50 2022 # header version : 1 # data offset : 1208 # data size : 837480 # feat offset : 838688 # hostname : Desktop # os release : 5.13.0-41-generic # perf version : 5.18.rc5.gac837f7ca7ed # arch : x86_64 # nrcpus online : 28 # nrcpus avail : 28 # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz # cpuid : GenuineIntel,6,85,4 # total memory : 65548656 kB $ perf inject -i perf.data-from-desktop -o injected-perf.data $ perf script --header-only -i injected-perf.data | head -15 # ======== # captured on : Fri May 20 15:06:55 2022 # header version : 1 # data offset : 1208 # data size : 837480 # feat offset : 838688 # hostname : nuc11 # os release : 5.17.5-local # perf version : 5.18.rc5.g0f828fdeb9af # arch : x86_64 # nrcpus online : 8 # nrcpus avail : 8 # cpudesc : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz # cpuid : GenuineIntel,6,140,1 # total memory : 16012124 kB After: $ perf inject -i perf.data-from-desktop -o injected-perf.data $ perf script --header-only -i injected-perf.data | head -15 # ======== # captured on : Fri May 20 15:08:54 2022 # header version : 1 # data offset : 1208 # data size : 837480 # feat offset : 838688 # hostname : Desktop # os release : 5.13.0-41-generic # perf version : 5.18.rc5.gac837f7ca7ed # arch : x86_64 # nrcpus online : 28 # nrcpus avail : 28 # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz # cpuid : GenuineIntel,6,85,4 # total memory : 65548656 kB Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220520132404.25853-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23perf header: Add ability to keep feature sectionsAdrian Hunter1-0/+10
Many feature sections should not be re-written during perf inject. In preparation to support that, add callbacks that a tool can use to copy a feature section from elsewhere. perf inject will use this facility to copy features sections from the input file. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220520132404.25853-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14perf bench: Fix numa testcase to check if CPU used to bind task is onlineAthira Rajeev1-0/+1
Perf numa bench test fails with error: Testcase: ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk Failure snippet: <<>> Running 'numa/mem' benchmark: # Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk" perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed. <<>> The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list", There is check to see if cpu number is greater than max cpu's possible in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >= g->p.nr_cpus) {". But it could happen that system has say 48 CPU's, but only number of online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is 48, so function will go ahead and set bit for CPU 8 also in cpumask ( td->bind_cpumask). bind_to_cpumask function is called to set affinity using sched_setaffinity and the cpumask. Since the CPU8 is not present, set affinity will fail here with EINVAL. Fix this issue by adding a check to make sure that, CPU's provided in the input argument values are online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02perf tools: Pass a fd to perf_file_header__read_pipe()Namhyung Kim1-1/+1
Currently it unconditionally writes to stdout for repipe. But perf inject can direct its output to a regular file. Then it needs to write the header to the file as well. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210719223153.1618812-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17perf header: Support HYBRID_CPU_PMU_CAPS featureJin Yao1-0/+1
Perf has supported the CPU_PMU_CAPS feature to display a list of CPU PMU capabilities. But on a hybrid platform, it may have several CPU PMUs (such as "cpu_core" and "cpu_atom"). The CPU_PMU_CAPS feature is hard to extend to support multiple CPU PMUs well if it needs to be compatible for the case of old perf data file + new perf tool. So for better compatibility we now create a new feature HYBRID_CPU_PMU_CAPS in the header. For the perf.data generated on hybrid platform, root@otcpl-adl-s-2:~# perf report --header-only -I # cpu_core pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid # cpu_atom pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CPU_PMU_CAPS CLOCK_DATA For the perf.data generated on non-hybrid platform root@kbl-ppc:~# perf report --header-only -I # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY HYBRID_CPU_PMU_CAPS Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210514122948.9472-3-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17perf header: Support HYBRID_TOPOLOGY featureJin Yao1-0/+1
It is useful to let the user know about the hybrid topology. Add the HYBRID_TOPOLOGY feature in header to indicate the core CPUs and the atom CPUs. With this patch a perf.data generated on a hybrid platform reports the hybrid CPU list: root@otcpl-adl-s-2:~# perf report --header-only -I ... # hybrid cpu system: # cpu_core cpu list : 0-15 # cpu_atom cpu list : 16-23 For a perf.data generated on a non-hybrid platform, reports a message that HYBRID_TOPOLOGY is missing: root@kbl-ppc:~# perf report --header-only -I ... # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210514122948.9472-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06perf header: Store clock references for -k/--clockid optionJiri Olsa1-0/+1
Add a new CLOCK_DATA feature that stores reference times when -k/--clockid option is specified. It contains the clock id and its reference time together with wall clock time taken at the 'same time', both values are in nanoseconds. The format of data is as below: struct { u32 version; /* version = 1 */ u32 clockid; u64 wall_clock_ns; u64 clockid_time_ns; }; This clock reference times will be used in following changes to display wall clock for perf events. It's available only for recording with clockid specified, because it's the only case where we can get reference time to wallclock time. It's can't do that with perf clock yet. Committer testing: $ perf record -h -k Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -k, --clockid <clockid> clockid to use for events, see clock_gettime() $ perf record -k monotonic sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (8 samples) ] $ perf report --header-only | grep clockid -A1 # event : name = cycles:u, , id = { 88815, 88816, 88817, 88818, 88819, 88820, 88821, 88822 }, size = 120, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, exclude_kernel = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, ksymbol = 1, bpf_event = 1, clockid = 1 # CPU_TOPOLOGY info available, use -I to display -- # clockid frequency: 1000 MHz # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake # clockid: monotonic (1) # reference time: 2020-08-06 09:40:21.619290 = 1596717621.619290 (TOD) = 21931.077673635 (monotonic) $ Original-patch-by: David Ahern <dsahern@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Geneviève Bastien <gbastien@versatic.net> Cc: Ian Rogers <irogers@google.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200805093444.314999-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-18perf header: Support CPU PMU capabilitiesKan Liang1-0/+1
To stitch LBR call stack, the max LBR information is required. So the CPU PMU capabilities information has to be stored in perf header. Add a new feature HEADER_CPU_PMU_CAPS for CPU PMU capabilities. Retrieve all CPU PMU capabilities, not just max LBR information. Add variable max_branches to facilitate future usage. Committer testing: # ls -la /sys/devices/cpu/caps/ total 0 drwxr-xr-x. 2 root root 0 Apr 17 10:53 . drwxr-xr-x. 6 root root 0 Apr 17 07:02 .. -r--r--r--. 1 root root 4096 Apr 17 10:53 max_precise # # cat /sys/devices/cpu/caps/max_precise 0 # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.033 MB perf.data (7 samples) ] # # perf report --header-only | egrep 'cpu(desc|.*capabilities)' # cpudesc : AMD Ryzen 5 3600X 6-Core Processor # cpu pmu capabilities: max_precise=0 # And then on an Intel machine: $ ls -la /sys/devices/cpu/caps/ total 0 drwxr-xr-x. 2 root root 0 Apr 17 10:51 . drwxr-xr-x. 6 root root 0 Apr 17 10:04 .. -r--r--r--. 1 root root 4096 Apr 17 11:37 branches -r--r--r--. 1 root root 4096 Apr 17 10:51 max_precise -r--r--r--. 1 root root 4096 Apr 17 11:37 pmu_name $ cat /sys/devices/cpu/caps/max_precise 3 $ cat /sys/devices/cpu/caps/branches 32 $ cat /sys/devices/cpu/caps/pmu_name skylake $ perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf report --header-only | egrep 'cpu(desc|.*capabilities)' # cpudesc : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake $ Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pavel Gerasimov <pavel.gerasimov@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com> Link: http://lore.kernel.org/lkml/20200319202517.23423-3-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06perf data: Move perf_dir_version into data.hAdrian Hunter1-4/+0
perf_dir_version belongs to struct perf_data which is declared in data.h. To allow its use in inline perf_data functions, move perf_dir_version to data.h Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191004083121.12182-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-20perf tools: Move event synthesizing routines to separate .c fileArnaldo Carvalho de Melo1-0/+18
For better grouping, in time we may end up making most of these static, i.e. generalizing the 'perf record' synthesizing code so that based on the target it can do the right thing and call the needed synthesizers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-s9zxxhk40s95pjng9panet16@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-20perf event: Move perf_event__synthesize* to event.hArnaldo Carvalho de Melo1-39/+3
Where is the perf_event__handler_t typedef they need, which was the only reason for header.h to be including event.h, untangle that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-outjyzh1o29ndcv9lsqyzt87@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf evlist: Rename struct perf_evlist to struct evlistJiri Olsa1-8/+8
Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29perf evsel: Rename struct perf_evsel to struct evselJiri Olsa1-4/+4
Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-15perf record: Implement COMPRESSED event record and its attributesAlexey Budankov1-0/+1
Implemented PERF_RECORD_COMPRESSED event, related data types, header feature and functions to write, read and print feature attributes from the trace header section. comp_mmap_len preserves the size of mmaped kernel buffer that was used during collection. comp_mmap_len size is used on loading stage as the size of decomp buffer for decompression of COMPRESSED events content. Committer notes: Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf bpf: Save BTF information as headers to perf.dataSong Liu1-0/+1
This patch enables 'perf record' to save BTF information as headers to perf.data. A new header type HEADER_BPF_BTF is introduced for this data. Committer testing: As root, being on the kernel sources top level directory, run: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg Just to compile and load a BPF program that attaches to the raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc). Make sure you have a recent enough clang, say version 9, to get the BTF ELF sections needed for this testing: # clang --version | head -1 clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500) # readelf -SW tools/perf/examples/bpf/augmented_raw_syscalls.o | grep BTF [22] .BTF PROGBITS 0000000000000000 000ede 000b0e 00 0 0 1 [23] .BTF.ext PROGBITS 0000000000000000 0019ec 0002a0 00 0 0 1 [24] .rel.BTF.ext REL 0000000000000000 002fa8 000270 10 30 23 8 Then do a systemwide perf record session for a few seconds: # perf record -a sleep 2s Then look at: # perf report --header-only | grep b[pt]f # event : name = cycles:ppp, , id = { 1116204, 1116205, 1116206, 1116207, 1116208, 1116209, 1116210, 1116211 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1 # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 51 # bpf_prog_info of id 52 # btf info of id 8 # We need to show more info about these BPF and BTF entries , but that can be done later. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-10-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf bpf: Save bpf_prog_info information as headers to perf.dataSong Liu1-0/+1
This patch enables perf-record to save bpf_prog_info information as headers to perf.data. A new header type HEADER_BPF_PROG_INFO is introduced for this data. Committer testing: As root, being on the kernel sources top level directory, run: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg Just to compile and load a BPF program that attaches to the raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc). Then do a systemwide perf record session for a few seconds: # perf record -a sleep 2s Then look at: # perf report --header-only | grep -i bpf # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 208 # bpf_prog_info of id 209 # We need to show more info about these programs, like bpftool does for the ones running on the system, i.e. 'perf record/perf report' become a way of saving the BPF state in a machine to then analyse on another, together with all the other information that is already saved in the perf.data header: # perf report --header-only # ======== # captured on : Tue Mar 12 11:42:13 2019 # header version : 1 # data offset : 296 # data size : 16294184 # feat offset : 16294480 # hostname : quaco # os release : 5.0.0+ # perf version : 5.0.gd783c8 # arch : x86_64 # nrcpus online : 8 # nrcpus avail : 8 # cpudesc : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz # cpuid : GenuineIntel,6,142,10 # total memory : 24555720 kB # cmdline : /home/acme/bin/perf (deleted) record -a # event : name = cycles:ppp, , id = { 3190123, 3190124, 3190125, 3190126, 3190127, 3190128, 3190129, 3190130 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1 # CPU_TOPOLOGY info available, use -I to display # NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_pt = 8, software = 1, power = 11, uprobe = 7, uncore_imc = 12, cpu = 4, cstate_core = 18, uncore_cbox_2 = 15, breakpoint = 5, uncore_cbox_0 = 13, tracepoint = 2, cstate_pkg = 19, uncore_arb = 17, kprobe = 6, i915 = 10, msr = 9, uncore_cbox_3 = 16, uncore_cbox_1 = 14 # CACHE info available, use -I to display # time of first sample : 116392.441701 # time of last sample : 116400.932584 # sample duration : 8490.883 ms # MEM_TOPOLOGY info available, use -I to display # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 208 # bpf_prog_info of id 209 # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT # ======== # Committer notes: We can't use the libbpf unconditionally, as the build may have been with NO_LIBBPF, when we end up with linking errors, so provide dummy {process,write}_bpf_prog_info() wrapped by HAVE_LIBBPF_SUPPORT for that case. Printing are not affected by this, so can continue as is. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-8-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf header: Add DIR_FORMAT feature to describe directory dataJiri Olsa1-0/+5
The data files layout is described by HEADER_DIR_FORMAT feature. Currently it holds only version number (1): uint64_t version; The current version holds only version value (1) means that data files: - Follow the 'data.*' name format. - Contain raw events data in standard perf format as read from kernel (and need to be sorted) Future versions are expected to describe different data files layout according to special needs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-18perf record: Encode -k clockid frequency into Perf traceAlexey Budankov1-0/+1
Store -k clockid frequency into Perf trace to enable timestamps derived metrics conversion into wall clock time on reporting stage. Below is the example of perf report output: tools/perf/perf record -k raw -- ../../matrix/linux/matrix.gcc ... [ perf record: Captured and wrote 31.222 MB perf.data (818054 samples) ] tools/perf/perf report --header # ======== ... # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, clockid = 4 ... # clockid frequency: 1000 MHz ... # ======== Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/23a4a1dc-b160-85a0-347d-40a2ed6d007b@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-19perf tools: Remove perf_tool from event_op2Jiri Olsa1-9/+6
Now that we keep a perf_tool pointer inside perf_session, there's no need to have a perf_tool argument in the event_op2 callback. Remove it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30perf tools: Switch 'session' argument to 'evlist' in ↵Jiri Olsa1-1/+1
perf_event__synthesize_attrs() To be able to pass in other than session's evlist. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-30perf tools: Fix the build on the alpine:edge distroArnaldo Carvalho de Melo1-0/+1
The UAPI file byteorder/little_endian.h uses the __always_inline define without including the header where it is defined, linux/stddef.h, this ends up working in all the other distros because that file gets included seemingly by luck from one of the files included from little_endian.h. But not on Alpine:edge, that fails for all files where perf_event.h is included but linux/stddef.h isn't include before that. Adding the missing linux/stddef.h file where it breaks on Alpine:edge to fix that, in all other distros, that is just a very small header anyway. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-9r1pifftxvuxms8l7ir73p5l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf tools: Add MEM_TOPOLOGY feature to perf data fileJiri Olsa1-0/+1
Adding MEM_TOPOLOGY feature to perf data file, that will carry physical memory map and its node assignments. The format of data in MEM_TOPOLOGY is as follows: 0 - version | for future changes 8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes 16 - count | number of nodes For each node we store map of physical indexes for each node: 32 - node id | node index 40 - size | size of bitmap 48 - bitmap | bitmap of memory indexes that belongs to node | /sys/devices/system/node/node<NODE>/memory<INDEX> The MEM_TOPOLOGY could be displayed with following report command: $ perf report --header-only -I ... # memory nodes (nr 1, block size 0x8000000): # 0 [7G]: 0-23,32-69 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-8-jolsa@kernel.org [ Rename 'index' to 'idx', as this breaks the build in rhel5, 6 and other systems where this is used by glibc headers ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf cpuid: Introduce a platform specific cpuid compare functionThomas Richter1-0/+1
The function get_cpuid_str() is called by perf_pmu__getcpuid() and on s390 returns a complete description of the CPU and its capabilities, which is a comma separated list. To map the CPU type with the value defined in the pmu-events/arch/s390/mapfile.csv, introduce an architecture specific cpuid compare function named strcmp_cpuid_str() The currently used regex algorithm is defined as the weak default and will be used if no platform specific one is defined. This matches the current behavior. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180213151419.80737-3-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-08perf header: Add infrastructure to record first and last sample timeJin Yao1-0/+1
perf report/script/... have a --time option to limit the time range of output. That's very useful to slice large traces, e.g. when processing the output of perf script for some analysis. But right now --time only supports absolute time. Also there is no fast way to get the start/end times of a given trace except for looking at it. This makes it hard to e.g. only decode the first half of the trace, which is useful for parallelization of scripts Another problem is that perf records are variable size and there is no synchronization mechanism. So the only way to find the last sample reliably would be to walk all samples. But we want to avoid that in perf report/... because it is already quite expensive. That is why storing the first sample time and last sample time in perf record is better. This patch creates a new header feature type HEADER_SAMPLE_TIME and related ops. Save the first sample time and the last sample time to the feature section in perf file header. That will be done when, for instance, processing build-ids, where we already have to process all samples to create the build-id table, take advantage of that to further amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf report/script' faster when using --time. Committer testing: After this patch is applied the header is written with zeroes, we need the next patch, for "perf record" to actually write the timestamps: # perf report -D | grep PERF_RECORD_SAMPLE\( 22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21be8c5 period: 1 addr: 0 <SNIP> 22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21ffd50 period: 2828043 addr: 0 # perf report --header | grep "time of " # time of first sample : 0.000000 # time of last sample : 0.000000 # Changelog: v7: 1. Rebase to latest perf/core branch. 2. Add following clarification in patch description according to Arnaldo's suggestion. "That will be done when, for instance, processing build-ids, where we already have to process all samples to create the build-id table, take advantage of that to further amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf report/script' faster when using --time." v4: Use perf script time style for timestamp printing. Also add with the printing of sample duration. v3: Remove the definitions of first_sample_time/last_sample_time from perf_session. Just define them in perf_evlist Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf pmu: Pass pmu as a parameter to get_cpuid_str()Ganapatrao Kulkarni1-1/+2
The cpuid string will not be same on all CPUs on heterogeneous platforms like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid string from associated CPUs of PMU CORE device. Also optimise arguments to function pmu_add_cpu_aliases. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: http://lkml.kernel.org/r/201