The ARM PMU(Performance Monitor Unit) provide useful information for debugging and performance profiling for ARM based platform and the driver is already included in linux kernel. This session is focus on how to enable the PMU support for RZ/G2L platform.
In kernel-source/arch/arm64/boot/dts/renesas/r9a07g044l2.dtsi, please add the below arm-pmu node in the RZ/G2L device tree.
arm-pmu { compatible = "arm,armv8-pmuv3"; interrupt-parent = <&gic>; interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>; interrupt-affinity = <&a55_0>, <&a55_1>; };
To enable the ARM PMU driver is needed for the PMU support, please make sure the 2 kernel config is included in kernel configuration.
CONFIG_PERF_EVENTS CONFIG_ARM_PMU
In build/conf/local.conf, Please install the perf command to the image
IMAGE_INSTALL_append += " perf"
root@smarc-rzg2l:~# perf usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] The most commonly used perf commands are: annotate Read perf.data (created by perf record) and display annotated code archive Create archive with object files with build-ids found in perf.data file bench General framework for benchmark suites buildid-cache Manage build-id cache. buildid-list List the buildids in a perf.data file c2c Shared Data C2C/HITM Analyzer. config Get and set variables in a configuration file. data Data file related processing diff Read perf.data files and display the differential profile evlist List the event names in a perf.data file ftrace simple wrapper for kernel's ftrace functionality inject Filter to augment the events stream with additional information kallsyms Searches running kernel for symbols kmem Tool to trace/measure kernel memory properties kvm Tool to trace/measure kvm guest os list List all symbolic event types lock Analyze lock events mem Profile memory accesses record Run a command and record its profile into perf.data report Read perf.data (created by perf record) and display the profile sched Tool to trace/measure scheduler properties (latencies) script Read perf.data (created by perf record) and display trace output stat Run a command and gather performance counter statistics test Runs sanity tests. timechart Tool to visualize total system behavior during a workload top System profiling tool. probe Define new dynamic tracepoints trace strace inspired tool See 'perf help COMMAND' for more information on a specific command.
root@smarc-rzg2l:~# perf list List of pre-defined events (to be used in -e): branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] stalled-cycles-backend OR idle-cycles-backend [Hardware event] stalled-cycles-frontend OR idle-cycles-frontend [Hardware event] alignment-faults [Software event] bpf-output [Software event] context-switches OR cs [Software event] cpu-clock [Software event] cpu-migrations OR migrations [Software event] dummy [Software event] emulation-faults [Software event] major-faults [Software event] minor-faults [Software event] page-faults OR faults [Software event] task-clock [Software event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-store-misses [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] armv8_pmuv3/br_immed_retired/ [Kernel PMU event] armv8_pmuv3/br_mis_pred/ [Kernel PMU event] armv8_pmuv3/br_mis_pred_retired/ [Kernel PMU event] armv8_pmuv3/br_pred/ [Kernel PMU event] armv8_pmuv3/br_retired/ [Kernel PMU event] armv8_pmuv3/br_return_retired/ [Kernel PMU event] armv8_pmuv3/bus_access/ [Kernel PMU event] armv8_pmuv3/bus_cycles/ [Kernel PMU event] armv8_pmuv3/cid_write_retired/ [Kernel PMU event] armv8_pmuv3/cpu_cycles/ [Kernel PMU event] armv8_pmuv3/exc_return/ [Kernel PMU event] armv8_pmuv3/exc_taken/ [Kernel PMU event] armv8_pmuv3/inst_retired/ [Kernel PMU event] armv8_pmuv3/inst_spec/ [Kernel PMU event] armv8_pmuv3/l1d_cache/ [Kernel PMU event] armv8_pmuv3/l1d_cache_refill/ [Kernel PMU event] armv8_pmuv3/l1d_cache_wb/ [Kernel PMU event] armv8_pmuv3/l1d_tlb/ [Kernel PMU event] armv8_pmuv3/l1d_tlb_refill/ [Kernel PMU event] armv8_pmuv3/l1i_cache/ [Kernel PMU event] armv8_pmuv3/l1i_cache_refill/ [Kernel PMU event] armv8_pmuv3/l1i_tlb/ [Kernel PMU event] armv8_pmuv3/l1i_tlb_refill/ [Kernel PMU event] armv8_pmuv3/l2d_cache/ [Kernel PMU event] armv8_pmuv3/l2d_cache_allocate/ [Kernel PMU event] armv8_pmuv3/l2d_cache_refill/ [Kernel PMU event] armv8_pmuv3/l2d_tlb/ [Kernel PMU event] armv8_pmuv3/l2d_tlb_refill/ [Kernel PMU event] armv8_pmuv3/ld_retired/ [Kernel PMU event] armv8_pmuv3/mem_access/ [Kernel PMU event] armv8_pmuv3/memory_error/ [Kernel PMU event] armv8_pmuv3/pc_write_retired/ [Kernel PMU event] armv8_pmuv3/st_retired/ [Kernel PMU event] armv8_pmuv3/stall_backend/ [Kernel PMU event] armv8_pmuv3/stall_frontend/ [Kernel PMU event] armv8_pmuv3/sw_incr/ [Kernel PMU event] armv8_pmuv3/ttbr_write_retired/ [Kernel PMU event] armv8_pmuv3/unaligned_ldst_retired/ [Kernel PMU event] rNNN [Raw hardware event descriptor] cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] (see 'man perf-list' on how to encode it) mem:<addr>[/len][:access] [Hardware breakpoint]
root@smarc-rzg2l:~# perf stat ls Performance counter stats for 'ls': 3.94 msec task-clock # 0.493 CPUs utilized 6 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 53 page-faults # 0.013 M/sec 4556982 cycles # 1.156 GHz 914867 instructions # 0.20 insn per cycle 106623 branches # 27.043 M/sec 20024 branch-misses # 18.78% of all branches 0.007990161 seconds time elapsed 0.003118000 seconds user 0.003188000 seconds sys
root@smarc-rzg2l:~# perf stat -e 'cache-references,cache-misses' ls Performance counter stats for 'ls': 302505 cache-references 11986 cache-misses # 3.962 % of all cache refs 0.008759965 seconds time elapsed 0.000000000 seconds user 0.005878000 seconds sys
root@smarc-rzg2l:~# perf stat -e 'armv8_pmuv3/mem_access/' ls Performance counter stats for 'ls': 354269 armv8_pmuv3/mem_access/ 0.010624637 seconds time elapsed 0.000000000 seconds user 0.006220000 seconds sys