Introduction

The ARM PMU(Performance Monitor Unit) provide useful information for debugging and performance profiling for ARM based platform and the driver is already included in linux kernel. This session is focus on how to enable the PMU support for RZ/G2L platform.

Device tree

In kernel-source/arch/arm64/boot/dts/renesas/r9a07g044l2.dtsi, please add the below arm-pmu node in the RZ/G2L device tree.

	arm-pmu {
		compatible = "arm,armv8-pmuv3";
		interrupt-parent = <&gic>;
		interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>,
			     <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
		interrupt-affinity = <&a55_0>, <&a55_1>;
	};

Kernel configuration

To enable the ARM PMU driver is needed for the PMU support, please make sure the 2 kernel config is included in kernel configuration.

CONFIG_PERF_EVENTS
CONFIG_ARM_PMU

Install the "perf" command into the image

In build/conf/local.conf, Please install the perf command to the image

IMAGE_INSTALL_append += " perf"

How to use perf command with PMU event

perf command help on RZ/G2L

root@smarc-rzg2l:~# perf

 usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]

 The most commonly used perf commands are:
   annotate        Read perf.data (created by perf record) and display annotated code
   archive         Create archive with object files with build-ids found in perf.data file
   bench           General framework for benchmark suites
   buildid-cache   Manage build-id cache.
   buildid-list    List the buildids in a perf.data file
   c2c             Shared Data C2C/HITM Analyzer.
   config          Get and set variables in a configuration file.
   data            Data file related processing
   diff            Read perf.data files and display the differential profile
   evlist          List the event names in a perf.data file
   ftrace          simple wrapper for kernel's ftrace functionality
   inject          Filter to augment the events stream with additional information
   kallsyms        Searches running kernel for symbols
   kmem            Tool to trace/measure kernel memory properties
   kvm             Tool to trace/measure kvm guest os
   list            List all symbolic event types
   lock            Analyze lock events
   mem             Profile memory accesses
   record          Run a command and record its profile into perf.data
   report          Read perf.data (created by perf record) and display the profile
   sched           Tool to trace/measure scheduler properties (latencies)
   script          Read perf.data (created by perf record) and display trace output
   stat            Run a command and gather performance counter statistics
   test            Runs sanity tests.
   timechart       Tool to visualize total system behavior during a workload
   top             System profiling tool.
   probe           Define new dynamic tracepoints
   trace           strace inspired tool

 See 'perf help COMMAND' for more information on a specific command.

List all events can be profiled by perf command

root@smarc-rzg2l:~# perf list

List of pre-defined events (to be used in -e):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]

  alignment-faults                                   [Software event]
  bpf-output                                         [Software event]
  context-switches OR cs                             [Software event]
  cpu-clock                                          [Software event]
  cpu-migrations OR migrations                       [Software event]
  dummy                                              [Software event]
  emulation-faults                                   [Software event]
  major-faults                                       [Software event]
  minor-faults                                       [Software event]
  page-faults OR faults                              [Software event]
  task-clock                                         [Software event]

  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-store-misses                             [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]

  armv8_pmuv3/br_immed_retired/                      [Kernel PMU event]
  armv8_pmuv3/br_mis_pred/                           [Kernel PMU event]
  armv8_pmuv3/br_mis_pred_retired/                   [Kernel PMU event]
  armv8_pmuv3/br_pred/                               [Kernel PMU event]
  armv8_pmuv3/br_retired/                            [Kernel PMU event]
  armv8_pmuv3/br_return_retired/                     [Kernel PMU event]
  armv8_pmuv3/bus_access/                            [Kernel PMU event]
  armv8_pmuv3/bus_cycles/                            [Kernel PMU event]
  armv8_pmuv3/cid_write_retired/                     [Kernel PMU event]
  armv8_pmuv3/cpu_cycles/                            [Kernel PMU event]
  armv8_pmuv3/exc_return/                            [Kernel PMU event]
  armv8_pmuv3/exc_taken/                             [Kernel PMU event]
  armv8_pmuv3/inst_retired/                          [Kernel PMU event]
  armv8_pmuv3/inst_spec/                             [Kernel PMU event]
  armv8_pmuv3/l1d_cache/                             [Kernel PMU event]
  armv8_pmuv3/l1d_cache_refill/                      [Kernel PMU event]
  armv8_pmuv3/l1d_cache_wb/                          [Kernel PMU event]
  armv8_pmuv3/l1d_tlb/                               [Kernel PMU event]
  armv8_pmuv3/l1d_tlb_refill/                        [Kernel PMU event]
  armv8_pmuv3/l1i_cache/                             [Kernel PMU event]
  armv8_pmuv3/l1i_cache_refill/                      [Kernel PMU event]
  armv8_pmuv3/l1i_tlb/                               [Kernel PMU event]
  armv8_pmuv3/l1i_tlb_refill/                        [Kernel PMU event]
  armv8_pmuv3/l2d_cache/                             [Kernel PMU event]
  armv8_pmuv3/l2d_cache_allocate/                    [Kernel PMU event]
  armv8_pmuv3/l2d_cache_refill/                      [Kernel PMU event]
  armv8_pmuv3/l2d_tlb/                               [Kernel PMU event]
  armv8_pmuv3/l2d_tlb_refill/                        [Kernel PMU event]
  armv8_pmuv3/ld_retired/                            [Kernel PMU event]
  armv8_pmuv3/mem_access/                            [Kernel PMU event]
  armv8_pmuv3/memory_error/                          [Kernel PMU event]
  armv8_pmuv3/pc_write_retired/                      [Kernel PMU event]
  armv8_pmuv3/st_retired/                            [Kernel PMU event]
  armv8_pmuv3/stall_backend/                         [Kernel PMU event]
  armv8_pmuv3/stall_frontend/                        [Kernel PMU event]
  armv8_pmuv3/sw_incr/                               [Kernel PMU event]
  armv8_pmuv3/ttbr_write_retired/                    [Kernel PMU event]
  armv8_pmuv3/unaligned_ldst_retired/                [Kernel PMU event]

  rNNN                                               [Raw hardware event descriptor]
  cpu/t1=v1[,t2=v2,t3 ...]/modifier                  [Raw hardware event descriptor]
   (see 'man perf-list' on how to encode it)

  mem:<addr>[/len][:access]                          [Hardware breakpoint]

Some example perf commands on RZ/G2L

  • Performance counting simple summary for 'ls' command
root@smarc-rzg2l:~# perf stat ls

 Performance counter stats for 'ls':

              3.94 msec task-clock                #    0.493 CPUs utilized          
                 6      context-switches          #    0.002 M/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
                53      page-faults               #    0.013 M/sec                  
           4556982      cycles                    #    1.156 GHz                    
            914867      instructions              #    0.20  insn per cycle         
            106623      branches                  #   27.043 M/sec                  
             20024      branch-misses             #   18.78% of all branches        

       0.007990161 seconds time elapsed

       0.003118000 seconds user
       0.003188000 seconds sys

  • The cache reference or cache misses events performance counting for "ls" command
root@smarc-rzg2l:~# perf stat -e 'cache-references,cache-misses' ls

 Performance counter stats for 'ls':

            302505      cache-references                                            
             11986      cache-misses              #    3.962 % of all cache refs    

       0.008759965 seconds time elapsed

       0.000000000 seconds user
       0.005878000 seconds sys
  • Performance counting for 'memory access' kernel PMU event for 'ls' command
root@smarc-rzg2l:~# perf stat -e 'armv8_pmuv3/mem_access/' ls

 Performance counter stats for 'ls':

            354269      armv8_pmuv3/mem_access/                                     

       0.010624637 seconds time elapsed

       0.000000000 seconds user
       0.006220000 seconds sys

  • No labels