Embedded Design Handbook

ID 683689
Date 8/28/2023
Public
Document Table of Contents

4.4.3.5.3. Analyzing the GNU Profiler Report

The information in this section is applicable to the GNU profiler report that the command line or the Nios® II SBT for Eclipse generates.

The GNU profiler report contains information in the following formats:

  • The flat profile portion of the report identifies the child functions in the order in which they consume processing time.
  • The call graph portion of the report describes the call tree of the program sorted by the total amount of time spent in each function and its children. Each entry in this table consists of several lines. The line with the index number at the left hand margin lists the current function. The lines above it list the functions that called this function, and the lines below it list the functions this one called, with exceptions and conditions detailed further in the report itself and the GNU profiler documentation.
Note: For more information, refer to the Nios® II GNU profiler documentation, with the GCC documentation, available at the Nios® II Embedded Design Suite Support page.

The example below shows the GNU profiler report excerpts from the previous tutorial. In the example, the flat profile shows that the checksum_test_routine() function call consumed 79.19% of the processing time during the execution.

The granularity statement in the call graph report states that the report covers 2.55 seconds (2550 milliseconds). The Nios® II timer (sys_clk_timer) has a 10 millisecond timer. The GNU profiler calls the timer interrupt once at the beginning, before a full clock period elapsed, and once every 10 milliseconds thereafter. A precise report, therefore, would show that the GNU profiler calls the timer interrupt handler 255 times. Index[13] shows that the GNU profiler calls alt_avalon_timer_sc_irq() 256 times, which is in the precision range of this measurement method.

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative    self           self     total
time    seconds   seconds   calls  s/call   s/call   name
79.19       2.02     2.02       1    2.02    2.03    checksum_test_routine
18.01       2.48     0.46       1    0.46    0.46    alt_busy_sleep
	.
	.
	.
		Call graph (explanation follows)

granularity: each sample hit covers 32 byte(s) for 0.39% of 2.55 seconds

index 	% time    self  children called   name
		    0.00   0.00       273/273      alt_irq_entry [106]
[13]   0.0    0.00   0.00          273       alt_irq_handler [13]
		    0.00   0.00       256/256      alt_avalon_timer_sc_irq [14]
		    0.00   0.00        17/17       altera_avalon_jtag_uart_irq [17]
	.
	.
	.