libprof/README.md

190 lines
5.7 KiB
Markdown
Raw Normal View History

2019-07-17 20:34:25 +02:00
# libprof: A performance profiling library for gint
libprof is a small gint library that can be used to time and profile the
execution of an add-in. Using it, one can record the time spent in one or
several functions to identify performance bottlenecks in an application.
libprof's measurements are accurate down to the microsecond-level thanks to
precise hardware timers, so it can also be used to time even small portions of
code.
## Building
libprof is built and installed only once for both fx-9860G and fx-CG 50. The
The dependencies are:
2019-07-17 20:34:25 +02:00
* A GCC cross-compiler for a SuperH architecture
* The [gint kernel](/Lephenixnoir/gint)
The Makefile will build and install the library without further instructions.
2019-07-17 20:34:25 +02:00
```sh
% make
% make install
2019-07-17 20:34:25 +02:00
```
By default `sh-elf` is used to build; you can override this by setting the
2019-07-17 20:34:25 +02:00
`target` variable.
```sh
% make target=sh4eb-elf
% make install target=sh4eb-elf
```
If you have the older setup with two toolchains (`sh3eb-elf` and `sh4eb-elf`),
instead of the new one with two targets on the same toolchain (`sh-elf`), you
will need to make and install twice.
2019-07-17 20:34:25 +02:00
## Basic use
To access the library, include the `<libprof.h>` header file.
```c
#include <libprof.h>
```
For each function you want to time, libprof will create a counter. At the start
of the program, you need to specify how many functions (libprof calls them
*contexts*) you will be timing, so that libprof can allocate enough memory.
libprof also needs one of gint's timer to actually measure time; it selects one
of TMU0, TMU1 and TMU2 which are the only ones suited for this precise job. The
available timer with the smallest interrupt priority is selected.
2019-07-17 20:34:25 +02:00
This initialization happens in the `prof_init()` function.
2019-07-17 20:34:25 +02:00
```c
/* Initialize libprof for 13 contexts (automatically selects a timer) */
prof_init(13);
2019-07-17 20:34:25 +02:00
```
You can then measure the execution time of a function by calling `prof_enter()`
at the beginning and `prof_end()` at the end. You just need to "name" the
function by giving its context ID, which is any number between 0 and the number
of contexts passed to `prof_init()` (here 0 to 12).
```c
void function5(void)
{
prof_enter(5);
/* Do stuff... */
prof_leave(5);
}
```
This will add `function5()`'s execution time to the 5th counter, so if the
function is called several times the total execution time will be recorded.
This way, at the end of the program, you can look at the counters to see where
most of the time has been spent.
To retrieve the total execution time of a function, use `prof_time()` :
```c
uint32_t total_function5_us = prof_time(5);
```
This time is measured in microseconds, even though the timers are actually more
precise than this. Note that the overhead of `prof_enter()` and `prof_leave()`
is usually less than 1 microsecond, so the time is very close to the actual
time spent in the function even if the context is frequently entered and left.
At the end of the program, free the resources of the library by calling
`prof_quit()`.
```c
prof_quit();
```
## Managing context numbers
The number of contexts must be set for all execution and all context IDs must
be between 0 and this number (excluded). Managing the numbers by hand is error-
prone and can lead to memory errors.
A simple way of managing context numbers without risking an error is to use an
enumeration.
```c
enum {
/* Whatever function you need */
PROFCTX_FUNCTION1 = 0,
PROFCTX_FUNCTION2,
PROFCTX_FUNCTION3,
PROFCTX_COUNT,
};
```
Enumerations will assign a value to all the provided names, and increment by
one each time. So for example here `PROFCTX_FUNCTION2` is equal to `1` and
`PROFCTX_COUNT` is equal to `3`. As you can see this is conveniently equal to
the number of contexts, which makes it simple to initialize the library:
```c
prof_init(PROFCTX_COUNT);
2019-07-17 20:34:25 +02:00
```
Then you can use context names instead of numbers:
```c
prof_enter(PROFCTX_FUNCTION1);
/* Do stuff... */
prof_leave(PROFCTX_FUNCTION1);
```
If you want to use a new context, you just need to add a name in the
enumeration (anywhere but after `PROFCTX_COUNT`) and all IDs plus the
initialization call will be updated automatically.
## Timing a single execution
`prof_enter()` and `prof_leave()` will add the measured execution time to the
context counter. Sometimes you want to make individual measurements instead of
adding all calls together. To achieve this effect, clear the counter before
the measure using `prof_clear()`.
Here is an example of a function `exec_time_us()` that times the execution of a
function `f` passed as parameter.
```c
uint32_t exec_time_us(void (*f)(void))
{
int ctx = PROFCTX_EXEC_TIME_US;
prof_clear(ctx);
prof_enter(ctx);
f();
prof_leave(ctx);
return prof_time(ctx);
}
```
## Exploiting the measure's precision
The overhead of `prof_enter()` and `prof_leave()` is usually less than a
microsecond, but the starting time of your benchmark might count (loading data
from memory to initialize arrays, performing function calls...). In this case,
the best you can do is measure the time difference between two similar calls.
If you need something even more precise then you can access libprof's counter
array directly to get the timer-tick value itself:
```c
uint32_t elapsed_timer_tick = prof_elapsed[ctx];
```
The frequency of this tick is PΦ/4, where the value of PΦ can be obtained by
querying gint's clock module:
```c
#include <gint/clock.h>
uint32_t tick_freq = clock_freq()->Pphi_f / 4;
```
One noteworthy phenomenon is the startup cost. The first few measurements are
always less precise, probably due to cache effects. I frequently have a first
measurement with an additional 100 us of execution time and 3 us of overhead,
which subsequent tests remove. So it is expected for the first few points of
data to lie outside the range of the next.