If you code for an embedded system, optimization and debugging are essential to keeping your applications running efficiently. Here are two tips on adding a code profiler and using the printf() command to help debug.

Add a code profiler to an embedded system
Code optimization can take up a significant amount of an embedded software developer’s time. Before you start to optimize your code, make sure that you know where the CPU is spending its time and that you’ve accurately identified the bottlenecks. This may seem obvious, but developers often guess incorrectly and spend precious hours fine-tuning code, only to discover that their efforts were wasted, resulting in little or no noticeable speed improvement.

A code profiler is a useful tool in the hunt for processing bottlenecks, but it’s often overlooked or unavailable as part of an embedded development environment. You can add a simple code profiler to any embedded system with very little effort if you have access to a fast and regular timer interrupt, some spare RAM, and a linker-generated map file.

A simple code profiler divides the execution address space into small, equal regions and maintains a counter for each one. A regular timer interrupt then periodically samples the CPU instruction pointer to determine the address of the executing code on each timer tick and increments the corresponding counter. Over time, this builds a statistical profile of where the CPU is spending its time.

Implementation details
Profile counter table—You’ll need a block of RAM to store the table of profiling counters. Each counter corresponds to a small region of the executable code in memory, say 32 bytes. Ideally, you should use 32-bit counters to avoid counter overflow issues. For example, a system with 64 KB of executable code divided into 2,048 regions of 32 bytes each would therefore require 2,048 regions of 4 bytes each, which equals 8 KB of spare RAM.

Timer interrupt—You need a timer interrupt to periodically sample the instruction pointer. On entry to the interrupt, the CPU will store the interrupted code’s instruction pointer on the stack or in a CPU register, depending on the processor architecture. Here, the timer interrupt service routine can retrieve the pointer and use it to determine which region of code was executing when the interrupt struck and which profile counter should be incremented.

For typical systems, an interrupt interval of between 1 ms and 10 ms is perfectly adequate, but a faster interrupt interval will give even better results.

Interpreting the results—When a profiling session is complete, you can dump the contents of the profile counter table to a debug terminal or file for analysis. (See ProfileDump() in the code below.) The counters with the largest accumulated counts correspond to areas of the executable code containing the most frequently executed code. By cross-referencing these code addresses with the information provided by the linker-generated map file and optionally some code disassembly, you can accurately determine the location of the bottlenecks in the source code.

See Listing A for an example of an implementation.

Take control of printf()-style debug statements
The most commonly used form of debugging in an embedded system is the trusty old printf() function. It’s nothing fancy, but it can let you take a good look into the operation of the code without stopping the program execution.

The printf() debugging statements used for this purpose are generally unwanted in the final production code, especially when you have limited processing power, restricted memory, or both. For this reason, developers often treat printf() statements as throwaways. They add them at strategic points in the code to track down a problem and then later delete them or comment them out.

Of course, this approach requires manually (and often tediously) reinstating the debugging on subsequent visits to the code. What you really want is to be able to keep the debug statements in the source code and allow them to be easily enabled and disabled using a compile-time switch.

The wrong path
You might think the obvious solution would be to wrap the printf() function call in a macro. Let’s say you try something like Listing B. However, while this example may look like it will work, it won’t.

As Listing B shows, this approach works fine for a simple single-parameter string (1), but it doesn’t work when there’s more than one parameter (2). The compiler will complain that there are too many arguments to the macro (it only expects one).

Remember that the prototype for printf() is:
int printf(const char *fmt, …);

The ellipsis (…) indicates that the function can take a variable number of arguments, which gives printf() its flexibility. Unfortunately, standard C doesn’t give macros the same flexibility. You can’t pass a variable number of arguments to a macro, so you can’t define a macro in this way:
#define debug(x, …)       printf(x, __VA_ARGS__)

While the newest C99 C-language specification does allow you to define macros with variable-argument lists in just this way, as do some compilers using nonportable implementations, this support is generally not available.

Double or nothing
The example in Listing C demonstrates the solution to the multiple-argument problem: using double parentheses.

Notice two small but important changes:

  • ·        We redefined the debug() macro to exclude the parenthesis after printf.
  • ·        We added double parentheses around the parameters to the calls to the debug() macro.

Listing C works because the preprocessor treats the entire text within the inner parenthesis (including the parenthesis) as the first argument when expanding the macro. Specifying the #define DEBUG line enables the printf debug code. Deleting the #define DEBUG line will completely remove all traces of the debug code from the runtime code.

A step further
You can extend this technique to allow differing levels of debug detail. There are several possible implementations. For example, if you want to perform the filtering entirely at compile time, you could use the code in Listing D.

Alternatively, if you want compile-time control over the debug code generation but you want runtime control over the filtering level, you could use the code in Listing E.