LowLog

LowLog provides a suite of tools used to gather detailed runtime logs from C programs. The tools help automate different styles of logging including function call traces and runtime control flow decisions. This introduction describes how to apply LowLog to a project, introduces the format of the output logs, and demonstrates these ideas using a small example project.

Applying LowLog to a Project

Generate Preprocessed Source Code

LowLog takes as input the preprocessed source files from a C project. The preprocessed files are used by underlying tools implemented using CIL and requiring standard C without any preprocessor directives.

Preprocessed files typically use i as the suffix. Use the -E option to generate a preprocessed file using GCC. This is easy to integrate into a Makefile by adding a preprocessed rule along the lines of:

preprocessed: $(SRCS:.c=.i)

%.i : %.c
    $(CC) -E $(CFLAGS) $(INCDIRS) $< -o $@

Insert Logging Commands

LowLog currently automates four variations of logging:

Call trace logging using global function IDs

Each function is assigned a globally unique ID. A preamble is added to each function to log this ID at runtime when the function is called. This information can be used to trivially reconstruct a runtime call trace.

Call trace logging using locally scoped function IDs

Each functions called by a particular function, F, is assigned an ID unique within the scope of functions called by F. Function F is augmented to log the ID of each called function immediately before the function is called. When combined with a known starting point, this information can be used to reconstruct a runtime call trace.

Runtime control flow logging

Each branch of each control flow point in a function is given an ID unique relative to the other branches in the current control flow statement. When combined with a known starting point, this information can be used to reconstruct the precise path taken through a program at runtime and, as a side effect, the runtime call trace.

Reduced runtime control flow logging

Branches of control flow points impacting the set of possible future function calls from within a function are given an ID unique relative to the other branches in the current control flow statement. When combined with a known starting point, this information can be used to reconstruct the path taken to each function call at runtime and, as a side effect, the runtime call trace.

All of the above mechanisms record when a function returns to prevent ambiguity in reconstruction of the runtime behavior. Local ID logging and both control flow logging mechanisms use relative program location information to reconstruct runtime information. To prevent loss of this runtime information, the target of function pointer dispatches and interrupt handlers must be explicitly recorded at runtime.

Inserting Global ID Logging

Assume that the environment variable FILES is the list of preprocessed files used in the project.

  1. Generate a list of all functions

    list_functions $FILES > list.txt
  2. Assign each function a globally unique ID

    enumerate_functions.py list.txt | sort > enum_list.txt
  3. Insert preamble logging unique ID into each function

    for FILE in $FILES
    do
        preamble --function_ids enum_list.txt --out `basename $FILE .i`.preamble.c $FILE
    done
  4. Log all function returns

    for FILE in $FILES
    do
        returns --out `basename $FILE .i`.preamble.return.c `basename $FILE .i`.preamble.c
    done

Inserting Locally Scoped ID Logging

Assume that the environment variable FILES is the list of preprocessed files used in the project.

  1. Generate list of functions called by each function

    list_called_functions $FILES > called.txt
  2. Assign each function called an ID unique relative to the other functions called by the caller

    enumerate_called_functions.py called.txt > enum_called.txt
  3. Insert statement logging the ID of the callee before each function call

    for FILE in $FILES
    do
        local_ids --function_ids enum_called.txt --out `basename $FILE .i`.local.c $FILE
    done
  4. Generate list of potential jump points (interrupt handlers or function pointer targets) for each function.

    get_jmp_points.py -o jumps.txt $FILES
  5. Insert preamble to jump points logging the identity of the jump point.

    for FILE in $FILES
    do
        preamble --function_ids jumps.txt --out `basename $FILE .i`.local.jump.c `basename $FILE .i`.local.c
    done
  6. Log all function returns

    for FILE in $FILES
    do
        returns --out `basename $FILE .i`.local.jump.return.c `basename $FILE .i`.local.jump.c
    done

Inserting Runtime Control Flow Logging

Assume that the environment variable FILES is the list of preprocessed files used in the project.

  1. Generate the control flow graph for each file in the program

    for FILE in $FILES
    do
        extract_cfg $FILE > `basename $FILE .i`.dot
    done
  2. Perform a NULL reduction on the graph to generate an easy to parse representation of the full control flow within a file

    for FILE in $FILES
    do
        reduce_cfg.py -n `basename $FILE .i`.dot > `basename $FILE .i`.null.cfg
    done
  3. Insert path logging statements into the file

    for FILE in $FILES
    do
        cfg_branches --cfg_data `basename $FILE .i`.null.cfg --out `basename $FILE .i`.null.c $FILE
    done
  4. Generate list of potential jump points (interrupt handlers or function pointer targets) for each function.

    get_jmp_points.py -o jumps.txt $FILES
  5. Insert preamble to jump points logging the identity of the jump point.

    for FILE in $FILES
    do
        preamble --function_ids jumps.txt --out `basename $FILE .i`.null.jump.c `basename $FILE .i`.null.c
    done
  6. Log all function returns

    for FILE in $FILES
    do
        returns --out `basename $FILE .i`.null.jump.return.c `basename $FILE .i`.null.jump.c
    done

Inserting Reduced Runtime Control Flow Logging

This uses nearly the same procedure used to insert complete runtime control flow logging, but performs a real reduction in step 2. Assume that the environment variable FILES is the list of preprocessed files used in the project.

  1. Generate the control flow graph for each file in the program

    for FILE in $FILES
    do
        extract_cfg $FILE > `basename $FILE .i`.dot
    done
  2. Perform a reduction on the graph.

    for FILE in $FILES
    do
        reduce_cfg.py `basename $FILE .i`.dot > `basename $FILE .i`.reduced.cfg
    done
  3. Insert path logging statements into the file

    for FILE in $FILES
    do
        cfg_branches --cfg_data `basename $FILE .i`.reduced.cfg --out `basename $FILE .i`.reduced.c $FILE
    done
  4. Generate list of potential jump points (interrupt handlers or function pointer targets) for each function.

    get_jmp_points.py -o jumps.txt $FILES
  5. Insert preamble to jump points logging the identity of the jump point.

    for FILE in $FILES
    do
        preamble --function_ids jumps.txt --out `basename $FILE .i`.reduced.jump.c `basename $FILE .i`.reduced.c
    done
  6. Log all function returns

    for FILE in $FILES
    do
        returns --out `basename $FILE .i`.reduced.jump.return.c `basename $FILE .i`.reduced.jump.c
    done

Direct Logging Commands to Output or Storage Mechanism

The above procedure inserts calls to a dummy logging function called HOLDER_FUNC that takes two arguments: a 32-bit value to log and the number of significant bits in the value. This function should be modified to refer to a real logging, output, or storage function.

Assuming an externally available binary logging routine called bitbuffer_add_bit and defined in the file bitbuffer.h, the above HOLDER_FUNC can be replaced using a sed script similar to:

+

s/HOLDER_FUNC("\(.*\) \(.*\)");/bitbuffer_add_bit(\1, \2);/
s/extern void HOLDER_FUNC(char const   \*msg ) ;/#include "bitbuffer.h"/

Users of systems without 32-bit integers should also run all modified files through a sed script to force int32_t and unsigned int32_t to really be 32-bits:

+

s/typedef int int32_t;/typedef int int32_t __attribute__ ((__mode__ (__SI__)));/
s/typedef unsigned int uint32_t;/typedef unsigned int uint32_t __attribute__ ((__mode__ (__SI__)));/

When inserted into a file called rename.sed the above script can be executed using:

+

rename.sed < $IN_FILE > $OUT_FILE

Build the Revised Project

Build the project using the files generated by LowLog in place of the originals. For small projects these changes can be done by hand. For larger projects it may be more convient to revise the Makefile to use FILE.lowlog.c, where lowlog.c is an alternate subscript applied to files after renaming the logging commands to differintate them from the original c files.

Understanding LowLog Logs

TODO

Example LowLog Session

TODO