Back-Annotation of WCET Timing Data

The technical infrastructure of WCC described so far allows the effective WCET minimization by optimizations applied at ICD-LLIR level. At this level of abstraction, static WCET analysis takes place and WCET estimates are imported and made available within the compiler. However, WCET-aware optimizations taking place at the source code level based on ICD-C are not directly supported since WCET timing data is not available within ICD-C. But especially high-level optimizations focusing on function call and loop transformation may exhibit a large potential for WCET reduction. Using Back-Annotation within WCC, WCET timing data present at assembly code level is made available at the source code level.

Mapping of ICD-LLIR Structures to ICD-C Structures

In order to raise the abstraction level of WCET timing data from ICD-LLIR to ICD-C, a connection between objects of these two IRs needs to be established. For coarse-grained objects such as entire files or functions, this task is easy, since each assembly-level file corresponds to exactly one source-level file (functions analogously). A mapping between ICD-LLIR file objects to ICD-C file objects can be realized simply by using the file name as a key. Basically, the same holds for functions where the function name can be used as key. Care needs to be taken only for functions declared to be static (in the sense of ANSI-C) and thus having internal linkage. Since several static functions with the same name may exist, a pair of function name and file name is used as mapping key.

Mapping of fine-grained objects like basic blocks from ICD-LLIR to ICD-C is more complicated since a one-to-one mapping does not always exist. The following relations between assembly- and source-level basic blocks may occur:

1-to-1 Relation
A code snippet like, e.g.,

{
c += 2;
b += c;
res = a + b;
}

is obviously related to exactly one assembly-level basic block since no control flow modifications take place. For all ICD-LLIR basic blocks for which such a bijective 1-to-1 relation holds, a mapping to ICD-C basic blocks can be achieved using the block label as key.

n-to-1 Relation
Source codes modifying control flow explicitly or implicitly inside single ANSI-C expressions lead to an n-to-1 relation:

{
a = a / 2;
a = a * 100;
a = foo( a );
a = a - 22;
return a;
}

The above code is represented by one ICD-C basic block. However, it corresponds to two assembly-level basic blocks due to the call of function foo. Similar n-to-1 situations occur in the presence of the logical AND (&&) and OR (||) operators of ANSI-C, since they implicitly modify the control flow.

As a consequence, the mapping of ICD-LLIR basic blocks to ICD-C blocks becomes surjective. Using basic block labels, several assembly-level blocks are mapped to a single source-level block in the presence of an n-to-1 relation.

1-to-m Relation
For ANSI-C loops like the one shown below, several ICD-C blocks representing the loop body and the termination condition may correspond to a single ICD-LLIR basic block:

do {
a--;
sum += a;
} while ( a > 0 );

Such a loop is represented by only one assembly-level basic block since the computations of the loop body, the test of the termination condition and the conditional jump back to the loop header represent a sequence of code without any control flow modifications in between.

WCET timing data attached to an ICD-LLIR basic block for which a 1-to-m relation holds, must not be duplicated and attached to all m source-level blocks, since this would result in tampered global WCET estimates within ICD-C. Therefore, WCET data is attached to only one ICD-C block which can be freely chosen.

Since the code selector forms the gateway between source- and assembly-level IRs within the WCC compiler, it is straightforward that the determination of the relationships between ICD-C and ICD-LLIR basic blocks as well as the establishment of corresponding mappings is integrated in the code selector. All WCC optimizations taking place at the ICD-LLIR level are extended to automatically update all mappings when modifying assembly-level basic blocks.

Back-Annotation of WCET Data

After establishing the connection between assembly- and source-level basic blocks using the mappings presented above, WCET timing data attached to ICD-LLIR basic blocks can be back-annotated to ICD-C. Furthermore, additional data computed by the compiler back-end is passed to ICD-C, too. The following WCET data is back-annotated this way:

Context-independent WCET per basic block.
Information whether a basic block lies on the worst-case execution path (WCEP).
Execution context-specific worst-case execution frequencies per CFG edge.
Summarized worst-case execution frequencies per basic block. This value denotes the total number of executions of that block in the worst case for all execution contexts.
Number of I-cache misses per basic block as determined during WCET analysis.
Information whether a basic block is infeasible in all execution contexts.
Basic block size in bytes and number of spill instructions generated during register allocation.

Since the used mappings need not be bijective, it may happen that it is tried to attach some new WCET data to an ICD-C block for which already old WCET data exists. This kind of conflict is solved as follows:

The sum of the old and new WCET estimates is attached to the ICD-C block.
If the old WCEP attribute is false, the new WCEP attribute is attached to the ICD-C block.
The maximum of old and new worst-case execution counts is attached to the ICD-C block.
If the old infeasible attribute is true, the new infeasible attribute is attached to the ICD-C block.

This way, detailed and fine-grained WCET timing data is available within WCC for high-level optimization.

Links and Further Reading

A compiler framework for the reduction of worst-case execution times