Specification of Memory Hierarchies
Previous research has shown that the performance of today's systems is not constrained by the computing power of processors. Instead, the memory subsystem forms a bottleneck slowing down fast, modern processors. Particularly in the domain of hard real-time systems, such a slow down is unacceptable. As a consequence, various optimizations within WCC aim at efficiently exploiting memory hierarchies by moving portions of a program's code and data to fast memories. This page describes the infrastructure provided by WCC supporting such memory hierarchy optimizations.
WCC allows to model, to generate optimized code for and to statically analyze arbitrarily distributed program objects within a freely definable memory hierarchy. The fundamental idea behind this infrastructure is to make information on the memory model available in the optimization stages at the ICD-LLIR level.
Memory Hierarchy Specification
To enable optimizations moving parts of a program across memories, the information which is usually available only to the linker in a conventional compilation process needs to be provided already to the WCC compiler itself. This is motivated by the fact that the WCET analyzer integrated into WCC requires detailed information about a program's memory layout. The stand-alone implementation of aiT used a fully linked and relocated binary executable providing the entire memory layout with its binary image. Now that aiT is used as an integral part of the WCC compilation framework, linking and memory layout of programs need to be considered by the compiler.
In order to specify memory hierarchies for WCC, a simple text file interface is realized. In such a memory layout description, different regions of a processor's physical memory hierarchy are described. Per such physical memory region, the following attributes can be defined:
- The region's base address and absolute length
- Access attributes like, e.g., read, write, executable, allocatable
- Memory access times, specified in processor cycles
- Assembly-level sections that are allowed to be mapped to a memory region
The following snippet shows some parts of WCC's memory hierarchy specification for the Infineon TriCore TC1796 processor:
###############################################################################
#
# Memory layout of the TriCore1.3 (tc1796)
#
# Referring to tc1796_um_v2.0_2007_07.pdf
#
#
# Program Flash with cached access (PFLASH; PMU; L2)
#
[PFLASH-C]
origin = 0x80100000
length = 0x100000
attributes = RXAC # read/execute/allocate/cached
cycles = 6
memory_type = PFLASH
sections = .text_cached
#
# Program Flash with non-cached access (PFLASH; PMU; L2)
#
[PFLASH-NC]
origin = 0xa0001000
length = 0xFE000
attributes = RXA # read/execute/allocate
cycles = 6
memory_type = PFLASH
sections = .text .init .eh_frame .ctors .dtors .traptab .inttab
load = .text_spm .pcptext .pcpdata
#
# Data SRAM (SRAM; DMU; L2)
#
[DMU-SRAM]
origin = 0xc0000000
length = 0x10000 # 64K
attributes = RWA # read/write/allocate
cycles = 1
memory_type = SPRAM
sections = .data .bss .sbss .sdata .zbss .zdata
#
# Local Data RAM (LDRAM; DMI; L1)
#
[DMI-LDRAM]
origin = 0xd0004000
length = 0xa000 # 40K
attributes = RWA # read/write/allocatable
cycles = 1
memory_type = LDRAM
sections = .data_spm
#
# Program SPM (SPRAM; PMI; L1)
#
[PMI-SRAM]
origin = 0xd4000000
length = 0xbc00 # 47K; 1K is reserved for PMI-SYS
attributes = RWXA # read/write/execute/allocatable
cycles = 1
memory_type = SPRAM
sections = .text_spm
Memory Assignment
Now that the WCC compiler is aware of the memory hierarchy of the target processor, it needs to be able to move parts of a program to different memories of this hierarchy. This functionality is realized within the assembly-level ICD-LLIR IR. Within ICD-LLIR, a couple of assembly-level sections is maintained. These sections adhere the TriCore EABI specification. The sections directives of a memory hierarchy specification define a one-to-one mapping to which physical memory region an assembly section can be moved.
Memory assignment within the WCC compiler is now performed by simply assigning parts of the ICD-LLIR IR to different sections. Currently, entire functions, basic blocks and data objects are allowed to be assigned to sections. ICD-LLIR provides a convenient API in order to realize such memory assignments of code and data blocks. In addition, ICD-LLIR provides symbol tables allowing to retrieve physical memory addresses per function, basic block or data object. This way, the memory layout of a program represented with ICD-LLIR is determined exactly. The information about this physical memory layout of a program can then be passed to the static WCET analyzer in order to achieve a memory hierarchy-aware WCET analysis within WCC.
Using these structures provided by WCC and ICD-LLIR, moving, e.g., a basic block bb from main memory onto the program scratchpad memory can be realized quite easily as follows:
LLIR_ObjectSectionLayout *layout = llir->getObjectSectionLayout();
LLIR_ObjectSection *so_spm = layout->find ObjectSectionByName( mSectionMap[ SECTION_SPM ] );
bb->detachFromSection();
bb->attachToSection( so_spm );
As a last step, it must be guaranteed that the memory layout of an executable program generated by WCC exactly reflects the layout decisions WCC has taken during compilation. Since the binary executable is produced outside WCC by the linker, WCC must be able to guide the linker. For this purpose, WCC not only emits pure assembly code. Instead, it also produces a GNU ld compatible linker script and invokes the linker using this linker script. This way, the binary executable is fully equivalent to the memory layout determined by WCC's optimizations.