Description
The chained block structure used by both interpreter and tier-1 compiler is linear, with each block pointing only to the subsequent block. Enhancing a block to reference its previous block brings significant value, especially for hot spot profiling. This advancement paves the way for developing a graph-based intermediate representation (IR). In this IR, graph edges symbolize use-define chains. Rather than working on a two-tiered Control-Flow Graph (CFG) comprising basic blocks (tier 1) and instructions (tier 2), analyses and transformations will directly interact with and modify this use-def information in a streamlined, single-tiered graph structure.
The sfuzz project employs a custom intermediate representation. The initial step in the actual code generation process involves lifting the entire function into this intermediate representation. During the initialization phase, when the target is first loaded, the size of the function is determined. This is achieved by parsing the elf metadata and creating a hashmap that maps function start addresses to their respective sizes.
The IR-lifting process iterates through the original instructions and generates an IR instruction for each original instruction using a large switch statement. The following example illustrates what the intermediate representation might resemble for a very minimal function that essentially performs a branch operation based on a comparison in the first block.