The Future of GCC


The current architecture of GCC has a number of deficiencies that we'd like to address.


We'd like to correct these things. It seems that this will require some substantial changes to GCC. The next diagram shows what I'd like the final structure to be.



We start with a bunch of parsers, just as we have now. (The question of whether or not certain parsers should be merged is not addressed here; it makes no difference to the overall design.)

The parsers call into an 'IL library', much as they now call into the middle-end of GCC. The interface will be similar to the current GENERIC, although hopefully a smaller, more efficient GENERIC than the one we have today.

The IL library creates an internal representation in SSA form, suitable for optimisation. This internal form can either be sent directly to a very fast code generator that would run at -O0, or sent through one or more simple optimisers, including at least dead code removal, and maybe CSE and other optimisations that simplify the code. The aim here is not to generate great code, but to reduce the amount of work that later passes must do.

Then, the IL is written out to an on-disk format, together with whatever analysis information makes the job of the remaining passes easier (trading off a little disk space against faster compilation). The on-disk format can be directly executed, possibly even JIT compiled if that turns out to be faster than the -O0 code generator.

Usually, though, the on-disk format will be passed to what we now consider to be a GCC frontend. The new frontend will be able to read and merge multiple IL files and merge them. It may also be desirable, in the single-file case, to skip the writing and reading and just pass the data in memory. If it is on disk, the IL will be not be read all at once, but as needed for inlining or to output a particular routine.

Each routine, or possibly a group of routines that are being optimised as a unit, will then be passed to the later single-routine optimisers and through RTL-based code generation.