cgccoptimizationcompilationswitch-statement

Can anything be done to get GCC to compile a switch statement with thousands of cases faster?


I've been working on a decompiler (for Glulx) which produces C code. In one mode it outputs all of the code in one switch statement (switching on the program counter). A medium sized input file produces almost 11 thousand cases, and GCC takes a long time to compile it - almost 5 minutes on my computer, and almost 10 minutes on Github Actions. I graphed how long it takes GCC to compile files with switch statements of various lengths:

A graph of number of cases vs time

As you can see, it's non-linear, each thousand more cases taking longer than the last thousand.

Is there anything that can be done to speed up GCC's compilation of this switch statement? Are there any hints to the compiler I can insert into the code, or any settings to change (I've been testing and timing this in -O0)? A switch statement like this, with many sparse cases, would be compiled into a binary tree, so I did consider whether I could manually split it up into blocks of 1000-2000 cases, but that might not actually make the compilation any faster; after all I'd just be manually doing what the compiler would do itself. (I haven't tried this yet.) Any ideas?

If it's relevant, most cases have fall-throughs: of 10898 cases, only 637 have an unconditional break at the end.

Edit: Nate Eldredge very helpfully pointed out in the comments that this is a regression from GCC 8.4. I have now reported this at the GCC bug tracker.


Results from compiling with -ftime-report -ftime-report-details:

Time variable                                   usr           sys          wall               GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    1240 kB (  1%)
 phase parsing                      :   1.26 (  0%)   0.87 ( 71%)   2.13 (  1%)   31503 kB ( 14%)
 phase opt and generate             : 277.02 (100%)   0.35 ( 29%) 277.48 ( 99%)  197271 kB ( 86%)
 callgraph construction             :   0.08 (  0%)   0.01 (  1%)   0.09 (  0%)   27116 kB ( 12%)
 `- tree eh                         :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)      76 kB (  0%)
 `- tree gimplify                   :   0.25 (  0%)   0.01 (  1%)   0.26 (  0%)   35760 kB ( 16%)
 callgraph optimization             :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)       0 kB (  0%)
 `- inline parameters               :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)    2047 kB (  1%)
 `- callgraph construction          :   0.03 (  0%)   0.00 (  0%)   0.02 (  0%)   14151 kB (  6%)
 `- ipa dead code removal           :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 callgraph ipa passes               :   0.33 (  0%)   0.10 (  8%)   0.43 (  0%)   27623 kB ( 12%)
 ipa dead code removal              :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 ipa inlining heuristics            :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 cfg construction                   :   0.59 (  0%)   0.01 (  1%)   0.61 (  0%)    6684 kB (  3%)
 `- rebuild jump labels             :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 cfg cleanup                        :   0.12 (  0%)   0.00 (  0%)   0.12 (  0%)     417 kB (  0%)
 trivially dead code                :   0.13 (  0%)   0.00 (  0%)   0.12 (  0%)       0 kB (  0%)
 df scan insns                      :   0.30 (  0%)   0.03 (  2%)   0.34 (  0%)       0 kB (  0%)
 df live regs                       :   0.10 (  0%)   0.01 (  1%)   0.11 (  0%)       0 kB (  0%)
 df reg dead/unused notes           :   0.11 (  0%)   0.00 (  0%)   0.11 (  0%)    5115 kB (  2%)
 register information               :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 alias analysis                     :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)    2048 kB (  1%)
 alias stmt walking                 :   0.01 (  0%)   0.01 (  1%)   0.07 (  0%)       0 kB (  0%)
 rebuild jump labels                :   0.05 (  0%)   0.00 (  0%)   0.05 (  0%)      27 kB (  0%)
 preprocessing                      :   0.23 (  0%)   0.27 ( 22%)   0.69 (  0%)    8646 kB (  4%)
 lexical analysis                   :   0.45 (  0%)   0.28 ( 23%)   0.76 (  0%)       0 kB (  0%)
 `- preprocessing                   :   0.23 (  0%)   0.27 ( 22%)   0.69 (  0%)    8646 kB (  4%)
 parser (global)                    :   0.03 (  0%)   0.00 (  0%)   0.00 (  0%)     949 kB (  0%)
 `- lexical analysis                :   0.00 (  0%)   0.01 (  1%)   0.00 (  0%)       0 kB (  0%)
 parser function body               :   0.55 (  0%)   0.32 ( 26%)   0.68 (  0%)   21899 kB ( 10%)
 `- lexical analysis                :   0.20 (  0%)   0.13 ( 11%)   0.45 (  0%)       0 kB (  0%)
 inline parameters                  :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)    2047 kB (  1%)
 tree gimplify                      :   0.25 (  0%)   0.01 (  1%)   0.26 (  0%)   35760 kB ( 16%)
 tree eh                            :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)      76 kB (  0%)
 tree CFG construction              :   0.03 (  0%)   0.01 (  1%)   0.03 (  0%)    8036 kB (  3%)
 `- tree CFG cleanup                :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 tree CFG cleanup                   :   0.10 (  0%)   0.00 (  0%)   0.11 (  0%)      14 kB (  0%)
 `- tree CFG cleanup                :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)       0 kB (  0%)
 tree PHI insertion                 :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)    3820 kB (  2%)
 tree SSA rewrite                   :   0.03 (  0%)   0.00 (  0%)   0.04 (  0%)    1948 kB (  1%)
 tree SSA other                     :   0.05 (  0%)   0.02 (  2%)   0.10 (  0%)       0 kB (  0%)
 `- tree SSA rewrite                :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)    1948 kB (  1%)
 `- tree PHI insertion              :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)    3820 kB (  2%)
 `- tree operand scan               :   0.06 (  0%)   0.06 (  5%)   0.11 (  0%)    5653 kB (  2%)
 `- dominance computation           :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 tree SSA incremental               :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)       0 kB (  0%)
 `- tree SSA rewrite                :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)       0 kB (  0%)
 `- dominance frontiers             :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 `- dominance computation           :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 tree operand scan                  :   0.06 (  0%)   0.06 (  5%)   0.12 (  0%)    5989 kB (  3%)
 tree switch lowering               : 271.78 ( 98%)   0.08 (  7%) 271.97 ( 97%)   10454 kB (  5%)
 `- tree CFG cleanup                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 `- tree operand scan               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     336 kB (  0%)
 dominance frontiers                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 dominance computation              :   0.06 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 out of ssa                         :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)     510 kB (  0%)
 expand vars                        :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)    7694 kB (  3%)
 expand                             :   0.29 (  0%)   0.02 (  2%)   0.31 (  0%)   46170 kB ( 20%)
 `- out of ssa                      :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)     510 kB (  0%)
 `- expand vars                     :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)    7694 kB (  3%)
 `- post expand cleanups            :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 post expand cleanups               :   0.05 (  0%)   0.00 (  0%)   0.05 (  0%)     246 kB (  0%)
 `- rebuild jump labels             :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)      27 kB (  0%)
 jump                               :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 `- cfg cleanup                     :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)       0 kB (  0%)
 `- trivially dead code             :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)       0 kB (  0%)
 loop init                          :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)       1 kB (  0%)
 `- dominance computation           :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)       0 kB (  0%)
 loop fini                          :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 integrated RA                      :   1.04 (  0%)   0.05 (  4%)   1.09 (  0%)   24640 kB ( 11%)
 `- df live regs                    :   0.05 (  0%)   0.01 (  1%)   0.06 (  0%)       0 kB (  0%)
 `- register information            :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 `- df reg dead/unused notes        :   0.11 (  0%)   0.00 (  0%)   0.11 (  0%)    5115 kB (  2%)
 `- trivially dead code             :   0.07 (  0%)   0.00 (  0%)   0.06 (  0%)       0 kB (  0%)
 `- alias analysis                  :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)    2048 kB (  1%)
 LRA non-specific                   :   0.38 (  0%)   0.00 (  0%)   0.40 (  0%)      65 kB (  0%)
 `- LRA virtuals elimination        :   0.06 (  0%)   0.00 (  0%)   0.05 (  0%)     143 kB (  0%)
 `- LRA create live ranges          :   0.07 (  0%)   0.01 (  1%)   0.08 (  0%)      11 kB (  0%)
 `- LRA hard reg assignment         :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 LRA virtuals elimination           :   0.06 (  0%)   0.00 (  0%)   0.05 (  0%)     143 kB (  0%)
 LRA create live ranges             :   0.07 (  0%)   0.01 (  1%)   0.08 (  0%)      11 kB (  0%)
 LRA hard reg assignment            :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 reload                             :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 `- integrated RA                   :   0.23 (  0%)   0.00 (  0%)   0.23 (  0%)       0 kB (  0%)
 `- LRA non-specific                :   0.11 (  0%)   0.00 (  0%)   0.11 (  0%)       0 kB (  0%)
 thread pro- & epilogue             :   0.16 (  0%)   0.00 (  0%)   0.16 (  0%)       2 kB (  0%)
 `- cfg cleanup                     :   0.05 (  0%)   0.00 (  0%)   0.04 (  0%)     401 kB (  0%)
 `- df live regs                    :   0.05 (  0%)   0.00 (  0%)   0.05 (  0%)       0 kB (  0%)
 machine dep reorg                  :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 shorten branches                   :   0.14 (  0%)   0.00 (  0%)   0.14 (  0%)       0 kB (  0%)
 final                              :   0.34 (  0%)   0.01 (  1%)   0.34 (  0%)    8094 kB (  4%)
 uninit var analysis                :   0.04 (  0%)   0.01 (  1%)   0.01 (  0%)       0 kB (  0%)
 `- alias stmt walking              :   0.01 (  0%)   0.01 (  1%)   0.07 (  0%)       0 kB (  0%)
 `- dominance computation           :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 initialize rtl                     :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)      12 kB (  0%)
 rest of compilation                :   0.20 (  0%)   0.00 (  0%)   0.17 (  0%)     116 kB (  0%)
 `- machine dep reorg               :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 `- final                           :   0.05 (  0%)   0.00 (  0%)   0.05 (  0%)    8016 kB (  3%)
 `- cfg construction                :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 `- integrated RA                   :   0.67 (  0%)   0.04 (  3%)   0.70 (  0%)   16448 kB (  7%)
 `- loop fini                       :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 `- shorten branches                :   0.14 (  0%)   0.00 (  0%)   0.14 (  0%)       0 kB (  0%)
 `- initialize rtl                  :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)      12 kB (  0%)
 `- df scan insns                   :   0.30 (  0%)   0.03 (  2%)   0.34 (  0%)       0 kB (  0%)
 `- tree CFG cleanup                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 repair loop structures             :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)       0 kB (  0%)
 `- loop init                       :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)       0 kB (  0%)
 `- dominance computation           :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)       0 kB (  0%)
 TOTAL                              : 278.28          1.22        279.62         230025 kB

Solution

  • This was a bug in GCC. It has been fixed in GCC 11.