LCOV - code coverage report
Current view: top level - gcc - bb-reorder.c (source / functions) Hit Total Coverage
Test: gcc.info Lines: 1021 1191 85.7 %
Date: 2020-05-30 12:51:24 Functions: 37 41 90.2 %
Legend: Lines: hit not hit | Branches: + taken - not taken # not executed Branches: 0 0 -

           Branch data     Line data    Source code
       1                 :            : /* Basic block reordering routines for the GNU compiler.
       2                 :            :    Copyright (C) 2000-2020 Free Software Foundation, Inc.
       3                 :            : 
       4                 :            :    This file is part of GCC.
       5                 :            : 
       6                 :            :    GCC is free software; you can redistribute it and/or modify it
       7                 :            :    under the terms of the GNU General Public License as published by
       8                 :            :    the Free Software Foundation; either version 3, or (at your option)
       9                 :            :    any later version.
      10                 :            : 
      11                 :            :    GCC is distributed in the hope that it will be useful, but WITHOUT
      12                 :            :    ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
      13                 :            :    or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
      14                 :            :    License for more details.
      15                 :            : 
      16                 :            :    You should have received a copy of the GNU General Public License
      17                 :            :    along with GCC; see the file COPYING3.  If not see
      18                 :            :    <http://www.gnu.org/licenses/>.  */
      19                 :            : 
      20                 :            : /* This file contains the "reorder blocks" pass, which changes the control
      21                 :            :    flow of a function to encounter fewer branches; the "partition blocks"
      22                 :            :    pass, which divides the basic blocks into "hot" and "cold" partitions,
      23                 :            :    which are kept separate; and the "duplicate computed gotos" pass, which
      24                 :            :    duplicates blocks ending in an indirect jump.
      25                 :            : 
      26                 :            :    There are two algorithms for "reorder blocks": the "simple" algorithm,
      27                 :            :    which just rearranges blocks, trying to minimize the number of executed
      28                 :            :    unconditional branches; and the "software trace cache" algorithm, which
      29                 :            :    also copies code, and in general tries a lot harder to have long linear
      30                 :            :    pieces of machine code executed.  This algorithm is described next.  */
      31                 :            : 
      32                 :            : /* This (greedy) algorithm constructs traces in several rounds.
      33                 :            :    The construction starts from "seeds".  The seed for the first round
      34                 :            :    is the entry point of the function.  When there are more than one seed,
      35                 :            :    the one with the lowest key in the heap is selected first (see bb_to_key).
      36                 :            :    Then the algorithm repeatedly adds the most probable successor to the end
      37                 :            :    of a trace.  Finally it connects the traces.
      38                 :            : 
      39                 :            :    There are two parameters: Branch Threshold and Exec Threshold.
      40                 :            :    If the probability of an edge to a successor of the current basic block is
      41                 :            :    lower than Branch Threshold or its count is lower than Exec Threshold,
      42                 :            :    then the successor will be the seed in one of the next rounds.
      43                 :            :    Each round has these parameters lower than the previous one.
      44                 :            :    The last round has to have these parameters set to zero so that the
      45                 :            :    remaining blocks are picked up.
      46                 :            : 
      47                 :            :    The algorithm selects the most probable successor from all unvisited
      48                 :            :    successors and successors that have been added to this trace.
      49                 :            :    The other successors (that has not been "sent" to the next round) will be
      50                 :            :    other seeds for this round and the secondary traces will start from them.
      51                 :            :    If the successor has not been visited in this trace, it is added to the
      52                 :            :    trace (however, there is some heuristic for simple branches).
      53                 :            :    If the successor has been visited in this trace, a loop has been found.
      54                 :            :    If the loop has many iterations, the loop is rotated so that the source
      55                 :            :    block of the most probable edge going out of the loop is the last block
      56                 :            :    of the trace.
      57                 :            :    If the loop has few iterations and there is no edge from the last block of
      58                 :            :    the loop going out of the loop, the loop header is duplicated.
      59                 :            : 
      60                 :            :    When connecting traces, the algorithm first checks whether there is an edge
      61                 :            :    from the last block of a trace to the first block of another trace.
      62                 :            :    When there are still some unconnected traces it checks whether there exists
      63                 :            :    a basic block BB such that BB is a successor of the last block of a trace
      64                 :            :    and BB is a predecessor of the first block of another trace.  In this case,
      65                 :            :    BB is duplicated, added at the end of the first trace and the traces are
      66                 :            :    connected through it.
      67                 :            :    The rest of traces are simply connected so there will be a jump to the
      68                 :            :    beginning of the rest of traces.
      69                 :            : 
      70                 :            :    The above description is for the full algorithm, which is used when the
      71                 :            :    function is optimized for speed.  When the function is optimized for size,
      72                 :            :    in order to reduce long jumps and connect more fallthru edges, the
      73                 :            :    algorithm is modified as follows:
      74                 :            :    (1) Break long traces to short ones.  A trace is broken at a block that has
      75                 :            :    multiple predecessors/ successors during trace discovery.  When connecting
      76                 :            :    traces, only connect Trace n with Trace n + 1.  This change reduces most
      77                 :            :    long jumps compared with the above algorithm.
      78                 :            :    (2) Ignore the edge probability and count for fallthru edges.
      79                 :            :    (3) Keep the original order of blocks when there is no chance to fall
      80                 :            :    through.  We rely on the results of cfg_cleanup.
      81                 :            : 
      82                 :            :    To implement the change for code size optimization, block's index is
      83                 :            :    selected as the key and all traces are found in one round.
      84                 :            : 
      85                 :            :    References:
      86                 :            : 
      87                 :            :    "Software Trace Cache"
      88                 :            :    A. Ramirez, J. Larriba-Pey, C. Navarro, J. Torrellas and M. Valero; 1999
      89                 :            :    http://citeseer.nj.nec.com/15361.html
      90                 :            : 
      91                 :            : */
      92                 :            : 
      93                 :            : #include "config.h"
      94                 :            : #include "system.h"
      95                 :            : #include "coretypes.h"
      96                 :            : #include "backend.h"
      97                 :            : #include "target.h"
      98                 :            : #include "rtl.h"
      99                 :            : #include "tree.h"
     100                 :            : #include "cfghooks.h"
     101                 :            : #include "df.h"
     102                 :            : #include "memmodel.h"
     103                 :            : #include "optabs.h"
     104                 :            : #include "regs.h"
     105                 :            : #include "emit-rtl.h"
     106                 :            : #include "output.h"
     107                 :            : #include "expr.h"
     108                 :            : #include "tree-pass.h"
     109                 :            : #include "cfgrtl.h"
     110                 :            : #include "cfganal.h"
     111                 :            : #include "cfgbuild.h"
     112                 :            : #include "cfgcleanup.h"
     113                 :            : #include "bb-reorder.h"
     114                 :            : #include "except.h"
     115                 :            : #include "alloc-pool.h"
     116                 :            : #include "fibonacci_heap.h"
     117                 :            : #include "stringpool.h"
     118                 :            : #include "attribs.h"
     119                 :            : #include "common/common-target.h"
     120                 :            : 
     121                 :            : /* The number of rounds.  In most cases there will only be 4 rounds, but
     122                 :            :    when partitioning hot and cold basic blocks into separate sections of
     123                 :            :    the object file there will be an extra round.  */
     124                 :            : #define N_ROUNDS 5
     125                 :            : 
     126                 :            : struct target_bb_reorder default_target_bb_reorder;
     127                 :            : #if SWITCHABLE_TARGET
     128                 :            : struct target_bb_reorder *this_target_bb_reorder = &default_target_bb_reorder;
     129                 :            : #endif
     130                 :            : 
     131                 :            : #define uncond_jump_length \
     132                 :            :   (this_target_bb_reorder->x_uncond_jump_length)
     133                 :            : 
     134                 :            : /* Branch thresholds in thousandths (per mille) of the REG_BR_PROB_BASE.  */
     135                 :            : static const int branch_threshold[N_ROUNDS] = {400, 200, 100, 0, 0};
     136                 :            : 
     137                 :            : /* Exec thresholds in thousandths (per mille) of the count of bb 0.  */
     138                 :            : static const int exec_threshold[N_ROUNDS] = {500, 200, 50, 0, 0};
     139                 :            : 
     140                 :            : /* If edge count is lower than DUPLICATION_THRESHOLD per mille of entry
     141                 :            :    block the edge destination is not duplicated while connecting traces.  */
     142                 :            : #define DUPLICATION_THRESHOLD 100
     143                 :            : 
     144                 :            : typedef fibonacci_heap <long, basic_block_def> bb_heap_t;
     145                 :            : typedef fibonacci_node <long, basic_block_def> bb_heap_node_t;
     146                 :            : 
     147                 :            : /* Structure to hold needed information for each basic block.  */
     148                 :            : struct bbro_basic_block_data
     149                 :            : {
     150                 :            :   /* Which trace is the bb start of (-1 means it is not a start of any).  */
     151                 :            :   int start_of_trace;
     152                 :            : 
     153                 :            :   /* Which trace is the bb end of (-1 means it is not an end of any).  */
     154                 :            :   int end_of_trace;
     155                 :            : 
     156                 :            :   /* Which trace is the bb in?  */
     157                 :            :   int in_trace;
     158                 :            : 
     159                 :            :   /* Which trace was this bb visited in?  */
     160                 :            :   int visited;
     161                 :            : 
     162                 :            :   /* Cached maximum frequency of interesting incoming edges.
     163                 :            :      Minus one means not yet computed.  */
     164                 :            :   int priority;
     165                 :            : 
     166                 :            :   /* Which heap is BB in (if any)?  */
     167                 :            :   bb_heap_t *heap;
     168                 :            : 
     169                 :            :   /* Which heap node is BB in (if any)?  */
     170                 :            :   bb_heap_node_t *node;
     171                 :            : };
     172                 :            : 
     173                 :            : /* The current size of the following dynamic array.  */
     174                 :            : static int array_size;
     175                 :            : 
     176                 :            : /* The array which holds needed information for basic blocks.  */
     177                 :            : static bbro_basic_block_data *bbd;
     178                 :            : 
     179                 :            : /* To avoid frequent reallocation the size of arrays is greater than needed,
     180                 :            :    the number of elements is (not less than) 1.25 * size_wanted.  */
     181                 :            : #define GET_ARRAY_SIZE(X) ((((X) / 4) + 1) * 5)
     182                 :            : 
     183                 :            : /* Free the memory and set the pointer to NULL.  */
     184                 :            : #define FREE(P) (gcc_assert (P), free (P), P = 0)
     185                 :            : 
     186                 :            : /* Structure for holding information about a trace.  */
     187                 :            : struct trace
     188                 :            : {
     189                 :            :   /* First and last basic block of the trace.  */
     190                 :            :   basic_block first, last;
     191                 :            : 
     192                 :            :   /* The round of the STC creation which this trace was found in.  */
     193                 :            :   int round;
     194                 :            : 
     195                 :            :   /* The length (i.e. the number of basic blocks) of the trace.  */
     196                 :            :   int length;
     197                 :            : };
     198                 :            : 
     199                 :            : /* Maximum count of one of the entry blocks.  */
     200                 :            : static profile_count max_entry_count;
     201                 :            : 
     202                 :            : /* Local function prototypes.  */
     203                 :            : static void find_traces_1_round (int, profile_count, struct trace *, int *,
     204                 :            :                                  int, bb_heap_t **, int);
     205                 :            : static basic_block copy_bb (basic_block, edge, basic_block, int);
     206                 :            : static long bb_to_key (basic_block);
     207                 :            : static bool better_edge_p (const_basic_block, const_edge, profile_probability,
     208                 :            :                            profile_count, profile_probability, profile_count,
     209                 :            :                            const_edge);
     210                 :            : static bool copy_bb_p (const_basic_block, int);
     211                 :            : 
     212                 :            : /* Return the trace number in which BB was visited.  */
     213                 :            : 
     214                 :            : static int
     215                 :   23820600 : bb_visited_trace (const_basic_block bb)
     216                 :            : {
     217                 :   23820600 :   gcc_assert (bb->index < array_size);
     218                 :   23820600 :   return bbd[bb->index].visited;
     219                 :            : }
     220                 :            : 
     221                 :            : /* This function marks BB that it was visited in trace number TRACE.  */
     222                 :            : 
     223                 :            : static void
     224                 :    6363260 : mark_bb_visited (basic_block bb, int trace)
     225                 :            : {
     226                 :    6363260 :   bbd[bb->index].visited = trace;
     227                 :    6363260 :   if (bbd[bb->index].heap)
     228                 :            :     {
     229                 :     257439 :       bbd[bb->index].heap->delete_node (bbd[bb->index].node);
     230                 :     257439 :       bbd[bb->index].heap = NULL;
     231                 :     257439 :       bbd[bb->index].node = NULL;
     232                 :            :     }
     233                 :    6363260 : }
     234                 :            : 
     235                 :            : /* Check to see if bb should be pushed into the next round of trace
     236                 :            :    collections or not.  Reasons for pushing the block forward are 1).
     237                 :            :    If the block is cold, we are doing partitioning, and there will be
     238                 :            :    another round (cold partition blocks are not supposed to be
     239                 :            :    collected into traces until the very last round); or 2). There will
     240                 :            :    be another round, and the basic block is not "hot enough" for the
     241                 :            :    current round of trace collection.  */
     242                 :            : 
     243                 :            : static bool
     244                 :    5660520 : push_to_next_round_p (const_basic_block bb, int round, int number_of_rounds,
     245                 :            :                       profile_count count_th)
     246                 :            : {
     247                 :    5660520 :   bool there_exists_another_round;
     248                 :    5660520 :   bool block_not_hot_enough;
     249                 :            : 
     250                 :    5660520 :   there_exists_another_round = round < number_of_rounds - 1;
     251                 :            : 
     252                 :    5660520 :   block_not_hot_enough = (bb->count < count_th
     253                 :    5660520 :                           || probably_never_executed_bb_p (cfun, bb));
     254                 :            : 
     255                 :    5660520 :   if (there_exists_another_round
     256                 :    5660520 :       && block_not_hot_enough)
     257                 :            :     return true;
     258                 :            :   else
     259                 :    3310550 :     return false;
     260                 :            : }
     261                 :            : 
     262                 :            : /* Find the traces for Software Trace Cache.  Chain each trace through
     263                 :            :    RBI()->next.  Store the number of traces to N_TRACES and description of
     264                 :            :    traces to TRACES.  */
     265                 :            : 
     266                 :            : static void
     267                 :     417352 : find_traces (int *n_traces, struct trace *traces)
     268                 :            : {
     269                 :     417352 :   int i;
     270                 :     417352 :   int number_of_rounds;
     271                 :     417352 :   edge e;
     272                 :     417352 :   edge_iterator ei;
     273                 :     417352 :   bb_heap_t *heap = new bb_heap_t (LONG_MIN);
     274                 :            : 
     275                 :            :   /* Add one extra round of trace collection when partitioning hot/cold
     276                 :            :      basic blocks into separate sections.  The last round is for all the
     277                 :            :      cold blocks (and ONLY the cold blocks).  */
     278                 :            : 
     279                 :     417352 :   number_of_rounds = N_ROUNDS - 1;
     280                 :            : 
     281                 :            :   /* Insert entry points of function into heap.  */
     282                 :     417352 :   max_entry_count = profile_count::zero ();
     283                 :     834704 :   FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
     284                 :            :     {
     285                 :     417352 :       bbd[e->dest->index].heap = heap;
     286                 :     417352 :       bbd[e->dest->index].node = heap->insert (bb_to_key (e->dest), e->dest);
     287                 :     417352 :       if (e->dest->count > max_entry_count)
     288                 :     416781 :         max_entry_count = e->dest->count;
     289                 :            :     }
     290                 :            : 
     291                 :            :   /* Find the traces.  */
     292                 :    2086760 :   for (i = 0; i < number_of_rounds; i++)
     293                 :            :     {
     294                 :    1669410 :       profile_count count_threshold;
     295                 :            : 
     296                 :    1669410 :       if (dump_file)
     297                 :         92 :         fprintf (dump_file, "STC - round %d\n", i + 1);
     298                 :            : 
     299                 :    1669410 :       count_threshold = max_entry_count.apply_scale (exec_threshold[i], 1000);
     300                 :            : 
     301                 :    1669410 :       find_traces_1_round (REG_BR_PROB_BASE * branch_threshold[i] / 1000,
     302                 :            :                            count_threshold, traces, n_traces, i, &heap,
     303                 :            :                            number_of_rounds);
     304                 :            :     }
     305                 :     417352 :   delete heap;
     306                 :            : 
     307                 :     417352 :   if (dump_file)
     308                 :            :     {
     309                 :        105 :       for (i = 0; i < *n_traces; i++)
     310                 :            :         {
     311                 :         82 :           basic_block bb;
     312                 :         82 :           fprintf (dump_file, "Trace %d (round %d):  ", i + 1,
     313                 :         82 :                    traces[i].round + 1);
     314                 :         82 :           for (bb = traces[i].first;
     315                 :        182 :                bb != traces[i].last;
     316                 :        100 :                bb = (basic_block) bb->aux)
     317                 :            :             {
     318                 :        100 :               fprintf (dump_file, "%d [", bb->index);
     319                 :        100 :               bb->count.dump (dump_file);
     320                 :        100 :               fprintf (dump_file, "] ");
     321                 :            :             }
     322                 :         82 :           fprintf (dump_file, "%d [", bb->index);
     323                 :         82 :           bb->count.dump (dump_file);
     324                 :         82 :           fprintf (dump_file, "]\n");
     325                 :            :         }
     326                 :         23 :       fflush (dump_file);
     327                 :            :     }
     328                 :     417352 : }
     329                 :            : 
     330                 :            : /* Rotate loop whose back edge is BACK_EDGE in the tail of trace TRACE
     331                 :            :    (with sequential number TRACE_N).  */
     332                 :            : 
     333                 :            : static basic_block
     334                 :     116115 : rotate_loop (edge back_edge, struct trace *trace, int trace_n)
     335                 :            : {
     336                 :     116115 :   basic_block bb;
     337                 :            : 
     338                 :            :   /* Information about the best end (end after rotation) of the loop.  */
     339                 :     116115 :   basic_block best_bb = NULL;
     340                 :     116115 :   edge best_edge = NULL;
     341                 :     116115 :   profile_count best_count = profile_count::uninitialized ();
     342                 :            :   /* The best edge is preferred when its destination is not visited yet
     343                 :            :      or is a start block of some trace.  */
     344                 :     116115 :   bool is_preferred = false;
     345                 :            : 
     346                 :            :   /* Find the most frequent edge that goes out from current trace.  */
     347                 :     116115 :   bb = back_edge->dest;
     348                 :     461049 :   do
     349                 :            :     {
     350                 :     461049 :       edge e;
     351                 :     461049 :       edge_iterator ei;
     352                 :            : 
     353                 :    1287380 :       FOR_EACH_EDGE (e, ei, bb->succs)
     354                 :     826333 :         if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
     355                 :     826333 :             && bb_visited_trace (e->dest) != trace_n
     356                 :            :             && (e->flags & EDGE_CAN_FALLTHRU)
     357                 :    1149370 :             && !(e->flags & EDGE_COMPLEX))
     358                 :            :         {
     359                 :     298564 :           if (is_preferred)
     360                 :            :             {
     361                 :            :               /* The best edge is preferred.  */
     362                 :     180757 :               if (!bb_visited_trace (e->dest)
     363                 :     180757 :                   || bbd[e->dest->index].start_of_trace >= 0)
     364                 :            :                 {
     365                 :            :                   /* The current edge E is also preferred.  */
     366                 :     176975 :                   if (e->count () > best_count)
     367                 :            :                     {
     368                 :      38120 :                       best_count = e->count ();
     369                 :      38120 :                       best_edge = e;
     370                 :      38120 :                       best_bb = bb;
     371                 :            :                     }
     372                 :            :                 }
     373                 :            :             }
     374                 :            :           else
     375                 :            :             {
     376                 :     117807 :               if (!bb_visited_trace (e->dest)
     377                 :     117807 :                   || bbd[e->dest->index].start_of_trace >= 0)
     378                 :            :                 {
     379                 :            :                   /* The current edge E is preferred.  */
     380                 :     114644 :                   is_preferred = true;
     381                 :     114644 :                   best_count = e->count ();
     382                 :     114644 :                   best_edge = e;
     383                 :     114644 :                   best_bb = bb;
     384                 :            :                 }
     385                 :            :               else
     386                 :            :                 {
     387                 :       3163 :                   if (!best_edge || e->count () > best_count)
     388                 :            :                     {
     389                 :       2555 :                       best_count = e->count ();
     390                 :       2555 :                       best_edge = e;
     391                 :       2555 :                       best_bb = bb;
     392                 :            :                     }
     393                 :            :                 }
     394                 :            :             }
     395                 :            :         }
     396                 :     461049 :       bb = (basic_block) bb->aux;
     397                 :            :     }
     398                 :     461049 :   while (bb != back_edge->dest);
     399                 :            : 
     400                 :     116115 :   if (best_bb)
     401                 :            :     {
     402                 :            :       /* Rotate the loop so that the BEST_EDGE goes out from the last block of
     403                 :            :          the trace.  */
     404                 :     116039 :       if (back_edge->dest == trace->first)
     405                 :            :         {
     406                 :       2869 :           trace->first = (basic_block) best_bb->aux;
     407                 :            :         }
     408                 :            :       else
     409                 :            :         {
     410                 :            :           basic_block prev_bb;
     411                 :            : 
     412                 :            :           for (prev_bb = trace->first;
     413                 :     374230 :                prev_bb->aux != back_edge->dest;
     414                 :            :                prev_bb = (basic_block) prev_bb->aux)
     415                 :            :             ;
     416                 :     113170 :           prev_bb->aux = best_bb->aux;
     417                 :            : 
     418                 :            :           /* Try to get rid of uncond jump to cond jump.  */
     419                 :     113170 :           if (single_succ_p (prev_bb))
     420                 :            :             {
     421                 :      92770 :               basic_block header = single_succ (prev_bb);
     422                 :            : 
     423                 :            :               /* Duplicate HEADER if it is a small block containing cond jump
     424                 :            :                  in the end.  */
     425                 :     180483 :               if (any_condjump_p (BB_END (header)) && copy_bb_p (header, 0)
     426                 :      92770 :                   && !CROSSING_JUMP_P (BB_END (header)))
     427                 :          0 :                 copy_bb (header, single_succ_edge (prev_bb), prev_bb, trace_n);
     428                 :            :             }
     429                 :            :         }
     430                 :            :     }
     431                 :            :   else
     432                 :            :     {
     433                 :            :       /* We have not found suitable loop tail so do no rotation.  */
     434                 :         76 :       best_bb = back_edge->src;
     435                 :            :     }
     436                 :     116115 :   best_bb->aux = NULL;
     437                 :     116115 :   return best_bb;
     438                 :            : }
     439                 :            : 
     440                 :            : /* One round of finding traces.  Find traces for BRANCH_TH and EXEC_TH i.e. do
     441                 :            :    not include basic blocks whose probability is lower than BRANCH_TH or whose
     442                 :            :    count is lower than EXEC_TH into traces (or whose count is lower than
     443                 :            :    COUNT_TH).  Store the new traces into TRACES and modify the number of
     444                 :            :    traces *N_TRACES.  Set the round (which the trace belongs to) to ROUND.
     445                 :            :    The function expects starting basic blocks to be in *HEAP and will delete
     446                 :            :    *HEAP and store starting points for the next round into new *HEAP.  */
     447                 :            : 
     448                 :            : static void
     449                 :    1669410 : find_traces_1_round (int branch_th, profile_count count_th,
     450                 :            :                      struct trace *traces, int *n_traces, int round,
     451                 :            :                      bb_heap_t **heap, int number_of_rounds)
     452                 :            : {
     453                 :            :   /* Heap for discarded basic blocks which are possible starting points for
     454                 :            :      the next round.  */
     455                 :    1669410 :   bb_heap_t *new_heap = new bb_heap_t (LONG_MIN);
     456                 :    1669410 :   bool for_size = optimize_function_for_size_p (cfun);
     457                 :            : 
     458                 :    5552380 :   while (!(*heap)->empty ())
     459                 :            :     {
     460                 :    3882970 :       basic_block bb;
     461                 :    3882970 :       struct trace *trace;
     462                 :    3882970 :       edge best_edge, e;
     463                 :    3882970 :       long key;
     464                 :    3882970 :       edge_iterator ei;
     465                 :            : 
     466                 :    3882970 :       bb = (*heap)->extract_min ();
     467                 :    3882970 :       bbd[bb->index].heap = NULL;
     468                 :    3882970 :       bbd[bb->index].node = NULL;
     469                 :            : 
     470                 :    3882970 :       if (dump_file)
     471                 :         94 :         fprintf (dump_file, "Getting bb %d\n", bb->index);
     472                 :            : 
     473                 :            :       /* If the BB's count is too low, send BB to the next round.  When
     474                 :            :          partitioning hot/cold blocks into separate sections, make sure all
     475                 :            :          the cold blocks (and ONLY the cold blocks) go into the (extra) final
     476                 :            :          round.  When optimizing for size, do not push to next round.  */
     477                 :            : 
     478                 :    3882970 :       if (!for_size
     479                 :    3882970 :           && push_to_next_round_p (bb, round, number_of_rounds,
     480                 :            :                                    count_th))
     481                 :            :         {
     482                 :    1094120 :           int key = bb_to_key (bb);
     483                 :    1094120 :           bbd[bb->index].heap = new_heap;
     484                 :    1094120 :           bbd[bb->index].node = new_heap->insert (key, bb);
     485                 :            : 
     486                 :    1094120 :           if (dump_file)
     487                 :         12 :             fprintf (dump_file,
     488                 :            :                      "  Possible start point of next round: %d (key: %d)\n",
     489                 :            :                      bb->index, key);
     490                 :    1094120 :           continue;
     491                 :            :         }
     492                 :            : 
     493                 :    2788850 :       trace = traces + *n_traces;
     494                 :    2788850 :       trace->first = bb;
     495                 :    2788850 :       trace->round = round;
     496                 :    2788850 :       trace->length = 0;
     497                 :    2788850 :       bbd[bb->index].in_trace = *n_traces;
     498                 :    2788850 :       (*n_traces)++;
     499                 :            : 
     500                 :    6225810 :       do
     501                 :            :         {
     502                 :    6225810 :           bool ends_in_call;
     503                 :            : 
     504                 :            :           /* The probability and count of the best edge.  */
     505                 :    6225810 :           profile_probability best_prob = profile_probability::uninitialized ();
     506                 :    6225810 :           profile_count best_count = profile_count::uninitialized ();
     507                 :            : 
     508                 :    6225810 :           best_edge = NULL;
     509                 :    6225810 :           mark_bb_visited (bb, *n_traces);
     510                 :    6225810 :           trace->length++;
     511                 :            : 
     512                 :    6225810 :           if (dump_file)
     513                 :        182 :             fprintf (dump_file, "Basic block %d was visited in trace %d\n",
     514                 :            :                      bb->index, *n_traces);
     515                 :            : 
     516                 :    6225810 :           ends_in_call = block_ends_with_call_p (bb);
     517                 :            : 
     518                 :            :           /* Select the successor that will be placed after BB.  */
     519                 :   15579300 :           FOR_EACH_EDGE (e, ei, bb->succs)
     520                 :            :             {
     521                 :    9353530 :               gcc_assert (!(e->flags & EDGE_FAKE));
     522                 :            : 
     523                 :    9353530 :               if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
     524                 :    5308950 :                 continue;
     525                 :            : 
     526                 :    8877500 :               if (bb_visited_trace (e->dest)
     527                 :    8877500 :                   && bb_visited_trace (e->dest) != *n_traces)
     528                 :    1567300 :                 continue;
     529                 :            : 
     530                 :            :               /* If partitioning hot/cold basic blocks, don't consider edges
     531                 :            :                  that cross section boundaries.  */
     532                 :    7310210 :               if (BB_PARTITION (e->dest) != BB_PARTITION (bb))
     533                 :     316809 :                 continue;
     534                 :            : 
     535                 :    6993400 :               profile_probability prob = e->probability;
     536                 :    6993400 :               profile_count count = e->dest->count;
     537                 :            : 
     538                 :            :               /* The only sensible preference for a call instruction is the
     539                 :            :                  fallthru edge.  Don't bother selecting anything else.  */
     540                 :    6993400 :               if (ends_in_call)
     541                 :            :                 {
     542                 :     753299 :                   if (e->flags & EDGE_CAN_FALLTHRU)
     543                 :            :                     {
     544                 :     391695 :                       best_edge = e;
     545                 :     391695 :                       best_prob = prob;
     546                 :     391695 :                       best_count = count;
     547                 :            :                     }
     548                 :     753299 :                   continue;
     549                 :            :                 }
     550                 :            : 
     551                 :            :               /* Edge that cannot be fallthru or improbable or infrequent
     552                 :            :                  successor (i.e. it is unsuitable successor).  When optimizing
     553                 :            :                  for size, ignore the probability and count.  */
     554                 :    8435620 :               if (!(e->flags & EDGE_CAN_FALLTHRU) || (e->flags & EDGE_COMPLEX)
     555                 :    6098040 :                   || !prob.initialized_p ()
     556                 :   12336900 :                   || ((prob.to_reg_br_prob_base () < branch_th
     557                 :    6096840 :                       || e->count () < count_th) && (!for_size)))
     558                 :    2195520 :                 continue;
     559                 :            : 
     560                 :    4044580 :               if (better_edge_p (bb, e, prob, count, best_prob, best_count,
     561                 :            :                                  best_edge))
     562                 :            :                 {
     563                 :    3719560 :                   best_edge = e;
     564                 :    3719560 :                   best_prob = prob;
     565                 :    3719560 :                   best_count = count;
     566                 :            :                 }
     567                 :            :             }
     568                 :            : 
     569                 :            :           /* If the best destination has multiple predecessors and can be
     570                 :            :              duplicated cheaper than a jump, don't allow it to be added to
     571                 :            :              a trace; we'll duplicate it when connecting the traces later.
     572                 :            :              However, we need to check that this duplication wouldn't leave
     573                 :            :              the best destination with only crossing predecessors, because
     574                 :            :              this would change its effective partition from hot to cold.  */
     575                 :    6225810 :           if (best_edge
     576                 :    3847430 :               && EDGE_COUNT (best_edge->dest->preds) >= 2
     577                 :    7800400 :               && copy_bb_p (best_edge->dest, 0))
     578                 :            :             {
     579                 :      44414 :               bool only_crossing_preds = true;
     580                 :      44414 :               edge e;
     581                 :      44414 :               edge_iterator ei;
     582                 :      57478 :               FOR_EACH_EDGE (e, ei, best_edge->dest->preds)
     583                 :      57466 :                 if (e != best_edge && !(e->flags & EDGE_CROSSING))
     584                 :            :                   {
     585                 :            :                     only_crossing_preds = false;
     586                 :            :                     break;
     587                 :            :                   }
     588                 :      44414 :               if (!only_crossing_preds)
     589                 :      44402 :                 best_edge = NULL;
     590                 :            :             }
     591                 :            : 
     592                 :            :           /* If the best destination has multiple successors or predecessors,
     593                 :            :              don't allow it to be added when optimizing for size.  This makes
     594                 :            :              sure predecessors with smaller index are handled before the best
     595                 :            :              destination.  It breaks long trace and reduces long jumps.
     596                 :            : 
     597                 :            :              Take if-then-else as an example.
     598                 :            :                 A
     599                 :            :                / \
     600                 :            :               B   C
     601                 :            :                \ /
     602                 :            :                 D
     603                 :            :              If we do not remove the best edge B->D/C->D, the final order might
     604                 :            :              be A B D ... C.  C is at the end of the program.  If D's successors
     605                 :            :              and D are complicated, might need long jumps for A->C and C->D.
     606                 :            :              Similar issue for order: A C D ... B.
     607                 :            : 
     608                 :            :              After removing the best edge, the final result will be ABCD/ ACBD.
     609                 :            :              It does not add jump compared with the previous order.  But it
     610                 :            :              reduces the possibility of long jumps.  */
     611                 :    6225810 :           if (best_edge && for_size
     612                 :    6225810 :               && (EDGE_COUNT (best_edge->dest->succs) > 1
     613                 :      90440 :                  || EDGE_COUNT (best_edge->dest->preds) > 1))
     614                 :            :             best_edge = NULL;
     615                 :            : 
     616                 :            :           /* Add all non-selected successors to the heaps.  */
     617                 :   15579300 :           FOR_EACH_EDGE (e, ei, bb->succs)
     618                 :            :             {
     619                 :   15079500 :               if (e == best_edge
     620                 :    5693230 :                   || e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
     621                 :   14570700 :                   || bb_visited_trace (e->dest))
     622                 :    5725970 :                 continue;
     623                 :            : 
     624                 :    3627560 :               key = bb_to_key (e->dest);
     625                 :            : 
     626                 :    3627560 :               if (bbd[e->dest->index].heap)
     627                 :            :                 {
     628                 :            :                   /* E->DEST is already in some heap.  */
     629                 :     998621 :                   if (key != bbd[e->dest->index].node->get_key ())
     630                 :            :                     {
     631                 :          0 :                       if (dump_file)
     632                 :            :                         {
     633                 :          0 :                           fprintf (dump_file,
     634                 :            :                                    "Changing key for bb %d from %ld to %ld.\n",
     635                 :            :                                    e->dest->index,
     636                 :            :                                    (long) bbd[e->dest->index].node->get_key (),
     637                 :            :                                    key);
     638                 :            :                         }
     639                 :          0 :                       bbd[e->dest->index].heap->replace_key
     640                 :    9353530 :                         (bbd[e->dest->index].node, key);
     641                 :            :                     }
     642                 :            :                 }
     643                 :            :               else
     644                 :            :                 {
     645                 :    2628940 :                   bb_heap_t *which_heap = *heap;
     646                 :            : 
     647                 :    2628940 :                   profile_probability prob = e->probability;
     648                 :            : 
     649                 :    2628940 :                   if (!(e->flags & EDGE_CAN_FALLTHRU)
     650                 :    2628940 :                       || (e->flags & EDGE_COMPLEX)
     651                 :    2401430 :                       || !prob.initialized_p ()
     652                 :    2400310 :                       || prob.to_reg_br_prob_base () < branch_th
     653                 :    3843570 :                       || e->count () < count_th)
     654                 :            :                     {
     655                 :            :                       /* When partitioning hot/cold basic blocks, make sure
     656                 :            :                          the cold blocks (and only the cold blocks) all get
     657                 :            :                          pushed to the last round of trace collection.  When
     658                 :            :                          optimizing for size, do not push to next round.  */
     659                 :            : 
     660                 :    2022630 :                       if (!for_size && push_to_next_round_p (e->dest, round,
     661                 :            :                                                              number_of_rounds,
     662                 :            :                                                              count_th))
     663                 :            :                         which_heap = new_heap;
     664                 :            :                     }
     665                 :            : 
     666                 :    2628940 :                   bbd[e->dest->index].heap = which_heap;
     667                 :    2628940 :                   bbd[e->dest->index].node = which_heap->insert (key, e->dest);
     668                 :            : 
     669                 :    2628940 :                   if (dump_file)
     670                 :            :                     {
     671                 :         79 :                       fprintf (dump_file,
     672                 :            :                                "  Possible start of %s round: %d (key: %ld)\n",
     673                 :            :                                (which_heap == new_heap) ? "next" : "this",
     674                 :         79 :                                e->dest->index, (long) key);
     675                 :            :                     }
     676                 :            : 
     677                 :            :                 }
     678                 :            :             }
     679                 :            : 
     680                 :    6225810 :           if (best_edge) /* Suitable successor was found.  */
     681                 :            :             {
     682                 :    3660300 :               if (bb_visited_trace (best_edge->dest) == *n_traces)
     683                 :            :                 {
     684                 :            :                   /* We do nothing with one basic block loops.  */
     685                 :     223342 :                   if (best_edge->dest != bb)
     686                 :            :                     {
     687                 :     136437 :                       if (best_edge->count ()
     688                 :     272874 :                           > best_edge->dest->count.apply_scale (4, 5))
     689                 :            :                         {
     690                 :            :                           /* The loop has at least 4 iterations.  If the loop
     691                 :            :                              header is not the first block of the function
     692                 :            :                              we can rotate the loop.  */
     693                 :            : 
     694                 :     116176 :                           if (best_edge->dest
     695                 :     116176 :                               != ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb)
     696                 :            :                             {
     697                 :     116115 :                               if (dump_file)
     698                 :            :                                 {
     699                 :          0 :                                   fprintf (dump_file,
     700                 :            :                                            "Rotating loop %d - %d\n",
     701                 :            :                                            best_edge->dest->index, bb->index);
     702                 :            :                                 }
     703                 :     116115 :                               bb->aux = best_edge->dest;
     704                 :     116115 :                               bbd[best_edge->dest->index].in_trace =
     705                 :     116115 :                                                              (*n_traces) - 1;
     706                 :     116115 :                               bb = rotate_loop (best_edge, trace, *n_traces);
     707                 :            :                             }
     708                 :            :                         }
     709                 :            :                       else
     710                 :            :                         {
     711                 :            :                           /* The loop has less than 4 iterations.  */
     712                 :            : 
     713                 :      20261 :                           if (single_succ_p (bb)
     714                 :      27608 :                               && copy_bb_p (best_edge->dest,
     715                 :            :                                             optimize_edge_for_speed_p
     716                 :       7347 :                                             (best_edge)))
     717                 :            :                             {
     718                 :       3379 :                               bb = copy_bb (best_edge->dest, best_edge, bb,
     719                 :            :                                             *n_traces);
     720                 :       3379 :                               trace->length++;
     721                 :            :                             }
     722                 :            :                         }
     723                 :            :                     }
     724                 :            : 
     725                 :            :                   /* Terminate the trace.  */
     726                 :     223342 :                   break;
     727                 :            :                 }
     728                 :            :               else
     729                 :            :                 {
     730                 :            :                   /* Check for a situation
     731                 :            : 
     732                 :            :                     A
     733                 :            :                    /|
     734                 :            :                   B |
     735                 :            :                    \|
     736                 :            :                     C
     737                 :            : 
     738                 :            :                   where
     739                 :            :                   AB->count () + BC->count () >= AC->count ().
     740                 :            :                   (i.e. 2 * B->count >= AC->count )
     741                 :            :                   Best ordering is then A B C.
     742                 :            : 
     743                 :            :                   When optimizing for size, A B C is always the best order.
     744                 :            : 
     745                 :            :                   This situation is created for example by:
     746                 :            : 
     747                 :            :                   if (A) B;
     748                 :            :                   C;
     749                 :            : 
     750                 :            :                   */
     751                 :            : 
     752                 :    9511210 :                   FOR_EACH_EDGE (e, ei, bb->succs)
     753                 :    6087700 :                     if (e != best_edge
     754                 :            :                         && (e->flags & EDGE_CAN_FALLTHRU)
     755                 :    2662140 :                         && !(e->flags & EDGE_COMPLEX)
     756                 :    2291290 :                         && !bb_visited_trace (e->dest)
     757                 :    2069400 :                         && single_pred_p (e->dest)
     758                 :    1047000 :                         && !(e->flags & EDGE_CROSSING)
     759                 :    1025300 :                         && single_succ_p (e->dest)
     760                 :     637632 :                         && (single_succ_edge (e->dest)->flags
     761                 :     637632 :                             & EDGE_CAN_FALLTHRU)
     762                 :     562870 :                         && !(single_succ_edge (e->dest)->flags & EDGE_COMPLEX)
     763                 :     562870 :                         && single_succ (e->dest) == best_edge->dest
     764                 :    6305040 :                         && (e->dest->count.apply_scale (2, 1)
     765                 :    6305040 :                             >= best_edge->count () || for_size))
     766                 :            :                       {
     767                 :      13445 :                         best_edge = e;
     768                 :      13445 :                         if (dump_file)
     769                 :          0 :                           fprintf (dump_file, "Selecting BB %d\n",
     770                 :          0 :                                    best_edge->dest->index);
     771                 :            :                         break;
     772                 :            :                       }
     773                 :            : 
     774                 :    3436960 :                   bb->aux = best_edge->dest;
     775                 :    3436960 :                   bbd[best_edge->dest->index].in_trace = (*n_traces) - 1;
     776                 :    3436960 :                   bb = best_edge->dest;
     777                 :            :                 }
     778                 :            :             }
     779                 :            :         }
     780                 :    6002470 :       while (best_edge);
     781                 :    2788850 :       trace->last = bb;
     782                 :    2788850 :       bbd[trace->first->index].start_of_trace = *n_traces - 1;
     783                 :    2788850 :       if (bbd[trace->last->index].end_of_trace != *n_traces - 1)
     784                 :            :         {
     785                 :    2788850 :           bbd[trace->last->index].end_of_trace = *n_traces - 1;
     786                 :            :           /* Update the cached maximum frequency for interesting predecessor
     787                 :            :              edges for successors of the new trace end.  */
     788                 :    6075450 :           FOR_EACH_EDGE (e, ei, trace->last->succs)
     789                 :    3286600 :             if (EDGE_FREQUENCY (e) > bbd[e->dest->index].priority)
     790                 :    2567020 :               bbd[e->dest->index].priority = EDGE_FREQUENCY (e);
     791                 :            :         }
     792                 :            : 
     793                 :            :       /* The trace is terminated so we have to recount the keys in heap
     794                 :            :          (some block can have a lower key because now one of its predecessors
     795                 :            :          is an end of the trace).  */
     796                 :    6075450 :       FOR_EACH_EDGE (e, ei, bb->succs)
     797                 :            :         {
     798                 :    5353440 :           if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
     799                 :    3286600 :               || bb_visited_trace (e->dest))
     800                 :    2066850 :             continue;
     801                 :            : 
     802                 :    1219750 :           if (bbd[e->dest->index].heap)
     803                 :            :             {
     804                 :    1219750 :               key = bb_to_key (e->dest);
     805                 :    1219750 :               if (key != bbd[e->dest->index].node->get_key ())
     806                 :            :                 {
     807                 :     778550 :                   if (dump_file)
     808                 :            :                     {
     809                 :         37 :                       fprintf (dump_file,
     810                 :            :                                "Changing key for bb %d from %ld to %ld.\n",
     811                 :            :                                e->dest->index,
     812                 :            :                                (long) bbd[e->dest->index].node->get_key (), key);
     813                 :            :                     }
     814                 :     778550 :                   bbd[e->dest->index].heap->replace_key
     815                 :    4065150 :                     (bbd[e->dest->index].node, key);
     816                 :            :                 }
     817                 :            :             }
     818                 :            :         }
     819                 :            :     }
     820                 :            : 
     821                 :    1669410 :   delete (*heap);
     822                 :            : 
     823                 :            :   /* "Return" the new heap.  */
     824                 :    1669410 :   *heap = new_heap;
     825                 :    1669410 : }
     826                 :            : 
     827                 :            : /* Create a duplicate of the basic block OLD_BB and redirect edge E to it, add
     828                 :            :    it to trace after BB, mark OLD_BB visited and update pass' data structures
     829                 :            :    (TRACE is a number of trace which OLD_BB is duplicated to).  */
     830                 :            : 
     831                 :            : static basic_block
     832                 :     137449 : copy_bb (basic_block old_bb, edge e, basic_block bb, int trace)
     833                 :            : {
     834                 :     137449 :   basic_block new_bb;
     835                 :            : 
     836                 :     137449 :   new_bb = duplicate_block (old_bb, e, bb);
     837                 :     137449 :   BB_COPY_PARTITION (new_bb, old_bb);
     838                 :            : 
     839                 :     137449 :   gcc_assert (e->dest == new_bb);
     840                 :            : 
     841                 :     137449 :   if (dump_file)
     842                 :          2 :     fprintf (dump_file,
     843                 :            :              "Duplicated bb %d (created bb %d)\n",
     844                 :            :              old_bb->index, new_bb->index);
     845                 :            : 
     846                 :     137449 :   if (new_bb->index >= array_size
     847                 :     137387 :       || last_basic_block_for_fn (cfun) > array_size)
     848                 :            :     {
     849                 :         62 :       int i;
     850                 :         62 :       int new_size;
     851                 :            : 
     852                 :         62 :       new_size = MAX (last_basic_block_for_fn (cfun), new_bb->index + 1);
     853                 :         62 :       new_size = GET_ARRAY_SIZE (new_size);
     854                 :         62 :       bbd = XRESIZEVEC (bbro_basic_block_data, bbd, new_size);
     855                 :       1182 :       for (i = array_size; i < new_size; i++)
     856                 :            :         {
     857                 :       1120 :           bbd[i].start_of_trace = -1;
     858                 :       1120 :           bbd[i].end_of_trace = -1;
     859                 :       1120 :           bbd[i].in_trace = -1;
     860                 :       1120 :           bbd[i].visited = 0;
     861                 :       1120 :           bbd[i].priority = -1;
     862                 :       1120 :           bbd[i].heap = NULL;
     863                 :       1120 :           bbd[i].node = NULL;
     864                 :            :         }
     865                 :         62 :       array_size = new_size;
     866                 :            : 
     867                 :         62 :       if (dump_file)
     868                 :            :         {
     869                 :          0 :           fprintf (dump_file,
     870                 :            :                    "Growing the dynamic array to %d elements.\n",
     871                 :            :                    array_size);
     872                 :            :         }
     873                 :            :     }
     874                 :            : 
     875                 :     137449 :   gcc_assert (!bb_visited_trace (e->dest));
     876                 :     137449 :   mark_bb_visited (new_bb, trace);
     877                 :     137449 :   new_bb->aux = bb->aux;
     878                 :     137449 :   bb->aux = new_bb;
     879                 :            : 
     880                 :     137449 :   bbd[new_bb->index].in_trace = trace;
     881                 :            : 
     882                 :     137449 :   return new_bb;
     883                 :            : }
     884                 :            : 
     885                 :            : /* Compute and return the key (for the heap) of the basic block BB.  */
     886                 :            : 
     887                 :            : static long
     888                 :    6358780 : bb_to_key (basic_block bb)
     889                 :            : {
     890                 :    6358780 :   edge e;
     891                 :    6358780 :   edge_iterator ei;
     892                 :            : 
     893                 :            :   /* Use index as key to align with its original order.  */
     894                 :    6358780 :   if (optimize_function_for_size_p (cfun))
     895                 :     439708 :     return bb->index;
     896                 :            : 
     897                 :            :   /* Do not start in probably never executed blocks.  */
     898                 :            : 
     899                 :    5919080 :   if (BB_PARTITION (bb) == BB_COLD_PARTITION
     900                 :    5919080 :       || probably_never_executed_bb_p (cfun, bb))
     901                 :    1375930 :     return BB_FREQ_MAX;
     902                 :            : 
     903                 :            :   /* Prefer blocks whose predecessor is an end of some trace
     904                 :            :      or whose predecessor edge is EDGE_DFS_BACK.  */
     905                 :    4543150 :   int priority = bbd[bb->index].priority;
     906                 :    4543150 :   if (priority == -1)
     907                 :            :     {
     908                 :    2537220 :       priority = 0;
     909                 :    6136400 :       FOR_EACH_EDGE (e, ei, bb->preds)
     910                 :            :         {
     911                 :    3599190 :           if ((e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
     912                 :    3208290 :                && bbd[e->src->index].end_of_trace >= 0)
     913                 :    3599190 :               || (e->flags & EDGE_DFS_BACK))
     914                 :            :             {
     915                 :      20361 :               int edge_freq = EDGE_FREQUENCY (e);
     916                 :            : 
     917                 :      20361 :               if (edge_freq > priority)
     918                 :            :                 priority = edge_freq;
     919                 :            :             }
     920                 :            :         }
     921                 :    2537220 :       bbd[bb->index].priority = priority;
     922                 :            :     }
     923                 :            : 
     924                 :    4543150 :   if (priority)
     925                 :            :     /* The block with priority should have significantly lower key.  */
     926                 :     953428 :     return -(100 * BB_FREQ_MAX + 100 * priority + bb->count.to_frequency (cfun));
     927                 :            : 
     928                 :    3589720 :   return -bb->count.to_frequency (cfun);
     929                 :            : }
     930                 :            : 
     931                 :            : /* Return true when the edge E from basic block BB is better than the temporary
     932                 :            :    best edge (details are in function).  The probability of edge E is PROB. The
     933                 :            :    count of the successor is COUNT.  The current best probability is
     934                 :            :    BEST_PROB, the best count is BEST_COUNT.
     935                 :            :    The edge is considered to be equivalent when PROB does not differ much from
     936                 :            :    BEST_PROB; similarly for count.  */
     937                 :            : 
     938                 :            : static bool
     939                 :    4044580 : better_edge_p (const_basic_block bb, const_edge e, profile_probability prob,
     940                 :            :                profile_count count, profile_probability best_prob,
     941                 :            :                profile_count best_count, const_edge cur_best_edge)
     942                 :            : {
     943                 :    4044580 :   bool is_better_edge;
     944                 :            : 
     945                 :            :   /* The BEST_* values do not have to be best, but can be a bit smaller than
     946                 :            :      maximum values.  */
     947                 :    4044580 :   profile_probability diff_prob = best_prob.apply_scale (1, 10);
     948                 :            : 
     949                 :            :   /* The smaller one is better to keep the original order.  */
     950                 :    4044580 :   if (optimize_function_for_size_p (cfun))
     951                 :     281629 :     return !cur_best_edge
     952                 :     313720 :            || cur_best_edge->dest->index > e->dest->index;
     953                 :            : 
     954                 :            :   /* Those edges are so expensive that continuing a trace is not useful
     955                 :            :      performance wise.  */
     956                 :    3762950 :   if (e->flags & (EDGE_ABNORMAL | EDGE_EH))
     957                 :            :     return false;
     958                 :            : 
     959                 :    3762950 :   if (prob > best_prob + diff_prob
     960                 :    3618770 :       || (!best_prob.initialized_p ()
     961                 :    6671180 :           && prob > profile_probability::guessed_never ()))
     962                 :            :     /* The edge has higher probability than the temporary best edge.  */
     963                 :            :     is_better_edge = true;
     964                 :     714688 :   else if (prob < best_prob - diff_prob)
     965                 :            :     /* The edge has lower probability than the temporary best edge.  */
     966                 :            :     is_better_edge = false;
     967                 :            :   else
     968                 :            :     {
     969                 :     176180 :       profile_count diff_count = best_count.apply_scale (1, 10);
     970                 :     223541 :       if (count < best_count - diff_count
     971                 :     176180 :           || (!best_count.initialized_p ()
     972                 :       8365 :               && count.nonzero_p ()))
     973                 :            :         /* The edge and the temporary best edge  have almost equivalent
     974                 :            :            probabilities.  The higher countuency of a successor now means
     975                 :            :            that there is another edge going into that successor.
     976                 :            :            This successor has lower countuency so it is better.  */
     977                 :            :         is_better_edge = true;
     978                 :     128819 :       else if (count > best_count + diff_count)
     979                 :            :         /* This successor has higher countuency so it is worse.  */
     980                 :            :         is_better_edge = false;
     981                 :      65607 :       else if (e->dest->prev_bb == bb)
     982                 :            :         /* The edges have equivalent probabilities and the successors
     983                 :            :            have equivalent frequencies.  Select the previous successor.  */
     984                 :            :         is_better_edge = true;
     985                 :            :       else
     986                 :      48557 :         is_better_edge = false;
     987                 :            :     }
     988                 :            : 
     989                 :            :   return is_better_edge;
     990                 :            : }
     991                 :            : 
     992                 :            : /* Return true when the edge E is better than the temporary best edge
     993                 :            :    CUR_BEST_EDGE.  If SRC_INDEX_P is true, the function compares the src bb of
     994                 :            :    E and CUR_BEST_EDGE; otherwise it will compare the dest bb.
     995                 :            :    BEST_LEN is the trace length of src (or dest) bb in CUR_BEST_EDGE.
     996                 :            :    TRACES record the information about traces.
     997                 :            :    When optimizing for size, the edge with smaller index is better.
     998                 :            :    When optimizing for speed, the edge with bigger probability or longer trace
     999                 :            :    is better.  */
    1000                 :            : 
    1001                 :            : static bool
    1002                 :    1155000 : connect_better_edge_p (const_edge e, bool src_index_p, int best_len,
    1003                 :            :                        const_edge cur_best_edge, struct trace *traces)
    1004                 :            : {
    1005                 :    1155000 :   int e_index;
    1006                 :    1155000 :   int b_index;
    1007                 :    1155000 :   bool is_better_edge;
    1008                 :            : 
    1009                 :    1155000 :   if (!cur_best_edge)
    1010                 :            :     return true;
    1011                 :            : 
    1012                 :     304609 :   if (optimize_function_for_size_p (cfun))
    1013                 :            :     {
    1014                 :      38478 :       e_index = src_index_p ? e->src->index : e->dest->index;
    1015                 :      38478 :       b_index = src_index_p ? cur_best_edge->src->index
    1016                 :      37502 :                               : cur_best_edge->dest->index;
    1017                 :            :       /* The smaller one is better to keep the original order.  */
    1018                 :      38478 :       return b_index > e_index;
    1019                 :            :     }
    1020                 :            : 
    1021                 :     266131 :   if (src_index_p)
    1022                 :            :     {
    1023                 :      18511 :       e_index = e->src->index;
    1024                 :            : 
    1025                 :            :       /* We are looking for predecessor, so probabilities are not that
    1026                 :            :          informative.  We do not want to connect A to B because A has
    1027                 :            :          only one successor (probability is 100%) while there is edge
    1028                 :            :          A' to B where probability is 90% but which is much more frequent.  */
    1029                 :      18511 :       if (e->count () > cur_best_edge->count ())
    1030                 :            :         /* The edge has higher probability than the temporary best edge.  */
    1031                 :            :         is_better_edge = true;
    1032                 :      15243 :       else if (e->count () < cur_best_edge->count ())
    1033                 :            :         /* The edge has lower probability than the temporary best edge.  */
    1034                 :            :         is_better_edge = false;
    1035                 :       7276 :       else if (e->probability > cur_best_edge->probability)
    1036                 :            :         /* The edge has higher probability than the temporary best edge.  */
    1037                 :            :         is_better_edge = true;
    1038                 :       6491 :       else if (e->probability < cur_best_edge->probability)
    1039                 :            :         /* The edge has lower probability than the temporary best edge.  */
    1040                 :            :         is_better_edge = false;
    1041                 :       5290 :       else if (traces[bbd[e_index].end_of_trace].length > best_len)
    1042                 :            :         /* The edge and the temporary best edge have equivalent probabilities.
    1043                 :            :            The edge with longer trace is better.  */
    1044                 :            :         is_better_edge = true;
    1045                 :            :       else
    1046                 :       5039 :         is_better_edge = false;
    1047                 :            :     }
    1048                 :            :   else
    1049                 :            :     {
    1050                 :     247620 :       e_index = e->dest->index;
    1051                 :            : 
    1052                 :     247620 :       if (e->probability > cur_best_edge->probability)
    1053                 :            :         /* The edge has higher probability than the temporary best edge.  */
    1054                 :            :         is_better_edge = true;
    1055                 :     156896 :       else if (e->probability < cur_best_edge->probability)
    1056                 :            :         /* The edge has lower probability than the temporary best edge.  */
    1057                 :            :         is_better_edge = false;
    1058                 :      63252 :       else if (traces[bbd[e_index].start_of_trace].length > best_len)
    1059                 :            :         /* The edge and the temporary best edge have equivalent probabilities.
    1060                 :            :            The edge with longer trace is better.  */
    1061                 :            :         is_better_edge = true;
    1062                 :            :       else
    1063                 :      46427 :         is_better_edge = false;
    1064                 :            :     }
    1065                 :            : 
    1066                 :            :   return is_better_edge;
    1067                 :            : }
    1068                 :            : 
    1069                 :            : /* Connect traces in array TRACES, N_TRACES is the count of traces.  */
    1070                 :            : 
    1071                 :            : static void
    1072                 :     417352 : connect_traces (int n_traces, struct trace *traces)
    1073                 :            : {
    1074                 :     417352 :   int i;
    1075                 :     417352 :   bool *connected;
    1076                 :     417352 :   bool two_passes;
    1077                 :     417352 :   int last_trace;
    1078                 :     417352 :   int current_pass;
    1079                 :     417352 :   int current_partition;
    1080                 :     417352 :   profile_count count_threshold;
    1081                 :     417352 :   bool for_size = optimize_function_for_size_p (cfun);
    1082                 :            : 
    1083                 :     417352 :   count_threshold = max_entry_count.apply_scale (DUPLICATION_THRESHOLD, 1000);
    1084                 :            : 
    1085                 :     417352 :   connected = XCNEWVEC (bool, n_traces);
    1086                 :     417352 :   last_trace = -1;
    1087                 :     417352 :   current_pass = 1;
    1088                 :     417352 :   current_partition = BB_PARTITION (traces[0].first);
    1089                 :     417352 :   two_passes = false;
    1090                 :            : 
    1091                 :     417352 :   if (crtl->has_bb_partition)
    1092                 :     336637 :     for (i = 0; i < n_traces && !two_passes; i++)
    1093                 :     289171 :       if (BB_PARTITION (traces[0].first)
    1094                 :     289171 :           != BB_PARTITION (traces[i].first))
    1095                 :      47465 :         two_passes = true;
    1096                 :            : 
    1097                 :    3577020 :   for (i = 0; i < n_traces || (two_passes && current_pass == 1) ; i++)
    1098                 :            :     {
    1099                 :    3159670 :       int t = i;
    1100                 :    3159670 :       int t2;
    1101                 :    3159670 :       edge e, best;
    1102                 :    3159670 :       int best_len;
    1103                 :            : 
    1104                 :    3159670 :       if (i >= n_traces)
    1105                 :            :         {
    1106                 :      47465 :           gcc_assert (two_passes && current_pass == 1);
    1107                 :      47465 :           i = 0;
    1108                 :      47465 :           t = i;
    1109                 :      47465 :           current_pass = 2;
    1110                 :      47465 :           if (current_partition == BB_HOT_PARTITION)
    1111                 :            :             current_partition = BB_COLD_PARTITION;
    1112                 :            :           else
    1113                 :          0 :             current_partition = BB_HOT_PARTITION;
    1114                 :            :         }
    1115                 :            : 
    1116                 :    3159670 :       if (connected[t])
    1117                 :    1244560 :         continue;
    1118                 :            : 
    1119                 :    2018960 :       if (two_passes
    1120                 :     377611 :           && BB_PARTITION (traces[t].first) != current_partition)
    1121                 :     103855 :         continue;
    1122                 :            : 
    1123                 :    1915100 :       connected[t] = true;
    1124                 :            : 
    1125                 :            :       /* Find the predecessor traces.  */
    1126                 :    1968500 :       for (t2 = t; t2 > 0;)
    1127                 :            :         {
    1128                 :    1551150 :           edge_iterator ei;
    1129                 :    1551150 :           best = NULL;
    1130                 :    1551150 :           best_len = 0;
    1131                 :    3984740 :           FOR_EACH_EDGE (e, ei, traces[t2].first->preds)
    1132                 :            :             {
    1133                 :    2433590 :               int si = e->src->index;
    1134                 :            : 
    1135                 :    2433590 :               if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
    1136                 :            :                   && (e->flags & EDGE_CAN_FALLTHRU)
    1137                 :    2433590 :                   && !(e->flags & EDGE_COMPLEX)
    1138                 :    1890940 :                   && bbd[si].end_of_trace >= 0
    1139                 :     295876 :                   && !connected[bbd[si].end_of_trace]
    1140                 :      72963 :                   && (BB_PARTITION (e->src) == current_partition)
    1141                 :    2506480 :                   && connect_better_edge_p (e, true, best_len, best, traces))
    1142                 :            :                 {
    1143                 :      58001 :                   best = e;
    1144                 :      58001 :                   best_len = traces[bbd[si].end_of_trace].length;
    1145                 :            :                 }
    1146                 :            :             }
    1147                 :    1551150 :           if (best)
    1148                 :            :             {
    1149                 :      53395 :               best->src->aux = best->dest;
    1150                 :      53395 :               t2 = bbd[best->src->index].end_of_trace;
    1151                 :      53395 :               connected[t2] = true;
    1152                 :            : 
    1153                 :      53395 :               if (dump_file)
    1154                 :            :                 {
    1155                 :          8 :                   fprintf (dump_file, "Connection: %d %d\n",
    1156                 :            :                            best->src->index, best->dest->index);
    1157                 :            :                 }
    1158                 :            :             }
    1159                 :            :           else
    1160                 :            :             break;
    1161                 :            :         }
    1162                 :            : 
    1163                 :    1915100 :       if (last_trace >= 0)
    1164                 :    1497750 :         traces[last_trace].last->aux = traces[t2].first;
    1165                 :            :       last_trace = t;
    1166                 :            : 
    1167                 :            :       /* Find the successor traces.  */
    1168                 :    3555810 :       while (1)
    1169                 :            :         {
    1170                 :            :           /* Find the continuation of the chain.  */
    1171                 :    2735460 :           edge_iterator ei;
    1172                 :    2735460 :           best = NULL;
    1173                 :    2735460 :           best_len = 0;
    1174                 :    5946420 :           FOR_EACH_EDGE (e, ei, traces[t].last->succs)
    1175                 :            :             {
    1176                 :    3210960 :               int di = e->dest->index;
    1177                 :            : 
    1178                 :    3210960 :               if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
    1179                 :            :                   && (e->flags & EDGE_CAN_FALLTHRU)
    1180                 :    2734930 :                   && !(e->flags & EDGE_COMPLEX)
    1181                 :    2544460 :                   && bbd[di].start_of_trace >= 0
    1182                 :    1377720 :                   && !connected[bbd[di].start_of_trace]
    1183                 :    1087970 :                   && (BB_PARTITION (e->dest) == current_partition)
    1184                 :    4293070 :                   && connect_better_edge_p (e, false, best_len, best, traces))
    1185                 :            :                 {
    1186                 :     932852 :                   best = e;
    1187                 :     932852 :                   best_len = traces[bbd[di].start_of_trace].length;
    1188                 :            :                 }
    1189                 :            :             }
    1190                 :            : 
    1191                 :    2735460 :           if (for_size)
    1192                 :            :             {
    1193                 :     192704 :               if (!best)
    1194                 :            :                 /* Stop finding the successor traces.  */
    1195                 :            :                 break;
    1196                 :            : 
    1197                 :            :               /* It is OK to connect block n with block n + 1 or a block
    1198                 :            :                  before n.  For others, only connect to the loop header.  */
    1199                 :     140082 :               if (best->dest->index > (traces[t].last->index + 1))
    1200                 :            :                 {
    1201                 :      12070 :                   int count = EDGE_COUNT (best->dest->preds);
    1202                 :            : 
    1203                 :      71288 :                   FOR_EACH_EDGE (e, ei, best->dest->preds)
    1204                 :      59218 :                     if (e->flags & EDGE_DFS_BACK)
    1205                 :       2718 :                       count--;
    1206                 :            : 
    1207                 :            :                   /* If dest has multiple predecessors, skip it.  We expect
    1208                 :            :                      that one predecessor with smaller index connects with it
    1209                 :            :                      later.  */
    1210                 :      12070 :                   if (count != 1) 
    1211                 :            :                     break;
    1212                 :            :                 }
    1213                 :            : 
    1214                 :            :               /* Only connect Trace n with Trace n + 1.  It is conservative
    1215                 :            :                  to keep the order as close as possible to the original order.
    1216                 :            :                  It also helps to reduce long jumps.  */
    1217                 :     131746 :               if (last_trace != bbd[best->dest->index].start_of_trace - 1)
    1218                 :            :                 break;
    1219                 :            : 
    1220                 :     130351 :               if (dump_file)
    1221                 :          5 :                 fprintf (dump_file, "Connection: %d %d\n",
    1222                 :          5 :                          best->src->index, best->dest->index);
    1223                 :            : 
    1224                 :     130351 :               t = bbd[best->dest->index].start_of_trace;
    1225                 :     130351 :               traces[last_trace].last->aux = traces[t].first;
    1226                 :     130351 :               connected[t] = true;
    1227                 :     130351 :               last_trace = t;
    1228                 :            :             }
    1229                 :    2542750 :           else if (best)
    1230                 :            :             {
    1231                 :     656913 :               if (dump_file)
    1232                 :            :                 {
    1233                 :         31 :                   fprintf (dump_file, "Connection: %d %d\n",
    1234                 :         31 :                            best->src->index, best->dest->index);
    1235                 :            :                 }
    1236                 :     656913 :               t = bbd[best->dest->index].start_of_trace;
    1237                 :     656913 :               traces[last_trace].last->aux = traces[t].first;
    1238                 :     656913 :               connected[t] = true;
    1239                 :     656913 :               last_trace = t;
    1240                 :            :             }
    1241                 :            :           else
    1242                 :            :             {
    1243                 :            :               /* Try to connect the traces by duplication of 1 block.  */
    1244                 :    1885840 :               edge e2;
    1245                 :    1885840 :               basic_block next_bb = NULL;
    1246                 :    1885840 :               bool try_copy = false;
    1247                 :            : 
    1248                 :    3610570 :               FOR_EACH_EDGE (e, ei, traces[t].last->succs)
    1249                 :    1724730 :                 if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
    1250                 :            :                     && (e->flags & EDGE_CAN_FALLTHRU)
    1251                 :    1275570 :                     && !(e->flags & EDGE_COMPLEX)
    1252                 :    4089370 :                     && (!best || e->probability > best->probability))
    1253                 :            :                   {
    1254                 :    1080100 :                     edge_iterator ei;
    1255                 :    1080100 :                     edge best2 = NULL;
    1256                 :    1080100 :                     int best2_len = 0;
    1257                 :            : 
    1258                 :            :                     /* If the destination is a start of a trace which is only
    1259                 :            :                        one block long, then no need to search the successor
    1260                 :            :                        blocks of the trace.  Accept it.  */
    1261                 :    1080100 :                     if (bbd[e->dest->index].start_of_trace >= 0
    1262                 :     196512 :                         && traces[bbd[e->dest->index].start_of_trace].length
    1263                 :            :                            == 1)
    1264                 :            :                       {
    1265                 :     132839 :                         best = e;
    1266                 :     132839 :                         try_copy = true;
    1267                 :     132839 :                         continue;
    1268                 :            :                       }
    1269                 :            : 
    1270                 :    2445220 :                     FOR_EACH_EDGE (e2, ei, e->dest->succs)
    1271                 :            :                       {
    1272                 :    1497960 :                         int di = e2->dest->index;
    1273                 :            : 
    1274                 :    1497960 :                         if (e2->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
    1275                 :    1497960 :                             || ((e2->flags & EDGE_CAN_FALLTHRU)
    1276                 :    1303640 :                                 && !(e2->flags & EDGE_COMPLEX)
    1277                 :    1211280 :                                 && bbd[di].start_of_trace >= 0
    1278                 :     448603 :                                 && !connected[bbd[di].start_of_trace]
    1279                 :     161470 :                                 && BB_PARTITION (e2->dest) == current_partition
    1280                 :    1303830 :                                 && e2->count () >= count_threshold
    1281                 :      79223 :                                 && (!best2
    1282                 :        501 :                                     || e2->probability > best2->probability
    1283                 :        221 :                                     || (e2->probability == best2->probability
    1284                 :         99 :                                         && traces[bbd[di].start_of_trace].length
    1285                 :            :                                            > best2_len))))
    1286                 :            :                           {
    1287                 :     273345 :                             best = e;
    1288                 :     273345 :                             best2 = e2;
    1289                 :     273345 :                             if (e2->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
    1290                 :      79029 :                               best2_len = traces[bbd[di].start_of_trace].length;
    1291                 :            :                             else
    1292                 :            :                               best2_len = INT_MAX;
    1293                 :            :                             next_bb = e2->dest;
    1294                 :            :                             try_copy = true;
    1295                 :            :                           }
    1296                 :            :                       }
    1297                 :            :                   }
    1298                 :            : 
    1299                 :            :               /* Copy tiny blocks always; copy larger blocks only when the
    1300                 :            :                  edge is traversed frequently enough.  */
    1301                 :    1885840 :               if (try_copy
    1302                 :     402352 :                   && BB_PARTITION (best->src) == BB_PARTITION (best->dest)
    1303                 :    2286870 :                   && copy_bb_p (best->dest,
    1304                 :     401031 :                                 optimize_edge_for_speed_p (best)
    1305                 :     401031 :                                 && (!best->count ().initialized_p ()
    1306                 :    2025260 :                                     || best->count () >= count_threshold)))
    1307                 :            :                 {
    1308                 :     134070 :                   basic_block new_bb;
    1309                 :            : 
    1310                 :     134070 :                   if (dump_file)
    1311                 :            :                     {
    1312                 :          2 :                       fprintf (dump_file, "Connection: %d %d ",
    1313                 :          2 :                                traces[t].last->index, best->dest->index);
    1314                 :          2 :                       if (!next_bb)
    1315                 :          0 :                         fputc ('\n', dump_file);
    1316                 :          2 :                       else if (next_bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
    1317                 :          2 :                         fprintf (dump_file, "exit\n");
    1318                 :            :                       else
    1319                 :          0 :                         fprintf (dump_file, "%d\n", next_bb->index);
    1320                 :            :                     }
    1321                 :            : 
    1322                 :     134070 :                   new_bb = copy_bb (best->dest, best, traces[t].last, t);
    1323                 :     134070 :                   traces[t].last = new_bb;
    1324                 :     134070 :                   if (next_bb && next_bb != EXIT_BLOCK_PTR_FOR_FN (cfun))
    1325                 :            :                     {
    1326                 :      33091 :                       t = bbd[next_bb->index].start_of_trace;
    1327                 :      33091 :                       traces[last_trace].last->aux = traces[t].first;
    1328                 :      33091 :                       connected[t] = true;
    1329                 :      33091 :                       last_trace = t;
    1330                 :            :                     }
    1331                 :            :                   else
    1332                 :            :                     break;      /* Stop finding the successor traces.  */
    1333                 :            :                 }
    1334                 :            :               else
    1335                 :            :                 break;  /* Stop finding the successor traces.  */
    1336                 :            :             }
    1337                 :     820355 :         }
    1338                 :            :     }
    1339                 :            : 
    1340                 :     417352 :   if (dump_file)
    1341                 :            :     {
    1342                 :         23 :       basic_block bb;
    1343                 :            : 
    1344                 :         23 :       fprintf (dump_file, "Final order:\n");
    1345                 :        207 :       for (bb = traces[0].first; bb; bb = (basic_block) bb->aux)
    1346                 :        184 :         fprintf (dump_file, "%d ", bb->index);
    1347                 :         23 :       fprintf (dump_file, "\n");
    1348                 :         23 :       fflush (dump_file);
    1349                 :            :     }
    1350                 :            : 
    1351                 :     417352 :   FREE (connected);
    1352                 :     417352 : }
    1353                 :            : 
    1354                 :            : /* Return true when BB can and should be copied. CODE_MAY_GROW is true
    1355                 :            :    when code size is allowed to grow by duplication.  */
    1356                 :            : 
    1357                 :            : static bool
    1358                 :    2070680 : copy_bb_p (const_basic_block bb, int code_may_grow)
    1359                 :            : {
    1360                 :    2070680 :   unsigned int size = 0;
    1361                 :    2070680 :   unsigned int max_size = uncond_jump_length;
    1362                 :    2070680 :   rtx_insn *insn;
    1363                 :            : 
    1364                 :    2070680 :   if (EDGE_COUNT (bb->preds) < 2)
    1365                 :            :     return false;
    1366                 :    2069800 :   if (!can_duplicate_block_p (bb))
    1367                 :            :     return false;
    1368                 :            : 
    1369                 :            :   /* Avoid duplicating blocks which have many successors (PR/13430).  */
    1370                 :    2069790 :   if (EDGE_COUNT (bb->succs) > 8)
    1371                 :            :     return false;
    1372                 :            : 
    1373                 :    2069780 :   if (code_may_grow && optimize_bb_for_speed_p (bb))
    1374                 :     205096 :     max_size *= param_max_grow_copy_bb_insns;
    1375                 :            : 
    1376                 :   28160300 :   FOR_BB_INSNS (bb, insn)
    1377                 :            :     {
    1378                 :   14933200 :       if (INSN_P (insn))
    1379                 :            :         {
    1380                 :   10188000 :           size += get_attr_min_length (insn);
    1381                 :   10188000 :           if (size > max_size)
    1382                 :            :             break;
    1383                 :            :         }
    1384                 :            :     }
    1385                 :            : 
    1386                 :    2069780 :   if (size <= max_size)
    1387                 :            :     return true;
    1388                 :            : 
    1389                 :    1887920 :   if (dump_file)
    1390                 :            :     {
    1391                 :         59 :       fprintf (dump_file,
    1392                 :            :                "Block %d can't be copied because its size = %u.\n",
    1393                 :         59 :                bb->index, size);
    1394                 :            :     }
    1395                 :            : 
    1396                 :            :   return false;
    1397                 :            : }
    1398                 :            : 
    1399                 :            : /* Return the length of unconditional jump instruction.  */
    1400                 :            : 
    1401                 :            : int
    1402                 :     508975 : get_uncond_jump_length (void)
    1403                 :            : {
    1404                 :     508975 :   unsigned int length;
    1405                 :            : 
    1406                 :     508975 :   start_sequence ();
    1407                 :     508975 :   rtx_code_label *label = emit_label (gen_label_rtx ());
    1408                 :     508975 :   rtx_insn *jump = emit_jump_insn (targetm.gen_jump (label));
    1409                 :     508975 :   length = get_attr_min_length (jump);
    1410                 :     508975 :   end_sequence ();
    1411                 :            : 
    1412                 :     508975 :   gcc_assert (length < INT_MAX);
    1413                 :     508975 :   return length;
    1414                 :            : }
    1415                 :            : 
    1416                 :            : /* Create a forwarder block to OLD_BB starting with NEW_LABEL and in the
    1417                 :            :    other partition wrt OLD_BB.  */
    1418                 :            : 
    1419                 :            : static basic_block
    1420                 :        665 : create_eh_forwarder_block (rtx_code_label *new_label, basic_block old_bb)
    1421                 :            : {
    1422                 :            :   /* Split OLD_BB, so that EH pads have always only incoming EH edges,
    1423                 :            :      bb_has_eh_pred bbs are treated specially by DF infrastructure.  */
    1424                 :        665 :   old_bb = split_block_after_labels (old_bb)->dest;
    1425                 :            : 
    1426                 :            :   /* Put the new label and a jump in the new basic block.  */
    1427                 :        665 :   rtx_insn *label = emit_label (new_label);
    1428                 :        665 :   rtx_code_label *old_label = block_label (old_bb);
    1429                 :        665 :   rtx_insn *jump = emit_jump_insn (targetm.gen_jump (old_label));
    1430                 :        665 :   JUMP_LABEL (jump) = old_label;
    1431                 :            : 
    1432                 :            :   /* Create the new basic block and put it in last position.  */
    1433                 :        665 :   basic_block last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
    1434                 :        665 :   basic_block new_bb = create_basic_block (label, jump, last_bb);
    1435                 :        665 :   new_bb->aux = last_bb->aux;
    1436                 :        665 :   new_bb->count = old_bb->count;
    1437                 :        665 :   last_bb->aux = new_bb;
    1438                 :            : 
    1439                 :        665 :   emit_barrier_after_bb (new_bb);
    1440                 :            : 
    1441                 :        665 :   make_single_succ_edge (new_bb, old_bb, 0);
    1442                 :            : 
    1443                 :            :   /* Make sure the new basic block is in the other partition.  */
    1444                 :        665 :   unsigned new_partition = BB_PARTITION (old_bb);
    1445                 :        665 :   new_partition ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
    1446                 :        665 :   BB_SET_PARTITION (new_bb, new_partition);
    1447                 :            : 
    1448                 :        665 :   return new_bb;
    1449                 :            : }
    1450                 :            : 
    1451                 :            : /* The common landing pad in block OLD_BB has edges from both partitions.
    1452                 :            :    Add a new landing pad that will just jump to the old one and split the
    1453                 :            :    edges so that no EH edge crosses partitions.  */
    1454                 :            : 
    1455                 :            : static void
    1456                 :          0 : sjlj_fix_up_crossing_landing_pad (basic_block old_bb)
    1457                 :            : {
    1458                 :          0 :   const unsigned lp_len = cfun->eh->lp_array->length ();
    1459                 :          0 :   edge_iterator ei;
    1460                 :          0 :   edge e;
    1461                 :            : 
    1462                 :            :   /* Generate the new common landing-pad label.  */
    1463                 :          0 :   rtx_code_label *new_label = gen_label_rtx ();
    1464                 :          0 :   LABEL_PRESERVE_P (new_label) = 1;
    1465                 :            : 
    1466                 :            :   /* Create the forwarder block.  */
    1467                 :          0 :   basic_block new_bb = create_eh_forwarder_block (new_label, old_bb);
    1468                 :            : 
    1469                 :            :   /* Create the map from old to new lp index and initialize it.  */
    1470                 :          0 :   unsigned *index_map = (unsigned *) alloca (lp_len * sizeof (unsigned));
    1471                 :          0 :   memset (index_map, 0, lp_len * sizeof (unsigned));
    1472                 :            : 
    1473                 :            :   /* Fix up the edges.  */
    1474                 :          0 :   for (ei = ei_start (old_bb->preds); (e = ei_safe_edge (ei)) != NULL; )
    1475                 :          0 :     if (e->src != new_bb && BB_PARTITION (e->src) == BB_PARTITION (new_bb))
    1476                 :            :       {
    1477                 :          0 :         rtx_insn *insn = BB_END (e->src);
    1478                 :          0 :         rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
    1479                 :            : 
    1480                 :          0 :         gcc_assert (note != NULL);
    1481                 :          0 :         const unsigned old_index = INTVAL (XEXP (note, 0));
    1482                 :            : 
    1483                 :            :         /* Generate the new landing-pad structure.  */
    1484                 :          0 :         if (index_map[old_index] == 0)
    1485                 :            :           {
    1486                 :          0 :             eh_landing_pad old_lp = (*cfun->eh->lp_array)[old_index];
    1487                 :          0 :             eh_landing_pad new_lp = gen_eh_landing_pad (old_lp->region);
    1488                 :          0 :             new_lp->post_landing_pad = old_lp->post_landing_pad;
    1489                 :          0 :             new_lp->landing_pad = new_label;
    1490                 :          0 :             index_map[old_index] = new_lp->index;
    1491                 :            :           }
    1492                 :          0 :         XEXP (note, 0) = GEN_INT (index_map[old_index]);
    1493                 :            : 
    1494                 :            :         /* Adjust the edge to the new destination.  */
    1495                 :          0 :         redirect_edge_succ (e, new_bb);
    1496                 :            :       }
    1497                 :            :     else
    1498                 :          0 :       ei_next (&ei);
    1499                 :          0 : }
    1500                 :            : 
    1501                 :            : /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
    1502                 :            :    Add a new landing pad that will just jump to the old one and split the
    1503                 :            :    edges so that no EH edge crosses partitions.  */
    1504                 :            : 
    1505                 :            : static void
    1506                 :        665 : dw2_fix_up_crossing_landing_pad (eh_landing_pad old_lp, basic_block old_bb)
    1507                 :            : {
    1508                 :        665 :   eh_landing_pad new_lp;
    1509                 :        665 :   edge_iterator ei;
    1510                 :        665 :   edge e;
    1511                 :            : 
    1512                 :            :   /* Generate the new landing-pad structure.  */
    1513                 :        665 :   new_lp = gen_eh_landing_pad (old_lp->region);
    1514                 :        665 :   new_lp->post_landing_pad = old_lp->post_landing_pad;
    1515                 :        665 :   new_lp->landing_pad = gen_label_rtx ();
    1516                 :        665 :   LABEL_PRESERVE_P (new_lp->landing_pad) = 1;
    1517                 :            : 
    1518                 :            :   /* Create the forwarder block.  */
    1519                 :        665 :   basic_block new_bb = create_eh_forwarder_block (new_lp->landing_pad, old_bb);
    1520                 :            : 
    1521                 :            :   /* Fix up the edges.  */
    1522                 :       5040 :   for (ei = ei_start (old_bb->preds); (e = ei_safe_edge (ei)) != NULL; )
    1523                 :       4375 :     if (e->src != new_bb && BB_PARTITION (e->src) == BB_PARTITION (new_bb))
    1524                 :            :       {
    1525                 :       3247 :         rtx_insn *insn = BB_END (e->src);
    1526                 :       3247 :         rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
    1527                 :            : 
    1528                 :       3247 :         gcc_assert (note != NULL);
    1529                 :       3247 :         gcc_checking_assert (INTVAL (XEXP (note, 0)) == old_lp->index);
    1530                 :       3247 :         XEXP (note, 0) = GEN_INT (new_lp->index);
    1531                 :            : 
    1532                 :            :         /* Adjust the edge to the new destination.  */
    1533                 :       3247 :         redirect_edge_succ (e, new_bb);
    1534                 :            :       }
    1535                 :            :     else
    1536                 :       1128 :       ei_next (&ei);
    1537                 :        665 : }
    1538                 :            : 
    1539                 :            : 
    1540                 :            : /* Ensure that all hot bbs are included in a hot path through the
    1541                 :            :    procedure. This is done by calling this function twice, once
    1542                 :            :    with WALK_UP true (to look for paths from the entry to hot bbs) and
    1543                 :            :    once with WALK_UP false (to look for paths from hot bbs to the exit).
    1544                 :            :    Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
    1545                 :            :    to BBS_IN_HOT_PARTITION.  */
    1546                 :            : 
    1547                 :            : static unsigned int
    1548                 :      95316 : sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
    1549                 :            :                     vec<basic_block> *bbs_in_hot_partition)
    1550                 :            : {
    1551                 :            :   /* Callers check this.  */
    1552                 :      95316 :   gcc_checking_assert (cold_bb_count);
    1553                 :            : 
    1554                 :            :   /* Keep examining hot bbs while we still have some left to check
    1555                 :            :      and there are remaining cold bbs.  */
    1556                 :     190604 :   vec<basic_block> hot_bbs_to_check = bbs_in_hot_partition->copy ();
    1557                 :    1768100 :   while (! hot_bbs_to_check.is_empty ()
    1558                 :    3441060 :          && cold_bb_count)
    1559                 :            :     {
    1560                 :    1672790 :       basic_block bb = hot_bbs_to_check.pop ();
    1561                 :    1672790 :       vec<edge, va_gc> *edges = walk_up ? bb->preds : bb->succs;
    1562                 :    1672790 :       edge e;
    1563                 :    1672790 :       edge_iterator ei;
    1564                 :    1672790 :       profile_probability highest_probability
    1565                 :    1672790 :                                  = profile_probability::uninitialized ();
    1566                 :    1672790 :       profile_count highest_count = profile_count::uninitialized ();
    1567                 :    1672790 :       bool found = false;
    1568                 :            : 
    1569                 :            :       /* Walk the preds/succs and check if there is at least one already
    1570                 :            :          marked hot. Keep track of the most frequent pred/succ so that we
    1571                 :            :          can mark it hot if we don't find one.  */
    1572                 :    2067820 :       FOR_EACH_EDGE (e, ei, edges)
    1573                 :            :         {
    1574                 :    2034380 :           basic_block reach_bb = walk_up ? e->src : e->dest;
    1575                 :            : 
    1576                 :    2034380 :           if (e->flags & EDGE_DFS_BACK)
    1577                 :      37868 :             continue;
    1578                 :            : 
    1579                 :            :           /* Do not expect profile insanities when profile was not adjusted.  */
    1580                 :    2345390 :           if (e->probability == profile_probability::never ()
    1581                 :    1655250 :               || e->count () == profile_count::zero ())
    1582                 :     343422 :             continue;
    1583                 :            : 
    1584                 :    1653090 :           if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION)
    1585                 :            :           {
    1586                 :            :             found = true;
    1587                 :            :             break;
    1588                 :            :           }
    1589                 :            :           /* The following loop will look for the hottest edge via
    1590                 :            :              the edge count, if it is non-zero, then fallback to
    1591                 :            :              the edge probability.  */
    1592                 :      13744 :           if (!(e->count () > highest_count))
    1593                 :      13739 :             highest_count = e->count ();
    1594                 :      13744 :           if (!highest_probability.initialized_p ()
    1595                 :     408778 :               || e->probability > highest_probability)
    1596                 :      13712 :             highest_probability = e->probability;
    1597                 :            :         }
    1598                 :            : 
    1599                 :            :       /* If bb is reached by (or reaches, in the case of !WALK_UP) another hot
    1600                 :            :          block (or unpartitioned, e.g. the entry block) then it is ok. If not,
    1601                 :            :          then the most frequent pred (or succ) needs to be adjusted.  In the
    1602                 :            :          case where multiple preds/succs have the same frequency (e.g. a
    1603                 :            :          50-50 branch), then both will be adjusted.  */
    1604                 :    1672790 :       if (found)
    1605                 :    1639350 :         continue;
    1606                 :            : 
    1607                 :      64087 :       FOR_EACH_EDGE (e, ei, edges)
    1608                 :            :         {
    1609                 :      30649 :           if (e->flags & EDGE_DFS_BACK)
    1610                 :      27296 :             continue;
    1611                 :            :           /* Do not expect profile insanities when profile was not adjusted.  */
    1612                 :      36698 :           if (e->probability == profile_probability::never ()
    1613                 :       3547 :               || e->count () == profile_count::zero ())
    1614                 :      16709 :             continue;
    1615                 :            :           /* Select the hottest edge using the edge count, if it is non-zero,
    1616                 :            :              then fallback to the edge probability.  */
    1617                 :       3353 :           if (highest_count.initialized_p ())
    1618                 :            :             {
    1619                 :       3353 :               if (!(e->count () >= highest_count))
    1620                 :          0 :                 continue;
    1621                 :            :             }
    1622                 :          0 :           else if (!(e->probability >= highest_probability))
    1623                 :          0 :             continue;
    1624                 :            : 
    1625                 :       3353 :           basic_block reach_bb = walk_up ? e->src : e->dest;
    1626                 :            : 
    1627                 :            :           /* We have a hot bb with an immediate dominator that is cold.
    1628                 :            :              The dominator needs to be re-marked hot.  */
    1629                 :       3353 :           BB_SET_PARTITION (reach_bb, BB_HOT_PARTITION);
    1630                 :       3353 :           if (dump_file)
    1631                 :          0 :             fprintf (dump_file, "Promoting bb %i to hot partition to sanitize "
    1632                 :            :                      "profile of bb %i in %s walk\n", reach_bb->index,
    1633                 :            :                      bb->index, walk_up ? "backward" : "forward");
    1634                 :       3353 :           cold_bb_count--;
    1635                 :            : 
    1636                 :            :           /* Now we need to examine newly-hot reach_bb to see if it is also
    1637                 :            :              dominated by a cold bb.  */
    1638                 :       3353 :           bbs_in_hot_partition->safe_push (reach_bb);
    1639                 :       3353 :           hot_bbs_to_check.safe_push (reach_bb);
    1640                 :            :         }
    1641                 :            :     }
    1642                 :      95316 :   hot_bbs_to_check.release ();
    1643                 :            : 
    1644                 :      95316 :   return cold_bb_count;
    1645                 :            : }
    1646                 :            : 
    1647                 :            : 
    1648                 :            : /* Find the basic blocks that are rarely executed and need to be moved to
    1649                 :            :    a separate section of the .o file (to cut down on paging and improve
    1650                 :            :    cache locality).  Return a vector of all edges that cross.  */
    1651                 :            : 
    1652                 :            : static vec<edge>
    1653                 :     189691 : find_rarely_executed_basic_blocks_and_crossing_edges (void)
    1654                 :            : {
    1655                 :     189691 :   vec<edge> crossing_edges = vNULL;
    1656                 :     189691 :   basic_block bb;
    1657                 :     189691 :   edge e;
    1658                 :     189691 :   edge_iterator ei;
    1659                 :     189691 :   unsigned int cold_bb_count = 0;
    1660                 :     189691 :   auto_vec<basic_block> bbs_in_hot_partition;
    1661                 :            : 
    1662                 :     189691 :   propagate_unlikely_bbs_forward ();
    1663                 :            : 
    1664                 :            :   /* Mark which partition (hot/cold) each basic block belongs in.  */
    1665                 :    2924140 :   FOR_EACH_BB_FN (bb, cfun)
    1666                 :            :     {
    1667                 :    2734450 :       bool cold_bb = false;
    1668                 :            : 
    1669                 :    2734450 :       if (probably_never_executed_bb_p (cfun, bb))
    1670                 :            :         {
    1671                 :     176467 :           cold_bb = true;
    1672                 :            : 
    1673                 :            :           /* Handle profile insanities created by upstream optimizations
    1674                 :            :              by also checking the incoming edge weights. If there is a non-cold
    1675                 :            :              incoming edge, conservatively prevent this block from being split
    1676                 :            :              into the cold section.  */
    1677                 :     176467 :           if (!bb->count.precise_p ())
    1678                 :        133 :             FOR_EACH_EDGE (e, ei, bb->preds)
    1679                 :         74 :               if (!probably_never_executed_edge_p (cfun, e))
    1680                 :            :                 {
    1681                 :            :                   cold_bb = false;
    1682                 :            :                   break;
    1683                 :            :                 }
    1684                 :            :         }
    1685                 :         59 :       if (cold_bb)
    1686                 :            :         {
    1687                 :     176467 :           BB_SET_PARTITION (bb, BB_COLD_PARTITION);
    1688                 :     176467 :           cold_bb_count++;
    1689                 :            :         }
    1690                 :            :       else
    1691                 :            :         {
    1692                 :    2557980 :           BB_SET_PARTITION (bb, BB_HOT_PARTITION);
    1693                 :    2557980 :           bbs_in_hot_partition.safe_push (bb);
    1694                 :            :         }
    1695                 :            :     }
    1696                 :            : 
    1697                 :            :   /* Ensure that hot bbs are included along a hot path from the entry to exit.
    1698                 :            :      Several different possibilities may include cold bbs along all paths
    1699                 :            :      to/from a hot bb. One is that there are edge weight insanities
    1700                 :            :      due to optimization phases that do not properly update basic block profile
    1701                 :            :      counts. The second is that the entry of the function may not be hot, because
    1702                 :            :      it is entered fewer times than the number of profile training runs, but there
    1703                 :            :      is a loop inside the function that causes blocks within the function to be
    1704                 :            :      above the threshold for hotness. This is fixed by walking up from hot bbs
    1705                 :            :      to the entry block, and then down from hot bbs to the exit, performing
    1706                 :            :      partitioning fixups as necessary.  */
    1707                 :     189691 :   if (cold_bb_count)
    1708                 :            :     {
    1709                 :      47658 :       mark_dfs_back_edges ();
    1710                 :      47658 :       cold_bb_count = sanitize_hot_paths (true, cold_bb_count,
    1711                 :            :                                           &bbs_in_hot_partition);
    1712                 :      47658 :       if (cold_bb_count)
    1713                 :      47658 :         sanitize_hot_paths (false, cold_bb_count, &bbs_in_hot_partition);
    1714                 :            : 
    1715                 :      95316 :       hash_set <basic_block> set;
    1716                 :      47658 :       find_bbs_reachable_by_hot_paths (&set);
    1717                 :    1061640 :       FOR_EACH_BB_FN (bb, cfun)
    1718                 :    1013980 :         if (!set.contains (bb))
    1719                 :     173114 :           BB_SET_PARTITION (bb, BB_COLD_PARTITION);
    1720                 :            :     }
    1721                 :            : 
    1722                 :            :   /* The format of .gcc_except_table does not allow landing pads to
    1723                 :            :      be in a different partition as the throw.  Fix this by either
    1724                 :            :      moving the landing pads or inserting forwarder landing pads.  */
    1725                 :     189691 :   if (cfun->eh->lp_array)
    1726                 :            :     {
    1727                 :     189691 :       const bool sjlj
    1728                 :     189691 :         = (targetm_common.except_unwind_info (&global_options) == UI_SJLJ);
    1729                 :     189691 :       unsigned i;
    1730                 :     189691 :       eh_landing_pad lp;
    1731                 :            : 
    1732                 :     584770 :       FOR_EACH_VEC_ELT (*cfun->eh->lp_array, i, lp)
    1733                 :            :         {
    1734                 :     395079 :           bool all_same, all_diff;
    1735                 :            : 
    1736                 :     395079 :           if (lp == NULL
    1737                 :      50209 :               || lp->landing_pad == NULL_RTX
    1738                 :      50209 :               || !LABEL_P (lp->landing_pad))
    1739                 :     344891 :             continue;
    1740                 :            : 
    1741                 :      50188 :           all_same = all_diff = true;
    1742                 :      50188 :           bb = BLOCK_FOR_INSN (lp->landing_pad);
    1743                 :     162246 :           FOR_EACH_EDGE (e, ei, bb->preds)
    1744                 :            :             {
    1745                 :     112058 :               gcc_assert (e->flags & EDGE_EH);
    1746                 :     112058 :               if (BB_PARTITION (bb) == BB_PARTITION (e->src))
    1747                 :            :                 all_diff = false;
    1748                 :            :               else
    1749                 :      97063 :                 all_same = false;
    1750                 :            :             }
    1751                 :            : 
    1752                 :      50188 :           if (all_same)
    1753                 :            :             ;
    1754                 :      40561 :           else if (all_diff)
    1755                 :            :             {
    1756                 :      39896 :               int which = BB_PARTITION (bb);
    1757                 :      39896 :               which ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
    1758                 :      39896 :               BB_SET_PARTITION (bb, which);
    1759                 :            :             }
    1760                 :        665 :           else if (sjlj)
    1761                 :          0 :             sjlj_fix_up_crossing_landing_pad (bb);
    1762                 :            :           else
    1763                 :        665 :             dw2_fix_up_crossing_landing_pad (lp, bb);
    1764                 :            : 
    1765                 :            :           /* There is a single, common landing pad in SJLJ mode.  */
    1766                 :      50188 :           if (sjlj)
    1767                 :            :             break;
    1768                 :            :         }
    1769                 :            :     }
    1770                 :            : 
    1771                 :            :   /* Mark every edge that crosses between sections.  */
    1772                 :    2925470 :   FOR_EACH_BB_FN (bb, cfun)
    1773                 :    6883320 :     FOR_EACH_EDGE (e, ei, bb->succs)
    1774                 :            :       {
    1775                 :    4147550 :         unsigned int flags = e->flags;
    1776                 :            : 
    1777                 :            :         /* We should never have EDGE_CROSSING set yet.  */
    1778                 :    4147550 :         gcc_checking_assert ((flags & EDGE_CROSSING) == 0);
    1779                 :            : 
    1780                 :    4147550 :         if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
    1781                 :    4147550 :             && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
    1782                 :    3930290 :             && BB_PARTITION (e->src) != BB_PARTITION (e->dest))
    1783                 :            :           {
    1784                 :     329832 :             crossing_edges.safe_push (e);
    1785                 :     329832 :             flags |= EDGE_CROSSING;
    1786                 :            :           }
    1787                 :            : 
    1788                 :            :         /* Now that we've split eh edges as appropriate, allow landing pads
    1789                 :            :            to be merged with the post-landing pads.  */
    1790                 :    4147550 :         flags &= ~EDGE_PRESERVE;
    1791                 :            : 
    1792                 :    4147550 :         e->flags = flags;
    1793                 :            :       }
    1794                 :            : 
    1795                 :     189691 :   return crossing_edges;
    1796                 :            : }
    1797                 :            : 
    1798                 :            : /* Set the flag EDGE_CAN_FALLTHRU for edges that can be fallthru.  */
    1799                 :            : 
    1800                 :            : static void
    1801                 :     436137 : set_edge_can_fallthru_flag (void)
    1802                 :            : {
    1803                 :     436137 :   basic_block bb;
    1804                 :            : 
    1805                 :    6979640 :   FOR_EACH_BB_FN (bb, cfun)
    1806                 :            :     {
    1807                 :    6543500 :       edge e;
    1808                 :    6543500 :       edge_iterator ei;
    1809                 :            : 
    1810                 :   16326200 :       FOR_EACH_EDGE (e, ei, bb->succs)
    1811                 :            :         {
    1812                 :    9782730 :           e->flags &= ~EDGE_CAN_FALLTHRU;
    1813                 :            : 
    1814                 :            :           /* The FALLTHRU edge is also CAN_FALLTHRU edge.  */
    1815                 :    9782730 :           if (e->flags & EDGE_FALLTHRU)
    1816                 :    5607160 :             e->flags |= EDGE_CAN_FALLTHRU;
    1817                 :            :         }
    1818                 :            : 
    1819                 :            :       /* If the BB ends with an invertible condjump all (2) edges are
    1820                 :            :          CAN_FALLTHRU edges.  */
    1821                 :    6543500 :       if (EDGE_COUNT (bb->succs) != 2)
    1822                 :    3392760 :         continue;
    1823                 :    3564040 :       if (!any_condjump_p (BB_END (bb)))
    1824                 :     413303 :         continue;
    1825                 :            : 
    1826                 :    3150740 :       rtx_jump_insn *bb_end_jump = as_a <rtx_jump_insn *> (BB_END (bb));
    1827                 :    3150740 :       if (!invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0))
    1828                 :          0 :         continue;
    1829                 :    3150740 :       invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0);
    1830                 :    3150740 :       EDGE_SUCC (bb, 0)->flags |= EDGE_CAN_FALLTHRU;
    1831                 :    3150740 :       EDGE_SUCC (bb, 1)->flags |= EDGE_CAN_FALLTHRU;
    1832                 :            :     }
    1833                 :     436137 : }
    1834                 :            : 
    1835                 :            : /* If any destination of a crossing edge does not have a label, add label;
    1836                 :            :    Convert any easy fall-through crossing edges to unconditional jumps.  */
    1837                 :            : 
    1838                 :            : static void
    1839                 :      47475 : add_labels_and_missing_jumps (vec<edge> crossing_edges)
    1840                 :            : {
    1841                 :      47475 :   size_t i;
    1842                 :      47475 :   edge e;
    1843                 :            : 
    1844                 :     377307 :   FOR_EACH_VEC_ELT (crossing_edges, i, e)
    1845                 :            :     {
    1846                 :     329832 :       basic_block src = e->src;
    1847                 :     329832 :       basic_block dest = e->dest;
    1848                 :     329832 :       rtx_jump_insn *new_jump;
    1849                 :            : 
    1850                 :     329832 :       if (dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
    1851                 :          0 :         continue;
    1852                 :            : 
    1853                 :            :       /* Make sure dest has a label.  */
    1854                 :     329832 :       rtx_code_label *label = block_label (dest);
    1855                 :            : 
    1856                 :            :       /* Nothing to do for non-fallthru edges.  */
    1857                 :     329832 :       if (src == ENTRY_BLOCK_PTR_FOR_FN (cfun))
    1858                 :          0 :         continue;
    1859                 :     329832 :       if ((e->flags & EDGE_FALLTHRU) == 0)
    1860                 :     243711 :         continue;
    1861                 :            : 
    1862                 :            :       /* If the block does not end with a control flow insn, then we
    1863                 :            :          can trivially add a jump to the end to fixup the crossing.
    1864                 :            :          Otherwise the jump will have to go in a new bb, which will
    1865                 :            :          be handled by fix_up_fall_thru_edges function.  */
    1866                 :      86121 :       if (control_flow_insn_p (BB_END (src)))
    1867                 :      43101 :         continue;
    1868                 :            : 
    1869                 :            :       /* Make sure there's only one successor.  */
    1870                 :      43020 :       gcc_assert (single_succ_p (src));
    1871                 :            : 
    1872                 :      43020 :       new_jump = emit_jump_insn_after (targetm.gen_jump (label), BB_END (src));
    1873                 :      43020 :       BB_END (src) = new_jump;
    1874                 :      43020 :       JUMP_LABEL (new_jump) = label;
    1875                 :      43020 :       LABEL_NUSES (label) += 1;
    1876                 :            : 
    1877                 :      43020 :       emit_barrier_after_bb (src);
    1878                 :            : 
    1879                 :            :       /* Mark edge as non-fallthru.  */
    1880                 :      43020 :       e->flags &= ~EDGE_FALLTHRU;
    1881                 :            :     }
    1882                 :      47475 : }
    1883                 :            : 
    1884                 :            : /* Find any bb's where the fall-through edge is a crossing edge (note that
    1885                 :            :    these bb's must also contain a conditional jump or end with a call
    1886                 :            :    instruction; we've already dealt with fall-through edges for blocks
    1887                 :            :    that didn't have a conditional jump or didn't end with call instruction
    1888                 :            :    in the call to add_labels_and_missing_jumps).  Convert the fall-through
    1889                 :            :    edge to non-crossing edge by inserting a new bb to fall-through into.
    1890                 :            :    The new bb will contain an unconditional jump (crossing edge) to the
    1891                 :            :    original fall through destination.  */
    1892                 :            : 
    1893                 :            : static void
    1894                 :      47475 : fix_up_fall_thru_edges (void)
    1895                 :            : {
    1896                 :      47475 :   basic_block cur_bb;
    1897                 :            : 
    1898                 :    1057820 :   FOR_EACH_BB_FN (cur_bb, cfun)
    1899                 :            :     {
    1900                 :    1010350 :       edge succ1;
    1901                 :    1010350 :       edge succ2;
    1902                 :    1010350 :       edge fall_thru = NULL;
    1903                 :    1010350 :       edge cond_jump = NULL;
    1904                 :            : 
    1905                 :    1010350 :       fall_thru = NULL;
    1906                 :    1010350 :       if (EDGE_COUNT (cur_bb->succs) > 0)
    1907                 :     939622 :         succ1 = EDGE_SUCC (cur_bb, 0);
    1908                 :            :       else
    1909                 :            :         succ1 = NULL;
    1910                 :            : 
    1911                 :    1010350 :       if (EDGE_COUNT (cur_bb->succs) > 1)
    1912                 :     621999 :         succ2 = EDGE_SUCC (cur_bb, 1);
    1913                 :            :       else
    1914                 :            :         succ2 = NULL;
    1915                 :            : 
    1916                 :            :       /* Find the fall-through edge.  */
    1917                 :            : 
    1918                 :    1010350 :       if (succ1
    1919                 :     939622 :           && (succ1->flags & EDGE_FALLTHRU))
    1920                 :            :         {
    1921                 :            :           fall_thru = succ1;
    1922                 :            :           cond_jump = succ2;
    1923                 :            :         }
    1924                 :     549407 :       else if (succ2
    1925                 :     422066 :                && (succ2->flags & EDGE_FALLTHRU))
    1926                 :            :         {
    1927                 :            :           fall_thru = succ2;
    1928                 :            :           cond_jump = succ1;
    1929                 :            :         }
    1930                 :    1011220 :       else if (succ2 && EDGE_COUNT (cur_bb->succs) > 2)
    1931                 :        872 :         fall_thru = find_fallthru_edge (cur_bb->succs);
    1932                 :            : 
    1933                 :     882983 :       if (fall_thru && (fall_thru->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)))
    1934                 :            :         {
    1935                 :            :           /* Check to see if the fall-thru edge is a crossing edge.  */
    1936                 :            : 
    1937                 :     838767 :           if (fall_thru->flags & EDGE_CROSSING)
    1938                 :            :             {
    1939                 :            :               /* The fall_thru edge crosses; now check the cond jump edge, if
    1940                 :            :                  it exists.  */
    1941                 :            : 
    1942                 :      43101 :               bool cond_jump_crosses = true;
    1943                 :      43101 :               int invert_worked = 0;
    1944                 :      43101 :               rtx_insn *old_jump = BB_END (cur_bb);
    1945                 :            : 
    1946                 :            :               /* Find the jump instruction, if there is one.  */
    1947                 :            : 
    1948                 :      43101 :               if (cond_jump)
    1949                 :            :                 {
    1950                 :      42956 :                   if (!(cond_jump->flags & EDGE_CROSSING))
    1951                 :      42758 :                     cond_jump_crosses = false;
    1952                 :            : 
    1953                 :            :                   /* We know the fall-thru edge crosses; if the cond
    1954                 :            :                      jump edge does NOT cross, and its destination is the
    1955                 :            :                      next block in the bb order, invert the jump
    1956                 :            :                      (i.e. fix it so the fall through does not cross and
    1957                 :            :                      the cond jump does).  */
    1958                 :            : 
    1959                 :      42758 :                   if (!cond_jump_crosses)
    1960                 :            :                     {
    1961                 :            :                       /* Find label in fall_thru block. We've already added
    1962                 :            :                          any missing labels, so there must be one.  */
    1963                 :            : 
    1964                 :      42758 :                       rtx_code_label *fall_thru_label
    1965                 :      42758 :                         = block_label (fall_thru->dest);
    1966                 :            : 
    1967                 :      42758 :                       if (old_jump && fall_thru_label)
    1968                 :            :                         {
    1969                 :      42758 :                           rtx_jump_insn *old_jump_insn
    1970                 :      84374 :                             = dyn_cast <rtx_jump_insn *> (old_jump);
    1971                 :      41273 :                           if (old_jump_insn)
    1972                 :      41273 :                             invert_worked = invert_jump (old_jump_insn,
    1973                 :            :                                                          fall_thru_label, 0);
    1974                 :            :                         }
    1975                 :            : 
    1976                 :      41273 :                       if (invert_worked)
    1977                 :            :                         {
    1978                 :      41267 :                           fall_thru->flags &= ~EDGE_FALLTHRU;
    1979                 :      41267 :                           cond_jump->flags |= EDGE_FALLTHRU;
    1980                 :      41267 :                           update_br_prob_note (cur_bb);
    1981                 :      41267 :                           std::swap (fall_thru, cond_jump);
    1982                 :      41267 :                           cond_jump->flags |= EDGE_CROSSING;
    1983                 :      41267 :                           fall_thru->flags &= ~EDGE_CROSSING;
    1984                 :            :                         }
    1985                 :            :                     }
    1986                 :            :                 }
    1987                 :            : 
    1988                 :      43101 :               if (cond_jump_crosses || !invert_worked)
    1989                 :            :                 {
    1990                 :            :                   /* This is the case where both edges out of the basic
    1991                 :            :                      block are crossing edges. Here we will fix up the
    1992                 :            :                      fall through edge. The jump edge will be taken care
    1993                 :            :                      of later.  The EDGE_CROSSING flag of fall_thru edge
    1994                 :            :                      is unset before the call to force_nonfallthru
    1995                 :            :                      function because if a new basic-block is created
    1996                 :            :                      this edge remains in the current section boundary
    1997                 :            :                      while the edge between new_bb and the fall_thru->dest
    1998                 :            :                      becomes EDGE_CROSSING.  */
    1999                 :            : 
    2000                 :       1834 :                   fall_thru->flags &= ~EDGE_CROSSING;
    2001                 :       1834 :                   basic_block new_bb = force_nonfallthru (fall_thru);
    2002                 :            : 
    2003                 :       1834 :                   if (new_bb)
    2004                 :            :                     {
    2005                 :       1834 :                       new_bb->aux = cur_bb->aux;
    2006                 :       1834 :                       cur_bb->aux = new_bb;
    2007                 :            : 
    2008                 :            :                       /* This is done by force_nonfallthru_and_redirect.  */
    2009                 :       1834 :                       gcc_assert (BB_PARTITION (new_bb)
    2010                 :            :                                   == BB_PARTITION (cur_bb));
    2011                 :            : 
    2012                 :       1834 :                       single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
    2013                 :            :                     }
    2014                 :            :                   else
    2015                 :            :                     {
    2016                 :            :                       /* If a new basic-block was not created; restore
    2017                 :            :                          the EDGE_CROSSING flag.  */
    2018                 :          0 :                       fall_thru->flags |= EDGE_CROSSING;
    2019                 :            :                     }
    2020                 :            : 
    2021                 :            :                   /* Add barrier after new jump */
    2022                 :       1834 :                   emit_barrier_after_bb (new_bb ? new_bb : cur_bb);
    2023                 :            :                 }
    2024                 :            :             }
    2025                 :            :         }
    2026                 :            :     }
    2027                 :      47475 : }
    2028                 :            : 
    2029                 :            : /* This function checks the destination block of a "crossing jump" to
    2030                 :            :    see if it has any crossing predecessors that begin with a code label
    2031                 :            :    and end with an unconditional jump.  If so, it returns that predecessor
    2032                 :            :    block.  (This is to avoid creating lots of new basic blocks that all
    2033                 :            :    contain unconditional jumps to the same destination).  */
    2034                 :            : 
    2035                 :            : static basic_block
    2036                 :          0 : find_jump_block (basic_block jump_dest)
    2037                 :            : {
    2038                 :          0 :   basic_block source_bb = NULL;
    2039                 :          0 :   edge e;
    2040                 :          0 :   rtx_insn *insn;
    2041                 :          0 :   edge_iterator ei;
    2042                 :            : 
    2043                 :          0 :   FOR_EACH_EDGE (e, ei, jump_dest->preds)
    2044                 :          0 :     if (e->flags & EDGE_CROSSING)
    2045                 :            :       {
    2046                 :          0 :         basic_block src = e->src;
    2047                 :            : 
    2048                 :            :         /* Check each predecessor to see if it has a label, and contains
    2049                 :            :            only one executable instruction, which is an unconditional jump.
    2050                 :            :            If so, we can use it.  */
    2051                 :            : 
    2052                 :          0 :         if (LABEL_P (BB_HEAD (src)))
    2053                 :          0 :           for (insn = BB_HEAD (src);
    2054                 :          0 :                !INSN_P (insn) && insn != NEXT_INSN (BB_END (src));
    2055                 :          0 :                insn = NEXT_INSN (insn))
    2056                 :            :             {
    2057                 :          0 :               if (INSN_P (insn)
    2058                 :            :                   && insn == BB_END (src)
    2059                 :            :                   && JUMP_P (insn)
    2060                 :            :                   && !any_condjump_p (insn))
    2061                 :            :                 {
    2062                 :            :                   source_bb = src;
    2063                 :            :                   break;
    2064                 :            :                 }
    2065                 :            :             }
    2066                 :            : 
    2067                 :            :         if (source_bb)
    2068                 :            :           break;
    2069                 :            :       }
    2070                 :            : 
    2071                 :          0 :   return source_bb;
    2072                 :            : }
    2073                 :            : 
    2074                 :            : /* Find all BB's with conditional jumps that are crossing edges;
    2075                 :            :    insert a new bb and make the conditional jump branch to the new
    2076                 :            :    bb instead (make the new bb same color so conditional branch won't
    2077                 :            :    be a 'crossing' edge).  Insert an unconditional jump from the
    2078                 :            :    new bb to the original destination of the conditional jump.  */
    2079                 :            : 
    2080                 :            : static void
    2081                 :          0 : fix_crossing_conditional_branches (void)
    2082                 :            : {
    2083                 :          0 :   basic_block cur_bb;
    2084                 :          0 :   basic_block new_bb;
    2085                 :          0 :   basic_block dest;
    2086                 :          0 :   edge succ1;
    2087                 :          0 :   edge succ2;
    2088                 :          0 :   edge crossing_edge;
    2089                 :          0 :   edge new_edge;
    2090                 :          0 :   rtx set_src;
    2091                 :          0 :   rtx old_label = NULL_RTX;
    2092                 :          0 :   rtx_code_label *new_label;
    2093                 :            : 
    2094                 :          0 :   FOR_EACH_BB_FN (cur_bb, cfun)
    2095                 :            :     {
    2096                 :          0 :       crossing_edge = NULL;
    2097                 :          0 :       if (EDGE_COUNT (cur_bb->succs) > 0)
    2098                 :          0 :         succ1 = EDGE_SUCC (cur_bb, 0);
    2099                 :            :       else
    2100                 :            :         succ1 = NULL;
    2101                 :            : 
    2102                 :          0 :       if (EDGE_COUNT (cur_bb->succs) > 1)
    2103                 :          0 :         succ2 = EDGE_SUCC (cur_bb, 1);
    2104                 :            :       else
    2105                 :            :         succ2 = NULL;
    2106                 :            : 
    2107                 :            :       /* We already took care of fall-through edges, so only one successor
    2108                 :            :          can be a crossing edge.  */
    2109                 :            : 
    2110                 :          0 :       if (succ1 && (succ1->flags & EDGE_CROSSING))
    2111                 :            :         crossing_edge = succ1;
    2112                 :          0 :       else if (succ2 && (succ2->flags & EDGE_CROSSING))
    2113                 :            :         crossing_edge = succ2;
    2114                 :            : 
    2115                 :          0 :       if (crossing_edge)
    2116                 :            :         {
    2117                 :          0 :           rtx_insn *old_jump = BB_END (cur_bb);
    2118                 :            : 
    2119                 :            :           /* Check to make sure the jump instruction is a
    2120                 :            :              conditional jump.  */
    2121                 :            : 
    2122                 :          0 :           set_src = NULL_RTX;
    2123                 :            : 
    2124                 :          0 :           if (any_condjump_p (old_jump))
    2125                 :            :             {
    2126                 :          0 :               if (GET_CODE (PATTERN (old_jump)) == SET)
    2127                 :          0 :                 set_src = SET_SRC (PATTERN (old_jump));
    2128                 :          0 :               else if (GET_CODE (PATTERN (old_jump)) == PARALLEL)
    2129                 :            :                 {
    2130                 :          0 :                   set_src = XVECEXP (PATTERN (old_jump), 0,0);
    2131                 :          0 :                   if (GET_CODE (set_src) == SET)
    2132                 :          0 :                     set_src = SET_SRC (set_src);
    2133                 :            :                   else
    2134                 :            :                     set_src = NULL_RTX;
    2135                 :            :                 }
    2136                 :            :             }
    2137                 :            : 
    2138                 :          0 :           if (set_src && (GET_CODE (set_src) == IF_THEN_ELSE))
    2139                 :            :             {
    2140                 :          0 :               rtx_jump_insn *old_jump_insn =
    2141                 :          0 :                         as_a <rtx_jump_insn *> (old_jump);
    2142                 :            : 
    2143                 :          0 :               if (GET_CODE (XEXP (set_src, 1)) == PC)
    2144                 :          0 :                 old_label = XEXP (set_src, 2);
    2145                 :          0 :               else if (GET_CODE (XEXP (set_src, 2)) == PC)
    2146                 :          0 :                 old_label = XEXP (set_src, 1);
    2147                 :            : 
    2148                 :            :               /* Check to see if new bb for jumping to that dest has
    2149                 :            :                  already been created; if so, use it; if not, create
    2150                 :            :                  a new one.  */
    2151                 :            : 
    2152                 :          0 :               new_bb = find_jump_block (crossing_edge->dest);
    2153                 :            : 
    2154                 :          0 :               if (new_bb)
    2155                 :          0 :                 new_label = block_label (new_bb);
    2156                 :            :               else
    2157                 :            :                 {
    2158                 :          0 :                   basic_block last_bb;
    2159                 :          0 :                   rtx_code_label *old_jump_target;
    2160                 :          0 :                   rtx_jump_insn *new_jump;
    2161                 :            : 
    2162                 :            :                   /* Create new basic block to be dest for
    2163                 :            :                      conditional jump.  */
    2164                 :            : 
    2165                 :            :                   /* Put appropriate instructions in new bb.  */
    2166                 :            : 
    2167                 :          0 :                   new_label = gen_label_rtx ();
    2168                 :          0 :                   emit_label (new_label);
    2169                 :            : 
    2170                 :          0 :                   gcc_assert (GET_CODE (old_label) == LABEL_REF);
    2171                 :          0 :                   old_jump_target = old_jump_insn->jump_target ();
    2172                 :          0 :                   new_jump = as_a <rtx_jump_insn *>
    2173                 :          0 :                     (emit_jump_insn (targetm.gen_jump (old_jump_target)));
    2174                 :          0 :                   new_jump->set_jump_target (old_jump_target);
    2175                 :            : 
    2176                 :          0 :                   last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
    2177                 :          0 :                   new_bb = create_basic_block (new_label, new_jump, last_bb);
    2178                 :          0 :                   new_bb->aux = last_bb->aux;
    2179                 :          0 :                   last_bb->aux = new_bb;
    2180                 :            : 
    2181                 :          0 :                   emit_barrier_after_bb (new_bb);
    2182                 :            : 
    2183                 :            :                   /* Make sure new bb is in same partition as source
    2184                 :            :                      of conditional branch.  */
    2185                 :          0 :                   BB_COPY_PARTITION (new_bb, cur_bb);
    2186                 :            :                 }
    2187                 :            : 
    2188                 :            :               /* Make old jump branch to new bb.  */
    2189                 :            : 
    2190                 :          0 :               redirect_jump (old_jump_insn, new_label, 0);
    2191                 :            : 
    2192                 :            :               /* Remove crossing_edge as predecessor of 'dest'.  */
    2193                 :            : 
    2194                 :          0 :               dest = crossing_edge->dest;
    2195                 :            : 
    2196                 :          0 :               redirect_edge_succ (crossing_edge, new_bb);
    2197                 :            : 
    2198                 :            :               /* Make a new edge from new_bb to old dest; new edge
    2199                 :            :                  will be a successor for new_bb and a predecessor
    2200                 :            :                  for 'dest'.  */
    2201                 :            : 
    2202                 :          0 :               if (EDGE_COUNT (new_bb->succs) == 0)
    2203                 :          0 :                 new_edge = make_single_succ_edge (new_bb, dest, 0);
    2204                 :            :               else
    2205                 :          0 :                 new_edge = EDGE_SUCC (new_bb, 0);
    2206                 :            : 
    2207                 :          0 :               crossing_edge->flags &= ~EDGE_CROSSING;
    2208                 :          0 :               new_edge->flags |= EDGE_CROSSING;
    2209                 :            :             }
    2210                 :            :         }
    2211                 :            :     }
    2212                 :          0 : }
    2213                 :            : 
    2214                 :            : /* Find any unconditional branches that cross between hot and cold
    2215                 :            :    sections.  Convert them into indirect jumps instead.  */
    2216                 :            : 
    2217                 :            : static void
    2218                 :          0 : fix_crossing_unconditional_branches (void)
    2219                 :            : {
    2220                 :          0 :   basic_block cur_bb;
    2221                 :          0 :   rtx_insn *last_insn;
    2222                 :          0 :   rtx label;
    2223                 :          0 :   rtx label_addr;
    2224                 :          0 :   rtx_insn *indirect_jump_sequence;
    2225                 :          0 :   rtx_insn *jump_insn = NULL;
    2226                 :          0 :   rtx new_reg;
    2227                 :          0 :   rtx_insn *cur_insn;
    2228                 :          0 :   edge succ;
    2229                 :            : 
    2230                 :          0 :   FOR_EACH_BB_FN (cur_bb, cfun)
    2231                 :            :     {
    2232                 :          0 :       last_insn = BB_END (cur_bb);
    2233                 :            : 
    2234                 :          0 :       if (EDGE_COUNT (cur_bb->succs) < 1)
    2235                 :          0 :         continue;
    2236                 :            : 
    2237                 :          0 :       succ = EDGE_SUCC (cur_bb, 0);
    2238                 :            : 
    2239                 :            :       /* Check to see if bb ends in a crossing (unconditional) jump.  At
    2240                 :            :          this point, no crossing jumps should be conditional.  */
    2241                 :            : 
    2242                 :          0 :       if (JUMP_P (last_insn)
    2243                 :          0 :           && (succ->flags & EDGE_CROSSING))
    2244                 :            :         {
    2245                 :          0 :           gcc_assert (!any_condjump_p (last_insn));
    2246                 :            : 
    2247                 :            :           /* Make sure the jump is not already an indirect or table jump.  */
    2248                 :            : 
    2249                 :          0 :           if (!computed_jump_p (last_insn)
    2250                 :          0 :               && !tablejump_p (last_insn, NULL, NULL))
    2251                 :            :             {
    2252                 :            :               /* We have found a "crossing" unconditional branch.  Now
    2253                 :            :                  we must convert it to an indirect jump.  First create
    2254                 :            :                  reference of label, as target for jump.  */
    2255                 :            : 
    2256                 :          0 :               label = JUMP_LABEL (last_insn);
    2257                 :          0 :               label_addr = gen_rtx_LABEL_REF (Pmode, label);
    2258                 :          0 :               LABEL_NUSES (label) += 1;
    2259                 :            : 
    2260                 :            :               /* Get a register to use for the indirect jump.  */
    2261                 :            : 
    2262                 :          0 :               new_reg = gen_reg_rtx (Pmode);
    2263                 :            : 
    2264                 :            :               /* Generate indirect the jump sequence.  */
    2265                 :            : 
    2266                 :          0 :               start_sequence ();
    2267                 :          0 :               emit_move_insn (new_reg, label_addr);
    2268                 :          0 :               emit_indirect_jump (new_reg);
    2269                 :          0 :               indirect_jump_sequence = get_insns ();
    2270                 :          0 :               end_sequence ();
    2271                 :            : 
    2272                 :            :               /* Make sure every instruction in the new jump sequence has
    2273                 :            :                  its basic block set to be cur_bb.  */
    2274                 :            : 
    2275                 :          0 :               for (cur_insn = indirect_jump_sequence; cur_insn;
    2276                 :          0 :                    cur_insn = NEXT_INSN (cur_insn))
    2277                 :            :                 {
    2278                 :          0 :                   if (!BARRIER_P (cur_insn))
    2279                 :          0 :                     BLOCK_FOR_INSN (cur_insn) = cur_bb;
    2280                 :          0 :                   if (JUMP_P (cur_insn))
    2281                 :          0 :                     jump_insn = cur_insn;
    2282                 :            :                 }
    2283                 :            : 
    2284                 :            :               /* Insert the new (indirect) jump sequence immediately before
    2285                 :            :                  the unconditional jump, then delete the unconditional jump.  */
    2286                 :            : 
    2287                 :          0 :               emit_insn_before (indirect_jump_sequence, last_insn);
    2288                 :          0 :               delete_insn (last_insn);
    2289                 :            : 
    2290                 :          0 :               JUMP_LABEL (jump_insn) = label;
    2291                 :          0 :               LABEL_NUSES (label)++;
    2292                 :            : 
    2293                 :            :               /* Make BB_END for cur_bb be the jump instruction (NOT the
    2294                 :            :                  barrier instruction at the end of the sequence...).  */
    2295                 :            : 
    2296                 :          0 :               BB_END (cur_bb) = jump_insn;
    2297                 :            :             }
    2298                 :            :         }
    2299                 :            :     }
    2300                 :          0 : }
    2301                 :            : 
    2302                 :            : /* Update CROSSING_JUMP_P flags on all jump insns.  */
    2303                 :            : 
    2304                 :            : static void
    2305                 :      47475 : update_crossing_jump_flags (void)
    2306                 :            : {
    2307                 :      47475 :   basic_block bb;
    2308                 :      47475 :   edge e;
    2309                 :      47475 :   edge_iterator ei;
    2310                 :            : 
    2311                 :    1057820 :   FOR_EACH_BB_FN (bb, cfun)
    2312                 :    1968740 :     FOR_EACH_EDGE (e, ei, bb->succs)
    2313                 :    1288010 :       if (e->flags & EDGE_CROSSING)
    2314                 :            :         {
    2315                 :     329610 :           if (JUMP_P (BB_END (bb)))
    2316                 :     329454 :             CROSSING_JUMP_P (BB_END (bb)) = 1;
    2317                 :            :           break;
    2318                 :            :         }
    2319                 :      47475 : }
    2320                 :            : 
    2321                 :            : /* Reorder basic blocks using the software trace cache (STC) algorithm.  */
    2322                 :            : 
    2323                 :            : static void
    2324                 :     417352 : reorder_basic_blocks_software_trace_cache (void)
    2325                 :            : {
    2326                 :     417352 :   if (dump_file)
    2327                 :         23 :     fprintf (dump_file, "\nReordering with the STC algorithm.\n\n");
    2328                 :            : 
    2329                 :     417352 :   int n_traces;
    2330                 :     417352 :   int i;
    2331                 :     417352 :   struct trace *traces;
    2332                 :            : 
    2333                 :            :   /* We are estimating the length of uncond jump insn only once since the code
    2334                 :            :      for getting the insn length always returns the minimal length now.  */
    2335                 :     417352 :   if (uncond_jump_length == 0)
    2336                 :       7115 :     uncond_jump_length = get_uncond_jump_length ();
    2337                 :            : 
    2338                 :            :   /* We need to know some information for each basic block.  */
    2339                 :     417352 :   array_size = GET_ARRAY_SIZE (last_basic_block_for_fn (cfun));
    2340                 :     417352 :   bbd = XNEWVEC (bbro_basic_block_data, array_size);
    2341                 :   11074700 :   for (i = 0; i < array_size; i++)
    2342                 :            :     {
    2343                 :   10657300 :       bbd[i].start_of_trace = -1;
    2344                 :   10657300 :       bbd[i].end_of_trace = -1;
    2345                 :   10657300 :       bbd[i].in_trace = -1;
    2346                 :   10657300 :       bbd[i].visited = 0;
    2347                 :   10657300 :       bbd[i].priority = -1;
    2348                 :   10657300 :       bbd[i].heap = NULL;
    2349                 :   10657300 :       bbd[i].node = NULL;
    2350                 :            :     }
    2351                 :            : 
    2352                 :     417352 :   traces = XNEWVEC (struct trace, n_basic_blocks_for_fn (cfun));
    2353                 :     417352 :   n_traces = 0;
    2354                 :     417352 :   find_traces (&n_traces, traces);
    2355                 :     417352 :   connect_traces (n_traces, traces);
    2356                 :     417352 :   FREE (traces);
    2357                 :     417352 :   FREE (bbd);
    2358                 :     417352 : }
    2359                 :            : 
    2360                 :            : /* Order edges by execution frequency, higher first.  */
    2361                 :            : 
    2362                 :            : static int
    2363                 :   11729900 : edge_order (const void *ve1, const void *ve2)
    2364                 :            : {
    2365                 :   11729900 :   edge e1 = *(const edge *) ve1;
    2366                 :   11729900 :   edge e2 = *(const edge *) ve2;
    2367                 :   11729900 :   profile_count c1 = e1->count ();
    2368                 :   11729900 :   profile_count c2 = e2->count ();
    2369                 :            :   /* Since profile_count::operator< does not establish a strict weak order
    2370                 :            :      in presence of uninitialized counts, use 'max': this makes them appear
    2371                 :            :      as if having execution frequency less than any initialized count.  */
    2372                 :   11729900 :   profile_count m = c1.max (c2);
    2373                 :   23459900 :   return (m == c2) - (m == c1);
    2374                 :            : }
    2375                 :            : 
    2376                 :            : /* Reorder basic blocks using the "simple" algorithm.  This tries to
    2377                 :            :    maximize the dynamic number of branches that are fallthrough, without
    2378                 :            :    copying instructions.  The algorithm is greedy, looking at the most
    2379                 :            :    frequently executed branch first.  */
    2380                 :            : 
    2381                 :            : static void
    2382                 :      18785 : reorder_basic_blocks_simple (void)
    2383                 :            : {
    2384                 :      18785 :   if (dump_file)
    2385                 :          8 :     fprintf (dump_file, "\nReordering with the \"simple\" algorithm.\n\n");
    2386                 :            : 
    2387                 :      18785 :   edge *edges = new edge[2 * n_basic_blocks_for_fn (cfun)];
    2388                 :            : 
    2389                 :            :   /* First, collect all edges that can be optimized by reordering blocks:
    2390                 :            :      simple jumps and conditional jumps, as well as the function entry edge.  */
    2391                 :            : 
    2392                 :      18785 :   int n = 0;
    2393                 :      18785 :   edges[n++] = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
    2394                 :            : 
    2395                 :      18785 :   basic_block bb;
    2396                 :     336482 :   FOR_EACH_BB_FN (bb, cfun)
    2397                 :            :     {
    2398                 :     317697 :       rtx_insn *end = BB_END (bb);
    2399                 :            : 
    2400                 :     317697 :       if (computed_jump_p (end) || tablejump_p (end, NULL, NULL))
    2401                 :        203 :         continue;
    2402                 :            : 
    2403                 :            :       /* We cannot optimize asm goto.  */
    2404                 :     317494 :       if (JUMP_P (end) && extract_asm_operands (end))
    2405                 :          0 :         continue;
    2406                 :            : 
    2407                 :     317494 :       if (single_succ_p (bb))
    2408                 :      86122 :         edges[n++] = EDGE_SUCC (bb, 0);
    2409                 :     231372 :       else if (any_condjump_p (end))
    2410                 :            :         {
    2411                 :     167978 :           edge e0 = EDGE_SUCC (bb, 0);
    2412                 :     167978 :           edge e1 = EDGE_SUCC (bb, 1);
    2413                 :            :           /* When optimizing for size it is best to keep the original
    2414                 :            :              fallthrough edges.  */
    2415                 :     167978 :           if (e1->flags & EDGE_FALLTHRU)
    2416                 :      96355 :             std::swap (e0, e1);
    2417                 :     167978 :           edges[n++] = e0;
    2418                 :     167978 :           edges[n++] = e1;
    2419                 :            :         }
    2420                 :            :     }
    2421                 :            : 
    2422                 :            :   /* Sort the edges, the most desirable first.  When optimizing for size
    2423                 :            :      all edges are equally desirable.  */
    2424                 :            : 
    2425                 :      18785 :   if (optimize_function_for_speed_p (cfun))
    2426                 :      18639 :     gcc_stablesort (edges, n, sizeof *edges, edge_order);
    2427                 :            : 
    2428                 :            :   /* Now decide which of those edges to make fallthrough edges.  We set
    2429                 :            :      BB_VISITED if a block already has a fallthrough successor assigned
    2430                 :            :      to it.  We make ->AUX of an endpoint point to the opposite endpoint
    2431                 :            :      of a sequence of blocks that fall through, and ->AUX will be NULL
    2432                 :            :      for a block that is in such a sequence but not an endpoint anymore.
    2433                 :            : 
    2434                 :            :      To start with, everything points to itself, nothing is assigned yet.  */
    2435                 :            : 
    2436                 :     374052 :   FOR_ALL_BB_FN (bb, cfun)
    2437                 :            :     {
    2438                 :     355267 :       bb->aux = bb;
    2439                 :     355267 :       bb->flags &= ~BB_VISITED;
    2440                 :            :     }
    2441                 :            : 
    2442                 :      18785 :   EXIT_BLOCK_PTR_FOR_FN (cfun)->aux = 0;
    2443                 :            : 
    2444                 :            :   /* Now for all edges, the most desirable first, see if that edge can
    2445                 :            :      connect two sequences.  If it can, update AUX and BB_VISITED; if it
    2446                 :            :      cannot, zero out the edge in the table.  */
    2447                 :            : 
    2448                 :     459648 :   for (int j = 0; j < n; j++)
    2449                 :            :     {
    2450                 :     440863 :       edge e = edges[j];
    2451                 :            : 
    2452                 :     440863 :       basic_block tail_a = e->src;
    2453                 :     440863 :       basic_block head_b = e->dest;
    2454                 :     440863 :       basic_block head_a = (basic_block) tail_a->aux;
    2455                 :     440863 :       basic_block tail_b = (basic_block) head_b->aux;
    2456                 :            : 
    2457                 :            :       /* An edge cannot connect two sequences if:
    2458                 :            :          - it crosses partitions;
    2459                 :            :          - its src is not a current endpoint;
    2460                 :            :          - its dest is not a current endpoint;
    2461                 :            :          - or, it would create a loop.  */
    2462                 :            : 
    2463                 :     440863 :       if (e->flags & EDGE_CROSSING
    2464                 :     440848 :           || tail_a->flags & BB_VISITED
    2465                 :     292732 :           || !tail_b
    2466                 :     255690 :           || (!(head_b->flags & BB_VISITED) && head_b != tail_b)
    2467                 :     237461 :           || tail_a == tail_b)
    2468                 :            :         {
    2469                 :     223680 :           edges[j] = 0;
    2470                 :     223680 :           continue;
    2471                 :            :         }
    2472                 :            : 
    2473                 :     217183 :       tail_a->aux = 0;
    2474                 :     217183 :       head_b->aux = 0;
    2475                 :     217183 :       head_a->aux = tail_b;
    2476                 :     217183 :       tail_b->aux = head_a;
    2477                 :     217183 :       tail_a->flags |= BB_VISITED;
    2478                 :            :     }
    2479                 :            : 
    2480                 :            :   /* Put the pieces together, in the same order that the start blocks of
    2481                 :            :      the sequences already had.  The hot/cold partitioning gives a little
    2482                 :            :      complication: as a first pass only do this for blocks in the same
    2483                 :            :      partition as the start block, and (if there is anything left to do)
    2484                 :            :      in a second pass handle the other partition.  */
    2485                 :            : 
    2486                 :      18785 :   basic_block last_tail = (basic_block) ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
    2487                 :            : 
    2488                 :      18785 :   int current_partition
    2489                 :      18785 :     = BB_PARTITION (last_tail == ENTRY_BLOCK_PTR_FOR_FN (cfun)
    2490                 :            :                     ? EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0)->dest
    2491                 :            :                     : last_tail);
    2492                 :      18785 :   bool need_another_pass = true;
    2493                 :            : 
    2494                 :      37579 :   for (int pass = 0; pass < 2 && need_another_pass; pass++)
    2495                 :            :     {
    2496                 :      18794 :       need_another_pass = false;
    2497                 :            : 
    2498                 :     336550 :       FOR_EACH_BB_FN (bb, cfun)
    2499                 :     317756 :         if ((bb->flags & BB_VISITED && bb->aux) || bb->aux == bb)
    2500                 :            :           {
    2501                 :     100534 :             if (BB_PARTITION (bb) != current_partition)
    2502                 :            :               {
    2503                 :         20 :                 need_another_pass = true;
    2504                 :         20 :                 continue;
    2505                 :            :               }
    2506                 :            : 
    2507                 :     100514 :             last_tail->aux = bb;
    2508                 :     100514 :             last_tail = (basic_block) bb->aux;
    2509                 :            :           }
    2510                 :            : 
    2511                 :      18794 :       current_partition ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
    2512                 :            :     }
    2513                 :            : 
    2514                 :      18785 :   last_tail->aux = 0;
    2515                 :            : 
    2516                 :            :   /* Finally, link all the chosen fallthrough edges.  */
    2517                 :            : 
    2518                 :     459648 :   for (int j = 0; j < n; j++)
    2519                 :     440863 :     if (edges[j])
    2520                 :     217183 :       edges[j]->src->aux = edges[j]->dest;
    2521                 :            : 
    2522                 :      18785 :   delete[] edges;
    2523                 :            : 
    2524                 :            :   /* If the entry edge no longer falls through we have to make a new
    2525                 :            :      block so it can do so again.  */
    2526                 :            : 
    2527                 :      18785 :   edge e = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
    2528                 :      18785 :   if (e->dest != ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux)
    2529                 :            :     {
    2530                 :         24 :       force_nonfallthru (e);
    2531                 :         24 :       e->src->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
    2532                 :            :     }
    2533                 :      18785 : }
    2534                 :            : 
    2535                 :            : /* Reorder basic blocks.  The main entry point to this file.  */
    2536                 :            : 
    2537                 :            : static void
    2538                 :     687985 : reorder_basic_blocks (void)
    2539                 :            : {
    2540                 :     687985 :   gcc_assert (current_ir_type () == IR_RTL_CFGLAYOUT);
    2541                 :            : 
    2542                 :     687985 :   if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1)
    2543                 :            :     return;
    2544                 :            : 
    2545                 :     436137 :   set_edge_can_fallthru_flag ();
    2546                 :     436137 :   mark_dfs_back_edges ();
    2547                 :            : 
    2548                 :     436137 :   switch (flag_reorder_blocks_algorithm)
    2549                 :            :     {
    2550                 :      18785 :     case REORDER_BLOCKS_ALGORITHM_SIMPLE:
    2551                 :      18785 :       reorder_basic_blocks_simple ();
    2552                 :      18785 :       break;
    2553                 :            : 
    2554                 :     417352 :     case REORDER_BLOCKS_ALGORITHM_STC:
    2555                 :     417352 :       reorder_basic_blocks_software_trace_cache ();
    2556                 :     417352 :       break;
    2557                 :            : 
    2558                 :          0 :     default:
    2559                 :          0 :       gcc_unreachable ();
    2560                 :            :     }
    2561                 :            : 
    2562                 :     436137 :   relink_block_chain (/*stay_in_cfglayout_mode=*/true);
    2563                 :            : 
    2564                 :     436137 :   if (dump_file)
    2565                 :            :     {
    2566                 :         31 :       if (dump_flags & TDF_DETAILS)
    2567                 :          0 :         dump_reg_info (dump_file);
    2568                 :         31 :       dump_flow_info (dump_file, dump_flags);
    2569                 :            :     }
    2570                 :            : 
    2571                 :            :   /* Signal that rtl_verify_flow_info_1 can now verify that there
    2572                 :            :      is at most one switch between hot/cold sections.  */
    2573                 :     436137 :   crtl->bb_reorder_complete = true;
    2574                 :            : }
    2575                 :            : 
    2576                 :            : /* Determine which partition the first basic block in the function
    2577                 :            :    belongs to, then find the first basic block in the current function
    2578                 :            :    that belongs to a different section, and insert a
    2579                 :            :    NOTE_INSN_SWITCH_TEXT_SECTIONS note immediately before it in the
    2580                 :            :    instruction stream.  When writing out the assembly code,
    2581                 :            :    encountering this note will make the compiler switch between the
    2582                 :            :    hot and cold text sections.  */
    2583                 :            : 
    2584                 :            : void
    2585                 :      47475 : insert_section_boundary_note (void)
    2586                 :            : {
    2587                 :      47475 :   basic_block bb;
    2588                 :      47475 :   bool switched_sections = false;
    2589                 :      47475 :   int current_partition = 0;
    2590                 :            : 
    2591                 :      47475 :   if (!crtl->has_bb_partition)
    2592                 :            :     return;
    2593                 :            : 
    2594                 :    1098810 :   FOR_EACH_BB_FN (bb, cfun)
    2595                 :            :     {
    2596                 :    1051340 :       if (!current_partition)
    2597                 :      47475 :         current_partition = BB_PARTITION (bb);
    2598                 :    1051340 :       if (BB_PARTITION (bb) != current_partition)
    2599                 :            :         {
    2600                 :      47474 :           gcc_assert (!switched_sections);
    2601                 :      47474 :           switched_sections = true;
    2602                 :      47474 :           emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
    2603                 :      47474 :           current_partition = BB_PARTITION (bb);
    2604                 :            :         }
    2605                 :            :     }
    2606                 :            : 
    2607                 :            :   /* Make sure crtl->has_bb_partition matches reality even if bbpart finds
    2608                 :            :      some hot and some cold basic blocks, but later one of those kinds is
    2609                 :            :      optimized away.  */
    2610                 :      47475 :   crtl->has_bb_partition = switched_sections;
    2611                 :            : }
    2612                 :            : 
    2613                 :            : namespace {
    2614                 :            : 
    2615                 :            : const pass_data pass_data_reorder_blocks =
    2616                 :            : {
    2617                 :            :   RTL_PASS, /* type */
    2618                 :            :   "bbro", /* name */
    2619                 :            :   OPTGROUP_NONE, /* optinfo_flags */
    2620                 :            :   TV_REORDER_BLOCKS, /* tv_id */
    2621                 :            :   0, /* properties_required */
    2622                 :            :   0, /* properties_provided */
    2623                 :            :   0, /* properties_destroyed */
    2624                 :            :   0, /* todo_flags_start */
    2625                 :            :   0, /* todo_flags_finish */
    2626                 :            : };
    2627                 :            : 
    2628                 :            : class pass_reorder_blocks : public rtl_opt_pass
    2629                 :            : {
    2630                 :            : public:
    2631                 :     201440 :   pass_reorder_blocks (gcc::context *ctxt)
    2632                 :     402880 :     : rtl_opt_pass (pass_data_reorder_blocks, ctxt)
    2633                 :            :   {}
    2634                 :            : 
    2635                 :            :   /* opt_pass methods: */
    2636                 :     960236 :   virtual bool gate (function *)
    2637                 :            :     {
    2638                 :     960236 :       if (targetm.cannot_modify_jumps_p ())
    2639                 :            :         return false;
    2640                 :     960236 :       return (optimize > 0
    2641                 :     960236 :               && (flag_reorder_blocks || flag_reorder_blocks_and_partition));
    2642                 :            :     }
    2643                 :            : 
    2644                 :            :   virtual unsigned int execute (function *);
    2645                 :            : 
    2646                 :            : }; // class pass_reorder_blocks
    2647                 :            : 
    2648                 :            : unsigned int
    2649                 :     687985 : pass_reorder_blocks::execute (function *fun)
    2650                 :            : {
    2651                 :     687985 :   basic_block bb;
    2652                 :            : 
    2653                 :            :   /* Last attempt to optimize CFG, as scheduling, peepholing and insn
    2654                 :            :      splitting possibly introduced more crossjumping opportunities.  */
    2655                 :     687985 :   cfg_layout_initialize (CLEANUP_EXPENSIVE);
    2656                 :            : 
    2657                 :     687985 :   reorder_basic_blocks ();
    2658                 :     687985 :   cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_NO_PARTITIONING);
    2659                 :            : 
    2660                 :    7455630 :   FOR_EACH_BB_FN (bb, fun)
    2661                 :    6767640 :     if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun))
    2662                 :    6079660 :       bb->aux = bb->next_bb;
    2663                 :     687985 :   cfg_layout_finalize ();
    2664                 :            : 
    2665                 :    7562050 :   FOR_EACH_BB_FN (bb, fun)
    2666                 :    6874070 :     df_recompute_luids (bb);
    2667                 :     687985 :   return 0;
    2668                 :            : }
    2669                 :            : 
    2670                 :            : } // anon namespace
    2671                 :            : 
    2672                 :            : rtl_opt_pass *
    2673                 :     201440 : make_pass_reorder_blocks (gcc::context *ctxt)
    2674                 :            : {
    2675                 :     201440 :   return new pass_reorder_blocks (ctxt);
    2676                 :            : }
    2677                 :            : 
    2678                 :            : /* Duplicate a block (that we already know ends in a computed jump) into its
    2679                 :            :    predecessors, where possible.  Return whether anything is changed.  */
    2680                 :            : static bool
    2681                 :        709 : maybe_duplicate_computed_goto (basic_block bb, int max_size)
    2682                 :            : {
    2683                 :        709 :   if (single_pred_p (bb))
    2684                 :            :     return false;
    2685                 :            : 
    2686                 :            :   /* Make sure that the block is small enough.  */
    2687                 :        155 :   rtx_insn *insn;
    2688                 :       1931 :   FOR_BB_INSNS (bb, insn)
    2689                 :        950 :     if (INSN_P (insn))
    2690                 :            :       {
    2691                 :        564 :         max_size -= get_attr_min_length (insn);
    2692                 :        564 :         if (max_size < 0)
    2693                 :            :            return false;
    2694                 :            :       }
    2695                 :            : 
    2696                 :         93 :   bool changed = false;
    2697                 :         93 :   edge e;
    2698                 :         93 :   edge_iterator ei;
    2699                 :        343 :   for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
    2700                 :            :     {
    2701                 :        250 :       basic_block pred = e->src;
    2702                 :            : 
    2703                 :            :       /* Do not duplicate BB into PRED if that is the last predecessor, or if
    2704                 :            :          we cannot merge a copy of BB with PRED.  */
    2705                 :        250 :       if (single_pred_p (bb)
    2706                 :        194 :           || !single_succ_p (pred)
    2707                 :         95 :           || e->flags & EDGE_COMPLEX
    2708                 :         95 :           || pred->index < NUM_FIXED_BLOCKS
    2709                 :         95 :           || (JUMP_P (BB_END (pred)) && !simplejump_p (BB_END (pred)))
    2710                 :        345 :           || (JUMP_P (BB_END (pred)) && CROSSING_JUMP_P (BB_END (pred))))
    2711                 :            :         {
    2712                 :        157 :           ei_next (&ei);
    2713                 :        157 :           continue;
    2714                 :            :         }
    2715                 :            : 
    2716                 :         93 :       if (dump_file)
    2717                 :          0 :         fprintf (dump_file, "Duplicating computed goto bb %d into bb %d\n",
    2718                 :          0 :                  bb->index, e->src->index);
    2719                 :            : 
    2720                 :            :       /* Remember if PRED can be duplicated; if so, the copy of BB merged
    2721                 :            :          with PRED can be duplicated as well.  */
    2722                 :         93 :       bool can_dup_more = can_duplicate_block_p (pred);
    2723                 :            : 
    2724                 :            :       /* Make a copy of BB, merge it into PRED.  */
    2725                 :         93 :       basic_block copy = duplicate_block (bb, e, NULL);
    2726                 :         93 :       emit_barrier_after_bb (copy);
    2727                 :         93 :       reorder_insns_nobb (BB_HEAD (copy), BB_END (copy), BB_END (pred));
    2728                 :         93 :       merge_blocks (pred, copy);
    2729                 :            : 
    2730                 :         93 :       changed = true;
    2731                 :            : 
    2732                 :            :       /* Try to merge the resulting merged PRED into further predecessors.  */
    2733                 :         93 :       if (can_dup_more)
    2734                 :         93 :         maybe_duplicate_computed_goto (pred, max_size);
    2735                 :            :     }
    2736                 :            : 
    2737                 :            :   return changed;
    2738                 :            : }
    2739                 :            : 
    2740                 :            : /* Duplicate the blocks containing computed gotos.  This basically unfactors
    2741                 :            :    computed gotos that were factored early on in the compilation process to
    2742                 :            :    speed up edge based data flow.  We used to not unfactor them again, which
    2743                 :            :    can seriously pessimize code with many computed jumps in the source code,
    2744                 :            :    such as interpreters.  See e.g. PR15242.  */
    2745                 :            : static void
    2746                 :     602266 : duplicate_computed_gotos (function *fun)
    2747                 :            : {
    2748                 :            :   /* We are estimating the length of uncond jump insn only once
    2749                 :            :      since the code for getting the insn length always returns
    2750                 :            :      the minimal length now.  */
    2751                 :     602266 :   if (uncond_jump_length == 0)
    2752                 :      80328 :     uncond_jump_length = get_uncond_jump_length ();
    2753                 :            : 
    2754                 :            :   /* Never copy a block larger than this.  */
    2755                 :     602266 :   int max_size
    2756                 :     602266 :     = uncond_jump_length * param_max_goto_duplication_insns;
    2757                 :            : 
    2758                 :     602266 :   bool changed = false;
    2759                 :            : 
    2760                 :            :   /* Try to duplicate all blocks that end in a computed jump and that
    2761                 :            :      can be duplicated at all.  */
    2762                 :     602266 :   basic_block bb;
    2763                 :    7214370 :   FOR_EACH_BB_FN (bb, fun)
    2764                 :    6612100 :     if (computed_jump_p (BB_END (bb)) && can_duplicate_block_p (bb))
    2765                 :        616 :       changed |= maybe_duplicate_computed_goto (bb, max_size);
    2766                 :            : 
    2767                 :            :   /* Duplicating blocks will redirect edges and may cause hot blocks
    2768                 :            :     previously reached by both hot and cold blocks to become dominated
    2769                 :            :     only by cold blocks.  */
    2770                 :     602266 :   if (changed)
    2771                 :         70 :     fixup_partitions ();
    2772                 :     602266 : }
    2773                 :            : 
    2774                 :            : namespace {
    2775                 :            : 
    2776                 :            : const pass_data pass_data_duplicate_computed_gotos =
    2777                 :            : {
    2778                 :            :   RTL_PASS, /* type */
    2779                 :            :   "compgotos", /* name */
    2780                 :            :   OPTGROUP_NONE, /* optinfo_flags */
    2781                 :            :   TV_REORDER_BLOCKS, /* tv_id */
    2782                 :            :   0, /* properties_required */
    2783                 :            :   0, /* properties_provided */
    2784                 :            :   0, /* properties_destroyed */
    2785                 :            :   0, /* todo_flags_start */
    2786                 :            :   0, /* todo_flags_finish */
    2787                 :            : };
    2788                 :            : 
    2789                 :            : class pass_duplicate_computed_gotos : public rtl_opt_pass
    2790                 :            : {
    2791                 :            : public:
    2792                 :     201440 :   pass_duplicate_computed_gotos (gcc::context *ctxt)
    2793                 :     402880 :     : rtl_opt_pass (pass_data_duplicate_computed_gotos, ctxt)
    2794                 :            :   {}
    2795                 :            : 
    2796                 :            :   /* opt_pass methods: */
    2797                 :            :   virtual bool gate (function *);
    2798                 :            :   virtual unsigned int execute (function *);
    2799                 :            : 
    2800                 :            : }; // class pass_duplicate_computed_gotos
    2801                 :            : 
    2802                 :            : bool
    2803                 :     960236 : pass_duplicate_computed_gotos::gate (function *fun)
    2804                 :            : {
    2805                 :     960236 :   if (targetm.cannot_modify_jumps_p ())
    2806                 :            :     return false;
    2807                 :     960236 :   return (optimize > 0
    2808                 :     687987 :           && flag_expensive_optimizations
    2809                 :    1605420 :           && ! optimize_function_for_size_p (fun));
    2810                 :            : }
    2811                 :            : 
    2812                 :            : unsigned int
    2813                 :     602266 : pass_duplicate_computed_gotos::execute (function *fun)
    2814                 :            : {
    2815                 :     602266 :   duplicate_computed_gotos (fun);
    2816                 :            : 
    2817                 :     602266 :   return 0;
    2818                 :            : }
    2819                 :            : 
    2820                 :            : } // anon namespace
    2821                 :            : 
    2822                 :            : rtl_opt_pass *
    2823                 :     201440 : make_pass_duplicate_computed_gotos (gcc::context *ctxt)
    2824                 :            : {
    2825                 :     201440 :   return new pass_duplicate_computed_gotos (ctxt);
    2826                 :            : }
    2827                 :            : 
    2828                 :            : /* This function is the main 'entrance' for the optimization that
    2829                 :            :    partitions hot and cold basic blocks into separate sections of the
    2830                 :            :    .o file (to improve performance and cache locality).  Ideally it
    2831                 :            :    would be called after all optimizations that rearrange the CFG have
    2832                 :            :    been called.  However part of this optimization may introduce new
    2833                 :            :    register usage, so it must be called before register allocation has
    2834                 :            :    occurred.  This means that this optimization is actually called
    2835                 :            :    well before the optimization that reorders basic blocks (see
    2836                 :            :    function above).
    2837                 :            : 
    2838                 :            :    This optimization checks the feedback information to determine
    2839                 :            :    which basic blocks are hot/cold, updates flags on the basic blocks
    2840                 :            :    to indicate which section they belong in.  This information is
    2841                 :            :    later used for writing out sections in the .o file.  Because hot
    2842                 :            :    and cold sections can be arbitrarily large (within the bounds of
    2843                 :            :    memory), far beyond the size of a single function, it is necessary
    2844                 :            :    to fix up all edges that cross section boundaries, to make sure the
    2845                 :            :    instructions used can actually span the required distance.  The
    2846                 :            :    fixes are described below.
    2847                 :            : 
    2848                 :            :    Fall-through edges must be changed into jumps; it is not safe or
    2849                 :            :    legal to fall through across a section boundary.  Whenever a
    2850                 :            :    fall-through edge crossing a section boundary is encountered, a new
    2851                 :            :    basic block is inserted (in the same section as the fall-through
    2852                 :            :    source), and the fall through edge is redirected to the new basic
    2853                 :            :    block.  The new basic block contains an unconditional jump to the
    2854                 :            :    original fall-through target.  (If the unconditional jump is
    2855                 :            :    insufficient to cross section boundaries, that is dealt with a
    2856                 :            :    little later, see below).
    2857                 :            : 
    2858                 :            :    In order to deal with architectures that have short conditional
    2859                 :            :    branches (which cannot span all of memory) we take any conditional
    2860                 :            :    jump that attempts to cross a section boundary and add a level of
    2861                 :            :    indirection: it becomes a conditional jump to a new basic block, in
    2862                 :            :    the same section.  The new basic block contains an unconditional
    2863                 :            :    jump to the original target, in the other section.
    2864                 :            : 
    2865                 :            :    For those architectures whose unconditional branch is also
    2866                 :            :    incapable of reaching all of memory, those unconditional jumps are
    2867                 :            :    converted into indirect jumps, through a register.
    2868                 :            : 
    2869                 :            :    IMPORTANT NOTE: This optimization causes some messy interactions
    2870                 :            :    with the cfg cleanup optimizations; those optimizations want to
    2871                 :            :    merge blocks wherever possible, and to collapse indirect jump
    2872                 :            :    sequences (change "A jumps to B jumps to C" directly into "A jumps
    2873                 :            :    to C").  Those optimizations can undo the jump fixes that
    2874                 :            :    partitioning is required to make (see above), in order to ensure
    2875                 :            :    that jumps attempting to cross section boundaries are really able
    2876                 :            :    to cover whatever distance the jump requires (on many architectures
    2877                 :            :    conditional or unconditional jumps are not able to reach all of
    2878                 :            :    memory).  Therefore tests have to be inserted into each such
    2879                 :            :    optimization to make sure that it does not undo stuff necessary to
    2880                 :            :    cross partition boundaries.  This would be much less of a problem
    2881                 :            :    if we could perform this optimization later in the compilation, but
    2882                 :            :    unfortunately the fact that we may need to create indirect jumps
    2883                 :            :    (through registers) requires that this optimization be performed
    2884                 :            :    before register allocation.
    2885                 :            : 
    2886                 :            :    Hot and cold basic blocks are partitioned and put in separate
    2887                 :            :    sections of the .o file, to reduce paging and improve cache
    2888                 :            :    performance (hopefully).  This can result in bits of code from the
    2889                 :            :    same function being widely separated in the .o file.  However this
    2890                 :            :    is not obvious to the current bb structure.  Therefore we must take
    2891                 :            :    care to ensure that: 1). There are no fall_thru edges that cross
    2892                 :            :    between sections; 2). For those architectures which have "short"
    2893                 :            :    conditional branches, all conditional branches that attempt to
    2894                 :            :    cross between sections are converted to unconditional branches;
    2895                 :            :    and, 3). For those architectures which have "short" unconditional
    2896                 :            :    branches, all unconditional branches that attempt to cross between
    2897                 :            :    sections are converted to indirect jumps.
    2898                 :            : 
    2899                 :            :    The code for fixing up fall_thru edges that cross between hot and
    2900                 :            :    cold basic blocks does so by creating new basic blocks containing
    2901                 :            :    unconditional branches to the appropriate label in the "other"
    2902                 :            :    section.  The new basic block is then put in the same (hot or cold)
    2903                 :            :    section as the original conditional branch, and the fall_thru edge
    2904                 :            :    is modified to fall into the new basic block instead.  By adding
    2905                 :            :    this level of indirection we end up with only unconditional branches
    2906                 :            :    crossing between hot and cold sections.
    2907                 :            : 
    2908                 :            :    Conditional branches are dealt with by adding a level of indirection.
    2909                 :            :    A new basic block is added in the same (hot/cold) section as the
    2910                 :            :    conditional branch, and the conditional branch is retargeted to the
    2911                 :            :    new basic block.  The new basic block contains an unconditional branch
    2912                 :            :    to the original target of the conditional branch (in the other section).
    2913                 :            : 
    2914                 :            :    Unconditional branches are dealt with by converting them into
    2915                 :            :    indirect jumps.  */
    2916                 :            : 
    2917                 :            : namespace {
    2918                 :            : 
    2919                 :            : const pass_data pass_data_partition_blocks =
    2920                 :            : {
    2921                 :            :   RTL_PASS, /* type */
    2922                 :            :   "bbpart", /* name */
    2923                 :            :   OPTGROUP_NONE, /* optinfo_flags */
    2924                 :            :   TV_REORDER_BLOCKS, /* tv_id */
    2925                 :            :   PROP_cfglayout, /* properties_required */
    2926                 :            :   0, /* properties_provided */
    2927                 :            :   0, /* properties_destroyed */
    2928                 :            :   0, /* todo_flags_start */
    2929                 :            :   0, /* todo_flags_finish */
    2930                 :            : };
    2931                 :            : 
    2932                 :            : class pass_partition_blocks : public rtl_opt_pass
    2933                 :            : {
    2934                 :            : public:
    2935                 :     201440 :   pass_partition_blocks (gcc::context *ctxt)
    2936                 :     402880 :     : rtl_opt_pass (pass_data_partition_blocks, ctxt)
    2937                 :            :   {}
    2938                 :            : 
    2939                 :            :   /* opt_pass methods: */
    2940                 :            :   virtual bool gate (function *);
    2941                 :            :   virtual unsigned int execute (function *);
    2942                 :            : 
    2943                 :            : }; // class pass_partition_blocks
    2944                 :            : 
    2945                 :            : bool
    2946                 :     960236 : pass_partition_blocks::gate (function *fun)
    2947                 :            : {
    2948                 :            :   /* The optimization to partition hot/cold basic blocks into separate
    2949                 :            :      sections of the .o file does not work well with linkonce or with
    2950                 :            :      user defined section attributes or with naked attribute.  Don't call
    2951                 :            :      it if either case arises.  */
    2952                 :     960236 :   return (flag_reorder_blocks_and_partition
    2953                 :     478371 :           && optimize
    2954                 :            :           /* See pass_reorder_blocks::gate.  We should not partition if
    2955                 :            :              we are going to omit the reordering.  */
    2956                 :     478357 :           && optimize_function_for_speed_p (fun)
    2957                 :     448573 :           && !DECL_COMDAT_GROUP (current_function_decl)
    2958                 :     376741 :           && !lookup_attribute ("section", DECL_ATTRIBUTES (fun->decl))
    2959                 :     376735 :           && !lookup_attribute ("naked", DECL_ATTRIBUTES (fun->decl))
    2960                 :            :           /* Workaround a bug in GDB where read_partial_die doesn't cope
    2961                 :            :              with DIEs with DW_AT_ranges, see PR81115.  */
    2962                 :    1336920 :           && !(in_lto_p && MAIN_NAME_P (DECL_NAME (fun->decl))));
    2963                 :            : }
    2964                 :            : 
    2965                 :            : unsigned
    2966                 :     369160 : pass_partition_blocks::execute (function *fun)
    2967                 :            : {
    2968                 :     369160 :   vec<edge> crossing_edges;
    2969                 :            : 
    2970                 :     369160 :   if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1)
    2971                 :            :     return 0;
    2972                 :            : 
    2973                 :     189691 :   df_set_flags (DF_DEFER_INSN_RESCAN);
    2974                 :            : 
    2975                 :     189691 :   crossing_edges = find_rarely_executed_basic_blocks_and_crossing_edges ();
    2976                 :     189691 :   if (!crossing_edges.exists ())
    2977                 :            :     /* Make sure to process deferred rescans and clear changeable df flags.  */
    2978                 :            :     return TODO_df_finish;
    2979                 :            : 
    2980                 :      47475 :   crtl->has_bb_partition = true;
    2981                 :            : 
    2982                 :            :   /* Make sure the source of any crossing edge ends in a jump and the
    2983                 :            :      destination of any crossing edge has a label.  */
    2984                 :      47475 :   add_labels_and_missing_jumps (crossing_edges);
    2985                 :            : 
    2986                 :            :   /* Convert all crossing fall_thru edges to non-crossing fall
    2987                 :            :      thrus to unconditional jumps (that jump to the original fall
    2988                 :            :      through dest).  */
    2989                 :      47475 :   fix_up_fall_thru_edges ();
    2990                 :            : 
    2991                 :            :   /* If the architecture does not have conditional branches that can
    2992                 :            :      span all of memory, convert crossing conditional branches into
    2993                 :            :      crossing unconditional branches.  */
    2994                 :      47475 :   if (!HAS_LONG_COND_BRANCH)
    2995                 :            :     fix_crossing_conditional_branches ();
    2996                 :            : 
    2997                 :            :   /* If the architecture does not have unconditional branches that
    2998                 :            :      can span all of memory, convert crossing unconditional branches
    2999                 :            :      into indirect jumps.  Since adding an indirect jump also adds
    3000                 :            :      a new register usage, update the register usage information as
    3001                 :            :      well.  */
    3002                 :      47475 :   if (!HAS_LONG_UNCOND_BRANCH)
    3003                 :            :     fix_crossing_unconditional_branches ();
    3004                 :            : 
    3005                 :      47475 :   update_crossing_jump_flags ();
    3006                 :            : 
    3007                 :            :   /* Clear bb->aux fields that the above routines were using.  */
    3008                 :      47475 :   clear_aux_for_blocks ();
    3009                 :            : 
    3010                 :      47475 :   crossing_edges.release ();
    3011                 :            : 
    3012                 :            :   /* ??? FIXME: DF generates the bb info for a block immediately.
    3013                 :            :      And by immediately, I mean *during* creation of the block.
    3014                 :            : 
    3015                 :            :         #0  df_bb_refs_collect
    3016                 :            :         #1  in df_bb_refs_record
    3017                 :            :         #2  in create_basic_block_structure
    3018                 :            : 
    3019                 :            :      Which means that the bb_has_eh_pred test in df_bb_refs_collect
    3020                 :            :      will *always* fail, because no edges can have been added to the
    3021                 :            :      block yet.  Which of course means we don't add the right 
    3022                 :            :      artificial refs, which means we fail df_verify (much) later.
    3023                 :            : 
    3024                 :            :      Cleanest solution would seem to make DF_DEFER_INSN_RESCAN imply
    3025                 :            :      that we also shouldn't grab data from the new blocks those new
    3026                 :            :      insns are in either.  In this way one can create the block, link
    3027                 :            :      it up properly, and have everything Just Work later, when deferred
    3028                 :            :      insns are processed.
    3029                 :            : 
    3030                 :            :      In the meantime, we have no other option but to throw away all
    3031                 :            :      of the DF data and recompute it all.  */
    3032                 :      47475 :   if (fun->eh->lp_array)
    3033                 :            :     {
    3034                 :      47475 :       df_finish_pass (true);
    3035                 :      47475 :       df_scan_alloc (NULL);
    3036                 :      47475 :       df_scan_blocks ();
    3037                 :            :       /* Not all post-landing pads use all of the EH_RETURN_DATA_REGNO
    3038                 :            :          data.  We blindly generated all of them when creating the new
    3039                 :            :          landing pad.  Delete those assignments we don't use.  */
    3040                 :      47475 :       df_set_flags (DF_LR_RUN_DCE);
    3041                 :      47475 :       df_analyze ();
    3042                 :            :     }
    3043                 :            : 
    3044                 :            :   /* Make sure to process deferred rescans and clear changeable df flags.  */
    3045                 :            :   return TODO_df_finish;
    3046                 :            : }
    3047                 :            : 
    3048                 :            : } // anon namespace
    3049                 :            : 
    3050                 :            : rtl_opt_pass *
    3051                 :     201440 : make_pass_partition_blocks (gcc::context *ctxt)
    3052                 :            : {
    3053                 :     201440 :   return new pass_partition_blocks (ctxt);
    3054                 :            : }

Generated by: LCOV version 1.0

LCOV profile is generated on x86_64 machine using following configure options: configure --disable-bootstrap --enable-coverage=opt --enable-languages=c,c++,fortran,go,jit,lto --enable-host-shared. GCC test suite is run with the built compiler.