gcc/0143-Backport-SME-mode-switching-Add-a-backprop-hook.patch

From cb4189b45a3a411958ab6aa85108f6dc7516acf5 Mon Sep 17 00:00:00 2001
From: Richard Sandiford <richard.sandiford@arm.com>
Date: Sat, 11 Nov 2023 17:29:00 +0000
Subject: [PATCH 044/157] [Backport][SME] mode-switching: Add a backprop hook

Reference: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=fc8458e20a524d053f576d64a606e21f8bd03b84

This patch adds a way for targets to ask that selected mode changes
be brought forward, through a combination of:

(1) requiring a mode in blocks where the entity was previously
    transparent

(2) pushing the transition at the head of a block onto incomging edges

SME has two uses for this:

- A "one-shot" entity that, for any given path of execution,
  either stays off or makes exactly one transition from off to on.
  This relies only on (1) above; see the hook description for more info.

  The main purpose of using mode-switching for this entity is to
  shrink-wrap the code that requires it.

- A second entity for which all transitions must be from known
  modes, which is enforced using a combination of (1) and (2).
  More specifically, (1) looks for edges B1->B2 for which:

  - B2 requires a specific mode and
  - B1 does not guarantee a specific starting mode

  In this system, such an edge is only possible if the entity is
  transparent in B1.  (1) then forces B1 to require some safe common
  mode.  Applying this inductively means that all incoming edges are
  from known modes.  If different edges give different starting modes,
  (2) pushes the transitions onto the edges themselves; this only
  happens if the entity is not transparent in some predecessor block.

The patch also uses the back-propagation as an excuse to do a simple
on-the-fly optimisation.

Hopefully the comments in the patch explain things a bit better.

gcc/
	* target.def (mode_switching.backprop): New hook.
	* doc/tm.texi.in (TARGET_MODE_BACKPROP): New @hook.
	* doc/tm.texi: Regenerate.
	* mode-switching.cc (struct bb_info): Add single_succ.
	(confluence_info): Add transp field.
	(single_succ_confluence_n, single_succ_transfer): New functions.
	(backprop_confluence_n, backprop_transfer): Likewise.
	(optimize_mode_switching): Use them.  Push mode transitions onto
	a block's incoming edges, if the backprop hook requires it.
---
 gcc/doc/tm.texi       |  28 +++++
 gcc/doc/tm.texi.in    |   2 +
 gcc/mode-switching.cc | 275 ++++++++++++++++++++++++++++++++++++++++++
 gcc/target.def        |  29 +++++
 4 files changed, 334 insertions(+)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index d7053ec9e..5f0972356 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10322,6 +10322,34 @@ The hook should return the number of modes if no suitable mode exists
 for the given arguments.
 @end deftypefn

+@deftypefn {Target Hook} int TARGET_MODE_BACKPROP (int @var{entity}, int @var{mode1}, int @var{mode2})
+If defined, the mode-switching pass uses this hook to back-propagate mode
+requirements through blocks that have no mode requirements of their own.
+Specifically, @var{mode1} is the mode that @var{entity} has on exit
+from a block B1 (say) and @var{mode2} is the mode that the next block
+requires @var{entity} to have.  B1 does not have any mode requirements
+of its own.
+
+The hook should return the mode that it prefers or requires @var{entity}
+to have in B1, or the number of modes if there is no such requirement.
+If the hook returns a required mode for more than one of B1's outgoing
+edges, those modes are combined as for @code{TARGET_MODE_CONFLUENCE}.
+
+For example, suppose there is a ``one-shot'' entity that,
+for a given execution of a function, either stays off or makes exactly
+one transition from off to on.  It is safe to make the transition at any
+time, but it is better not to do so unnecessarily.  This hook allows the
+function to manage such an entity without having to track its state at
+runtime.  Specifically. the entity would have two modes, 0 for off and
+1 for on, with 2 representing ``don't know''.  The system is forbidden from
+transitioning from 2 to 1, since 2 represents the possibility that the
+entity is already on (and the aim is to avoid having to emit code to
+check for that case).  This hook would therefore return 1 when @var{mode1}
+is 2 and @var{mode2} is 1, which would force the entity to be on in the
+source block.  Applying this inductively would remove all transitions
+in which the previous state is unknown.
+@end deftypefn
+
 @deftypefn {Target Hook} int TARGET_MODE_ENTRY (int @var{entity})
 If this hook is defined, it is evaluated for every @var{entity} that
 needs mode switching.  It should return the mode that @var{entity} is
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index d420e62fd..fcab21744 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6924,6 +6924,8 @@ mode or ``no mode'', depending on context.

 @hook TARGET_MODE_CONFLUENCE

+@hook TARGET_MODE_BACKPROP
+
 @hook TARGET_MODE_ENTRY

 @hook TARGET_MODE_EXIT
diff --git a/gcc/mode-switching.cc b/gcc/mode-switching.cc
index 065767902..c2a0f0294 100644
--- a/gcc/mode-switching.cc
+++ b/gcc/mode-switching.cc
@@ -81,6 +81,7 @@ struct bb_info
   int computing;
   int mode_out;
   int mode_in;
+  int single_succ;
 };

 /* Clear ode I from entity J in bitmap B.  */
@@ -508,6 +509,9 @@ struct
   /* Information about each basic block, indexed by block id.  */
   struct bb_info *bb_info;

+  /* A bitmap of blocks for which the current entity is transparent.  */
+  sbitmap transp;
+
   /* The entity that we're processing.  */
   int entity;

@@ -579,6 +583,210 @@ forward_transfer (int bb_index)
   return true;
 }

+/* A backwards confluence function.  Update the the bb_info single_succ
+   field for E's source block, based on changes to E's destination block.
+   At the end of the dataflow problem, single_succ is the single mode
+   that all successors require (directly or indirectly), or no_mode
+   if there are conflicting requirements.
+
+   Initially, a value of no_mode + 1 means "don't know".  */
+
+static bool
+single_succ_confluence_n (edge e)
+{
+  /* The entry block has no associated mode information.  */
+  if (e->src->index == ENTRY_BLOCK)
+    return false;
+
+  /* We don't control mode changes across abnormal edges.  */
+  if (e->flags & EDGE_ABNORMAL)
+    return false;
+
+  /* Do nothing if we've already found a conflict.  */
+  struct bb_info *bb_info = confluence_info.bb_info;
+  int no_mode = confluence_info.no_mode;
+  int src_mode = bb_info[e->src->index].single_succ;
+  if (src_mode == no_mode)
+    return false;
+
+  /* Work out what mode the destination block (or its successors) require.  */
+  int dest_mode;
+  if (e->dest->index == EXIT_BLOCK)
+    dest_mode = no_mode;
+  else if (bitmap_bit_p (confluence_info.transp, e->dest->index))
+    dest_mode = bb_info[e->dest->index].single_succ;
+  else
+    dest_mode = bb_info[e->dest->index].seginfo->mode;
+
+  /* Do nothing if the destination block has no new information.  */
+  if (dest_mode == no_mode + 1 || dest_mode == src_mode)
+    return false;
+
+  /* Detect conflicting modes.  */
+  if (src_mode != no_mode + 1)
+    dest_mode = no_mode;
+
+  bb_info[e->src->index].single_succ = dest_mode;
+  return true;
+}
+
+/* A backward transfer function for computing the bb_info single_succ
+   fields, as described above single_succ_confluence.  */
+
+static bool
+single_succ_transfer (int bb_index)
+{
+  /* We don't have any field to transfer to.  Assume that, after the
+     first iteration, we are only called if single_succ has changed.
+     We should then process incoming edges if the entity is transparent.  */
+  return bitmap_bit_p (confluence_info.transp, bb_index);
+}
+
+/* Check whether the target wants to back-propagate a mode change across
+   edge E, and update the source block's computed mode if so.  Return true
+   if something changed.  */
+
+static bool
+backprop_confluence_n (edge e)
+{
+  /* The entry and exit blocks have no useful mode information.  */
+  if (e->src->index == ENTRY_BLOCK || e->dest->index == EXIT_BLOCK)
+    return false;
+
+  /* We don't control mode changes across abnormal edges.  */
+  if (e->flags & EDGE_ABNORMAL)
+    return false;
+
+  /* We can only require a new mode in the source block if the entity
+     was originally transparent there.  */
+  if (!bitmap_bit_p (confluence_info.transp, e->src->index))
+    return false;
+
+  /* Exit now if there is no required mode, or if all paths into the
+     source block leave the entity in the required mode.  */
+  struct bb_info *bb_info = confluence_info.bb_info;
+  int no_mode = confluence_info.no_mode;
+  int src_mode = bb_info[e->src->index].mode_out;
+  int dest_mode = bb_info[e->dest->index].mode_in;
+  if (dest_mode == no_mode || src_mode == dest_mode)
+    return false;
+
+  /* See what the target thinks about this transition.  */
+  int entity = confluence_info.entity;
+  int new_mode = targetm.mode_switching.backprop (entity, src_mode,
+						  dest_mode);
+  if (new_mode == no_mode)
+    return false;
+
+  /* The target doesn't like the current transition, but would be happy
+     with a transition from NEW_MODE.
+
+     If we force the source block to use NEW_MODE, we might introduce a
+     double transition on at least one path through the function (one to
+     NEW_MODE and then one to DEST_MODE).  Therefore, if all destination
+     blocks require the same mode, it is usually better to bring that
+     mode requirement forward.
+
+     If that isn't possible, merge the preference for this edge with
+     the preferences for other edges.  no_mode + 1 indicates that there
+     was no previous preference.  */
+  int old_mode = bb_info[e->src->index].computing;
+  if (bb_info[e->src->index].single_succ != no_mode)
+    new_mode = bb_info[e->src->index].single_succ;
+  else if (old_mode != no_mode + 1)
+    new_mode = mode_confluence (entity, old_mode, new_mode, no_mode);
+
+  if (old_mode == new_mode)
+    return false;
+
+  bb_info[e->src->index].computing = new_mode;
+  return true;
+}
+
+/* If the current entity was originally transparent in block BB_INDEX,
+   update the incoming mode to match the outgoing mode.  Register a mode
+   change if the entity is no longer transparent.
+
+   Also, as an on-the-fly optimization, check whether the entity was
+   originally transparent in BB_INDEX and if all successor blocks require
+   the same mode.  If so, anticipate the mode change in BB_INDEX if
+   doing it on the incoming edges would require no more mode changes than
+   doing it on the outgoing edges.  The aim is to reduce the total number
+   of mode changes emitted for the function (and thus reduce code size and
+   cfg complexity) without increasing the number of mode changes on any
+   given path through the function.  A typical case where it helps is:
+
+	  T
+	 / \
+	T   M
+	 \ /
+	  M
+
+   where the entity is transparent in the T blocks and is required to have
+   mode M in the M blocks.  If there are no redundancies leading up to this,
+   there will be two mutually-exclusive changes to mode M, one on each of
+   the T->M edges.  The optimization instead converts it to:
+
+	  T            T            M
+	 / \          / \          / \
+	T   M   ->   M   M   ->   M   M
+	 \ /          \ /          \ /
+	  M            M            M
+
+   which creates a single transition to M for both paths through the diamond.
+
+   Return true if something changed.  */
+
+static bool
+backprop_transfer (int bb_index)
+{
+  /* The entry and exit blocks have no useful mode information.  */
+  if (bb_index == ENTRY_BLOCK || bb_index == EXIT_BLOCK)
+    return false;
+
+  /* We can only require a new mode if the entity was previously
+     transparent.  */
+  if (!bitmap_bit_p (confluence_info.transp, bb_index))
+    return false;
+
+  struct bb_info *bb_info = confluence_info.bb_info;
+  basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bb_index);
+  int no_mode = confluence_info.no_mode;
+  int mode_in = bb_info[bb_index].mode_in;
+  int mode_out = bb_info[bb_index].computing;
+  if (mode_out == no_mode + 1)
+    {
+      /* The entity is still transparent for this block.  See whether
+	 all successor blocks need the same mode, either directly or
+	 indirectly.  */
+      mode_out = bb_info[bb_index].single_succ;
+      if (mode_out == no_mode)
+	return false;
+
+      /* Get a minimum bound on the number of transitions that would be
+	 removed if BB itself required MODE_OUT.  */
+      unsigned int moved = 0;
+      for (edge e : bb->succs)
+	if (e->dest->index != EXIT_BLOCK
+	    && mode_out == bb_info[e->dest->index].seginfo->mode)
+	  moved += 1;
+
+      /* See whether making the mode change on all incoming edges would
+	 be no worse than making it on MOVED outgoing edges.  */
+      if (moved < EDGE_COUNT (bb->preds))
+	return false;
+
+      bb_info[bb_index].mode_out = mode_out;
+      bb_info[bb_index].computing = mode_out;
+    }
+  else if (mode_out == mode_in)
+    return false;
+
+  bb_info[bb_index].mode_in = mode_out;
+  bb_info[bb_index].seginfo->mode = mode_out;
+  return true;
+}
+
 /* Find all insns that need a particular mode setting, and insert the
    necessary mode switches.  Return true if we did work.  */

@@ -684,6 +892,7 @@ optimize_mode_switching (void)
 	}

       confluence_info.bb_info = info;
+      confluence_info.transp = nullptr;
       confluence_info.entity = entity;
       confluence_info.no_mode = no_mode;

@@ -695,6 +904,9 @@ optimize_mode_switching (void)

     };

+  if (targetm.mode_switching.backprop)
+    clear_aux_for_edges ();
+
   for (j = n_entities - 1; j >= 0; j--)
     {
       int e = entity_map[j];
@@ -817,6 +1029,53 @@ optimize_mode_switching (void)
 	    }
 	}

+      /* If the target requests it, back-propagate selected mode requirements
+	 through transparent blocks.  */
+      if (targetm.mode_switching.backprop)
+	{
+	  /* First work out the mode on entry to and exit from each block.  */
+	  forwprop_mode_info (info, e, no_mode);
+
+	  /* Compute the single_succ fields, as described above
+	     single_succ_confluence.  */
+	  FOR_EACH_BB_FN (bb, cfun)
+	    info[bb->index].single_succ = no_mode + 1;
+
+	  confluence_info.transp = transp_all;
+	  bitmap_set_range (blocks, 0, last_basic_block_for_fn (cfun));
+	  df_simple_dataflow (DF_BACKWARD, NULL, NULL,
+			      single_succ_confluence_n,
+			      single_succ_transfer, blocks,
+			      df_get_postorder (DF_BACKWARD),
+			      df_get_n_blocks (DF_BACKWARD));
+
+	  FOR_EACH_BB_FN (bb, cfun)
+	    {
+	      /* Repurpose mode_in as the first mode required by the block,
+		 or the output mode if none.  */
+	      if (info[bb->index].seginfo->mode != no_mode)
+		info[bb->index].mode_in = info[bb->index].seginfo->mode;
+
+	      /* In transparent blocks, use computing == no_mode + 1
+		 to indicate that no propagation has taken place.  */
+	      if (info[bb->index].computing == no_mode)
+		info[bb->index].computing = no_mode + 1;
+	    }
+
+	  bitmap_set_range (blocks, 0, last_basic_block_for_fn (cfun));
+	  df_simple_dataflow (DF_BACKWARD, NULL, NULL, backprop_confluence_n,
+			      backprop_transfer, blocks,
+			      df_get_postorder (DF_BACKWARD),
+			      df_get_n_blocks (DF_BACKWARD));
+
+	  /* Any block that now computes a mode is no longer transparent.  */
+	  FOR_EACH_BB_FN (bb, cfun)
+	    if (info[bb->index].computing == no_mode + 1)
+	      info[bb->index].computing = no_mode;
+	    else if (info[bb->index].computing != no_mode)
+	      bitmap_clear_bit (transp_all, bb->index);
+	}
+
       /* Set the anticipatable and computing arrays.  */
       for (i = 0; i < no_mode; i++)
 	{
@@ -900,6 +1159,22 @@ optimize_mode_switching (void)
 	  for (i = 0; i < no_mode; i++)
 	    if (mode_bit_p (del[bb->index], j, i))
 	      info[bb->index].seginfo->mode = no_mode;
+
+	  /* See whether the target can perform the first transition.
+	     If not, push it onto the incoming edges.  The earlier backprop
+	     pass should ensure that the resulting transitions are valid.  */
+	  if (targetm.mode_switching.backprop)
+	    {
+	      int from_mode = info[bb->index].mode_in;
+	      int to_mode = info[bb->index].seginfo->mode;
+	      if (targetm.mode_switching.backprop (entity_map[j], from_mode,
+						   to_mode) != no_mode)
+		{
+		  for (edge e : bb->preds)
+		    e->aux = (void *) (intptr_t) (to_mode + 1);
+		  info[bb->index].mode_in = to_mode;
+		}
+	    }
 	}

       /* Now output the remaining mode sets in all the segments.  */
diff --git a/gcc/target.def b/gcc/target.def
index 1e2091ed3..4d77c1523 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -7042,6 +7042,35 @@ The hook should return the number of modes if no suitable mode exists\n\
 for the given arguments.",
  int, (int entity, int mode1, int mode2), NULL)

+DEFHOOK
+(backprop,
+ "If defined, the mode-switching pass uses this hook to back-propagate mode\n\
+requirements through blocks that have no mode requirements of their own.\n\
+Specifically, @var{mode1} is the mode that @var{entity} has on exit\n\
+from a block B1 (say) and @var{mode2} is the mode that the next block\n\
+requires @var{entity} to have.  B1 does not have any mode requirements\n\
+of its own.\n\
+\n\
+The hook should return the mode that it prefers or requires @var{entity}\n\
+to have in B1, or the number of modes if there is no such requirement.\n\
+If the hook returns a required mode for more than one of B1's outgoing\n\
+edges, those modes are combined as for @code{TARGET_MODE_CONFLUENCE}.\n\
+\n\
+For example, suppose there is a ``one-shot'' entity that,\n\
+for a given execution of a function, either stays off or makes exactly\n\
+one transition from off to on.  It is safe to make the transition at any\n\
+time, but it is better not to do so unnecessarily.  This hook allows the\n\
+function to manage such an entity without having to track its state at\n\
+runtime.  Specifically. the entity would have two modes, 0 for off and\n\
+1 for on, with 2 representing ``don't know''.  The system is forbidden from\n\
+transitioning from 2 to 1, since 2 represents the possibility that the\n\
+entity is already on (and the aim is to avoid having to emit code to\n\
+check for that case).  This hook would therefore return 1 when @var{mode1}\n\
+is 2 and @var{mode2} is 1, which would force the entity to be on in the\n\
+source block.  Applying this inductively would remove all transitions\n\
+in which the previous state is unknown.",
+ int, (int entity, int mode1, int mode2), NULL)
+
 DEFHOOK
 (entry,
  "If this hook is defined, it is evaluated for every @var{entity} that\n\
--
2.33.0