Skip to content

[llvm] Support save/restore point splitting in shrink-wrap #119359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

enoskova-sc
Copy link
Contributor

@enoskova-sc enoskova-sc commented Dec 10, 2024

This patch introduces "-enable-shrink-wrap-into-multiple-points"
option, which enables splitting Save and Restore points during ShrinkWrap pass, i.e.
insert registers saves and restores as close as possible to their usage.

Current algorithm disables Save / Restore point splitting for
functions with instructions with FrameIndex operands,
with EHPads and with any Stack accesses beacuse it is difficult to prove the safety of it.

This patch also add support for multiple Save / Restore points only for RISCV.

Now ShrinkWrap produces:

  • list of SavePoint + Registers
  • list of RestorePoint + Registers
  • Prolog (NCD of Save points)
  • Epilog (NCPD of Restore points)

Shrink-Wrap points split Part 5.
RFC: https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581

Part 1: #117862
Part 2: #119355
Part 3: #119357
Part 4: #119358

Without this patch ScalableVector frame index property is used before assignment.
More precisely, let's take a look at RISCVFrameLowering::assignCalleeSavedSpillSlots.
In this function we divide callee saved registers on scalar and vector ones,
based on ScalableVector property of their frame indexes:
```
  ...
  const auto &UnmanagedCSI = getUnmanagedCSI(*MF, CSI);
  const auto &RVVCSI = getRVVCalleeSavedInfo(*MF, CSI);
  ...
```
But we assign ScalableVector property several lines below:
```
  ...
  auto storeRegToStackSlot = [&](decltype(UnmanagedCSI) CSInfo) {
    for (auto &CS : CSInfo) {
      // Insert the spill to the stack frame.
      Register Reg = CS.getReg();
      const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
      TII.storeRegToStackSlot(MBB, MI, Reg, !MBB.isLiveIn(Reg),
                              CS.getFrameIdx(), RC, TRI, Register());
    }
  };
  storeRegToStackSlot(UnmanagedCSI);
  ...
```
Due to it, list of RVV callee saved registers will always be empty.
Currently this problem doesn't appear, but if you slightly change the code and,
for example, put some instructions between scalar and vector spills,
the resulting code will be ill formed.
Currently mir supports only one save and one restore point specification:

```
  savePoint:       '%bb.1'
  restorePoint:    '%bb.2'
```

This patch provide possibility to have multiple save and multiple restore points in mir:

```
  savePoints:
    - point:           '%bb.1'
  restorePoints:
    - point:           '%bb.2'
```
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2024

@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-backend-hexagon
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-backend-arm

Author: Elizaveta Noskova (enoskova-sc)

Changes

This patch introduces "-enable-shrink-wrap-into-multiple-points"
option, which enables splitting Save and Restore points during ShrinkWrap pass, i.e.
insert registers saves and restores as close as possible to their usage.

Current algorithm disables Save / Restore point splitting for
functions with instructions with FrameIndex operands,
with EHPads and with any Stack accesses beacuse it is difficult to prove the safety of it.

This patch also add support for multiple Save / Restore points only for RISCV.

Now ShrinkWrap produces:

  • list of SavePoint + Registers
  • list of RestorePoint + Registers
  • Prolog (NCD of Save points)
  • Epilog (NCPD of Restore points)

Shrink-Wrap points split Part 5.

Part 1: #117862
Part 2: #119355
Part 3: #119357
Part 4: #119358


Patch is 393.94 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119359.diff

358 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/MIRYamlMapping.h (+36-7)
  • (modified) llvm/include/llvm/CodeGen/MachineDominators.h (+5)
  • (modified) llvm/include/llvm/CodeGen/MachineFrameInfo.h (+131-9)
  • (modified) llvm/include/llvm/CodeGen/TargetFrameLowering.h (+4)
  • (modified) llvm/lib/CodeGen/MIRParser/MIRParser.cpp (+41-14)
  • (modified) llvm/lib/CodeGen/MIRPrinter.cpp (+34-11)
  • (modified) llvm/lib/CodeGen/MachineDominators.cpp (+16)
  • (modified) llvm/lib/CodeGen/MachineFrameInfo.cpp (+17)
  • (modified) llvm/lib/CodeGen/PrologEpilogInserter.cpp (+206-64)
  • (modified) llvm/lib/CodeGen/ShrinkWrap.cpp (+392-108)
  • (modified) llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp (+24-16)
  • (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+2-3)
  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+83-17)
  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+6)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp (+47)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/store-merging-debug.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-ldst-no-premature-sp-pop.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-mov-debug-locs.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/aarch64st1.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/cfi-fixup-multi-block-prologue.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/cfi-fixup.mir (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/dont-shrink-wrap-stack-mayloadorstore.mir (+13-7)
  • (modified) llvm/test/CodeGen/AArch64/early-ifcvt-regclass-mismatch.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/emit_fneg_with_non_register_operand.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/irg-nomem.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/jump-table-duplicate.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/ldst-nopreidx-sp-redzone.mir (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir (+12-2)
  • (modified) llvm/test/CodeGen/AArch64/loop-sink-limit.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/loop-sink.mir (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/machine-latecleanup-inlineasm.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/nested-iv-regalloc.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/regalloc-last-chance-recolor-with-split.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/shrinkwrap-split-restore-point.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sink-and-fold-illegal-shift.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sink-and-fold-preserve-debugloc.mir (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/split-deadloop.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing-last-in-block.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/tail-dup-redundant-phi.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/taildup-addrtaken.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame-predecrement.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame-scavenge.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame1.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame2.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame3.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame4.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame5.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame6.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame7.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-frame8.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-save-lrpair1.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-save-lrpair2.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh-save-lrpair3.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh2.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh3.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh4.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh5.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh6.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh7.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh8.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh9.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wineh_shrinkwrap.mir (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-multiple-mem-operands-nontemporal-1.mir (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-multiple-mem-operands-nontemporal-2.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/codesize-ifcvt.mir (+6-6)
  • (modified) llvm/test/CodeGen/ARM/constant-island-movwt.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/constant-islands-cfg.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/constant-islands-split-IT.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/execute-only-save-cpsr.mir (+8-8)
  • (modified) llvm/test/CodeGen/ARM/fp16-litpool2-arm.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/fp16-litpool3-arm.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/inlineasmbr-if-cvt.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/invalidated-save-point.ll (+2-2)
  • (modified) llvm/test/CodeGen/ARM/jump-table-dbg-value.mir (+2-2)
  • (modified) llvm/test/CodeGen/ARM/stack_frame_offset.mir (+6-6)
  • (modified) llvm/test/CodeGen/Hexagon/cext-opt-block-addr.mir (+4-4)
  • (modified) llvm/test/CodeGen/Hexagon/early-if-predicator.mir (+2-2)
  • (modified) llvm/test/CodeGen/Hexagon/machine-sink-float-usr.mir (+4-4)
  • (modified) llvm/test/CodeGen/Hexagon/pipeliner/swp-phi-start.mir (+2-2)
  • (modified) llvm/test/CodeGen/MIR/ARM/thumb2-sub-sp-t3.mir (+2-2)
  • (modified) llvm/test/CodeGen/MIR/Generic/frame-info.mir (+2-2)
  • (modified) llvm/test/CodeGen/MIR/Hexagon/addrmode-opt-nonreaching.mir (+2-2)
  • (modified) llvm/test/CodeGen/MIR/RISCV/machine-function-info.mir (+2-2)
  • (modified) llvm/test/CodeGen/MIR/X86/branch-folder-with-label.mir (+6-6)
  • (modified) llvm/test/CodeGen/MIR/X86/diexpr-win32.mir (+4-4)
  • (modified) llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir (+2-2)
  • (modified) llvm/test/CodeGen/MIR/X86/frame-info-save-restore-points.mir (+12-4)
  • (modified) llvm/test/CodeGen/MIR/X86/inline-asm-rm-exhaustion.mir (+6-6)
  • (modified) llvm/test/CodeGen/Mips/delay-slot-filler-bundled-insts-def-use.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/delay-slot-filler-bundled-insts.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/indirect-jump-hazard/guards-verify-call.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/indirect-jump-hazard/guards-verify-tailcall.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dext-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dext-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextm-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextm-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextm-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextu-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextu-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextu-size-valid.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dextu-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dins-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dins-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dins-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dinsm-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dinsm-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dinsm-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dinsu-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dinsu-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/dinsu-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/ext-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/ext-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/ext-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/ins-pos-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/ins-pos.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/instverify/ins-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-fp-micromips.mir (+4-4)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-fp-micromipsr6.mir (+4-4)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-fp-mips.mir (+4-4)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-fp-mipsr6.mir (+4-4)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-int-microMIPS.mir (+16-16)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-int-micromipsr6.mir (+24-24)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-int-mips64.mir (+12-12)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-int-mips64r6.mir (+24-24)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-int-mipsr6.mir (+24-24)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-int.mir (+12-12)
  • (modified) llvm/test/CodeGen/Mips/longbranch/branch-limits-msa.mir (+20-20)
  • (modified) llvm/test/CodeGen/Mips/micromips-eva.mir (+4-4)
  • (modified) llvm/test/CodeGen/Mips/micromips-short-delay-slot.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/micromips-sizereduction/micromips-lwp-swp.mir (+8-8)
  • (modified) llvm/test/CodeGen/Mips/micromips-sizereduction/micromips-no-lwp-swp.mir (+8-8)
  • (modified) llvm/test/CodeGen/Mips/mirparser/target-flags-pic-mxgot-tls.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/mirparser/target-flags-pic-o32.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/mirparser/target-flags-pic.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/mirparser/target-flags-static-tls.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/msa/emergency-spill.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/sll-micromips-r6-encoding.mir (+2-2)
  • (modified) llvm/test/CodeGen/Mips/unaligned-memops-mapping.mir (+4-4)
  • (modified) llvm/test/CodeGen/NVPTX/proxy-reg-erasure.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/DisableHoistingDueToBlockHotnessNoProfileData.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/DisableHoistingDueToBlockHotnessProfileData.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/NoCRFieldRedefWhenSpillingCRBIT.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/alignlongjumptest.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/block-placement-1.mir (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/block-placement.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/collapse-rotates.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/common-chain-aix32.ll (+3-3)
  • (modified) llvm/test/CodeGen/PowerPC/common-chain.ll (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/convert-rr-to-ri-instrs-R0-special-handling.mir (+14-14)
  • (modified) llvm/test/CodeGen/PowerPC/convert-rr-to-ri-instrs-out-of-range.mir (+40-40)
  • (modified) llvm/test/CodeGen/PowerPC/convert-rr-to-ri-instrs.mir (+178-178)
  • (modified) llvm/test/CodeGen/PowerPC/ctrloop-do-not-duplicate-mi.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/livevars-crash2.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll (+14-14)
  • (modified) llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll (+16-16)
  • (modified) llvm/test/CodeGen/PowerPC/more-dq-form-prepare.ll (+64-64)
  • (modified) llvm/test/CodeGen/PowerPC/peephole-phi-acc.mir (+8-8)
  • (modified) llvm/test/CodeGen/PowerPC/peephole-replaceInstr-after-eliminate-extsw.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/phi-eliminate.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/pr43527.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/remove-copy-crunsetcrbit.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/remove-implicit-use.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/remove-redundant-li-skip-imp-kill.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/remove-self-copies.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/rlwinm_rldicl_to_andi.mir (+12-12)
  • (modified) llvm/test/CodeGen/PowerPC/schedule-addi-load.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/setcr_bc.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/setcr_bc2.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/setcr_bc3.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/shrink-wrap.ll (+102-102)
  • (modified) llvm/test/CodeGen/PowerPC/tls_get_addr_fence1.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/tls_get_addr_fence2.mir (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/two-address-crash.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/live-sp.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/pr53662.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/addi-rvv-stack-object.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/emergency-slot.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/large-rvv-stack-size.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rvv-stack-align.mir (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/wrong-stack-offset-for-rvv-object.mir (+2-2)
  • (added) llvm/test/CodeGen/RISCV/shrinkwrap-split.mir (+282)
  • (modified) llvm/test/CodeGen/RISCV/stack-slot-coloring.mir (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/zcmp-prolog-epilog-crash.mir (+6-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/add_reduce.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/begin-vpt-without-inst.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-default.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize-strd-lr.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-mov.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-ignore-vctp.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-remove-loop-update.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/emptyblock.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/end-positive-offset.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/extract-element.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/incorrect-sub-16.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/incorrect-sub-32.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/incorrect-sub-8.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpnot-1.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpnot-2.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpnot-3.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpsel-1.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpsel-2.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-chain-store.mir (+4-4)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-chain.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-itercount.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-random.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-chain.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-prev-iteration.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-liveout.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/lstp-insertion-position.mir (+4-4)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/massive.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix-debug.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dls.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dlstp.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-lr-terminator.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/move-def-before-start.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/move-start-after-def.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiblock-massive.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiple-do-loops.mir (+6-6)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-reduct-livein-arg.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/no-dec-cbnz.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/no-dec-reorder.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/no-dec.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/no-vpsel-liveout.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/non-masked-load.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/non-masked-store.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/out-of-range-cbz.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/remove-elem-moves.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/revert-after-call.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/revert-after-read.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/revert-after-write.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/revert-non-header.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/revert-non-loop.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/revert-while.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/safe-def-no-mov.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/safe-retaining.mir (+1-1)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/size-limit.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-debug.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/switch.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unrolled-and-vector.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unsafe-cpsr-loop-def.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unsafe-cpsr-loop-use.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unsafe-use-after.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vaddv.mir (+38-38)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-add-operand-liveout.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-in-vpt-2.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-in-vpt.mir (+8-8)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-subi3.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-subri.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-subri12.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp16-reduce.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vmaxmin_vpred_r.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vmldava_in_vpt.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vpt-block-debug.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/vpt-blocks.mir (+14-14)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/while-negative-offset.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/while.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir (+6-6)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-liveout-lsr-shift.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-vctp-opcode-liveout.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-vctp-operand-liveout.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/bti-pac-replace-1.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks-1-pred.mir (+2-2)
diff --git a/llvm/include/llvm/CodeGen/MIRYamlMapping.h b/llvm/include/llvm/CodeGen/MIRYamlMapping.h
index 09a6ca936fe1f4..d4dc53fc0ed32c 100644
--- a/llvm/include/llvm/CodeGen/MIRYamlMapping.h
+++ b/llvm/include/llvm/CodeGen/MIRYamlMapping.h
@@ -610,6 +610,24 @@ LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::yaml::MachineJumpTable::Entry)
 namespace llvm {
 namespace yaml {
 
+struct SRPEntry {
+  StringValue Point;
+  std::vector<StringValue> Registers;
+
+  bool operator==(const SRPEntry &Other) const {
+    return Point == Other.Point && Registers == Other.Registers;
+  }
+};
+
+using SaveRestorePoints = std::vector<SRPEntry>;
+
+template <> struct MappingTraits<SRPEntry> {
+  static void mapping(IO &YamlIO, SRPEntry &Entry) {
+    YamlIO.mapRequired("point", Entry.Point);
+    YamlIO.mapRequired("registers", Entry.Registers);
+  }
+};
+
 template <> struct MappingTraits<MachineJumpTable> {
   static void mapping(IO &YamlIO, MachineJumpTable &JT) {
     YamlIO.mapRequired("kind", JT.Kind);
@@ -618,6 +636,14 @@ template <> struct MappingTraits<MachineJumpTable> {
   }
 };
 
+} // namespace yaml
+} // namespace llvm
+
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::yaml::SRPEntry)
+
+namespace llvm {
+namespace yaml {
+
 /// Serializable representation of MachineFrameInfo.
 ///
 /// Doesn't serialize attributes like 'StackAlignment', 'IsStackRealignable' and
@@ -645,8 +671,8 @@ struct MachineFrameInfo {
   bool HasTailCall = false;
   bool IsCalleeSavedInfoValid = false;
   unsigned LocalFrameSize = 0;
-  StringValue SavePoint;
-  StringValue RestorePoint;
+  SaveRestorePoints SavePoints;
+  SaveRestorePoints RestorePoints;
 
   bool operator==(const MachineFrameInfo &Other) const {
     return IsFrameAddressTaken == Other.IsFrameAddressTaken &&
@@ -667,7 +693,8 @@ struct MachineFrameInfo {
            HasMustTailInVarArgFunc == Other.HasMustTailInVarArgFunc &&
            HasTailCall == Other.HasTailCall &&
            LocalFrameSize == Other.LocalFrameSize &&
-           SavePoint == Other.SavePoint && RestorePoint == Other.RestorePoint &&
+           SavePoints == Other.SavePoints &&
+           RestorePoints == Other.RestorePoints &&
            IsCalleeSavedInfoValid == Other.IsCalleeSavedInfoValid;
   }
 };
@@ -699,10 +726,12 @@ template <> struct MappingTraits<MachineFrameInfo> {
     YamlIO.mapOptional("isCalleeSavedInfoValid", MFI.IsCalleeSavedInfoValid,
                        false);
     YamlIO.mapOptional("localFrameSize", MFI.LocalFrameSize, (unsigned)0);
-    YamlIO.mapOptional("savePoint", MFI.SavePoint,
-                       StringValue()); // Don't print it out when it's empty.
-    YamlIO.mapOptional("restorePoint", MFI.RestorePoint,
-                       StringValue()); // Don't print it out when it's empty.
+    YamlIO.mapOptional(
+        "savePoints", MFI.SavePoints,
+        SaveRestorePoints()); // Don't print it out when it's empty.
+    YamlIO.mapOptional(
+        "restorePoints", MFI.RestorePoints,
+        SaveRestorePoints()); // Don't print it out when it's empty.
   }
 };
 
diff --git a/llvm/include/llvm/CodeGen/MachineDominators.h b/llvm/include/llvm/CodeGen/MachineDominators.h
index 74cf94398736dd..88800d91ef51a9 100644
--- a/llvm/include/llvm/CodeGen/MachineDominators.h
+++ b/llvm/include/llvm/CodeGen/MachineDominators.h
@@ -185,6 +185,11 @@ class MachineDominatorTree : public DomTreeBase<MachineBasicBlock> {
     return Base::findNearestCommonDominator(A, B);
   }
 
+  /// Returns the nearest common dominator of the given blocks.
+  /// If that tree node is a virtual root, a nullptr will be returned.
+  MachineBasicBlock *
+  findNearestCommonDominator(ArrayRef<MachineBasicBlock *> Blocks) const;
+
   MachineDomTreeNode *operator[](MachineBasicBlock *BB) const {
     applySplitCriticalEdges();
     return Base::getNode(BB);
diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
index 213b7ec6b3fbfb..d746466d41c3e2 100644
--- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
@@ -27,6 +27,21 @@ class MachineBasicBlock;
 class BitVector;
 class AllocaInst;
 
+using SaveRestorePoints = DenseMap<MachineBasicBlock *, std::vector<Register>>;
+
+class CalleeSavedInfoPerBB {
+  DenseMap<MachineBasicBlock *, std::vector<CalleeSavedInfo>> Map;
+
+public:
+  std::vector<CalleeSavedInfo> get(MachineBasicBlock *MBB) const {
+    return Map.lookup(MBB);
+  }
+
+  void set(DenseMap<MachineBasicBlock *, std::vector<CalleeSavedInfo>> CSI) {
+    Map = std::move(CSI);
+  }
+};
+
 /// The CalleeSavedInfo class tracks the information need to locate where a
 /// callee saved register is in the current frame.
 /// Callee saved reg can also be saved to a different register rather than
@@ -37,6 +52,8 @@ class CalleeSavedInfo {
     int FrameIdx;
     unsigned DstReg;
   };
+  std::vector<MachineBasicBlock *> SpilledIn;
+  std::vector<MachineBasicBlock *> RestoredIn;
   /// Flag indicating whether the register is actually restored in the epilog.
   /// In most cases, if a register is saved, it is also restored. There are
   /// some situations, though, when this is not the case. For example, the
@@ -58,9 +75,9 @@ class CalleeSavedInfo {
   explicit CalleeSavedInfo(unsigned R, int FI = 0) : Reg(R), FrameIdx(FI) {}
 
   // Accessors.
-  Register getReg()                        const { return Reg; }
-  int getFrameIdx()                        const { return FrameIdx; }
-  unsigned getDstReg()                     const { return DstReg; }
+  Register getReg() const { return Reg; }
+  int getFrameIdx() const { return FrameIdx; }
+  unsigned getDstReg() const { return DstReg; }
   void setFrameIdx(int FI) {
     FrameIdx = FI;
     SpilledToReg = false;
@@ -72,6 +89,16 @@ class CalleeSavedInfo {
   bool isRestored()                        const { return Restored; }
   void setRestored(bool R)                       { Restored = R; }
   bool isSpilledToReg()                    const { return SpilledToReg; }
+  ArrayRef<MachineBasicBlock *> spilledIn() const { return SpilledIn; }
+  ArrayRef<MachineBasicBlock *> restoredIn() const { return RestoredIn; }
+  void addSpilledIn(MachineBasicBlock *MBB) { SpilledIn.push_back(MBB); }
+  void addRestoredIn(MachineBasicBlock *MBB) { RestoredIn.push_back(MBB); }
+  void setSpilledIn(std::vector<MachineBasicBlock *> BBV) {
+    SpilledIn = std::move(BBV);
+  }
+  void setRestoredIn(std::vector<MachineBasicBlock *> BBV) {
+    RestoredIn = std::move(BBV);
+  }
 };
 
 /// The MachineFrameInfo class represents an abstract stack frame until
@@ -295,6 +322,10 @@ class MachineFrameInfo {
   /// Has CSInfo been set yet?
   bool CSIValid = false;
 
+  CalleeSavedInfoPerBB CSInfoPerSave;
+
+  CalleeSavedInfoPerBB CSInfoPerRestore;
+
   /// References to frame indices which are mapped
   /// into the local frame allocation block. <FrameIdx, LocalOffset>
   SmallVector<std::pair<int, int64_t>, 32> LocalFrameObjects;
@@ -331,9 +362,16 @@ class MachineFrameInfo {
   bool HasTailCall = false;
 
   /// Not null, if shrink-wrapping found a better place for the prologue.
-  MachineBasicBlock *Save = nullptr;
+  MachineBasicBlock *Prolog = nullptr;
   /// Not null, if shrink-wrapping found a better place for the epilogue.
-  MachineBasicBlock *Restore = nullptr;
+  MachineBasicBlock *Epilog = nullptr;
+
+  /// Not empty, if shrink-wrapping found a better place for saving callee
+  /// saves.
+  SaveRestorePoints SavePoints;
+  /// Not empty, if shrink-wrapping found a better place for restoring callee
+  /// saves.
+  SaveRestorePoints RestorePoints;
 
   /// Size of the UnsafeStack Frame
   uint64_t UnsafeStackSize = 0;
@@ -809,21 +847,105 @@ class MachineFrameInfo {
   /// \copydoc getCalleeSavedInfo()
   std::vector<CalleeSavedInfo> &getCalleeSavedInfo() { return CSInfo; }
 
+  /// Returns callee saved info vector for provided save point in
+  /// the current function.
+  std::vector<CalleeSavedInfo> getCSInfoPerSave(MachineBasicBlock *MBB) const {
+    return CSInfoPerSave.get(MBB);
+  }
+
+  /// Returns callee saved info vector for provided restore point
+  /// in the current function.
+  std::vector<CalleeSavedInfo>
+  getCSInfoPerRestore(MachineBasicBlock *MBB) const {
+    return CSInfoPerRestore.get(MBB);
+  }
+
   /// Used by prolog/epilog inserter to set the function's callee saved
   /// information.
   void setCalleeSavedInfo(std::vector<CalleeSavedInfo> CSI) {
     CSInfo = std::move(CSI);
   }
 
+  /// Used by prolog/epilog inserter to set the function's callee saved
+  /// information for particular save point.
+  void setCSInfoPerSave(
+      DenseMap<MachineBasicBlock *, std::vector<CalleeSavedInfo>> CSI) {
+    CSInfoPerSave.set(CSI);
+  }
+
+  /// Used by prolog/epilog inserter to set the function's callee saved
+  /// information for particular restore point.
+  void setCSInfoPerRestore(
+      DenseMap<MachineBasicBlock *, std::vector<CalleeSavedInfo>> CSI) {
+    CSInfoPerRestore.set(CSI);
+  }
+
   /// Has the callee saved info been calculated yet?
   bool isCalleeSavedInfoValid() const { return CSIValid; }
 
   void setCalleeSavedInfoValid(bool v) { CSIValid = v; }
 
-  MachineBasicBlock *getSavePoint() const { return Save; }
-  void setSavePoint(MachineBasicBlock *NewSave) { Save = NewSave; }
-  MachineBasicBlock *getRestorePoint() const { return Restore; }
-  void setRestorePoint(MachineBasicBlock *NewRestore) { Restore = NewRestore; }
+  const SaveRestorePoints &getRestorePoints() const { return RestorePoints; }
+
+  const SaveRestorePoints &getSavePoints() const { return SavePoints; }
+
+  std::pair<MachineBasicBlock *, std::vector<Register>>
+  getRestorePoint(MachineBasicBlock *MBB) const {
+    if (auto It = RestorePoints.find(MBB); It != RestorePoints.end())
+      return *It;
+
+    std::vector<Register> Regs = {};
+    return std::make_pair(nullptr, Regs);
+  }
+
+  std::pair<MachineBasicBlock *, std::vector<Register>>
+  getSavePoint(MachineBasicBlock *MBB) const {
+    if (auto It = SavePoints.find(MBB); It != SavePoints.end())
+      return *It;
+
+    std::vector<Register> Regs = {};
+    return std::make_pair(nullptr, Regs);
+  }
+
+  void setSavePoints(SaveRestorePoints NewSavePoints) {
+    SavePoints = std::move(NewSavePoints);
+  }
+
+  void setRestorePoints(SaveRestorePoints NewRestorePoints) {
+    RestorePoints = std::move(NewRestorePoints);
+  }
+
+  void setSavePoint(MachineBasicBlock *MBB, std::vector<Register> &Regs) {
+    if (SavePoints.contains(MBB))
+      SavePoints[MBB] = Regs;
+    else
+      SavePoints.insert(std::make_pair(MBB, Regs));
+  }
+
+  static const SaveRestorePoints constructSaveRestorePoints(
+      const SaveRestorePoints &SRP,
+      const DenseMap<MachineBasicBlock *, MachineBasicBlock *> &BBMap) {
+    SaveRestorePoints Pts{};
+    for (auto &Src : SRP) {
+      Pts.insert(std::make_pair(BBMap.find(Src.first)->second, Src.second));
+    }
+    return Pts;
+  }
+
+  void setRestorePoint(MachineBasicBlock *MBB, std::vector<Register> &Regs) {
+    if (RestorePoints.contains(MBB))
+      RestorePoints[MBB] = Regs;
+    else
+      RestorePoints.insert(std::make_pair(MBB, Regs));
+  }
+
+  MachineBasicBlock *getProlog() const { return Prolog; }
+  void setProlog(MachineBasicBlock *BB) { Prolog = BB; }
+  MachineBasicBlock *getEpilog() const { return Epilog; }
+  void setEpilog(MachineBasicBlock *BB) { Epilog = BB; }
+
+  void clearSavePoints() { SavePoints.clear(); }
+  void clearRestorePoints() { RestorePoints.clear(); }
 
   uint64_t getUnsafeStackSize() const { return UnsafeStackSize; }
   void setUnsafeStackSize(uint64_t Size) { UnsafeStackSize = Size; }
diff --git a/llvm/include/llvm/CodeGen/TargetFrameLowering.h b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
index 97de0197da9b40..373455a630a993 100644
--- a/llvm/include/llvm/CodeGen/TargetFrameLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
@@ -199,6 +199,10 @@ class TargetFrameLowering {
     return false;
   }
 
+  /// enableCSRSaveRestorePointsSplit - Returns true if the target support
+  /// multiple save/restore points in shrink wrapping.
+  virtual bool enableCSRSaveRestorePointsSplit() const { return false; }
+
   /// Returns true if the stack slot holes in the fixed and callee-save stack
   /// area should be used when allocating other stack locations to reduce stack
   /// size.
diff --git a/llvm/lib/CodeGen/MIRParser/MIRParser.cpp b/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
index e2543f883f91ce..835eeb2362ceb1 100644
--- a/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
@@ -124,6 +124,10 @@ class MIRParserImpl {
   bool initializeFrameInfo(PerFunctionMIParsingState &PFS,
                            const yaml::MachineFunction &YamlMF);
 
+  bool initializeSaveRestorePoints(PerFunctionMIParsingState &PFS,
+                                   const yaml::SaveRestorePoints &YamlSRP,
+                                   bool IsSavePoints);
+
   bool initializeCallSiteInfo(PerFunctionMIParsingState &PFS,
                               const yaml::MachineFunction &YamlMF);
 
@@ -832,18 +836,9 @@ bool MIRParserImpl::initializeFrameInfo(PerFunctionMIParsingState &PFS,
   MFI.setHasTailCall(YamlMFI.HasTailCall);
   MFI.setCalleeSavedInfoValid(YamlMFI.IsCalleeSavedInfoValid);
   MFI.setLocalFrameSize(YamlMFI.LocalFrameSize);
-  if (!YamlMFI.SavePoint.Value.empty()) {
-    MachineBasicBlock *MBB = nullptr;
-    if (parseMBBReference(PFS, MBB, YamlMFI.SavePoint))
-      return true;
-    MFI.setSavePoint(MBB);
-  }
-  if (!YamlMFI.RestorePoint.Value.empty()) {
-    MachineBasicBlock *MBB = nullptr;
-    if (parseMBBReference(PFS, MBB, YamlMFI.RestorePoint))
-      return true;
-    MFI.setRestorePoint(MBB);
-  }
+  initializeSaveRestorePoints(PFS, YamlMFI.SavePoints, true /*IsSavePoints*/);
+  initializeSaveRestorePoints(PFS, YamlMFI.RestorePoints,
+                              false /*IsSavePoints*/);
 
   std::vector<CalleeSavedInfo> CSIInfo;
   // Initialize the fixed frame objects.
@@ -1058,8 +1053,40 @@ bool MIRParserImpl::initializeConstantPool(PerFunctionMIParsingState &PFS,
   return false;
 }
 
-bool MIRParserImpl::initializeJumpTableInfo(PerFunctionMIParsingState &PFS,
-    const yaml::MachineJumpTable &YamlJTI) {
+bool MIRParserImpl::initializeSaveRestorePoints(
+    PerFunctionMIParsingState &PFS, const yaml::SaveRestorePoints &YamlSRP,
+    bool IsSavePoints) {
+  SMDiagnostic Error;
+  MachineFunction &MF = PFS.MF;
+  MachineFrameInfo &MFI = MF.getFrameInfo();
+  llvm::SaveRestorePoints SRPoints;
+
+  for (const auto &Entry : YamlSRP) {
+    const auto &MBBSource = Entry.Point;
+    MachineBasicBlock *MBB = nullptr;
+    if (parseMBBReference(PFS, MBB, MBBSource.Value))
+      return true;
+
+    std::vector<Register> Registers{};
+    for (auto &RegStr : Entry.Registers) {
+      Register Reg;
+      if (parseNamedRegisterReference(PFS, Reg, RegStr.Value, Error))
+        return error(Error, RegStr.SourceRange);
+
+      Registers.push_back(Reg);
+    }
+    SRPoints.insert(std::make_pair(MBB, Registers));
+  }
+
+  if (IsSavePoints)
+    MFI.setSavePoints(SRPoints);
+  else
+    MFI.setRestorePoints(SRPoints);
+  return false;
+}
+
+bool MIRParserImpl::initializeJumpTableInfo(
+    PerFunctionMIParsingState &PFS, const yaml::MachineJumpTable &YamlJTI) {
   MachineJumpTableInfo *JTI = PFS.MF.getOrCreateJumpTableInfo(YamlJTI.Kind);
   for (const auto &Entry : YamlJTI.Entries) {
     std::vector<MachineBasicBlock *> Blocks;
diff --git a/llvm/lib/CodeGen/MIRPrinter.cpp b/llvm/lib/CodeGen/MIRPrinter.cpp
index c8f6341c1224d2..ea7c504d355c19 100644
--- a/llvm/lib/CodeGen/MIRPrinter.cpp
+++ b/llvm/lib/CodeGen/MIRPrinter.cpp
@@ -117,7 +117,10 @@ class MIRPrinter {
                const MachineRegisterInfo &RegInfo,
                const TargetRegisterInfo *TRI);
   void convert(ModuleSlotTracker &MST, yaml::MachineFrameInfo &YamlMFI,
-               const MachineFrameInfo &MFI);
+               const MachineFrameInfo &MFI, const TargetRegisterInfo *TRI);
+  void convert(ModuleSlotTracker &MST, yaml::SaveRestorePoints &YamlSRP,
+               const DenseMap<MachineBasicBlock *, std::vector<Register>> &SRP,
+               const TargetRegisterInfo *TRI);
   void convert(yaml::MachineFunction &MF,
                const MachineConstantPool &ConstantPool);
   void convert(ModuleSlotTracker &MST, yaml::MachineJumpTable &YamlJTI,
@@ -235,7 +238,8 @@ void MIRPrinter::print(const MachineFunction &MF) {
   convert(YamlMF, MF, MF.getRegInfo(), MF.getSubtarget().getRegisterInfo());
   MachineModuleSlotTracker MST(MMI, &MF);
   MST.incorporateFunction(MF.getFunction());
-  convert(MST, YamlMF.FrameInfo, MF.getFrameInfo());
+  convert(MST, YamlMF.FrameInfo, MF.getFrameInfo(),
+          MF.getSubtarget().getRegisterInfo());
   convertStackObjects(YamlMF, MF, MST);
   convertEntryValueObjects(YamlMF, MF, MST);
   convertCallSiteObjects(YamlMF, MF, MST);
@@ -372,7 +376,8 @@ void MIRPrinter::convert(yaml::MachineFunction &YamlMF,
 
 void MIRPrinter::convert(ModuleSlotTracker &MST,
                          yaml::MachineFrameInfo &YamlMFI,
-                         const MachineFrameInfo &MFI) {
+                         const MachineFrameInfo &MFI,
+                         const TargetRegisterInfo *TRI) {
   YamlMFI.IsFrameAddressTaken = MFI.isFrameAddressTaken();
   YamlMFI.IsReturnAddressTaken = MFI.isReturnAddressTaken();
   YamlMFI.HasStackMap = MFI.hasStackMap();
@@ -392,14 +397,10 @@ void MIRPrinter::convert(ModuleSlotTracker &MST,
   YamlMFI.HasTailCall = MFI.hasTailCall();
   YamlMFI.IsCalleeSavedInfoValid = MFI.isCalleeSavedInfoValid();
   YamlMFI.LocalFrameSize = MFI.getLocalFrameSize();
-  if (MFI.getSavePoint()) {
-    raw_string_ostream StrOS(YamlMFI.SavePoint.Value);
-    StrOS << printMBBReference(*MFI.getSavePoint());
-  }
-  if (MFI.getRestorePoint()) {
-    raw_string_ostream StrOS(YamlMFI.RestorePoint.Value);
-    StrOS << printMBBReference(*MFI.getRestorePoint());
-  }
+  if (!MFI.getSavePoints().empty())
+    convert(MST, YamlMFI.SavePoints, MFI.getSavePoints(), TRI);
+  if (!MFI.getRestorePoints().empty())
+    convert(MST, YamlMFI.RestorePoints, MFI.getRestorePoints(), TRI);
 }
 
 void MIRPrinter::convertEntryValueObjects(yaml::MachineFunction &YMF,
@@ -618,6 +619,28 @@ void MIRPrinter::convert(yaml::MachineFunction &MF,
   }
 }
 
+void MIRPrinter::convert(ModuleSlotTracker &MST,
+                         yaml::SaveRestorePoints &YamlSRP,
+                         const SaveRestorePoints &SRP,
+                         const TargetRegisterInfo *TRI) {
+  for (const auto &MBBEntry : SRP) {
+    std::string Str;
+    yaml::SRPEntry Entry;
+    raw_string_ostream StrOS(Str);
+    StrOS << printMBBReference(*MBBEntry.first);
+    Entry.Point = StrOS.str();
+    Str.clear();
+    for (auto &Reg : MBBEntry.second) {
+      if (Reg != MCRegister::NoRegister) {
+        StrOS << printReg(Reg, TRI);
+        Entry.Registers.push_back(StrOS.str());
+        Str.clear();
+      }
+    }
+    YamlSRP.push_back(Entry);
+  }
+}
+
 void MIRPrinter::convert(ModuleSlotTracker &MST,
                          yaml::MachineJumpTable &YamlJTI,
                          const MachineJumpTableInfo &JTI) {
diff --git a/llvm/lib/CodeGen/MachineDominators.cpp b/llvm/lib/CodeGen/MachineDominators.cpp
index a2cc8fdfa7c9f9..384f90c6da66c0 100644
--- a/llvm/lib/CodeGen/MachineDominators.cpp
+++ b/llvm/lib/CodeGen/MachineDominators.cpp
@@ -189,3 +189,19 @@ void MachineDominatorTree::applySplitCriticalEdges() const {
   NewBBs.clear();
   CriticalEdgesToSplit.clear();
 }
+
+MachineBasicBlock *MachineDominatorTree::findNearestCommonDominator(
+    ArrayRef<MachineBasicBlock *> Blocks) const {
+  assert(!Blocks.empty());
+
+  MachineBasicBlock *NCD = Blocks.front();
+  for (MachineBasicBlock *BB : Blocks.drop_front()) {
+    NCD = Base::findNearestCommonDominator(NCD, BB);
+
+    // Stop when the root is reached.
+    if (Base::isVirtualRoot(Base::getNode(NCD)))
+      return nullptr;
+  }
+
+  return NCD;
+}
diff --git a/llvm/lib/CodeGen/MachineFrameInfo.cpp b/llvm/lib/CodeGen/MachineFrameInfo.cpp
index e4b993850f73dc..c6658d2e9eba88 100644
--- a/llvm/lib/CodeGen/MachineFrameInfo.cpp
+++ b/llvm/lib/CodeGen/MachineFrameInfo.cpp
@@ -244,6 +244,23 @@ void MachineFrameInfo::print(const MachineFunction &MF, raw_ostream &OS) const{
     }
     OS << "\n";
   }
+
+  OS << "save/restore points:\n";
+
+  if (!SavePoints.empty()...
[truncated]

Copy link

github-actions bot commented Dec 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

With this patch the possibility to store multiple Save and Restore points in MachineFrameInfo appears.
As the logical consequnce of it, the notions "Save point" / "Restore point"
are no longer synonyms for "Prolog" / "Epilog". Currently, "Prolog" / "Epilog"
is the place for stack allocation / deallocation and
"Save point" / "Restore point" is the place for register spills and restores.
So, now we need to store in MachineFrameInfo not only vector of Save and vector of Restore blocks,
but Prolog and Epilog.

As we assume to have multiple Save and Restore points we need to know the list of registers,
we store / restore in each point. Threfore our SavePoint become a pair <MachineBasicBlock, std::vector<Register>>.

The full support for operating with multiple Save / Restore points is supported only in RISCV backend.
This patch introduces "-enable-shrink-wrap-into-multiple-points"
option, which enables splitting Save and Restore points during ShrinkWrap pass, i.e.
insert registers saves and restores as close as possible to their usage.

Current algorithm disables Save / Restore point splitting for
functions with instructions with FrameIndex operands,
with EHPads and with any Stack accesses beacuse it is difficult to prove the safety of it.

This patch also add support for multiple Save / Restore points only for RISCV.

Now ShrinkWrap produces:
- list of SavePoint + Registers
- list of RestorePoint + Registers
- Prolog (NCD of Save points)
- Epilog (NCPD of Restore points)
@enoskova-sc enoskova-sc force-pushed the users/enoskova-sc/multiple-save-restore-points-shrink-wrap branch from 9f67919 to ea5e86f Compare December 10, 2024 11:03
@enoskova-sc
Copy link
Contributor Author

@qcolombet, could you take a look, please?

static int64_t calculateCSRSpillOffsets(MachineFrameInfo &MFI,
const TargetFrameLowering *TFI,
int MinCSFI, int FrameIdx) {
int LocalAreaOffset = -TFI->getOffsetOfLocalArea();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this negated instead of using the raw value? Is this assuming the stack grows down?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here I assume, that for RISC-V stack grows down.

// does the same, not via new instructions but via save/restore libcalls.
if (!STI.hasStdExtZcmp() && !STI.enableSaveRestore())
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not return in all paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants