[LangRef] No target-specific size limit for atomics #136864

Meinersbur · 2025-04-23T13:36:42Z

According to the current LangRef, atomics of sizes larger than a target-dependent limit are non-conformant IR. Presumably, that size limit is TargetLoweringBase::getMaxAtomicSizeInBitsSupported(). As a consequence, one would not even know whether IR is valid without instantiating the Target backend.

To get around this, Clang's CGAtomic uses a constant "16 bytes" for the maximally supported atomic. The verifier only checks the power-of-two requirement.

In a discussion with @jyknight, the intention is rather that the AtomicExpandPass will just lower everything larger than the target-supported atomic sizes to libcall (such as __atomic_load). According to this interpretation, the size limit needs only be known by the lowering and does not affect the IR specification.

The original "target-specific size limit" had been added in 59b6688. The LangRef change is needed for #134455 because otherwise frontends need to pass a TargetLowering object to the helper functions just to know what the target-specific limit is.

This also changes the LangRef for atomicrmw. Are there libatomic fallbacks for these? If not, LLVM-IR validity still depends on instantiating the actual backend. There are also some intrinsics such as llvm.memcpy.element.unordered.atomic that have this constraint but do not change in this PR.

llvmbot · 2025-04-23T13:38:06Z

@llvm/pr-subscribers-llvm-ir

Author: Michael Kruse (Meinersbur)

Changes

According to the current LangRef, atomics of sizes larger than a target-dependent limit are non-conformant IR. Presumably, that size limit is TargetLoweringBase::getMaxAtomicSizeInBitsSupported(). As a consequence, one would not even know whether IR is valid without instantiating the Target backend.

To get around this, Clang's CGAtomic uses a constant "16 bytes" for the maximally supported atomic. The verifier only checks the power-of-two requirement.

In a discussion with @jyknight, the intention is rather that the AtomicExpandPass will just lower everything larger than the target-supported atomic sizes to libcall (such as __atomic_load). According to this interpretation, the size limit needs only be known by the lowering and does not affect the IR specification.

The original "target-specific size limit" had been added in 59b6688. The LangRef change is needed for #134455 because otherwise frontends need to pass a TargetLowering object to the helper functions just to know what the target-specific limit is.

This also changes the LangRef for atomicrmw. Are there libatomic fallbacks for these? If not, LLVM-IR validity still depends on instantiating the actual backend. There are also some intrinsics such as llvm.memcpy.element.unordered.atomic that have this constraint but do not change in this PR.

Full diff: https://github.com/llvm/llvm-project/pull/136864.diff

1 Files Affected:

(modified) llvm/docs/LangRef.rst (+6-6)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 9e78f317bd9ea..7a0b7f3254f41 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11206,7 +11206,7 @@ If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
 Atomic loads produce :ref:`defined <memmodel>` results when they may see
 multiple atomic stores. The type of the pointee must be an integer, pointer, or
 floating-point type whose bit width is a power of two greater than or equal to
-eight and less than or equal to a target-specific size limit.  ``align`` must be
+eight. ``align`` must be
 explicitly specified on atomic loads. Note: if the alignment is not greater or
 equal to the size of the `<value>` type, the atomic operation is likely to
 require a lock and have poor performance. ``!nontemporal`` does not have any
@@ -11347,7 +11347,7 @@ If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
 Atomic loads produce :ref:`defined <memmodel>` results when they may see
 multiple atomic stores. The type of the pointee must be an integer, pointer, or
 floating-point type whose bit width is a power of two greater than or equal to
-eight and less than or equal to a target-specific size limit.  ``align`` must be
+eight. ``align`` must be
 explicitly specified on atomic stores. Note: if the alignment is not greater or
 equal to the size of the `<value>` type, the atomic operation is likely to
 require a lock and have poor performance. ``!nontemporal`` does not have any
@@ -11492,8 +11492,8 @@ There are three arguments to the '``cmpxchg``' instruction: an address
 to operate on, a value to compare to the value currently be at that
 address, and a new value to place at that address if the compared values
 are equal. The type of '<cmp>' must be an integer or pointer type whose
-bit width is a power of two greater than or equal to eight and less
-than or equal to a target-specific size limit. '<cmp>' and '<new>' must
+bit width is a power of two greater than or equal to eight.
+'<cmp>' and '<new>' must
 have the same type, and the type of '<pointer>' must be a pointer to
 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
 optimizer is not allowed to modify the number or order of execution of
@@ -11604,8 +11604,8 @@ operation. The operation must be one of the following keywords:
 -  usub_sat
 
 For most of these operations, the type of '<value>' must be an integer
-type whose bit width is a power of two greater than or equal to eight
-and less than or equal to a target-specific size limit. For xchg, this
+type whose bit width is a power of two greater than or equal to eight.
+For xchg, this
 may also be a floating point or a pointer type with the same size constraints
 as integers.  For fadd/fsub/fmax/fmin, this must be a floating-point
 or fixed vector of floating-point type.  The type of the '``<pointer>``'

efriedma-quic · 2025-04-23T16:50:10Z

When this text was originally written, the backend didn't have support for expanding atomics to libcalls, I think. Not sure if all targets support such lowering today... does __atomic_load exist on GPU targets?

Meinersbur · 2025-04-24T12:59:33Z

AFAIK no GPU provides __atomic_load with arbitrary size directly. Possible could be implemented for CUDA after the Pascal-generation using cuda::atomic. Currently, if you would try to use the __atomic_load with an unsupported size, you may either get a link error or the compiler would crash.

For the purpose of #134455 it doesn't matter where the crash occurs (AtomicExpandPass or emitAtomicLoad).

efriedma-quic · 2025-04-24T19:02:25Z

So it seems like there is still a limit, at least on some targets?

Maybe it's sufficient to just note that most targets support arbitrary widths using a lock-based fallback.

Meinersbur · 2025-04-28T15:29:43Z

The question is, what a Frontend should do. Let's say it needs a atomic of size 16:

Unconditionally emit an LLVM-IR cmpxchg instruction. With my undertanding of the current LangRef, this is illegal IR even if AtomicExpandPass would convert the illegal IR to legal IR.
Instantiate the TargetMachine and ask it for the supported atomic size. For Clang the discussion was that this would require LLVM_TARGETS_TO_BUILD to emit LLVM-IR for a specific architecture which was an unwelcome change.
Implement its own target-triple dependent lookup of for each target¹.
If it is just "most targets" then the frontend still needs to know what those targets are. If it has to know itself, this is not that much different from the previous. If it is just a (non-normative) note then it would also still be illegal IR.
A middle ground would be "If target can support atomics of arbitrary size then atomic load/store/cmpxchg can be of any size. Otherwise, the size must be at most TargetLoweringBase::getMaxAtomicSizeInBitsSupported()". If it is not supported, usually frontends cannot just simply not emit an atomic. We would prefer it it emitted an error instead of crashing or emitting illegal IR, so we still would need a list of supported target triples in the frontend...

The frontend still needs to implement libcall fallbacks for cases such as non-constant sizes (this is what #134455 is doing). If due to compiler optimizations the argument becomes constant (e.g. because it is wrapped in a utility function that should work for any size), there currently is no way back to an atomic load/store/cmpxchg instruction. So ideally AtomicExpandPass would handle non-constant sizes/ordering flags as well.

Clang has its own TargetInfo that contains a lot where IMHO the backend should be the source-source-of-truth. For Flang this means we may need to duplicate that info once more or also instantiate TargetInfo in addition to its own TargetCharacteristics. Maybe the RFC will help. ↩

efriedma-quic · 2025-04-28T18:06:55Z

We have a mechanism for backend diagnostics, so they can report errors on IR they can't support.

If a frontend wants to generate different code sequences depending on whether the target has atomics of a given width, and we have some targets that don't support arbitrary atomics, then yes, we're stuck with some kind of target-specific check.

nikic · 2025-04-28T19:13:19Z

I think as far as IR and the backend is concerned, it's fine to say that atomics of arbitrary width are supported. If this results in a libatomic call and libatomic is not linked, that's a linker error and no longer LLVM's problem. The frontend will not be able to improve on this situation by producing some kind of better lowering than LLVM.

Worth pointing out that amdgpu and nvptx disable all libcalls apart from the atomic libcalls. They'll still emit the usual atomic_* libcalls and not crash / produce a backend error.

Meinersbur · 2025-04-29T10:02:31Z

@efriedma-quic @nikic So what wording do you suggest for the LangRef?

nikic · 2025-04-29T12:18:21Z

I think your current proposed wording is fine.

Meinersbur · 2025-04-29T13:01:57Z

I think your current proposed wording is fine.

If by prosposed wording you mean the state of the current PR which just removes the part (in contrast to item 5 in #136864 (comment)), can you approve it?

nikic

LGTM, but please wait for a second approval.

[LangRef] No target-specific size limit

4fb8cc5

Meinersbur requested review from jyknight and efriedma-quic April 23, 2025 13:36

Meinersbur marked this pull request as ready for review April 23, 2025 13:37

llvmbot added the llvm:ir label Apr 23, 2025

Meinersbur requested a review from nikic April 29, 2025 12:56

nikic approved these changes Apr 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LangRef] No target-specific size limit for atomics #136864

[LangRef] No target-specific size limit for atomics #136864

Meinersbur commented Apr 23, 2025

llvmbot commented Apr 23, 2025

efriedma-quic commented Apr 23, 2025

Meinersbur commented Apr 24, 2025

efriedma-quic commented Apr 24, 2025

Meinersbur commented Apr 28, 2025

efriedma-quic commented Apr 28, 2025

nikic commented Apr 28, 2025

Meinersbur commented Apr 29, 2025

nikic commented Apr 29, 2025

Meinersbur commented Apr 29, 2025

nikic left a comment

[LangRef] No target-specific size limit for atomics #136864

Are you sure you want to change the base?

[LangRef] No target-specific size limit for atomics #136864

Conversation

Meinersbur commented Apr 23, 2025

llvmbot commented Apr 23, 2025

efriedma-quic commented Apr 23, 2025

Meinersbur commented Apr 24, 2025

efriedma-quic commented Apr 24, 2025

Meinersbur commented Apr 28, 2025

Footnotes

efriedma-quic commented Apr 28, 2025

nikic commented Apr 28, 2025

Meinersbur commented Apr 29, 2025

nikic commented Apr 29, 2025

Meinersbur commented Apr 29, 2025

nikic left a comment

Choose a reason for hiding this comment