Fixed vec_pack_to_short_fp32 in Clang altivec.h #129923

johnplatts · 2025-03-05T19:55:35Z

Fixes bug in vec_pack_to_short_fp32 in clang/lib/Headers/altivec.h.

Resolves issue #60822

github-actions · 2025-03-05T19:55:55Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-03-05T19:56:34Z

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: John Platts (johnplatts)

Changes

Fixes bug in vec_pack_to_short_fp32 in clang/lib/Headers/altivec.h.

Resolves issue #60822

Reviewer: @lei137

Full diff: https://github.com/llvm/llvm-project/pull/129923.diff

1 Files Affected:

(modified) clang/lib/Headers/altivec.h (+3-7)

diff --git a/clang/lib/Headers/altivec.h b/clang/lib/Headers/altivec.h
index 8da65055012f1..45d557238e3d9 100644
--- a/clang/lib/Headers/altivec.h
+++ b/clang/lib/Headers/altivec.h
@@ -7529,13 +7529,9 @@ vec_pack(vector double __a, vector double __b) {
 #ifdef __POWER9_VECTOR__
 static __inline__ vector unsigned short __ATTRS_o_ai
 vec_pack_to_short_fp32(vector float __a, vector float __b) {
-  vector float __resa = __builtin_vsx_xvcvsphp(__a);
-  vector float __resb = __builtin_vsx_xvcvsphp(__b);
-#ifdef __LITTLE_ENDIAN__
-  return (vector unsigned short)vec_mergee(__resa, __resb);
-#else
-  return (vector unsigned short)vec_mergeo(__resa, __resb);
-#endif
+  vector unsigned int __resa = (vector unsigned int)__builtin_vsx_xvcvsphp(__a);
+  vector unsigned int __resb = (vector unsigned int)__builtin_vsx_xvcvsphp(__b);
+  return vec_pack(__resa, __resb);
 }
 
 #endif

llvmbot · 2025-03-05T19:56:34Z

@llvm/pr-subscribers-backend-x86

Author: John Platts (johnplatts)

Changes

Fixes bug in vec_pack_to_short_fp32 in clang/lib/Headers/altivec.h.

Resolves issue #60822

Reviewer: @lei137

Full diff: https://github.com/llvm/llvm-project/pull/129923.diff

1 Files Affected:

(modified) clang/lib/Headers/altivec.h (+3-7)

diff --git a/clang/lib/Headers/altivec.h b/clang/lib/Headers/altivec.h
index 8da65055012f1..45d557238e3d9 100644
--- a/clang/lib/Headers/altivec.h
+++ b/clang/lib/Headers/altivec.h
@@ -7529,13 +7529,9 @@ vec_pack(vector double __a, vector double __b) {
 #ifdef __POWER9_VECTOR__
 static __inline__ vector unsigned short __ATTRS_o_ai
 vec_pack_to_short_fp32(vector float __a, vector float __b) {
-  vector float __resa = __builtin_vsx_xvcvsphp(__a);
-  vector float __resb = __builtin_vsx_xvcvsphp(__b);
-#ifdef __LITTLE_ENDIAN__
-  return (vector unsigned short)vec_mergee(__resa, __resb);
-#else
-  return (vector unsigned short)vec_mergeo(__resa, __resb);
-#endif
+  vector unsigned int __resa = (vector unsigned int)__builtin_vsx_xvcvsphp(__a);
+  vector unsigned int __resb = (vector unsigned int)__builtin_vsx_xvcvsphp(__b);
+  return vec_pack(__resa, __resb);
 }
 
 #endif

johnplatts · 2025-03-06T14:30:54Z

I have discovered that there is a bug in the VSX vec_pack_to_short_fp32(a, b) intrinsic with GCC on big-endian POWER9 (but not little-endian POWER9), where vec_pack_to_short_fp32(a, b) returns {b[0], b[1], b[2], b[3], a[0], a[1], a[2], a[3]} instead of the expected {a[0], a[1], a[2], a[3], b[0], b[1], b[2], b[3]}.

PVIPR expects the vec_pack_to_short_fp32(a, b) intrinsic to return the same results on both big-endian POWER9 and little-endian POWER9.

lei137

Thank-you for the fix John!

It would be good if your patch can show the changes that is being introduced via this patch. In prep for that I have put up a PR to add the existing checks for this test. Can you add the changes that will be introduced as part of this PR? Here is the diff once my patch lands.

diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c b/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c
index 68d32ee14c8f..824267b98564 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c
@@ -854,15 +854,13 @@ vector unsigned short test74(void) {
 // CHECK-BE: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK-BE: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK-BE: [[REG0:%[0-9]+]] = call <4 x i32> @llvm.ppc.altivec.vperm
-// CHECK-BE-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <4 x float>
-// CHECK-BE-NEXT: [[REG2:%[0-9]+]] = bitcast <4 x float> [[REG1]] to <8 x i16>
-// CHECK-BE-NEXT: ret <8 x i16> [[REG2]]
+// CHECK-BE-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <8 x i16>
+// CHECK-BE-NEXT: ret <8 x i16> [[REG1]]
 // CHECK: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK: [[REG0:%[0-9]+]] = call <4 x i32> @llvm.ppc.altivec.vperm
-// CHECK-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <4 x float>
-// CHECK-NEXT: [[REG2:%[0-9]+]] = bitcast <4 x float> [[REG1]] to <8 x i16>
-// CHECK-NEXT: ret <8 x i16> [[REG2]]
+// CHECK-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <8 x i16>
+// CHECK-NEXT: ret <8 x i16> [[REG1]]
   return vec_pack_to_short_fp32(vfa, vfb);
 }
 vector unsigned int test75(void) {

Thank-you!

…altivec_f32_to_f16_fix_030525

Update test in prep for IR changes that will be introduced in #129923

…fp32 (#130324) Update test in prep for IR changes that will be introduced in llvm/llvm-project#129923

lei137 · 2025-03-17T15:12:03Z

@johnplatts Please resolve the conflicts for this PR.

Fixed vec_pack_to_short_fp32 in Clang altivec.h

0bfa20f

llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Mar 5, 2025

johnplatts mentioned this pull request Mar 6, 2025

Clang vec_pack_to_short_fp32 implementation generates incorrect results #60822

Closed

RKSimon removed the backend:X86 label Mar 6, 2025

add additional checks to test for vec_pack_to_short_fp32

51c0579

lei137 mentioned this pull request Mar 7, 2025

[NFC] Add additional checks to test for vec_pack_to_short_fp32 #130324

Merged

lei137 reviewed Mar 7, 2025

View reviewed changes

lei137 linked an issue Mar 7, 2025 that may be closed by this pull request

Clang vec_pack_to_short_fp32 implementation generates incorrect results #60822

Closed

Merge remote-tracking branch 'lei137_llvm/lei/addNFCtest' into clang_…

ee4c4e5

…altivec_f32_to_f16_fix_030525

llvmbot added backend:PowerPC backend:X86 labels Mar 7, 2025

Updated test to reflect changes made to vec_pack_to_short_fp32

9aed4ea

lei137 added a commit that referenced this pull request Mar 7, 2025

[NFC] Add additional checks to test for vec_pack_to_short_fp32 (#130324)

23a44b9

Update test in prep for IR changes that will be introduced in #129923

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 7, 2025

Automerge: [NFC] Add additional checks to test for vec_pack_to_short_…

7239f64

…fp32 (#130324) Update test in prep for IR changes that will be introduced in llvm/llvm-project#129923

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed vec_pack_to_short_fp32 in Clang altivec.h #129923

Fixed vec_pack_to_short_fp32 in Clang altivec.h #129923

johnplatts commented Mar 5, 2025

github-actions bot commented Mar 5, 2025

llvmbot commented Mar 5, 2025 •

edited

Loading

llvmbot commented Mar 5, 2025

johnplatts commented Mar 6, 2025

lei137 left a comment

lei137 commented Mar 17, 2025

Fixed vec_pack_to_short_fp32 in Clang altivec.h #129923

Are you sure you want to change the base?

Fixed vec_pack_to_short_fp32 in Clang altivec.h #129923

Conversation

johnplatts commented Mar 5, 2025

github-actions bot commented Mar 5, 2025

llvmbot commented Mar 5, 2025 • edited Loading

llvmbot commented Mar 5, 2025

johnplatts commented Mar 6, 2025

lei137 left a comment

Choose a reason for hiding this comment

lei137 commented Mar 17, 2025

llvmbot commented Mar 5, 2025 •

edited

Loading