Skip to content

Fixed vec_pack_to_short_fp32 in Clang altivec.h #129923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

johnplatts
Copy link

Fixes bug in vec_pack_to_short_fp32 in clang/lib/Headers/altivec.h.

Resolves issue #60822

Reviewer: @lei137

Copy link

github-actions bot commented Mar 5, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Mar 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 5, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: John Platts (johnplatts)

Changes

Fixes bug in vec_pack_to_short_fp32 in clang/lib/Headers/altivec.h.

Resolves issue #60822

Reviewer: @lei137


Full diff: https://github.com/llvm/llvm-project/pull/129923.diff

1 Files Affected:

  • (modified) clang/lib/Headers/altivec.h (+3-7)
diff --git a/clang/lib/Headers/altivec.h b/clang/lib/Headers/altivec.h
index 8da65055012f1..45d557238e3d9 100644
--- a/clang/lib/Headers/altivec.h
+++ b/clang/lib/Headers/altivec.h
@@ -7529,13 +7529,9 @@ vec_pack(vector double __a, vector double __b) {
 #ifdef __POWER9_VECTOR__
 static __inline__ vector unsigned short __ATTRS_o_ai
 vec_pack_to_short_fp32(vector float __a, vector float __b) {
-  vector float __resa = __builtin_vsx_xvcvsphp(__a);
-  vector float __resb = __builtin_vsx_xvcvsphp(__b);
-#ifdef __LITTLE_ENDIAN__
-  return (vector unsigned short)vec_mergee(__resa, __resb);
-#else
-  return (vector unsigned short)vec_mergeo(__resa, __resb);
-#endif
+  vector unsigned int __resa = (vector unsigned int)__builtin_vsx_xvcvsphp(__a);
+  vector unsigned int __resb = (vector unsigned int)__builtin_vsx_xvcvsphp(__b);
+  return vec_pack(__resa, __resb);
 }
 
 #endif

@llvmbot
Copy link
Member

llvmbot commented Mar 5, 2025

@llvm/pr-subscribers-backend-x86

Author: John Platts (johnplatts)

Changes

Fixes bug in vec_pack_to_short_fp32 in clang/lib/Headers/altivec.h.

Resolves issue #60822

Reviewer: @lei137


Full diff: https://github.com/llvm/llvm-project/pull/129923.diff

1 Files Affected:

  • (modified) clang/lib/Headers/altivec.h (+3-7)
diff --git a/clang/lib/Headers/altivec.h b/clang/lib/Headers/altivec.h
index 8da65055012f1..45d557238e3d9 100644
--- a/clang/lib/Headers/altivec.h
+++ b/clang/lib/Headers/altivec.h
@@ -7529,13 +7529,9 @@ vec_pack(vector double __a, vector double __b) {
 #ifdef __POWER9_VECTOR__
 static __inline__ vector unsigned short __ATTRS_o_ai
 vec_pack_to_short_fp32(vector float __a, vector float __b) {
-  vector float __resa = __builtin_vsx_xvcvsphp(__a);
-  vector float __resb = __builtin_vsx_xvcvsphp(__b);
-#ifdef __LITTLE_ENDIAN__
-  return (vector unsigned short)vec_mergee(__resa, __resb);
-#else
-  return (vector unsigned short)vec_mergeo(__resa, __resb);
-#endif
+  vector unsigned int __resa = (vector unsigned int)__builtin_vsx_xvcvsphp(__a);
+  vector unsigned int __resb = (vector unsigned int)__builtin_vsx_xvcvsphp(__b);
+  return vec_pack(__resa, __resb);
 }
 
 #endif

@johnplatts
Copy link
Author

I have discovered that there is a bug in the VSX vec_pack_to_short_fp32(a, b) intrinsic with GCC on big-endian POWER9 (but not little-endian POWER9), where vec_pack_to_short_fp32(a, b) returns {b[0], b[1], b[2], b[3], a[0], a[1], a[2], a[3]} instead of the expected {a[0], a[1], a[2], a[3], b[0], b[1], b[2], b[3]}.

PVIPR expects the vec_pack_to_short_fp32(a, b) intrinsic to return the same results on both big-endian POWER9 and little-endian POWER9.

Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank-you for the fix John!

It would be good if your patch can show the changes that is being introduced via this patch. In prep for that I have put up a PR to add the existing checks for this test. Can you add the changes that will be introduced as part of this PR? Here is the diff once my patch lands.

diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c b/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c
index 68d32ee14c8f..824267b98564 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-p9vector.c
@@ -854,15 +854,13 @@ vector unsigned short test74(void) {
 // CHECK-BE: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK-BE: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK-BE: [[REG0:%[0-9]+]] = call <4 x i32> @llvm.ppc.altivec.vperm
-// CHECK-BE-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <4 x float>
-// CHECK-BE-NEXT: [[REG2:%[0-9]+]] = bitcast <4 x float> [[REG1]] to <8 x i16>
-// CHECK-BE-NEXT: ret <8 x i16> [[REG2]]
+// CHECK-BE-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <8 x i16>
+// CHECK-BE-NEXT: ret <8 x i16> [[REG1]]
 // CHECK: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK: @llvm.ppc.vsx.xvcvsphp(<4 x float>
 // CHECK: [[REG0:%[0-9]+]] = call <4 x i32> @llvm.ppc.altivec.vperm
-// CHECK-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <4 x float>
-// CHECK-NEXT: [[REG2:%[0-9]+]] = bitcast <4 x float> [[REG1]] to <8 x i16>
-// CHECK-NEXT: ret <8 x i16> [[REG2]]
+// CHECK-NEXT: [[REG1:%[0-9]+]] = bitcast <4 x i32> [[REG0]] to <8 x i16>
+// CHECK-NEXT: ret <8 x i16> [[REG1]]
   return vec_pack_to_short_fp32(vfa, vfb);
 }
 vector unsigned int test75(void) {

Thank-you!

@lei137 lei137 linked an issue Mar 7, 2025 that may be closed by this pull request
lei137 added a commit that referenced this pull request Mar 7, 2025
Update test in prep for IR changes that will be introduced in
#129923
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 7, 2025
…fp32 (#130324)

Update test in prep for IR changes that will be introduced in
llvm/llvm-project#129923
@lei137
Copy link
Contributor

lei137 commented Mar 17, 2025

@johnplatts Please resolve the conflicts for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:PowerPC backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clang vec_pack_to_short_fp32 implementation generates incorrect results
4 participants