Skip to content

add missing ceil avx, sse functions #1207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 1, 2022

Conversation

isuruf
Copy link
Collaborator

@isuruf isuruf commented Jan 16, 2022

This was noticed in #1120, but decided to implement as a separate PR.

cc @d-parks

__mth_i_ceil_sse(float x)
{
__asm__(
"roundss $0x2,%0,%0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to hardcode the rounding mode here? What if the program has called ieee_set_rounding_mode?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Intel's documentation:
Round to 00B Rounded result is the closest to the infinitely precise result. If two values are equally close, the result is nearest (even) the even value (i.e., the integer value with the least-significant bit of zero).

Round down 01B Rounded result is closest to but no greater than the infinitely precise result. (toward −∞)

Round up 10B Rounded result is closest to but no less than the infinitely precise result. (toward +∞)

Round toward 11B Rounded result is closest to but no greater in absolute value than the infinitely precise result. zero (Truncate)

What ever the current rounding mode the user has specified should not affect computing ceiling or floor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification.

Copy link
Collaborator

@bryanpkc bryanpkc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kiranchandramohan kiranchandramohan merged commit 9137a34 into flang-compiler:master Jun 1, 2022
@isuruf isuruf deleted the ceil branch June 1, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants