Skip to content

Allow optarch values to be partial maps including vector extensions #3797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

Flamefire
Copy link
Contributor

Instead of only setting optimal compiler arguments based on architecture and CPU family/vendor also include the supported vector extensions as criteria to choose flags.
To simplify specifying generic flags allow partial matches, e.g. a fallback setting for any x86 arch which doesn't have more specific values.
Example:

    COMPILER_OPTIMAL_ARCHITECTURE_OPTION = {
        (systemtools.X86_64, ): 'xHost',
        (systemtools.X86_64, systemtools.AMD, systemtools.AVX2): 'mavx2',
    }

Closes #3793

@Flamefire
Copy link
Contributor Author

@boegel Updated this PR

@boegel boegel removed this from the 4.6.1 milestone Sep 9, 2022
Instead of only setting optimal compiler arguments based on architecture
and CPU family/vendor also include the supported vector extensions as
criteria to choose flags.
To simplify specifying generic flags allow partial matches, e.g. a
fallback setting for any x86 arch which doesn't have more specific
values.
Example:
    COMPILER_OPTIMAL_ARCHITECTURE_OPTION = {
        (systemtools.X86_64, ): 'xHost',
        (systemtools.X86_64, systemtools.AMD, systemtools.AVX2): 'mavx2',
    }
Intel will use SSE2 when passed -xHost on AMD systems
Be more specific here by passing e.g. AVX2 when that is available.
Avoid spurious failures if the lock gets removed before the code checks
for it.
@jfgrimm jfgrimm self-assigned this Apr 24, 2024
@jfgrimm jfgrimm added the EasyBuild-5.0 EasyBuild 5.0 label Apr 24, 2024
@jfgrimm jfgrimm modified the milestones: release after 4.9.1, 5.0 Apr 24, 2024
@Flamefire
Copy link
Contributor Author

Ping on this as this came up in the EUM with respect to intel compilers

@joeydumont
Copy link
Contributor

To add to this, it seems like spack has already done some work in mapping CPU feature flags to compiler versions, see microarchitectures.json. There is some logic in portage as well, where CPU_FLAGS variable map to USE flags, but this probably maps less closely to the problem addressed in this PR than what archspec does.

@Flamefire
Copy link
Contributor Author

We just got hit by this not being implemented:

tf_gen.F90(172): catastrophic error: Function return parameter requires SSE register while SSE is disabled.

When compiling HDF5 with the Intel compiler and -xhost on AMD EPYC 7702

@boegel boegel modified the milestones: 5.0.0, 5.x Mar 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add optimal optimization flags for Intel compilers on AMD CPUs
5 participants