Skip to content

Illegal Instruction in amd64 .sif on macOS/aarch64 rosetta #3195

Open
@n-io

Description

@n-io

Description

I encountered Illegal instruction (core dumped) when running apps from an sdk that's provided in a .sif file. The sif is built for amd64, and my system is a aarch64 (M2) machine with macOS 15.3. This is what I'm running:

$ limactl start template://apptainer --vm-type=vz --rosetta --mount-writable --mount-type=virtiofs --name apptainer
$ limactl shell apptainer
me@lima-apptainer$ singularity shell /path/to/sdk.sif
singularity> vi /viewing/files/works/fine.txt
singularity> sdk_debug_shell
Illegal instruction (core dumped)

Using these commands, I can set up my lima vm with apptainer, get shell access to the sif, and can browse around to view files. However, running any of the sdk apps (python apps) will error as shown above. I'm trying to use rosetta/vz for performance reasons.

I get the same error when running with qemu in system-mode, where I can also get shell access to the sif and browse around, but running bigger apps will crash in the exact same way as shown above (replace the first line with limactl start template://apptainer --vm-type=qemu --arch=x86_64 --mount-writable --name apptainer).

However, the error disappears when adding --set '.cpuType.x86_64="max"'.

Please could you advise if there is a similar work-around for rosetta/vz?

Activity

afbjorklund

afbjorklund commented on Feb 5, 2025

@afbjorklund
Member

Without having the app or preferably a small reproducer, it is hard to known why it is crashing in emulation.

For instance, it was discovered that AVX-512 returns SIGILL when running on macOS - see #3022 (comment)

But most apps should not try to run that (v4) without explicitly being asked to?

https://www.phoronix.com/news/Linus-Torvalds-On-AVX-512

Even the AVX (v3) doesn't seem to make much of a difference, performance-wise...

https://www.phoronix.com/news/RedHat-RHEL10-x86-64-v3-Explore

Even if we cannot show performance improvements for software included in RHEL,
it may still make sense to go ahead with the switch

afbjorklund

afbjorklund commented on Feb 5, 2025

@afbjorklund
Member

Running a simple program using avx is enough to reproduce...

afb@lima-apptainer:~$ sudo apt install -y g++-x86-64-linux-gnu
afb@lima-apptainer:~$ x86_64-linux-gnu-g++ -O3 -march=x86-64-v4 -static test.cpp 
afb@lima-apptainer:~$ ./a.out 
Illegal instruction (core dumped)

While your problem might be different, it is the suspected reason.

It behaves the same way, when running on older real hardware.


Related: https://stackoverflow.com/questions/56621809/getting-illegal-instruction-while-running-a-basic-avx512-code

Real code is supposed to be able to target the current architecture at runtime, but the emulation complicates things.

afbjorklund

afbjorklund commented on Feb 5, 2025

@afbjorklund
Member

Example CPU capabilities (cpuid):

Name: VirtualApple @ 2.50GHz
Vendor String: GenuineIntel
Vendor ID: Intel
PhysicalCores: 1
Threads Per Core: 1
Logical Cores: 1
CPU Family 6 Model: 44 Stepping: 0
Features: AESNI,CLMUL,CMOV,CMPXCHG8,CX16,FXSR,FXSROPT,LAHF,MMX,NX,OSXSAVE,POPCNT,RDTSCP,SSE,SSE2,SSE3,SSE4,SSE42,SSSE3,SYSCALL,SYSEE,X87,XSAVE
Microarchitecture level: 2
Cacheline bytes: 64
L1 Instruction Cache: 131072 bytes
L1 Data Cache: 131072 bytes
L2 Cache: 8388608 bytes
L3 Cache: 0 bytes
Frequency: 2500000000 Hz

https://github.com/klauspost/cpuid

afbjorklund

afbjorklund commented on Feb 5, 2025

@afbjorklund
Member

Apparently macOS 15 adds support for AVX2 (v3) but not for AVX512 (v4)

https://en.wikipedia.org/wiki/Rosetta_(software) - "macOS Sequoia"

Here are some code examples that uses AVX or AVX2, for testing with:

https://github.com/kshitijl/avx2-examples

afbjorklund

afbjorklund commented on Feb 5, 2025

@afbjorklund
Member

As a workaround for the older Rosetta, you can disable it and use qemu instead:

echo -1 | sudo tee /proc/sys/fs/binfmt_misc/rosetta

That will give you more CPU features, but AVX-512 is not yet supported by QEMU:

Name: QEMU TCG CPU version 2.5+
Vendor String: AuthenticAMD
Vendor ID: AMD
PhysicalCores: 0
Threads Per Core: 1
Logical Cores: 0
CPU Family 6 Model: 6 Stepping: 3
Features: ADX,AESNI,AMD3DNOW,AMD3DNOWEXT,AVX,AVX2,AVXSLOW,BMI1,BMI2,CLMUL,CMOV,CMPSB_SCADBS_SHORT,CMPXCHG8,CX16,ERMS,F16C,FMA3,FSRM,FXSR,FXSROPT,HYPERVISOR,IA32_ARCH_CAP,IBPB,IBRS,LAHF,LZCNT,MMX,MMXEXT,MOVBE,MOVSB_ZL,MPX,NRIPS,NX,OSXSAVE,POPCNT,PSFD,RDRAND,RDSEED,RDTSCP,SHA,SPEC_CTRL_SSBD,SSE,SSE2,SSE3,SSE4,SSE42,SSE4A,SSSE3,STIBP,STIBP_ALWAYSON,STOSB_SHORT,SVM,SVMNP,SYSCALL,SYSEE,VAES,WBNOINVD,X87,XGETBV1,XSAVE,XSAVEOPT
Microarchitecture level: 3
Cacheline bytes: 64
L1 Instruction Cache: 65536 bytes
L1 Data Cache: 65536 bytes
L2 Cache: 524288 bytes
L3 Cache: -1 bytes

i.e. the AVX/AVX2 programs will now run, but the AVX-512 will continue to crash:

qemu: uncaught target signal 4 (Illegal instruction) - core dumped

n-io

n-io commented on Feb 19, 2025

@n-io
Author

I managed to dig a bit deeper, and the error occurs during a python import:

$ limactl shell apptainer
me@lima-apptainer$ singularity shell /path/to/sdk.sif
Singularity> PYTHONFAULTHANDLER="1" python3
Python 3.8.16 (default, Mar 18 2024, 18:27:40) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from vendor.sdk.debug.lib.instruction_trace import sdkinstrtracepybind
Fatal Python error: Illegal instruction

Current thread 0x00007ffffdf1f740 (most recent call first):
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1132 in create_module
  File "<frozen importlib._bootstrap>", line 556 in module_from_spec
  File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "<stdin>", line 1 in <module>
Illegal instruction (core dumped)

Singularity> python3 --version
Python 3.8.16

I found the underlying .so file, but readelf -A /sdk/debug/lib/instruction_trace/sdkinstrtracepybind.cpython-38-x86_64-linux-gnu.so returned no architecture specifics.

The avx2-examples you mentioned above don't work with either the -march=x86-64-v3 or -march=x86-64-v4 flag.

n-io

n-io commented on Feb 19, 2025

@n-io
Author

I noticed in man arch that rosetta takes both -x86_64 and -x86_64h options, is there a way to control this from lima?

jandubois

jandubois commented on Feb 19, 2025

@jandubois
Member

I noticed in man arch that rosetta takes both -x86_64 and -x86_64h options, is there a way to control this from lima?

I don't think this means that Rosetta supports it.

The MachO file format supports "universal binaries" that can contain multiple versions of the same program, compiled for different architectures. The arch command allows you to launch the variant for a specific architecture, assuming the host CPU supports it. This has nothing to do with Rosetta.

afbjorklund

afbjorklund commented on Feb 20, 2025

@afbjorklund
Member

This seems related to https://sdk.cerebras.net/installation-guide#apple-silicon-mac-installation, maybe ask the vendor?

"Running the Cerebras SDK on an Apple Silicon Mac or other ARM machine requires x86 emulation.
Performance will be sluggish, and emulation bugs are possible."

It might be as simple as adding vmType: qemu to the template?

The VZ/Rosetta support is new, so maybe it didn't exist when created.

n-io

n-io commented on Feb 20, 2025

@n-io
Author

It might seem related to https://bugs.launchpad.net/lxml/+bug/2059910

I'm interested in getting vz to work for performance reasons. It all works fine with qemu as mentioned above, but is, well, sluggish.

afbjorklund

afbjorklund commented on Feb 20, 2025

@afbjorklund
Member

But fixing vz is something for Apple, no? Like the additions in macOS 15

n-io

n-io commented on Feb 20, 2025

@n-io
Author

Ultimately, yes. It didn't work out of the box with qemu, but there was a workaround. I suppose my question was if a similar workaround might exist for vz, in case there is anything that can be configured in lima, but I understand this might be an upstream issue instead.

Thanks for taking the time btw!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jandubois@afbjorklund@n-io

        Issue actions

          Illegal Instruction in amd64 `.sif` on macOS/aarch64 rosetta · Issue #3195 · lima-vm/lima