Description
Description
I encountered Illegal instruction (core dumped)
when running apps from an sdk that's provided in a .sif
file. The sif is built for amd64, and my system is a aarch64 (M2) machine with macOS 15.3. This is what I'm running:
$ limactl start template://apptainer --vm-type=vz --rosetta --mount-writable --mount-type=virtiofs --name apptainer
$ limactl shell apptainer
me@lima-apptainer$ singularity shell /path/to/sdk.sif
singularity> vi /viewing/files/works/fine.txt
singularity> sdk_debug_shell
Illegal instruction (core dumped)
Using these commands, I can set up my lima vm with apptainer, get shell access to the sif, and can browse around to view files. However, running any of the sdk apps (python apps) will error as shown above. I'm trying to use rosetta/vz for performance reasons.
I get the same error when running with qemu in system-mode, where I can also get shell access to the sif and browse around, but running bigger apps will crash in the exact same way as shown above (replace the first line with limactl start template://apptainer --vm-type=qemu --arch=x86_64 --mount-writable --name apptainer
).
However, the error disappears when adding --set '.cpuType.x86_64="max"'
.
Please could you advise if there is a similar work-around for rosetta/vz?
Activity
afbjorklund commentedon Feb 5, 2025
Without having the app or preferably a small reproducer, it is hard to known why it is crashing in emulation.
For instance, it was discovered that AVX-512 returns SIGILL when running on macOS - see #3022 (comment)
But most apps should not try to run that (v4) without explicitly being asked to?
https://www.phoronix.com/news/Linus-Torvalds-On-AVX-512
Even the AVX (v3) doesn't seem to make much of a difference, performance-wise...
https://www.phoronix.com/news/RedHat-RHEL10-x86-64-v3-Explore
afbjorklund commentedon Feb 5, 2025
Running a simple program using avx is enough to reproduce...
While your problem might be different, it is the suspected reason.
It behaves the same way, when running on older real hardware.
Related: https://stackoverflow.com/questions/56621809/getting-illegal-instruction-while-running-a-basic-avx512-code
Real code is supposed to be able to target the current architecture at runtime, but the emulation complicates things.
afbjorklund commentedon Feb 5, 2025
Example CPU capabilities (
cpuid
):https://github.com/klauspost/cpuid
afbjorklund commentedon Feb 5, 2025
Apparently macOS 15 adds support for AVX2 (v3) but not for AVX512 (v4)
https://en.wikipedia.org/wiki/Rosetta_(software) - "macOS Sequoia"
Here are some code examples that uses AVX or AVX2, for testing with:
https://github.com/kshitijl/avx2-examples
afbjorklund commentedon Feb 5, 2025
As a workaround for the older Rosetta, you can disable it and use qemu instead:
echo -1 | sudo tee /proc/sys/fs/binfmt_misc/rosetta
That will give you more CPU features, but AVX-512 is not yet supported by QEMU:
i.e. the AVX/AVX2 programs will now run, but the AVX-512 will continue to crash:
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
n-io commentedon Feb 19, 2025
I managed to dig a bit deeper, and the error occurs during a python import:
I found the underlying .so file, but
readelf -A /sdk/debug/lib/instruction_trace/sdkinstrtracepybind.cpython-38-x86_64-linux-gnu.so
returned no architecture specifics.The avx2-examples you mentioned above don't work with either the
-march=x86-64-v3
or-march=x86-64-v4
flag.n-io commentedon Feb 19, 2025
I noticed in
man arch
that rosetta takes both-x86_64
and-x86_64h
options, is there a way to control this from lima?jandubois commentedon Feb 19, 2025
I don't think this means that Rosetta supports it.
The MachO file format supports "universal binaries" that can contain multiple versions of the same program, compiled for different architectures. The
arch
command allows you to launch the variant for a specific architecture, assuming the host CPU supports it. This has nothing to do with Rosetta.afbjorklund commentedon Feb 20, 2025
This seems related to https://sdk.cerebras.net/installation-guide#apple-silicon-mac-installation, maybe ask the vendor?
"Running the Cerebras SDK on an Apple Silicon Mac or other ARM machine requires x86 emulation.
Performance will be sluggish, and emulation bugs are possible."
It might be as simple as adding
vmType: qemu
to the template?The VZ/Rosetta support is new, so maybe it didn't exist when created.
n-io commentedon Feb 20, 2025
It might seem related to https://bugs.launchpad.net/lxml/+bug/2059910
I'm interested in getting vz to work for performance reasons. It all works fine with qemu as mentioned above, but is, well, sluggish.
afbjorklund commentedon Feb 20, 2025
But fixing vz is something for Apple, no? Like the additions in macOS 15
n-io commentedon Feb 20, 2025
Ultimately, yes. It didn't work out of the box with qemu, but there was a workaround. I suppose my question was if a similar workaround might exist for vz, in case there is anything that can be configured in lima, but I understand this might be an upstream issue instead.
Thanks for taking the time btw!