http://bugs.winehq.org/show_bug.cgi?id=58420
Bug ID: 58420 Summary: Wine crashes when running msys2 iff CPU's PKU/PKRU feature is enabled Product: Wine Version: unspecified Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: minor Priority: P2 Component: -unknown Assignee: wine-bugs@winehq.org Reporter: pipcet@protonmail.com Distribution: ---
This is a heads-up about a bug in cygwin/msys2 which may be misreported as a wine bug.
The Cygwin code assumes CPUID leaf 0x0d to indicate in EBX a 64-byte-aligned value. That may or may not be true on current Windows systems, but on current Linux systems running Wine, with recent CPUs, a CPU feature called "PKU" or "PKRU" (memory protection keys for userspace), enabled by default, indicates an additional 8-byte XSAVE area.
This makes the Cygwin code attempt to xsave to a non-64-byte aligned area, which causes a segfault and abnormal program termination.
While this is clearly a bug, it's a bug that apparently cannot be triggered by current Windows systems. This means Wine fails to be bug-for-bug compatible in this case. I don't think there's an easy way around that: we'd have to trap XGETBV and CPUID instructions to pretend that a feature that is enabled by Linux actually isn't available.
It would have been nice if the Linux kernel had provided a way to disable the pku feature on a per-process basis, but AFAICT, it didn't. The nopku kernel command line argument makes things work, but requires a reboot and disables the very useful PKU feature for the entire system.
So I don't think there's a good workaround here, but maybe we should suggest to the Linux kernel people to allow per-process activation of any future XCR0 features?
http://bugs.winehq.org/show_bug.cgi?id=58420
Ken Sharp imwellcushtymelike@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |download, source
http://bugs.winehq.org/show_bug.cgi?id=58420
Bernhard Übelacker bernhardu@mailbox.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |bernhardu@mailbox.org
--- Comment #1 from Bernhard Übelacker bernhardu@mailbox.org --- Thank you very much for the detailed information.
Just putting a link to your cygwin upstream mailing list discussion: https://cygwin.com/pipermail/cygwin/2025-June/258368.html
http://bugs.winehq.org/show_bug.cgi?id=58420
--- Comment #2 from pipcet@protonmail.com --- For what it's worth, I feel quite strongly that Linux should allow "hiding" XCR0 features on a per-process basis. That would allow Wine to work around this bug on the Linux/Wine side of things, and run unfixed Cygwin software on new CPUs without a reboot (and a reduced feature set in the rebooted system).
I suggested that, and the suggestion was rejected.
This means that Wine will now move further away from bug-for-bug compatibility with Windows whenever a new CPU feature is introduced (Linux would enable it for everyone, and even if there is an option to disable it, that would necessarily require a reboot of the Linux system).
I may be overestimating how important such compatibility is to the Wine developers, but if I'm not, our best remaining option would be to make Wine use virtualization instead of hoping to be able to run a Windows program as a Linux process directly; that wouldn't make Wine an emulator, it would merely mean that we trap certain instructions to hide CPU features which Windows doesn't enable or which work differently on Linux and Windows.
http://bugs.winehq.org/show_bug.cgi?id=58420
--- Comment #3 from Bernhard Übelacker bernhardu@mailbox.org --- If I understand this right this issue makes wine getting an exception in a xsave instruction? There might be a chance to workaround by adding something to the function emulate_instruction for this case?
http://bugs.winehq.org/show_bug.cgi?id=58420
--- Comment #4 from pipcet@protonmail.com --- (In reply to Bernhard Übelacker from comment #3)
If I understand this right this issue makes wine getting an exception in a xsave instruction?
Yes, but the xsave instruction itself behaves correctly. It sees an unaligned pointer and throws a general protection fault, as it is documented to do.
There might be a chance to workaround by adding something to the function emulate_instruction for this case?
That would mean making xsave work with an unaligned buffer, IIUC.
It'd be very tricky: we would need to copy the correct xsave state from the context (we can't just execute xsave to a temp buffer because our xstate might be different from the state of the interrupted process), and it's possible that the layout would be different so we'd need to parse and rebuild it. That's a lot of effort for a workaround.
It seems quite unlikely any program relies on the fact that xsave to an unaligned area does not work. Still, it's possible that the workaround would break more than it fixes.
However, these are new CPUs, so arch_prctl(ARCH_GET_CPUID) might be available, which would let us trap (just) the CPUID instruction and align the reported maximum size of the XSAVE area. I don't know whether that's an option, particularly since the relevant flag is cleared by execve().
There's no really good option here: if we have to live with the latest and greatest features being enabled by Linux, in XCR0 and the other control registers, then those features will become visible to Wine-run programs years before Windows implements them, and some of those programs will break.
I imagine that'll be a nightmare for APX (which interacts with xsave in very strange ways, using a different order for the storage areas depending on whether xsavec or xsave is used).
http://bugs.winehq.org/show_bug.cgi?id=58420
--- Comment #5 from pipcet@protonmail.com --- Created attachment 78898 --> http://bugs.winehq.org/attachment.cgi?id=78898 LD_PRELOAD workaround for Intel CPUs; does not work on AMD. PoC.
http://bugs.winehq.org/show_bug.cgi?id=58420
--- Comment #6 from pipcet@protonmail.com --- (In reply to pipcet from comment #4)
However, these are new CPUs, so arch_prctl(ARCH_GET_CPUID) might be available, which would let us trap (just) the CPUID instruction and align the reported maximum size of the XSAVE area. I don't know whether that's an option, particularly since the relevant flag is cleared by execve().
That's available on one of the systems I'm testing on (the Intel one), but not the other one (AMD). On the Intel system, the LD_PRELOAD library I've just attached works, without having to modify Wine, MSYS2, or rebooting. However, even though it works in tests, there's a race condition: to execute the real CPUID instruction, we momentarily disable the arch_prctl which traps on other CPUID instructions (so we don't get caught in an infloop). If another thread chooses that moment to execute the problematic CPUID instruction, it'll get the unfixed result...
http://bugs.winehq.org/show_bug.cgi?id=58420
--- Comment #7 from pipcet@protonmail.com --- Created attachment 78899 --> http://bugs.winehq.org/attachment.cgi?id=78899 LD_PRELOAD workaround for all CPUs; this assumes all unaligned XSAVEs are due to this specific Cygwin code sequence
This works for AMD CPUs as well as Intel, and it doesn't have the race condition. It does, however, assume that all unaligned xsave64 instructions are due to the specific Cygwin code sequence, and messes around with the Cygwin stack (and stack pointer!) in a specific way which would break all other applications.
Note that this code does not yet inspect siginfo before it dereferences %rip. If %rip is invalid or not readable, we'll get a recursive segfault and a stack overflow.
http://bugs.winehq.org/show_bug.cgi?id=58420
Joel Holdsworth joel@airwebreathe.org.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |joel@airwebreathe.org.uk