Do you have performance data that shows an improvement in applications?
I don't.
I don't know if this makes the code simpler or better in other ways—great if so—but if not, and there's no instance where avoiding that server call is known to be a bottleneck, I wouldn't be inclined to do it personally.
It removes all need for message queue handling in ntdll, and it doesn't have to know about message queues at all. Also removes the extra server requests, including when waiting only on the queue. On the server side it also gets rid of the awkward dual server + inproc queue sync.
It also brings improvements in win32u, with better handling of driver events, and eventually even internal messages. Right know we blindly poll for events, and that could be done only when necessary.
So I think it makes it better.
In any case I hope this isn't going to be considered necessary for ntsync to be upstreamed.
Well I don't know that it is, the other MR has been stalled for a while without much feedback and usually that indicates that something isn't quite right and needs to be tried differently. I can only take guesses and the queue handling in ntdll looks like a possible reason.