The announcement of KDE Neon dev/unstable switching to Wayland by default raised quite a few worried comments as NVIDIA’s proprietary driver is not supported. One thing should be clear: we won’t break any setups. We will make sure that X11 is selected by default if the given hardware setup does not support Wayland. Nevertheless I think that the amount of questions show that I should discuss this in more detail.
NVIDIA does support Wayland – kind-of. The solution they came up with is not compatible to any existing Wayland compositor and requires patches to make it work. For the reference implementation Weston there are patches provided by NVIDIA, but those have not been integrated yet. For KWin such patches do not exist and we have no plans to develop such an adaption as long as the patches are not merged into Weston. Even if there would be patches, we would not merge them as long as they are not merged into Weston.
The solution NVIDIA came up with requires different code paths. This is unfortunate as it would require driver specific adjustments and driver specific code paths. This is bad for everybody involved. For us developers, for the driver developers and most importantly for our users. It means that we developers have to spend time on implementing and maintaining a solution for one driver – time which could be spent on fixing bugs instead. We could do such an effort for one driver, but once it goes to every driver requiring adjustment it gets not manageable.
But also adjustments for one driver are problematic. The latest NVIDIA driver caused a regression in KWin. On Quadro hardware (other hardware seems to be not affected) our shader self test fails which results in compositing disabled. If one removes the shader self test everything works fine, though. I assume that there is a bug in KWin’s rendering of the self test which is triggered only with this driver. But as I don’t have such hardware I cannot verify. Yes, I did pass multiple patches for investigating and trying to fix it to a colleague with such hardware. No, please don’t donate me hardware.
In the end, after spending more than half a day on it, we had to do the worst option which is to add a driver and hardware specific check to disable the self test and ship it with the 5.7.5 release. It’s super problematic for the code maintainability to add such checks. We are hiding a bug and we cannot investigate it. We are now stuck with an implementation where we will never be able to say “we can remove that again”. Driver specific workarounds tend to stick around. E.g. we have such a check:
// Broken on Intel chips with Mesa 9.1 - BUG 313613 if (gl->driver() == Driver_Intel && gl->mesaVersion() >= kVersionNumber(9, 1) && gl->mesaVersion() < kVersionNumber(9, 2)) return;
It's nowadays absolutely pointless to have such code around as nobody is using such a Mesa version. But the code is still there, makes it more complex and has a maintenance cost. This is why driver specific implementations are bad and is nothing we want in our code base.
People asked to be pragmatic, because NVIDIA is so important. I am absolutely pragmatic here: we don't have the resources to develop and maintain an NVIDIA specific implementation on Wayland.
Also some people complained that this is unfair because we do have an implementation for (proprietary) Android drivers. I need to point out that this does not compare at all.
First of all our Android implementation is not specific for a proprietary driver. It is written for the open source hwcomposer interface exposed through libhybris. All of that is open source. The fact that the actual driver might be proprietary is nothing we like, but also not relevant for our implementation.
In addition the implementation is encapsulated in a platform plugin and significantly reduced in functionality (only one screen, no cursor, etc.). This is something we would not be able to do for NVIDIA (you would want multi-screen, right?).
For NVIDIA we would have to add a deviation into the DRM platform plugin to create the OpenGL context in a different way. This is something our architecture does not support and was not created for. The general idea is that if creating the GBM based context fails, KWin will terminate. Adding support there for a different way to get an OpenGL context up and running would result in lots of added complexity in a very important code path. We have to ensure that KWin terminates if OpenGL fails. At the same time we have to make sure that llvmpipe is not picked if NVIDIA hardware is used. This would be a horrible mess to maintain - especially if developers are not able to test this without huge effort.
From what I understand from the patch set it would also require to significantly change the presenting of a frame on an output and by that also turn our lower level code more complex. This code is currently able to serve both our OpenGL and our QPainter based compositor, but wouldn't allow to support NVIDIA's implementation. Adding changes there would hinder us in future development of the platform plugin. This is an important area we are working on and KWin 5.8 contains a new additional implementation making use of atomic mode settings. We want to have atomic mode settings used everywhere in the stack to have every frame perfect. NVIDIA's implementation would make that difficult.
EGLStreams bring another disadvantage as the code to bind a buffer (what a window renders) to a texture (what the compositor needs to render) would need changes. Binding the buffer is currently performed by KWin core and not part of the plugin infrastructure. Given that new additional code would also be needed there. We don't need that for any other platform we currently support. E.g. for hwcomposer on Andrid libhybris takes care of allowing us to use EGL the same way as on any other platform. I absolutely do not understand why changes would be needed there. Existing code shows that it can be done differently. And here we see again why I think the situation with EGLStream does not compare at all to supporting hwcomposer.
Overall we are not thrilled by the prospect of two competing implementations. We do hope that at XDC the discussions will have a positive end and that there will be only one implementation. I don't care which one, I don't care whether one is better as the other. What I care about is only requiring one code path, the possibility to test with free drivers (Mesa) and the support for atomic mode settings. Ideally I would also prefer to not have to adjust existing code.