[UPDATE] Warning: NVIDIA driver 384.69 seems to be broken with QtQuick

Just a short warning to KDE Plasma users with NVIDIA drivers. Lately we have seen many crash reports from NVIDIA users who updated to version 384.xx. Affected is at least KWin and KScreenLocker, which means one cannot unlock the session any more. The crash happens in the NVIDIA driver triggered from somewhere in QtQuick, so completely outside of our code. A typical backtrace looks like that:

#0  0x00007fffdedffc02 in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#1  0x00007fffdf0ca9ba in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#2  0x00007fffdef953da in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#3  0x00007fffdefab2d4 in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#4  0x00007fffdef99bce in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#5  0x00007ffff5f0b681 in QSGBatchRenderer::Renderer::renderMergedBatch(QSGBatchRenderer::Batch const*) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#6  0x00007ffff5f0bf8d in QSGBatchRenderer::Renderer::renderBatches() () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#7  0x00007ffff5f11909 in QSGBatchRenderer::Renderer::render() () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#8  0x00007ffff5f0209f in QSGRenderer::renderScene(QSGBindable const&) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#9  0x00007ffff5f0257b in QSGRenderer::renderScene(unsigned int) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#10 0x00007ffff5f3dd2e in QSGDefaultRenderContext::renderNextFrame(QSGRenderer*, unsigned int) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#11 0x00007ffff5f97b04 in QQuickWindowPrivate::renderSceneGraph(QSize const&) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#12 0x00007ffff5f4715e in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#13 0x00007ffff5f4b8dc in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#14 0x00007ffff4e28989 in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#15 0x00007ffff39966ba in start_thread (arg=0x7fffd4d5a700) at pthread_create.c:333
#16 0x00007ffff47353dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

I would recommend to not update to that driver and if you are affected to downgrade again.

Update

Together with NVIDIA we figured out the root problem causing the crash. The sandbox of the lockscreen introduced in Plasma 5.10 is too restrictive for the NVIDIA driver. We have now worked around the issue by removing parts of the sandbox when running on NVIDIA and released a Plasma 5.10.5.1 release. This should be shipped to distributions shortly. We will look into a way to make the sandbox work again also for NVIDIA users with Plasma 5.11.

Distribution management – how Upstream ensures Downstream keeps the Quality

I read Emmanuele Bassi’s very interesting blog post about software distribution this week and thought a lot about it. Emmanuele kind of answers to a presentation by Richard Brown (from OpenSUSE fame). While I haven’t seen that presentation, I saw a similar one last year at the openSUSE conference and also talked with Richard about the topic. So I dare to think that I understand Richard’s arguments and even agree with most of them.

Nevertheless I want to share some of my thoughts on the topic from the perspective of KWin maintainership. KWin is more part of the OS stack, so distributing through means like Flatpack are not a valid option IMHO. As KWin is close to the OS, we have a very well integration with distributions. Plasma (which KWin is part of) has dedicated packager groups in most distributions, we have a direct communication channel to the distros, we know our packagers in large parts in person, etc. etc. So from the open source distribution model we are in the best category. We are better positioned than let’s say a new game which needs to be distributed.

So this is the background of this blog post. What I’m now going to do is to describe some of the issues we hit just over the last year with distributing our software through distros. What I will show are many issues which would not have been possible if we distributed it ourselves without a distro in between (for the record: this includes KDE Neon!). To make it quite clear: I do not think that it would be a good idea for KWin to be distributed outside distributions. While thinking about this topic I came up with a phrase to what we are doing: “Distribution management”.

We do lot of work to ensure that distributions don’t destroy our software. We actually manage our distributions.

Latest example: QML runtime dependency

Aleix created a new function in extra-cmake-modules to specify the QML runtime dependencies. Once it was available I immediately went through KWin sources, extracted all QML modules we use and added it to CMake, so that it’s shown when running CMake. Reference: D7273

This is a huge issue in fact. Plasma 5.10 introduced support for virtual keyboard in the lock screen. We had to add special code for handling that this runtime dependency is missing. I sent a mail to our distributions mailing list to remind distros to package and ship this dependency.

Obviously: if we distributed the software by ourselves this would not have been an issue at all. We would just bundle Qt Virtual Keyboard and be done. But as we are distributed through distributions we had to write special handling code for the case it’s missing and send out mails to remind distros to package it.

Incorrect dependencies – too many

Although we specify all required dependencies though CMake our distributions might add random other dependencies. For example KWin on Debian depended on QtWayland (both client and server), although KWin does not need the one or the other. This got fixed after I reported it.

This is of course a huge problem. A common saying about KDE is that it has too many dependencies and that you cannot install software on e.g. GNOME, because it pulls in everything and the kitchen sink. That’s of course true if distros add incorrect additional package dependencies.

Incorrect dependencies – too few

Of course the other side is also possible. One can have too few dependencies. Again the example is Debian, which did not make KWin depend on Breeze resulting in incorrect window decoration being set. That we depend on it at compile time is also following the “distribution management” as distros asked us to add all dependencies through CMake. So we did and made KWin optionally depend on Breeze. Of course such distribution management does not help if distributions don’t check the CMake output.

Also here if we distributed ourselves we would have compiled with all dependencies.

Outdated dependencies

Another very common problem is that distributions do not have the dependencies which we need and want to use. In case of KWin this especially became a problem during the Wayland port. For a very long time we had to keep Wayland compile time optional so that distributions like Slackware [1] could continue to compile KWin. When we turned it into a hard dependency we had to inform distros about it and check with them whether it’s working. Of course just because you inform distros does not mean that it will work out later on.

That we have distros with outdated dependencies is always a sore pain. Our users are never running the software which we are, never getting the same testing. We have to keep workarounds for outdated dependencies (e.g. KWin has a workaround for an XWayland issue fixed in 1.19) and that of course results in many, many bug reports we have to handle where we have to tell the users that their software is just too old.

Also we face opposition when trying to increase dependencies, resulting in distros consider dropping the software or to revert the changes.

Handling upgrades

Another reoccurring issue is handling updates. Our users are constantly running outdated software and not getting the bug fixes. E.g. Ubuntu 16.04 (LTS) ships with KWin 5.5. We currently support KWin 5.8 and 5.10. We constantly get bug reports for such old software and that’s just because it’s outdated in the distribution. Such bug reports only cause work for us and are a frustrating experience for the users. Because the only feedback they get is: “no longer maintained, please update to a current version”. Which is of course not true, because the distro does not even allow the user to upgrade.

Even when upgrading it’s a problem. We have to inform distros about how to handle the upgrade, of course that does not help, it fails nevertheless. We also never know what are allowed upgrade paths. For us it’s simple: 5.8 upgraded to 5.9, upgraded to 5.10, etc. If we would ship upgrades ourselves we could ensure that. But distros might go from 5.8 to 5.10 without upgrade to 5.9 ever. So we need to handle such situations. This was extremely challenging for me with a lock screen change in Plasma 5.10 to ensure the upgrade works correctly.

Random distro failure

An older example from Debian: during the gcc-5 transition KWin got broken and started to crash on startup without any chance to recover. This issue would of course not happen if we would have distributed KWin ourselves with all dependencies.

The experience was rather bad as users (rightfully) complained about the brokeness of KDE. I had friends asking me in person how it could be that we ship such broken software. Of course we were innocent and this was only in the distro. But still we (KDE) get the full blame for the breakage caused by the distro.

Useless bug reports

Thanks to distributions all crash reports we get are useless. The debug packages are missing and that’s pointless. Even if debug packages are installed mostly the crash point is missing. This is especially a problem with Arch as they don’t provide debug packages. In 2017 KWin got 71 bug reports which are marked as NEEDSINFO/BACKTRACE as the reported bug is pointless.

Misinformed maintainers

I don’t know how often I got asked this: should we change Qt to be compiled against OpenGL ES? Every time I was close to a heart attack as this would break all Qt applications for all NVIDIA users (at least a few years ago there was no GLES support in the NVIDIA driver). Luckily in most cases the maintainers asked (why me and not Qt?), but I remember one case where they did not ask and just rolled it out. With the expected result.

Thoughts

I could go on with examples for quite some time. But I think it should be obvious what I want to show: even for well integrated software such as KWin the distributions are not able to package and deliver correctly. As an upstream developer one has to spend quite some time on managing the distributions, so that the software is delivered in an acceptable state to the users.

Distros always claim that they provide a level of quality. Looking at the examples above I have to wonder where it is (granted there are positive exceptions like openSUSE). Distros claim they do a licensing check. That’s useful and very much needed, but is it required that openSUSE, Debian and Arch do the same check? Furthermore I personally doubt that distros can do more than a superficial check, they would never find a case where one copies code which is license incompatible.

Given the experience I have made with distros over the last decade working on KWin I am not surprised that projects are looking for better ways to distribute software. Ways where they can control the software distribution and ensure it works. Ways where their software is not degraded due to distros doing mistakes.

All that said, I do agree with Richard in most points, just don’t think it works. If everybody would use openSUSE than Richard would be right with almost all points. But the reality is that we have a bazillion of distributions doing the same work and repeating the same mistakes. Due to that I think that for software distribution Flatpack & Co. are a very valid approach to improve the overall user experience.

[1] Slackware for a long time did not have Wayland as that would need Weston and Weston depends on PAM which is only the optional fallback for no logind support, which Slackware of course does not have