Server side decorations and Wayland

I heard that GNOME is currently trying to lobby for all applications implementing CSD. One of the arguments seems to be that CSD is a must on Wayland. That’s of course not the case. Nothing in Wayland enforces CSD. Wayland itself is as ignorant about this as X11.

The situation is that GNOME Shell and Weston require CSD, but KDE Plasma and Sway do not. In fact we created a protocol (supported by GTK) that allows to negotiate with the Wayland compositor whether to use CSD or SSD.

I’m the one who drafted that protocol and I get requests about upstreaming it regularly. If I think about it, I got contacted by pretty much every toolkit about it in private – except EFL and GTK. And every time the toolkit developers tell me “this is what we want!” My answer to this is that I’m keeping out of the discussion. I’m burnt from it and are not interested in it any more.

But of course: if toolkit developers think that this is needed, don’t get fooled by GNOME asking you to implement CSD. In the same way you can ask GNOME to support server side decorations. They could do it, because they need it for XWayland anyway.

And I can totally understand you toolkit developers to not want to implement CSD. It’s a lot of work and it’s difficult to not look like an alien in the various desktop environments.

16. January 2018

KWin/X11 is feature frozen

Yesterday the KDE Community released the Beta for Plasma 5.12 LTS. With that release the feature freeze for 5.12 is in place and also an eternal feature freeze for KWin/X11. To quote the release announcement: “5.12 is the last release which sees feature development in KWin on X11. With 5.13 onwards only new features relevant to Wayland are going to be added.” This raised quite some questions, concerns and misunderstandings in the social networks. With this blog post I try to address those question and explain why this change in policy is done.

Is KWin/X11 still maintained?

Yes! We are just in the process of releasing an LTS. Of course KWin is fully maintained in the LTS life time. While in 5.8 only X11 was maintained, now we are able to offer maintenance for both X11 and Wayland. For the maintenance we do not differentiate between windowing systems.

Will X11 bugs still be fixed?

As X11 is under maintenance, I expect bugs to still get fixed.

Does this mean that in 5.13 X11 will be unmaintained?

We are going to forward port bug fixes from the 5.12 branch to master. Thus any release after 5.12 will get all bug fixes from 5.12. Given that I would say 5.13 will also be maintained on X11.

Does this mean that in the next LTS X11 will be unmaintained?

We will decide when the time comes. Currently I do not expect that we would drop maintenance.

Does this mean Plasma 5.13 will default to Wayland?

This is about feature freeze for X11. Whether Wayland will be the default or not is completely unrelated to this.

Will X11 users not get any new features in KWin?

Of course there will be new features! Most functionality in KWin is independent of the windowing system. Any improvement to those areas benefit Wayland and X11 users. Currently we have a few improvements in the pipeline, for example tooltips on decoration buttons, improved blur effect, a rework of slide desktop effect, improvements to the cube effect and a few more. All of them will be available to both X11 and Wayland users.

How do you decide whether it’s an X11 only feature?

In the case of KWin this is very simple. There are areas in our code structure which are only get pulled in if run on X11, other areas are marked in an if (x11) section. If the new feature touches this code, it’s probably not going to be added.

Does this feature freeze also apply to other KDE software?

No, but personally I would recommend any maintainer to apply a similar feature freeze. I personally will not help developing any X11 specific feature any more and resigned as maintainer of various X11 specific frameworks this week.

What are you going to do if someone present a feature for X11?

It won’t be merged.

But why?

This requires a little bit more explanation. I had a look at the most prominent issues we had over the last years. Where are our stability problems, where are our quality problems, what costs most development time? I observed that it was always in new features specific to X11. Very often even for features we added to Wayland without causing problems.

So I started to look into why that’s so. The obvious answer that we don’t get bugs for Wayland because nobody uses is, is not the case. We get many bug reports for Wayland and many users are nowadays running Wayland. So the reason must be somewhere else.

On Wayland we are able to test the code. Let’s take an example: we get input events through libinput. For this we have unit tests through mocking. Thus we can automate the testing of the lowest layer. From libinput events are sent into KWin core through an internal API and that we can also use in the integration tests. So we can simulate everything from the point where libinput events would hit KWin core. We can test this properly, we know that we get all events (because KWin is the display manager) and we can test every aspect. We can lock the screen and verify how it works, we can make applications grab the pointer or keyboard and test this. We can invoke global shortcuts, etc. etc. It’s all just as if we get the events through libinput. The quality of these new areas in KWin feels really good.

A few weeks ago some commits I did hit the Linux media about KWin/Wayland without X11 starting too fast and due to that causing bugs. These issues were found through our test suite, before any user would ever be exposed to them.

What we did in the past was taking these new features and bring them to X11. But there we cannot test. There is no way on X11 to e.g. fake a touch screen. On X11 we cannot test how this behaves if we lock the screen or used Alt+Tab. We can write the code and manually test it. Hey it works, awesome! But that was mostly not the case. There were corner cases which caused trouble. And to this comes that the main devs run Wayland instead of X11. If features break they are not exposed to the bugs.

Could you give some examples of things that broke?

Sure! The first feature we backported to X11 was “modifier only shortcut”. We spent months fixing the fallout because of X11 weirdness. Another feature backported was panels on shared screen edges. It worked great for Wayland, but on X11 some corner cases were overseen and caused issues, which affected our users and required time to fix. We backported the touch screen swipe gesture on X11 which was a mess. On X11 touch screens are also mouse events, that made it difficult and created many corner cases.

The problem is not adding the feature. The problem is fixing the bugs created by the new features.

But if a new foo requires adjustments?

KWin won’t be adjusted to any new requirements in the XServer, Mesa, input stack, proprietary drivers, etc. If something breaks it’s the fault of those components which broke it.

22. December 2017

Ten years of KDE

At the end of the year 2007 I sent my first patch to KWin. At that time 4.0 was about to be released, so that patch didn’t end up in the repo in 2007, but only in beginning of 2008 and was released with 4.1.

Ten years is a long time and we have achieved much in KWin during that time. My personal highlight is of course that KWin is nowadays a Wayland compositor and no longer an X11 window manager. According to git shortlog I added about 5000 commits to KWin and another 500 commits to KWayland.

So up to the next ten years! Merry Christmas and a happy new year 2018.

30. October 201731. October 2017

Plasma/Wayland and NVIDIA – 2017 edition

More than a year ago I elaborated whether KWin should or should not add support for NVIDIA’s proprietary Wayland solution. I think it is time to look at the situation of Plasma/Wayland and NVIDIA again. In case you haven’t read my previous blog post on that topic I recommend to read it as I use it as the base for this blog post.

Compared to a year ago not much has changed: NVIDIA still does not support the standard Linux solution gbm, which is supported by all vendors and nowadays even going to enter the mobile space. E.g. the purism phone is going to have a standard graphics stack with gbm. So no additional code required. But NVIDIA doesn’t support gbm. Instead it has a proprietary implementation called EGLStreams, which no other vendor implements. Due to that Plasma/Wayland cannot support OpenGL for NVIDIA users.

But current KWin master branch (what will become 5.12) supports automatical fallback to Qpainter compositing in case OpenGL fails. With that it should be possible to run Plasma/Wayland on NVIDIA hardware. Granted it won’t be accelerated, but as explained above that is only NVIDIA’S fault. Also I have not tested this yet. It might be that NVIDIA doesn’t support dumb buffers on DRM.

In my blog post from last year I elaborated whether KWin should add support for NVIDIA’S proprietary solution. I put out a requirement that the patches for Weston need to be merged first:

For KWin such patches do not exist and we have no plans to develop such an adaption as long as the patches are not merged into Weston. Even if there would be patches, we would not merge them as long as they are not merged into Weston.

A year passed and nothing happened. I don’t expect anything further will happen: the patches won’t be merged into Weston. Given that it doesn’t make any sense to make this a requirement for KWin. If we hold up the requirement it would be like saying we would never accept it.

Last year I hoped that XDC would find a solution for the problem:

We do hope that at XDC the discussions will have a positive end and that there will be only one implementation.

And it looked positive: a new universal buffer allocation library got announced. I hoped that this would solve the situation pretty quickly. The new library would get implemented, Mesa and NVIDIA add support for it, we slightly adjust our code and drop gbm. Everybody happy.

But now another XDC passed and from what I understand not much process was made. The library is not ready and no driver uses it. We waited one year for it, I doubt waiting another year will change the situation. Realistically I think it will take another two to three years for the library to get implemented if at all. Then it will take about a year for Mesa gaining support and shipping a release. Yet another year for distributions shipping with that version so that we can depend on it without people killing us. So maybe in five years there will be a replacement. I don’t think we can wait that long.

So does this mean KWin will gain support for EGLStreams? Certainly not! I do not think that the KDE community should spend any time to support NVIDIA’s proprietary solution! We are a free software community and we should not implement code which only benefits proprietary non-free solutions. There are way more free things to do to improve Wayland without having to write code for proprietary solutions.

Also there is another aspect which I did not consider in my blog post last year: XWayland. XWayland also uses gbm and does not have support for NVIDIA’s proprietary solution. Due to that one does not have OpenGL for any legacy X application. This probably includes most games, but currently also browsers such as Firefox and Chromium. I don’t think that this would give a satisfying experience to NVIDIA users. And it is also hardly better than rendering everything through the QPainter compositor. This is another reason for the KDE Community to not spend any time on implementing support for EGLStreams.

But I think there is lots NVIDIA could do. Today I would accept a patch for EGLStreams in KWin if NVIDIA provides it. I would not be happy about it, but I would not veto it. If it is well implemented and doesn’t introduce problems for the gbm implementation I would not really have an argument against it. But I expect NVIDIA to do it. I don’t want a contribution from a non-NVIDIA developer. This mess was created by NVIDIA, NVIDIA needs to fix it.

Similar I think that NVIDIA should adjust XWayland. I understand that NVIDIA is not happy with the design of XWayland, but nevertheless they should make it work. Their users pay quite some money for the hardware. I think they have a right to demand from NVIDIA to fix this situation. Ideal would of course be NVIDIA adopting gbm. But as that seems unlikely, I think it is the duty of NVIDIA to provide patches for their users.

29. September 2017

KWin/Wayland goes real time

Today I landed a change in KWin master branch to enable real time scheduling policy for KWin/Wayland. The idea behind this change is to keep the graphical system responsive at all times, no matter what other processes are doing.

So far KWin was in the same scheduling group as most other processes. While Linux’s scheduler is completely fair and should provide a fair amount of time to KWin, it could render the system hard to use. All processes are equal, but some processes are more equal than others. KWin – as the windowing system – is to a certain degree more equal. All input and all rendering events need to go through KWin.

We can now imagine situations where a system can become unusable due to KWin not getting sufficient time slots. E.g. if the user moves the mouse we expect a timely cursor motion. But if KWin is not scheduled the system is quickly starting to lagging. Basically what we expect is that when the mouse moves with the next page flip the cursor must be at the updated position. A page flip happens normally every 16 msec (60 Hz), so if we miss one or two we are in the area where a user visually notices this. For a gamer with a high precision mouse this could be quite problematic already.

Another situation is a few processes running amok and one wants to switch to a virtual terminal to kill the processes. The key combination (e.g. Ctrl+Alt+F1) needs to go through KWin. Again the user wants to have a responsive system in order to be able to switch the session.

While thinking about these general problems I came to the conclusion that we should try making KWin a real time process. This would ensure that KWin gets the CPU whenever it wants to have the CPU. This is actually quite important: KWin is only taking CPU if there is a reason for it, e.g. an input event or a window requesting a repaint, etc. So by adding a real time scheduling policy to KWin we can ensure that all input events and all rendering events are handled in a timely manner. My hope is that this in general results in slightly smoother experience and especially in a smoother experience if the system is under heavy load.

Of course I considered the disadvantages. If KWin is a real time process, it always wins. It’s on the scheduler’s fast lane. This could of course be a problem for other processes actually demanding a higher scheduling policy or which are real time itself. E.g. a game or a video player might already be real time. But here we also see that there is no point in having the game be real time for fast input if the windowing system itself is not real time. So this is actually not a disadvantage, but rather another reason to make KWin a real time process.

There could be other real time processes which are more important. Of course those should win against KWin. To support this KWin requests only the minimum priority, so any other real time process in the system is considered more important than KWin.

The only real “problem” I see would be KWin running amok, because then KWin would get too much CPU time. But in this case the system would be broken anyway. If e.g. KWin would be stuck in an endless loop one would not be able to switch to another VT as KWin is stuck in the endless loop. This is obviously only a theoretical example.

So how is it done? KWin gained a new optional dependency on libcap and on installation sets the capability CAP_SYS_NICE on the kwin_wayland binary. @Distributions: please update your build deps. On startup kwin_wayland adjusts the scheduler policy to request SCHED_FIFO and drops the capability again before doing anything else.

As this change is in master it won’t be part of the Plasma 5.11 release which also means we have now the maximum time span till it ends up in a release. I would be happy about users testing it and especially report any negative influences of the change.

24. September 2017

Announcing the XFree KWin project

Over the last weeks I concentrated my work on KWin on what I call the XFree KWin project. The idea is to be able to start KWin/Wayland without XWayland support. While most of the changes required for it are already in Plasma 5.11, not everything got ready in time, but now everything is under review on phabricator, so it’s a good point in time to talk about this project.

Why an XFree KWin?

You might wonder why we spend time on getting KWin to run without X11 support. After all we need to provide support for XWayland anyway to be able to support legacy applications which are not yet on Wayland. So what’s the advantage if in a normal session one needs XWayland anyway?

One aspect is that it shows that our porting efforts to Wayland are finished. Over the last years I tried to start KWin without XWayland support for a few times just to find areas which are not yet ported. By being able to run KWin without X11 support we know that everything is ported or at least does not hard depend on X11 any more.

Another aspect is obviously Plasma Mobile which does not really require XWayland and also XWayland not making much sense on e.g. the libhybris enabled systems as Xwayland doesn’t have OpenGL there. By not requiring XWayland we can reduce our runtime and memory costs.

Speaking of runtime costs: not requiring X11 means that we don’t have to wait for XWayland during KWin startup. Instead XWayland can be started in parallel. This means KWin and the complete Plasma session starts a little bit faster.

And most important this is an important prerequisite to be able to handle a crashing XWayland. So far when XWayland crashed KWin terminated gracefully as KWin depends on X11. The hope is that when XWayland crashes we can just restart it and keep the session running.

How it was done

The general idea behind getting KWin X11 free is “code that isn’t loaded, cannot interfere”. KWin uses platform plugins (not Qt QPA plugins) for the various platforms KWin can run on. There is also a platform plugin for KWin/X11, so code which is only required in the KWin/X11 case can be moved into the platform plugin. As KWin/Wayland does not load this plugin we are certain that the code will not be loaded and thus cannot interfere.

But how to find code which is only required on KWin/X11? After all KWin’s code base is about 150 kSloc (according to cloc) and that makes it rather difficult. A good help here was our CI system which performs code coverage. KWin’s tests mostly are mostly based on KWin/Wayland so an area which does not have any test coverage is a good candidate for being X11 specific. By looking at these areas it was possible to find patterns which also helped to find more usages. A good help is KWin’s internal X11 API such as displayWidth, displayHeight, rootWindow and connection. The usage of these functions is partially so few that one could just evaluate each usage. As a result of this work the functions displayWidth and displayHeight are not used at all any more.

Plugin based compositors

Another idea was to get our compositors into plugins. Especially the XRender based compositor is not of much use in a Wayland world and thus should not be loaded into the binary. Unfortunately various parts of KWin called directly into the concrete compositor implementations, so to solve this we had to extend the internal API. In Plasma 5.11 the XRender and QPainter compositor are now loaded as plugins, so on Wayland the not-compatible XRender compositor is no longer loaded into memory and on X11 the not-compatible QPainter compositor is no longer loaded into memory. But also on Wayland the QPainter compositor is only loaded into memory if it is going to be used.

The OpenGL compositor is still always loaded in Plasma 5.11, but the change to have it as a plugin is already prepared and will be merged into master soonish. This will bring great advantages to the stability of the system: currently we are not able to define which platform supports which compositor as the initialization code just didn’t support being more flexible. But now with the plugin based approach I’m very confident that we can make this work in a better way.

Outlook

Being able to start and run KWin/Wayland without X11 support is of course only the start. More changes will be required. For example to delay loading XWayland until an application tries to connect to it (c.f. Weston). This would not make much sense in the start of Plasma yet as we still have applications in our startup which require X11 (e.g. ksmserver).

Another area is to try to get KWin compile without X11 support and to move everything required for Xwayland into a plugin. This will be a larger project as it will require to move much code around and to add further abstractions in some areas of KWin. Hint: this could be a nice Google Summer of Code project. As a fast step for especially Plasma Mobile and the Purism Librem phone an ifdef around every X11 code area could be a solution.

But my personal main goal is to be able to handle a crashing XWayland. This will also be an interesting task as the X11 code in KWin also depends on other libraries (e.g. KWindowSystem) which do not expect to outlive the X server. So that will be a challenging task to find all areas where KWin holds X11 data structures and to clean them up properly without crashing due to some cleanup code calling into xcb.

27. August 201731. August 2017

[UPDATE] Warning: NVIDIA driver 384.69 seems to be broken with QtQuick

Just a short warning to KDE Plasma users with NVIDIA drivers. Lately we have seen many crash reports from NVIDIA users who updated to version 384.xx. Affected is at least KWin and KScreenLocker, which means one cannot unlock the session any more. The crash happens in the NVIDIA driver triggered from somewhere in QtQuick, so completely outside of our code. A typical backtrace looks like that:

#0  0x00007fffdedffc02 in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#1  0x00007fffdf0ca9ba in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#2  0x00007fffdef953da in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#3  0x00007fffdefab2d4 in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#4  0x00007fffdef99bce in ?? () from /usr/lib/nvidia-384/libnvidia-glcore.so.384.69
#5  0x00007ffff5f0b681 in QSGBatchRenderer::Renderer::renderMergedBatch(QSGBatchRenderer::Batch const*) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#6  0x00007ffff5f0bf8d in QSGBatchRenderer::Renderer::renderBatches() () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#7  0x00007ffff5f11909 in QSGBatchRenderer::Renderer::render() () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#8  0x00007ffff5f0209f in QSGRenderer::renderScene(QSGBindable const&) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#9  0x00007ffff5f0257b in QSGRenderer::renderScene(unsigned int) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#10 0x00007ffff5f3dd2e in QSGDefaultRenderContext::renderNextFrame(QSGRenderer*, unsigned int) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#11 0x00007ffff5f97b04 in QQuickWindowPrivate::renderSceneGraph(QSize const&) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#12 0x00007ffff5f4715e in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#13 0x00007ffff5f4b8dc in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#14 0x00007ffff4e28989 in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#15 0x00007ffff39966ba in start_thread (arg=0x7fffd4d5a700) at pthread_create.c:333
#16 0x00007ffff47353dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

I would recommend to not update to that driver and if you are affected to downgrade again.

Update

Together with NVIDIA we figured out the root problem causing the crash. The sandbox of the lockscreen introduced in Plasma 5.10 is too restrictive for the NVIDIA driver. We have now worked around the issue by removing parts of the sandbox when running on NVIDIA and released a Plasma 5.10.5.1 release. This should be shipped to distributions shortly. We will look into a way to make the sandbox work again also for NVIDIA users with Plasma 5.11.

13. August 2017

Distribution management – how Upstream ensures Downstream keeps the Quality

I read Emmanuele Bassi’s very interesting blog post about software distribution this week and thought a lot about it. Emmanuele kind of answers to a presentation by Richard Brown (from OpenSUSE fame). While I haven’t seen that presentation, I saw a similar one last year at the openSUSE conference and also talked with Richard about the topic. So I dare to think that I understand Richard’s arguments and even agree with most of them.

Nevertheless I want to share some of my thoughts on the topic from the perspective of KWin maintainership. KWin is more part of the OS stack, so distributing through means like Flatpack are not a valid option IMHO. As KWin is close to the OS, we have a very well integration with distributions. Plasma (which KWin is part of) has dedicated packager groups in most distributions, we have a direct communication channel to the distros, we know our packagers in large parts in person, etc. etc. So from the open source distribution model we are in the best category. We are better positioned than let’s say a new game which needs to be distributed.

So this is the background of this blog post. What I’m now going to do is to describe some of the issues we hit just over the last year with distributing our software through distros. What I will show are many issues which would not have been possible if we distributed it ourselves without a distro in between (for the record: this includes KDE Neon!). To make it quite clear: I do not think that it would be a good idea for KWin to be distributed outside distributions. While thinking about this topic I came up with a phrase to what we are doing: “Distribution management”.

We do lot of work to ensure that distributions don’t destroy our software. We actually manage our distributions.

Latest example: QML runtime dependency

Aleix created a new function in extra-cmake-modules to specify the QML runtime dependencies. Once it was available I immediately went through KWin sources, extracted all QML modules we use and added it to CMake, so that it’s shown when running CMake. Reference: D7273

This is a huge issue in fact. Plasma 5.10 introduced support for virtual keyboard in the lock screen. We had to add special code for handling that this runtime dependency is missing. I sent a mail to our distributions mailing list to remind distros to package and ship this dependency.

Obviously: if we distributed the software by ourselves this would not have been an issue at all. We would just bundle Qt Virtual Keyboard and be done. But as we are distributed through distributions we had to write special handling code for the case it’s missing and send out mails to remind distros to package it.

Incorrect dependencies – too many

Although we specify all required dependencies though CMake our distributions might add random other dependencies. For example KWin on Debian depended on QtWayland (both client and server), although KWin does not need the one or the other. This got fixed after I reported it.

This is of course a huge problem. A common saying about KDE is that it has too many dependencies and that you cannot install software on e.g. GNOME, because it pulls in everything and the kitchen sink. That’s of course true if distros add incorrect additional package dependencies.

Incorrect dependencies – too few

Of course the other side is also possible. One can have too few dependencies. Again the example is Debian, which did not make KWin depend on Breeze resulting in incorrect window decoration being set. That we depend on it at compile time is also following the “distribution management” as distros asked us to add all dependencies through CMake. So we did and made KWin optionally depend on Breeze. Of course such distribution management does not help if distributions don’t check the CMake output.

Also here if we distributed ourselves we would have compiled with all dependencies.

Outdated dependencies

Another very common problem is that distributions do not have the dependencies which we need and want to use. In case of KWin this especially became a problem during the Wayland port. For a very long time we had to keep Wayland compile time optional so that distributions like Slackware [1] could continue to compile KWin. When we turned it into a hard dependency we had to inform distros about it and check with them whether it’s working. Of course just because you inform distros does not mean that it will work out later on.

That we have distros with outdated dependencies is always a sore pain. Our users are never running the software which we are, never getting the same testing. We have to keep workarounds for outdated dependencies (e.g. KWin has a workaround for an XWayland issue fixed in 1.19) and that of course results in many, many bug reports we have to handle where we have to tell the users that their software is just too old.

Also we face opposition when trying to increase dependencies, resulting in distros consider dropping the software or to revert the changes.

Handling upgrades

Another reoccurring issue is handling updates. Our users are constantly running outdated software and not getting the bug fixes. E.g. Ubuntu 16.04 (LTS) ships with KWin 5.5. We currently support KWin 5.8 and 5.10. We constantly get bug reports for such old software and that’s just because it’s outdated in the distribution. Such bug reports only cause work for us and are a frustrating experience for the users. Because the only feedback they get is: “no longer maintained, please update to a current version”. Which is of course not true, because the distro does not even allow the user to upgrade.

Even when upgrading it’s a problem. We have to inform distros about how to handle the upgrade, of course that does not help, it fails nevertheless. We also never know what are allowed upgrade paths. For us it’s simple: 5.8 upgraded to 5.9, upgraded to 5.10, etc. If we would ship upgrades ourselves we could ensure that. But distros might go from 5.8 to 5.10 without upgrade to 5.9 ever. So we need to handle such situations. This was extremely challenging for me with a lock screen change in Plasma 5.10 to ensure the upgrade works correctly.

Random distro failure

An older example from Debian: during the gcc-5 transition KWin got broken and started to crash on startup without any chance to recover. This issue would of course not happen if we would have distributed KWin ourselves with all dependencies.

The experience was rather bad as users (rightfully) complained about the brokeness of KDE. I had friends asking me in person how it could be that we ship such broken software. Of course we were innocent and this was only in the distro. But still we (KDE) get the full blame for the breakage caused by the distro.

Useless bug reports

Thanks to distributions all crash reports we get are useless. The debug packages are missing and that’s pointless. Even if debug packages are installed mostly the crash point is missing. This is especially a problem with Arch as they don’t provide debug packages. In 2017 KWin got 71 bug reports which are marked as NEEDSINFO/BACKTRACE as the reported bug is pointless.

Misinformed maintainers

I don’t know how often I got asked this: should we change Qt to be compiled against OpenGL ES? Every time I was close to a heart attack as this would break all Qt applications for all NVIDIA users (at least a few years ago there was no GLES support in the NVIDIA driver). Luckily in most cases the maintainers asked (why me and not Qt?), but I remember one case where they did not ask and just rolled it out. With the expected result.

Thoughts

I could go on with examples for quite some time. But I think it should be obvious what I want to show: even for well integrated software such as KWin the distributions are not able to package and deliver correctly. As an upstream developer one has to spend quite some time on managing the distributions, so that the software is delivered in an acceptable state to the users.

Distros always claim that they provide a level of quality. Looking at the examples above I have to wonder where it is (granted there are positive exceptions like openSUSE). Distros claim they do a licensing check. That’s useful and very much needed, but is it required that openSUSE, Debian and Arch do the same check? Furthermore I personally doubt that distros can do more than a superficial check, they would never find a case where one copies code which is license incompatible.

Given the experience I have made with distros over the last decade working on KWin I am not surprised that projects are looking for better ways to distribute software. Ways where they can control the software distribution and ensure it works. Ways where their software is not degraded due to distros doing mistakes.

All that said, I do agree with Richard in most points, just don’t think it works. If everybody would use openSUSE than Richard would be right with almost all points. But the reality is that we have a bazillion of distributions doing the same work and repeating the same mistakes. Due to that I think that for software distribution Flatpack & Co. are a very valid approach to improve the overall user experience.

[1] Slackware for a long time did not have Wayland as that would need Weston and Weston depends on PAM which is only the optional fallback for no logind support, which Slackware of course does not have

17. July 2017

KWin requires C++14

This is a short public service announcement: KWin master as of today requires a compiler which supports C++14. This means at least gcc 5 or clang 3.4. All major distributions support at least one of the two.

15. July 2017

Plasma Wayland and Qt 5.9 and beyond

As you might know Qt 5.8 created challenging problems for our Wayland session and threw our efforts back quite a bit. In this post I want to discuss the actual problems it created, how we are addressing them and looking into the future.

How our integration used to work

Our integration uses additional Wayland protocols. We have a protocol for server side window decorations which we use in our Plasma integration plugin to inform KWin whether the window should have a decoration or not. We have a protocol for client provided shadows which is e.g. used by our widget style Breeze to add shadows to the context menus. We have a protocol for the desktop shell, so that it can mark windows as desktop, panel, auto hiding panel, position the window, etc. Also we have a few protocols for interacting with our effect system, e.g. sliding popups, blur behind.

To use these protocols we need to interact with Qt in a low level way. We use the native interface in the Qt Platform Abstraction to get a wl_surface pointer for the QWindow. In order to not have to keep this simple for our applications our KWayland::Client API provides an API point for it: Surface::fromWindow(QWindow*) -> Surface*.

But when exactly to inject our own integration? We found a very handy way which worked much better than what we had used in the past for X11 (and based on that also transitioned X11 code to use it). Qt emits an event once it has created the native platform surface (in case of Wayland the wl_surface*) for a QWindow. Verbatim quote of the documentation:

The QPlatformSurfaceEvent class is used to notify about native platform surface events.

Platform window events are synchronously sent to windows and offscreen surfaces when their underlying native surfaces are created or are about to be destroyed.

Applications can respond to these events to know when the underlying platform surface exists.

Awesome! We get an event when the surface is created and when it gets destroyed. This made it very simple to create the integration and what is really important for us is that we get this event before the window is shown. So we can prepare everything so that KWin gets a good state.

And the KWin side was to a large part implemented on assumptions on how the sequence will work. We first get the surface, then the (xdg) shell surface, then the integration bits. Sure it would be nice if KWin handled also other sequences, but as the only implementation of this is Qt it doesn’t make sense to really care about it. We know that it was not perfect, we even had the test cases for it, which expect failed.

What broke with Qt 5.8?

In Qt 5.8 our complete integration broke. When the platform surface created event was emitted the wl_surface was not created, see QTBUG 58423. This is in my humble opinion a clear violation of the documented behavior and thus a breakage of the stability guarantees Qt provides, but others might disagree. After some discussion, trying out patches by those who had a Qt 5.8 build we had a patch for Qt which made things mostly work together with a patch to KWin. But at the time we had the patch ready Qt 5.8 was already declared end of life with no prospect of a Qt 5.8.1 bug fix release. For our Wayland session it was just impossible to get Qt 5.8 compatible again. All we could do was to advise distributions to not combine Qt 5.8 with the Wayland session. From our side it was fine for distros to ship Qt 5.8, but if they do they should make it impossible to install the Wayland session. The state was just too broken.

Qt 5.9

With Qt 5.9 the situation looks better. The required patch is merged, it’s an LTS release and we already had the first bug fix release. Qt wants to create more bug fix releases for it and this allows us to use it as a new target for integration. But still the situation is not as good as it used to be. If you currently use our Wayland session with Qt 5.9 you will still see quite some rough corners compared to where we were with Qt 5.7.

The main problem for us is that the platform surface regression was not the only change affecting us. Pre Qt 5.7 a wl_surface lived as long as a QWindow. Now the wl_surface gets destroyed whenever the window gets hidden and a new one created on every show. Unfortunately without a platform surface created event. This means our integration breaks as soon as a window gets hidden. E.g. after closing KRunner the integration for KRunner is broken.

We tried to address this problem in various places, but it is challenging. On the show event we don’t have the wl_surface yet (too early), on the expose event the window is already mapped (too late). This creates problems for KWin which is not prepared for the integration bits to hit when the window is mapped. Our protocols were designed with the platform surface created event semantics. For example in KRunner we face an issue with the integration. KRunner is a panel, which accepts focus and allows windows to go below, also it positions manually. Now when it gets re-shown KWin doesn’t know that this window is supposed to position itself and positions it. Now we get the request to put it as panel, KWin adjusts and moves maximized windows around. And then we get the request that the panel allows windows to go below. KWin shuffles the windows again as the maximized area changed again.

This is a rather tricky situation as we cannot really do something about it. If the window is mapped it’s too late. Even if we improve our API to handle the situation better it will be too late.

There are two possibilities to handle this: Qt stops to destroy the surface or sends a platform surface created event when it recreates the surface. The latter would be my personally preferred solution as this would match the documentation again and allow us to just use the one event handler.

Other regression

The situation around the changed behavior in Qt 5.8 caused a few steps back. Our code needed to be adjusted and that sometimes caused issues. We had a few regressions which also affected the compositor, so the stability of the whole system suffered. These issues are luckily investigated and fixed. But there are still bugs lurking in the system.

For me personally the most annoying bug is a crash in Qt which affects the auto completion of kdevelop. This makes hacking in a Wayland session rather difficult. I’m running currently a patched KWin which disables the virtual keyboard integration to not hit this issue.

A huge problem is that context menus are not marked as transient windows. This means that the Wayland compositor does not know that it is a menu and positions the menu anywhere on the screen. It gives the system a very unfinished touch.

If KRunner is closed through the escape key, the key starts to repeat on the window constantly and due to that it is not possible to open KRunner again. Similar if you start an application in Kickoff through the enter key, when opening Kickoff again it automatically launches the currently selected item. This again makes it very difficult to use the session and gives the whole system an unfinished look. We are working on a workaround for this issue in the server.

Towards the future

Qt 5.9 is here to stay and that’s what we have to use as integration target. Given that Qt 5.9 and Qt 5.7 behave very differently it will become difficult for us to maintain support for both. My suggestion is that we drop support for Qt 5.7 and require Qt 5.9 for the Wayland session. In addition there is hope that we can improve the integration. Marco and David have been working on adding support for XDG Shell unstable v6 which is already supported by Qt and makes it easier to integrate with. Once this landed in KWin Qt will be switched from wl_shell to xdg shell. This will improve the situation for us quite a bit as we then have one code path for both Qt and GTK applications.

How to prevent such regressions in the future

The change of behavior in Qt 5.8 threw our Wayland efforts back a few months. This is something we communicated to Qt quite early and it’s something which worries me a lot. We cannot spend time on changing our integration every time Qt releases a new version. Given that we need to look into how to prevent that such a situation happens again.

I hope that we can improve the integration on the testing front between Qt and KDE. We have a huge test suite which can find regressions in Qt. If Qt would run KDE’s tests during the integration phase Qt would notice regressions before they hit the code base. Given that all our tests are free software it should be possible for Qt to integrate them.

But also the other side would be interesting: if we could get the latest Qt into our CI system we could also discover breakages early. We have now a new docker based CI system which allows running multiple builds of the same change (e.g. Plasma gets build on openSUSE tumbleweed and on FreeBSD) – an image with a daily or weekly Qt snapshot could help us and Qt a lot to detect breakages early.

I also hope in openQA which allows to test the full operating system. This would spot regressions like the misplaced context menus even if KWin’s own auto tests would not spot them (KWin doesn’t care about Qt there, only about the Wayland and X11 protocols). There we might need to invest some work to make sure that KWin/Wayland can be properly run in the openQA tests.

I hope that our Plasma devs can discuss this in more detail with Qt devs during Akademy in person. Unfortunately I cannot be at Akademy this year 🙁 so I cannot discuss in person.

Last but not least it is important that developers test. It would help a lot if the developers working on QtWayland test their changes in a running Plasma Wayland session. We are now overall in a state that the session is suited for hacking on. I do all my Wayland hacking in a Wayland session, experiencing all the glitches like kdevelop crashing.

Of course you might wonder what about us? Shouldn’t we KDE devs also test against the latest Qt? For me personally that is not the case. I’m working on the server side and not on the client side. I’m also not testing the latest GTK for example. Nevertheless I tried to use Qt 5.9 before it got released. Used the installer, spent a day to compile everything on top of it it, just to notice that it doesn’t have QtWayland and won’t get it. I didn’t give up that easily. So I tried to compile QtWayland myself. But when I tried to use it, it turned out to not have any keyboard support, because qtbase was compiled without support for xbkcommon. At that point I gave up. Not having QtWayland is one thing, but not being able to use keyboard is another, it’s rather pointless. The only other option is to compile Qt, but that is hardly an option as it’s really difficult to compile an actually working Qt with all components. The last times I tried, I failed, wasting days compiling. If there were usable weekly images for Qt I would be happy to try it, but of course only with a properly compiled and included QtWayland.