Last week we got quite some criticism about the quality of KDE Plasma 5 on the Internet. This came rather surprising for us and is at least in my opinion highly undeserved. So far what we saw is that Plasma has a high quality – probably better than previous iterations of what was known as the KDE Desktop Environment – and got lots of praise for the state it is in. So how come that there is such a discrepancy between what we see and what our users see?
The chain breaks at the weakest link
Plasma is just one piece of the user experience our users get. It’s the most visible one and it’s also the one which gets all the blame if things break. But in reality there is much more to it. Plasma depends on other libraries – most importantly Qt – and on drivers (OpenGL/Mesa) and on hardware. All of that is put together by distributions. We don’t ship the software to our users, we ship it to the distributions which then integrate it with the rest of the stack. All we can do is give recommendations to our distributions about common issues and how to prevent that they happen. Now I don’t want to move the blame to the distributions, because that would also be undeserved. I just want to explain how complex the process is.
Some numbers
So let’s start with a look at some numbers. I’m looking at a combination of the applications plasmashell, kwin, krunner and systemsettings. These are the most visible to the users. The numbers will be slightly distorted, because there were also bugs going in for the old 4.11 series and only plasmashell is a 5 only product.
Over the last 365 days 1313 crash reports were reported for these products. That’s quite a number, but means it’s only about 3.5 per day. Given our large user bases that’s not that bad. If we had a huge quality problem we would get hundreds of bug reports per day. And yes we do, I had been in such situations in KWin that we got two digits numbers of duplicates per day.
Of those 1313 crash reports 127 are still unconfirmed (we need to improve here!) and 23 confirmed. The number looks high, but now we need to look at how they distribute. There are a few bugs with a high number of duplicates. Given my limited search skills I found one with 91 duplicates which got a fix released six months ago, but we still see new duplicates for it. I come back to how that happens later. A few bugs reported several times and we get close to the numbers. I just summed up the number of the ten most often reported crashers and that’s already about half of the reported crashes.
So why are there crashes reported so often and are not immediately fixed? This is exactly the weakest link I have been talking about. It’s things we don’t have an influence on.
Graphics drivers
One of the most severe issues we hit with Plasma 5 is an issue with the Intel graphics drivers. I think that this problem is the reason why people complain about instability of Plasma. The only reason. Crashes in drivers (especially Intel) are nothing new, it’s a problem which has haunted us for years. For example the most often reported crash against KWin is a crash in the Intel driver – we worked around that issue by disabling a feature for all Intel users and will never ever enable it again.
Now I know that this sounds like passing the blame to the graphics drivers and I can imagine that you say that we need to QA test. That is true, we need to QA test, also against the drivers. Just that’s not possible. I blogged about why that’s not possible back in 2011. Nothing has changed except that the number of possible hardware and driver combinations further increased.
The biggest problem for us is that the drivers a distro will ship doesn’t exist yet when we develop and test our software. Even if we would use development drivers it would be too old compared to what distros ship.
One could say it’s the responsibility of the distribution to do the QA. Yes, they as a software integrator should make sure that the software works together. And they do, but lots of it runs in VMs which don’t provide the “real” hardware. But expecting distributions to do all the combinations we cannot test, is also wrong. If there is one which can do the quality assurance of e.g. the Intel driver it’s Intel. They have the hardware, they have the developers (AFAIK there are more fulltime devs working on the Intel drivers than we have devs on Plasma), they have the QA team. I do not know how they test, but I am sure they could spot such severe driver regressions.
What to do when we hit such a severe driver issue? Well that’s difficult. If possible we can workaround like the KWin problem I mentioned above. But that still takes time till it reaches the users. In this particular case we informed distributions about a possible workaround on Xserver level and I hope ll distributions applied it. If not please complain to your distribution.
Multi Screen woes
Another source of reported instability is multi screen handling on X11. Again the problem does not lie on our side but in this case Qt. First a word of clarification: auto-testing multi-screen on X11 is extremely difficult. Virtual X servers like Xvfb do not support the randr extension which makes it impossible to mock a correct behavior in X. Also testing with real X doesn’t help as that can only test with the screens one has and physically plugging out a screen during an auto-test isn’t really a solution.
So this sounds like blaming Qt, which of course is not what I want to do. You can rightly question why we as KDE do not help Qt with it. Well we did. Especially Dan Vratil did an incredible job of improving the experience directly in Qt. There were things which one wouldn’t believe are possible. For example there is an xrandr call to read the configuration and that blocks the Xserver in the Intel driver. Now each Qt application did that and caused a freeze. When I figured out the root source for this problem I created a remote denial of service proof of concept against X server and informed the security list about this problem. I did this in January and haven’t got a reply till now and have not talked in public about this. So yeah I just did a full-disclosure. Anyway Dan worked around this problem and fixed many more.
So why is this all still an issue? This is slightly related to how Qt and Plasma releases are not synced. For example at the moment Qt 5.5 will not receive any bug fix releases any more, but all we have in distributions is Qt 5.4. This means any fixes we do will not reach users till distributions roll out Qt 5.6. It’s rather depressive to be honest. You know you fixed an issue, but it doesn’t reach your users. I think this needs works from both Qt and the distributions. We need more bug fix releases and distributions must update Qt more often. I hope that the long term release Qt 5.6 will improve the situation as that gives us also a chance to provide bug fixes and hopefully reach our users.
As it stands there seem to still be crash cases in Qt 5.6. Since recently I have a nice tool which should allow to mock multi screen setup and I want to try to dedicate some time this week to create test cases. If that works I hope we can move this forward.
Issues fixed months ago
Another problem we noticed is that fixes we created doesn’t reach our users. This is mostly to how some distributions work with the exception of rolling release distributions. They create a “stable” and “feature frozen” product. If a major version number increases it’s not going to be updated. I just described the problem with outdated Qt, that’s part of the story. The update didn’t get into the distribution and thus fixes don’t reach our users.
Even more with frameworks we don’t provide bug fix releases and that creates “problems” for distributions. They don’t roll our the new frameworks, although they fix important issues. This is slowly improving, distributions need to get used to this process and also accept that their policy doesn’t apply.
Furthermore one needs to point out that some distributions do have additional repositories to get newer software. If you use e.g. Kubuntu 15.04 I highly recommend to use that. Every non LTS release from Ubuntu is basically a testing distribution. If you use that you decided to go with faster software updates. Please use the ways to get those updates. If you don’t want that please stay with LTS.
This is something which applies to pretty much all distributions. Stay with the long term releases if you don’t want to update to newer software.
Removed features
What we also heard a lot lately are complains about that we removed features. Yes we did that, we streamlined some implementations, we decided to focus on the core, we decided to not port all X11 specific code, but allow 3rd party applications to fill that niche (e.g. SDDM or LightDM instead of KDM). Nevertheless let’s look at some of the complains.
Legacy systray
The biggest problem for our users seems to be the removal of support of the legacy system tray (xembed). This was not a move against our users, but a change we did not expect to cause so many problems. We did evaluate the situation prior to the release and saw that it was possible to do this step without loss of functionality. Nevertheless it created problems. How was that possible?
A key feature to the switch was getting the distributions in. Our distributions had to patch software in the same way as Ubuntu did. So we collected the required changes and informed distributions about it. The patches to Qt 4, which flags to enable in which repositories and so on.
With some distributions that worked awesomely, with some we had problems. In one distributions the GNOME maintainers refused to enable the required flags (boo), one distribution decided to not enable the Qt 4 patch because that’s not what they do, but they already had patches for arm64 (WTF? Support for a theoretical architecture is more important than users on existing hardware? Wtf, wtf!) If your distribution did not do the transition in a sufficient way, please complain to them and not to us.
Now there were things which got overlooked. Skype needed proper multi-arch and the package installation did not always work. Distros were informed about this problem once we noticed. Some proprietary applications used static linking destroying the “magic”. That’s unfortunate, but also not our fault. And also in the area of proprietary applications: wine applications have no fallback. Sorry about that, but we just were not aware of that and we didn’t get a feedback quickly enough.
We noticed that this is a problem and David started to work on a solution for Plasma 5.5. I’m not happy that we had to do that as it just delays solving the actual problem: we need to get rid of the old systray. I’m already seeing the bug reports about unusable systray icons on HighDPI or not working on Wayland. There is also one solution: work together to get this transition done. Bug your distributions to include all patches (if not done yet), bug your favorite projects to port. It’s about time!
Applets per virtual desktop and dashboard
Some of the feedback we got is that users are unhappy that they cannot have different applets and wallpaper per virtual desktop and also on dashboard. Yes we understand that this is seen as a regression, so let’s look at why we changed.
First of all: change is sometimes required also removing features is sometimes required. We are sorry for the users affected by such changes, but we normally don’t do that without good reason. We are extremely careful as you can see in the systray case where we coordinated with distributions months prior to the release to make the transition smooth.
For the features here in question we need to look back at the last iteration of Plasma. Some of the things we learned was that these features just didn’t really work, they were buggy and there were many conditions were it just didn’t really work. An example is KWin which just never really knew when Plasma is in dashboard mode, we failed to properly set up the state and it just resulted in lots of quirks. So when we looked at it for Plasma 5 we realized that it’s extremely close to another mode KWin could handle well: Show Desktop. So we merged those two and improved the Show Desktop experience at the same time.
Multiple virtual desktops in Plasma had always been a quirk. I remember one discussion shortly after starting to contribute to Plasma: we were not able to properly support this feature in Desktop Grid or Desktop Cube. Technically that was just impossible how Plasma worked. We were not able to solve this problem in all of Plasma 4 and we would not be able to solve it in Plasma 5. In addition it also caused problems with various other features, so overall it looked like a better idea to disable this feature in core Plasma. This goes along with the fact that we consider Virtual Desktops as a high productivity and experienced user feature. By default they are disabled.
Now we understand that this is a feature important to some users. But we need to ask you to understand that we cannot maintain all possible features for all possible users and that sometimes the quality of the complete product is more important. Furthermore Plasma is a flexible product and it should be possible to provide this as a 3rd party feature.
Not ported featured
There are a few more features which just didn’t make it to the new release. In many cases that’s because of the features do not have a maintainer. Examples for this are KHotkeys and the application menu. In both cases the port was not trivial due to changes in the underlying infrastructure, so we couldn’t manage it with our limited resources.
But that’s of course a chance for you, dear reader. If you are a developer and loved some of the features we were not able to provide, you can step up and maintain them. We are always looking for more help!
What was important to us is to not overload ourselves with more code than we can maintain. We want to provide a high quality product which we can ensure that it is high quality. This required to cut down in some areas. I think that was the correct decision and I think we reached our goal in providing a foundation for a high quality product.
Improved Workflows
Overall with our new product we put a focus on quality and adjusted our workflows to ensure that we have a high quality. As already mentioned we decided to focus on the core and by that focus only on what we can ensure to maintain. That’s just one aspect to it. Another aspect is the increased usage of automatic testing thanks to continuous integration done directly by KDE and also by our distributions. A big thanks to openSUSE for openQA and Kubuntu for the Kubuntu CI! Those two instances have caught many issues which slipped through our CI.
Additionally we release in shorter cycles (frameworks each month, Plasma every three months). This is only possible by ensuring quality through higher code review, requirements for auto tests, etc. A good overview can be found in sebas’s blog post on the topic.