Comparing DPMS on X11 and Wayland

On the Plasma workspaces Display Power Management Signaling (DPMS) is handled by the power management daemon (powerdevil). After a configurable idle time it signals the X-Server through the X11 DPMS Extension. The X-Server handles the timeout, switching to a different DPMS level and restoring to enabled after an input event all by itself.

The X11 extension is a one-way channel. One can tell the X-Server to go to DPMS, but it neither notifies when it has done so nor when it restored. This means powerdevil doesn’t know when the screens are off. Granted, there is a call to query the current state, but we cannot perform active polling to figure out whether the state changed. That would be rather bad from a power management perspective.

So overall it’s kind of a black box. Our power management doesn’t know when the screens turn off, Neither does any other application including KWin. For KWin it would be rather important to know about it, because it’s the compositor and there is no point in running the compositor to render for something which will never be visible. The same is obviously true for any other application: it will happily continue to render updated states although it won’t be visible. For a power management feature this is rather disappointing.

Now let’s switch to Wayland – in this case kwin_wayland. Last week I started to implement support for DPMS in KWin. Given that KWin is responsible for DRM, we need to add support for it in KWin. Of course we want to leverage the existing code in powerdevil, so we don’t add the actual power management functionality, but rather an easy to use interface to tell KWin to enter a DPMS state on a given output.

Of course we tried to not repeat the mistake from X11 and thus the interface is bi-directional: it notifies when a given state is entered. Although in a Wayland world that’s not so important any more: KWin enables DPMS and knows that DPMS is enabled and can make internal use of this. So whenever the outputs are put into standby the compositor loop is stopped and doesn’t try to repaint any more. If the compositor is stopped windows are not being repainted and aren’t signaled that the last frame got painted. So all applications properly supporting frame callbacks stop rendering automatically, too. This means on Wayland we are able to properly put the rendering of all applications into an off state when the connected outputs are in standby mode. A huge improvement for power management.

Layered compositing

On X11 the (OpenGL) compositor renders into a single buffer through the overlay window. This is needed to get features like translucency, shadows, wobbled windows or a desktop cube as Xorg itself doesn’t have any support for such features. The disadvantage of this approach is that we basically always have to perform a “copy” of what needs to be rendered. Consider VLC is playing a fullscreen video the compositor needs to take VLC’s video pixmap and render it onto the overlay window. The compositor needs to run, evaluate the scene and then render the one window.

Of course the compositors optimize for this case by only repainting what is needed and repainting from top to bottom in the window stack so that only the one fully opaque window gets rendered. Even more they have features like unredirecting fullscreen windows or (in the case of KWin) blocking compositing while specific windows are active. Unredirecting is a nice idea but it doesn’t really cover the use case. If one has just a small window (a tool tip, menu, on-screen volume change display) on top of the VLC video window, we need to start compositing again. Although we still would like to just forward the one VLC window.

On Wayland the situation looks better. Instead of rendering to an overlay window the compositor can directly interact with the hardware through e.g. DRM. This allows to for example take the buffer provided by VLC and pass it directly to DRM and bypass the compositing all together. But modern hardware allows us to do more than Xorg ever allowed us through “layers”. Be it the raspberry pi, Android’s hwcomposer or Linux’s KMS Planes, the idea is always similar: let the hardware compose the final image together.

If we think of Plasma phone we realize that in most cases we just need three layers: top panel, bottom panel and a maximized window in between. There is no need to do complex composition of the scene at all as all that is needed is putting the three visible (and opaque) windows in the three layers.

On a desktop system it’s a little bit more complicated. Our windows are not all maximized, they have translucent window decorations, shadows, there is a panel with blurred background. Especially the latter one needs a composition through OpenGL. Nevertheless there should be situations where we can make use of layers. Especially in the case of a fullscreen or maximized window this should help making the compositor faster.

Using layers will mean a large change on how KWin’s compositor works. The infrastructure for unredirecting of fullscreen windows cannot be reused in a sensible way as it is evaluated only for fullscreen windows and before the compositor runs. That is before the effects are executed. The decision whether to put a window into a layer needs to come at a later point: once we have evaluated whether we need to compose the complete screen (you might be wobbling a window) or can short pass through layers.

What I have in mind is to give this architecture a try with the VC4 hardware of the raspberry pi which supports layers in a decent way. And to first adjust the QPainter based compositor (as it’s the simpler one) to make use of this architecture. The idea is to experiment with it to get a useful generic KWin-internal backend API, so that we can also use it for OpenGL compositor and even more for our DRM and hwcomposer backend.

For the DRM backend I want to use the new atomic modesetting feature which will be introduced in Linux 4.2. Given that we cannot depend on it anytime soon, this will still need some time till it will be implemented. Overall all of this is not going to happen tomorrow – there are still more important things to work on. But if you, dear reader, are interested on working on this, please contact me. I’m happy to walk you through the code and share my ideas.

As this blog post is mostly about optimizing the compositor by making use of new hardware features which we didn’t have on X11, I can already point out that I will do a follow up blog post on Vulkan.

Now to something else!

KDE is currently running a fundraiser for the Randa Meetings with the topic “Bring Touch to KDE!”. I personally will not participate in the Randa Meetings, but many other KDE developers will go there and we need your support for it. Please help and make the Randa Meetings possible.

Fundraiser

Turning the world upside down

This blog post is a rather important one for me. Not only is it the first blog post which I write in a nearby cafe sitting in the sun and enjoying spring, it is also the first blog post written in a Plasma session running inside kwin_wayland.

This marks an important step in the process of getting Plasma and KWin ready for Wayland as we have reached a state where I can dogfood kwin_wayland on one of my systems. It means that I’m confident enough to write a blog post without having to fear that the session crashes every few seconds (granted I press save a lot nevertheless).

So what do I mean when saying that I’m running a kwin_wayland session? Does it mean that everything is already using Wayland? No, unfortunately not, rather the opposite: all running applications are still using X11, but we use a rootless Xwayland server. The only Wayland application in this setup is Xwayland and KWin itself.

Nevertheless it’s a noteworthy achievement as we now have the architecture pulled upside down. No longer does KWin connect to either an X-server or a Wayland-server which does the rendering to the screen, it does it now all by itself. Over the last week I worked on the DRM (Direct Rendering Manager) backend for kwin_wayland. The DRM backend will become the primary backend for KWin, doing mode-setting, creating buffers for rendering and performing page flips of those buffers on the connected outputs and rendering the mouse cursor on the outputs. In the current state it’s able to get all connected outputs and perform mode setting on them. In addition it can power both compositors: OpenGL and QPainter. The QPainter one is the rather trivial: all we need are two buffers which we map into a QImage to render on. With each rendered frame the buffers are swapped and the last rendering buffer gets presented.

The OpenGL compositor is slightly more difficult. Here we need to introduce another component: GBM (generic buffer management). Gbm allows us to create an EGLDisplay for our DRM device, create an EGLSurface for each of the outputs and get the buffer from the surface to be passed to DRM for presenting. Unfortunately GBM is currently only available on Mesa and given NVIDIA’s presentation at last years XDC it looks like NVIDIA will come up with a different solution. My hope was that this would have settled at the time when we start implementing the solution, but unfortunately that’s not the case yet. So for the foreseeable future it looks like we will first have a Mesa specific backend, then later on can add a new NVIDIA-specific backend and hope that at some future point in time Mesa also supports NVIDIA’s solution (after all it was presented as a generic solution) and we can drop the GBM backend. The situation is none I like, but at the moment we can only ignore the proprietary drivers.

Another noteworthy detail is how KWin opens the DRM device. KWin runs as the normal non-privileged user – thus is not allowed to open the device. Just like libinput integration the DRM backend uses logind to open the device file. Given that we already have support for that in KWin it was a straight forward, elegant and secure way to implement it. I know, that not everybody will agree with this solution. But of course it’s just a DBus interface and anybody can implement it.

There is still quite some work needed before this backend, which got merged into master today, will be fully functional. For example all outputs are set to (0/0) that is render overlapped sections. Here we will in one way or another integrate with kscreen to get sane defaults from the start. I’m so looking forward to being responsible for screen layouts. If I look at all the terrible things one has to do to keep properly updated on XRandR… Now we won’t render with a wrong viewport because different parts of the stack are stuck in different states and no, we won’t freeze just because applications start and call an XRandR call. There are so many things which “just work” and which have been painful on X11. From my first tests on hardware I’m confident that we will finally have working multi-screen handling which is adequate for 2015.

Of course this is a wonderful time to start working on KWin. There are lots of small tasks to work on. I am constantly adding new tasks to our todo list. And if you ask, I can always provide some more small tasks.