Plasma 5.7 will ship with a new taskmanager library. One of the reasons to implement a new taskmanager library is the port to Wayland. Our old code base was heavily relying on X11 and didn’t support the concept of multiple windowing systems. You can read more on that in Eike’s blog post about the new task manager. In this blog post I want to focus a little bit on the Wayland side of a task manager.
Architectural difference to X11
On X11 every application is able to read every aspect of every other window. This allows to implement a task manager. The task manager can get notified when a window is added, when a window is removed and install event filters for the changes of the window properties. That’s the basics for a task manager.
On Wayland the world is different. An application is not able to see the windows of other applications. This means by default Plasma is not able to see what windows exists, the taskmanager would be empty. The only application which knows about the existing windows is the Wayland compositor.
Merging Desktop Shell and Wayland compositor?
A possible solution to this problem could be to merge the desktop shell and the compositor into one process. This is the architecture used in e.g. Unity and GNOME Shell on X11.
But such an architecture has also disadvantages. It violates the concept of separating concerns and of “doing one thing and do it right”. We have an excellent task manager, an excellent window manager and an excellent desktop shell.
Also it would make code sharing between platforms more difficult. A task manager for Wayland would be very different from a task manager on X11. All the business logic and presentation should be the same. No matter which windowing system is used.
By merging the code bases together we would also make the code less reusable. One couldn’t use KWin/Wayland in another desktop environment (e.g. LXQt) nor could one use Plasma/Wayland with another compositor (e.g. sway). While we of course recommend to use KWin as the Wayland (and X11) compositor we don’t enforce it and also don’t want to enforce it.
Protocols, protocols
So a different solution is needed and that is done through a custom Wayland protocol which is called org_kde_plasma_windowmanagement. We have a few interfaces prefixed with Plasma. This does not mean that it can only be used by Plasma. It just means that the interface was first designed and developed for Plasma’s needs. We are happy to share the interface with other desktop environments.
The protocol is provided by KWin and it announces whenever a new window gets created. Once a new window gets created a bound client can bind an org_kde_plasma_window interface and get notified about all the states of the window.
The protocol exposes pretty much all the information KWin has about the window and it gets updated automatically whenever the state in KWin changes. In addition the protocol has requests the task manager needs, like “close the window”, “minimize it”, “maximize it”, etc. etc. KWin also listens to these requests and honors them.
Although the protocol is added to Wayland, it is windowing system agnostic. The created Plasma Window does not expose the actual windowing system resource (after all on Wayland another client cannot get access to it). KWin exposes a Plasma Window for both X11 managed windows and Wayland windows. By that the task manager is able to manage tasks from multiple windowing systems without knowing that it does so.
Evolution of the protocol
We added the Plasma window protocol initially for the work on the Plasma phone about a year ago. Also on the Plasma phone we had the use case of switching windows and using the X11 based task manager was just no option on a Wayland only device 😉
While the protocol worked fine for our initial needs, it was not yet sufficient for the use in the desktop task manager. So over the last weeks we introduced further support and improved the API to make it work more reliable. Also quite some changes went into KWin to ensure that only “interesting” windows are exposed. E.g. a context menu should not be shown in the task manager.
Our KWayland library of course has support for this protocol through the classes PlasmaWindowManagement and PlasmaWindow on client side. In addition there is a model exposed as PlasmaWindowModel.
Future work
The work on the interface is not done yet. There are still changes to the API needed and the upcoming KDE Frameworks 5.23 release will include new API calls in PlasmaWindow. As well we have already the first change scheduled for KDE Frameworks 5.24. And we also know that we need to expose a few more data elements in the protocol to get the task manager to the same level as on X11.
There is also an interesting idea floating around to use the – windowing system agnostic – Wayland implementation on both X11 and Wayland. For this KWin (on X11) would need to create a dummy Wayland server for the task manager to connect to. It’s an idea which we might or might not implement.
Security considerations
Currently every Wayland client is able to bind this interface, which means that some of the security improvements of Wayland are not available in Plasma/Wayland. We are fully aware of that and were also fully aware of the consequence when we added the interface. I do have ideas on how to address this and this will be implemented before we will recommend the Plasma/Wayland session for daily usage. The design allows to add security checks in these areas. Unfortunately my priority list did not allow me to implement this for Plasma 5.7, next chance 5.8.
Nice stuff going on 🙂
Regarding changes X11 -> Wayland:
How would one implement a screenshot utility and a color picker? Would Kwin also be involved there?
Yes that needs support from the compositor. Applications cannot access the composed image, so neither a screenshot, nor a color can be picked. Both needs protocols with color picker being the easier one. All it needs is a way to ask the compositor “please give me a color” and the compositor returns them. All the picking needs to be in the compositor then.
Well, there should also be some kind of security there also, as an application could ask for the color of each pixel and effectively take a screenshot.
Making a compositor both secure and featurefull is quite a challenging task, but I trust that you Plasma devs can make one !
Not really. The compositor would be responsible for ensuring the user gets to choose which pixel gets picked, so the application couldn’t “ask for the color of each pixel” in any meaningful way.
Hell, the compositor could expose some kind of “This application doesn’t seem to be giving you time to do anything else. Do you want to block it from requesting more color pickers?” dialog similar to the interrupt dialogs browsers offer for things like unresponsive JavaScript and floods of alert() dialogs.
If the protocol does not allow to say “give me color at x/y”, but “give me color” and the user has to actively click, it becomes impossible for the application to get a screenshot that way.
How are window thumbnails handled by this, or are they?
Window thumbnails are one of the still lacking features.
Should not this task manager protocol better be part of wayland core/out-of-the-box standard instead of a kwin development?. I know the protocol is generic/reusable and someone need to do the first step (chicken-egg problem) but as this is something common to every compositor, I think it would be better maintained as part of wayland. On the other hand, if it’s in the spirit of the protocol to be the foundations of something becoming widely adopted in the future, wouldn’t be better to have a name more plasma/kde agnostic than “org_kde_plasma_windowmanagement”?.
I don’t think that this is common to all compositors. As I explained most have an architecture which make the need for such a protocol non existent. If other compositors approach us about upstreaming it, we are happy to do so. As long as we are the only users, it doesn’t make sense.
> We are happy to share the interface with other desktop environments.
Thank you thank you thank you thank you thank you! I’m looking forward to hearing more about you on what KDE will need / want and how we can build cross-DE desktop interfaces and, of course, the security that goes with it.
May I suggest you take another look at libwsm (https://github.com/mupuf/libwsm/), when you get to working on permission handling? Does it do the job in terms of what you need to express for the task manager and colour picker scenarios? Does it relinquish control to users in ways that you think are appropriate for KDE users? The question stands both conceptually (the bit that matters) and technically (though we could throw away any form of implementation in favour of something better or easier to use).
Sure we will have a look at libwsm again. Currently I’m toying with an idea based on Giulio’s suggested interface.
Oh, I had completely missed that. Interesting work, and great for pre-caching authorisation! However I can see cases where you might want to give one-time / short-lived authorisations rather than continued ones.
For instance if Skype is asking to record my desktop whilst I’m on a video call and press the “Share Desktop” button, that shouldn’t allow Skype to secretly record it too while it’s in daemon mode, because Skype might be CIA spyware for all we know. So in that situation a hard allow at client initialisation fails to capture a potentially good policy: authorise on a case-by-case basis or based on contextual cues (e.g. a Skype window being the active window).
Likewise if I’m the CIA agent / Microsoft developer currently developing and testing the spyware inside Skype, I might want a way to force-authorise the feature to help me do my debug, in which case the authoriser interface would work (but so would asking for permission every time and automatically receiving it, in so far as the compositor has a way to remember this app is always authorised).
I can see that the discussion on resources that should have only one user at a time (e.g. virtual keyboard) prompts us to change how we present authorisation to clients in the libwsm model. Maybe authorisation APIs should both present the security decision and the actual ability of the client to grab the resource (a bit like X11 might tell you a keyboard global is already grabbed; you know you can always try again later). Maybe Wayland compositors could also notify clients when a resource becomes available again once clients have asked for it?
Do you think these are valid points? If so then I expect it would be good to have this conversation on wayland-devel so I get to hear others’ opinions and ideas.
Yes that are valid points. I hadn’t thought about something like Skype yet and rather thought that allowing once will be sufficient.
Can you give some details on how you would make this secure? Thanks!
My current idea is to introduce an authorization protocol. The client would have to ask through the authorization protocol whether it’s allowed to use the privileged interface. The compositor can then decide by itself or delegate to polkit. If an interface would be used without authentication it raises a protocol error.
Hello, Martin. Do you have a plans to add support for EGL Streams to Kwin, required for Nvidia proprietary driver for work in Wayland session?
No, I don’t have any plans for that.
There is no problem merging a shell and compositor processes.
They will be still doing their one thing and doing it right.
The concerns of their code-bases will remain separated.
The only disadvantage is that when one dies, the other dies too.