In the last blog post I explained how input devices are opened and handled in KWin. In this blog post I’ll have a closer look on keyboard devices and events.
Keyboard are not keyboards
Keyboards on Linux are weird. You don’t have one keyboard but many of them. Many devices also announce to be a keyboard and just support one key. A good example for this is the power button or an external headset which provides mute, volume up/down keys. From an input perspective such devices are also keyboards.
For us in KWin it is important to figure out what the keyboard really supports. If there is no “real” keyboard attached (or enabled), our virtual keyboard should get activated automatically. E.g. if you detach the keyboard from a convertable it should turn into tablet mode by having a virtual keyboard. When attaching the keyboard, the virtual keyboard should be disabled as the primary text input device. libinput provides a function to test which keys are supported. We use that to differentiate the classes of keyboards.
Keyboards are the most simple input devices out there. Libinput only emits one event of type LIBINPUT_EVENT_KEYBOARD_KEY and that only contains the key which was either pressed or released. KWin reads events from libinput in a dedicated thread, so each event only gets queued and our main thread is notified about the new event. Once the main thread processes the event, the event gets translated into our input redirection classes. All input events go through the input redirection, no matter from which source the events are delivered. KWin does not only support events from libinput, but also the nested setups (KWin running on top of X11 or on top of another Wayland server) and fake events used in our integration tests. This means once the event reaches the input redirection we in general lose the information which device created the event. Though recently we extended the internal API to optionally include the device in the event handling. This is used by the Debug Console to show on which device an event was generated. But more on that later.
Now the key press/release event has reached our central dispatching method KeyboardInputRedirection::processKey. The first (and most important) task is to update the keyboard state in xkbcommon. Xkbcommon is used to translate a hardware key with a layout to the actual key symbol depending on the state of the keyboard (e.g. active modifier). To explain: if I press the “y” (key code 21) key and have the “Shift” key pressed, it will create a “Z” with the German keyboard layout, but a “Y” with the English layout. Simplified that’s the job of xkbcommon.
In KWin we have wrapped all functionality for xkbcommon in a dedicated class called Xkb. This class tracks for us the active layout and performs the layout switching (including showing the OSD when the layout changes). It knows the last composed key symbols, the currently active modifiers and the modifiers relevant for shortcut activation.
When updating the state of xkb we also check what changed. Did the user activate the num lock? If yes we need to announce that the LEDs changed, so that our libinput code can update the LEDs on the physical keyboard. Did a modifier change? If yes we need to inform our Wayland windows about the new modifier set. In Wayland this is tracked on the server, although the actual translation from key to symbol happens on the client. So why does KWin also do the translation? KWin also needs the keysym in various places, e.g. the filter in Present Windows or in general for triggering global shortcuts.
Our Xkb state updating functionality is also responsible for handling modifier only shortcuts. Actually it’s the wrong place for it, but our input filtering code does not guarantee that a filter sees all input events. For the modifier only shortcuts it’s essential to see all events, though, and the only place is directly in Xkb. Not the most elegant solution, but it works. This functionality is also used by X11 as I explained in an older blog post.
Filtering through KWin
Now KWin has enough information to process the key event. For that it creates a customized QKeyEvent and sends it through an input filter chain. KWin’s input processing is using a chain of input filters. Each filter can perform an operation based on an event and decide whether the event should be further processed or whether event processing should end.
For example pretty early in the chain we have the lock screen filter. If the screen is locked this filter intercepts the event processing and ensures that the event is only sent to the screen locker and not to any window. Or there is a filter ensuring that ctrl+alt+f1 works even if the screen is locked. Another filter is responsible for handling global shortcuts, one for passing events to our effects system (such as Present Windows).
The last filter in the chain is our forwarding filter. The task of this filter is to forward the events to a window. It passes the event to KWayland::Server from where it is sent to the currently focused Wayland surface.
Focused Keyboard surface
The Wayland server needs the focused keyboard surface for that. In case of keyboard focus that is relatively trivial in KWin. KWin has a concept of an “active” window. Before forwarding the event KWin verifies which is the focused keyboard window. If there is an active window the surface of that window is marked as the focused keyboard surface in KWayland::Server.
Our KWayland::Server library takes care of sending a keyboard leave and keyboard enter event to the respective windows, so that KWin doesn’t have to care about this. This is one of our advantages by having an abstraction with KWayland::Server – everything that is not of relevance to the compositor is handled directly in the library.
Key event processing in Wayland
The forwarding input filter updated the keyboard surface and sends now the key event to the Wayland client. For that all the processing into keysymbol is no longer needed, the key code is sent to the client.
The client gets the key event through a callback and now also sends it through xkbcommon. In Wayland the keymap is sent from the server to the client, so that both server and client have the same keymap. The client can now do a translation from key code to key symbol, just like KWin did before.
The further event processing is handled inside the client. E.g. in Qt this will generate a QKeyEvent which is then sent to the focused widget.
Keyboard input has also a special mode: repeating keys. When a key is pressed, some of them should generating repeating keys. KWin uses the configuration from the keyboard module to decide when and how often a key should repeat. A repeating key is not forwarded to the Wayland clients. Instead KWin tells through the Wayland Keyboard protocol the settings for key repeat and this is than handled directly in the client.
Unfortunately in Qt this is broken and a hardcoded value is used. So currently in a Plasma Wayland session key repeat is rather broken as it’s handled differently depending on the used application. KWin is correct, X11 applications are correct, GTK applications are correct, Qt applications are incorrect, if run on Wayland.
If you want to support our work, consider donating to our Make the World a Better Place! – KDE End of Year 2016 Fundraising campaing.
42 Replies to “How input works – Keyboard input”
Which point in this grand picture do you expect autotype functionality of password managers (for example, KeePassX) to attach to? Losing autotype with migration to Wayland will be unfortunate, but IIUC it’ll be required that the password manager is “trusted” in some way, won’t it?
I have no idea what “autotype” is.
Basically simulating the user typing in the password stored in the password manager. Sure, you can copy the password to the clipboard, but autotype works in some cases clipboard does not (for example, if the target app is in a VM which has a separate clipboard not integrated with the host one).
There is no protocol to fake key events. Faking key events is dangerous and should not be allowed by default, it’s very easy to abuse. On X11 it’s a testing extension, not a standard feature. For to me unknown reasons it’s enabled by default.
A particular problem is that the keypassx cannot know the keyboard layout of the application it’s trying to write to. What it has stored is a string. Let’s consider an example: my name – Gräßlin. How to send this to an application using key codes? With an english layout you cannot even write it. That’s a problem also on X11. It’s pure chance if that feature works.
The only useful way can be to use copy and paste.
So your password manager sending key events is more dangerous than storing your passwords as plain text in your clipboard?
Yes as Wayland has clipboard implemented as peer-to-peer. Clipboard content is only available to the application having keyboard focus.
Following up on the clipboard security. Having it shared only on the application which is in focus isn’t still unsafe?
What about if I copy the password and alt-tab on a bunch of unsafe windows before finding the trusted browser?
By the way, thank you for the great work 🙂
And how should an unsecured app know that this is a password?
Well, that’s often pretty obvious if the password is a generated one – people rarely use other 8+ character mixed-case-with-number strings except probably for chemical formulae…
I’ve tried it out of curiosity and it worked.
I’ve added a bunch of unicode characters which are not on my Italian keyboard and performed the autotype on Kate. All the characters where exactly as they were supposed to be.
This is the code which performs the autotype
and apparently this is the one which does the translation
My c++ is not as good as I’d like, but it seems to me that essentially it creates a virtual keyboard to have the right mapping between keycodes and characters.
It uses crest. There is no guarantee that this can work.
To add more on it. We have keyboard input from kde connect and I haven’t implemented it on Wayland as I think it’s not possible to implement this reliable
Would it be possible, both for autotype and for KDE connect, to hook into this not as a simulated physical keyboard, but as a special type of virtual keyboard / input method editor? IIUC, the input methods have to work with strings anyway… but they are not mentioned in the original article at all. Are they simply not supported?
I am interested in the input methods on Wayland as well.
From what I understand, the text-input and input-method Wayland protocols are still unstable. While a new version has been suggested this year (some comments on it can be found here https://lists.freedesktop.org/archives/wayland-devel/2016-July/029887.html), it has not yet been added to the official unstable protocols yet. QTWayland does have an implementation of a preliminary version of the protocol but I am not sure where it is being used.
ibus on Gnome Wayland (the only supported input method there, I think) is not using a Wayland protocol at all but is communicating with the compositor through Dbus (IIRC).
Input method are not mentioned in the article as it’s not a keyboard and this article was about keyboards.
But yes, sending keysyms is what the text input protocol does. So that can be a solution. But I don’t think the protocol is suited for that.
In fact we are looking at it from the wrong perspective. What is needed is a protocol for an app to talk to keypass to get the password. Going through the windowing system is a hack and the wrong layer. Wayland gives us the chance to fix these protocol abuses.
How would you suggest integrating such a protocol into apps? Should the toolkits automatically provide the “query secret store” action for any password entry field?
And by the way, you do mention the virtual keyboard. Does it work as the “proper” hardware keyboard simulation, or just as a text input method with the text input protocol?
It’s just text input
Would having KeePassX create a virtual keyboard for this feature be a more proper solution then?
I reckon many would like to see a similar feature implemented, perhaps just not enabled by default.
As I just commented: I think the proper way is a dedicated communication channel between the application and the keyboard application. No windowing system involved at all.
There should be some protocol, that will allow one application to bypass security. Like LoginD allows one application to read input* files.
Well, you probably have one suitable low-level protocol… you can give your application permissions to create an uinput device and simulate the keyboard. A bit like what fuse does for filesystems. Not that it would be easy…
Plasma on Wayland is starting to feel quite stable. Thanks for all your work. However, keyboard handling is unfortunately one of the areas that needs more work to be usable. I have lost track of they software layers, what is Plasma/KWin responsible for and what should be fixed in Qt. Could you point to some bug trackers where to search for issues to see if something is being done about, for example, the non-working dead keys? (Which, by the way, have been acting even more strangely lately: now they open KRunner…)
missing dead-key support is https://bugreports.qt.io/browse/QTBUG-54792 – in general all input related bugs seem to be in QtWayland.
Ok, thanks. One more question: the bug report talks about the compose key. Does the this also cover the “classic dead key” (as defined by the Wikipedia page on Compose key)? Or maybe a new bug report should be filed requesting this to be implemented? The compose key is not a substitute for the classic dead key handling which people have been accustomed to since mechanical typewriters.
Inside xkbcommon compose and dead-key support is very related. So if you support one, you support both. See https://xkbcommon.org/doc/current/group__compose.html
Probably a fringe use case but the focused window reminded me: can there only be one focus? There has been mpx (multi pointer x) support in xinput for several years where you can have several independent mouse pointers and a keyboard associated to their respective window focus, and there was a proof of concept window manager that supported it. As far as I know all of the mainstream window managers were always very confused when using this.
Any plans to have this work on Wayland?
As you say it’s a fringe use case. So no we don’t have any plans to support this.
I love your Blog. Stuff like this is too often to bad documented and explained. And Users and Integrators scratch their head because of design and implementation things.
BIG KUDOS to you for explaining that!
I love this blog, too. Thank You very much for not only writing all of this but for spending additional time to show us how it works.
I was just reading about Emoji input on GNOME (https://fedoramagazine.org/using-favorite-emoji-fedora-25/), which relies on ibus-typing-booster. It also provides auto-completion features for many non-European languages : http://mike-fabian.github.io/ibus-typing-booster/index.html
Which made me wonder : how about Plasma/KWin ? On X11, it seems to be working regardless of the environment you are using. Would that still be the case in the Wayland world ? More generally, what about inputing non-European characters using KWin-wayland ?
I was told that one of the frameworks for non-European characters just works on KWin/Wayland. It’s not an area of my expertise so I need to pass that on to others.
Martin, on Kubuntu16.10 right click near right edge of my right display and it still opens menu in first display right edge. Where should I report this issue? Plasma 5.8.4
That’s totally offtopic to my blog post. This is not the kubuntu support section. Please don’t comment bugs on my blog posts.
“Actually it’s the wrong place for it, but our input filtering code does not guarantee that a filter sees all input events.”
Could you elaborate on why there’s no guarantee here?
“Before forwarding the event KWin verifies which is the focused keyboard window. If there is an active window the surface of that window is marked as the focused keyboard surface in KWayland::Server.”
Doesn’t this create a race condition? Is it possible that after the focused keyboard surface was marked and before the event gets processed by KWayland::Server the focused window will have changed subsequently marking another surface as keyboard focused?
Btw, I’m not sure if “focused keyboard surface” is the right term. If this is meant to mean “[wayland’s] surface which should receive all keyboard events” then maybe “surface with keyboard focus” or “keyboard focused surface” would be better? Could some English native speaker help us here, please?
There is no guarantee as the implementation doesn’t provide such a guarantee.
No, there is no race condition there. It’s all handled inside KWin. Kwayland server is a library KWin uses, so the state is always in sync. Clients cannot focus themselves so it cannot go out of sync.
“There is no guarantee as the implementation doesn’t provide such a guarantee.”
Well, while technically probably correct this is rather terse response 🙂 What so “special” happens between the time xkbcommon is called and event filters are called that the former gets all events but the latter does not?
It’s the event filters which don’t have that guarantee. It’s the way how they are designed
I guess you mean that some events could be filtered by filters earlier in the chain not that there are events which do not get directed to filter chain at all. I guess that adding InputEventSpy (InputEventObserver?) is meant to provide this guarantee.
Comments are closed.