Why screen lockers on X11 cannot be secure

Today we released Plasma 5.2 and this new release comes with two fixes for security vulnerabilities in our screen locker implementation. As I found, exploited, reported and fixed these vulnerabilities I decided to put them a little bit into context.

The first vulnerability concerns our QtQuick user interface for the lock screen. Through the Look and Feel package it was possible to send the login information to a remote location. That’s pretty bad but luckily also only a theoretical problem: we have not yet implemented a way to install new Look and Feel packages from the Internet. So we found the issue before any harm was done.

The second vulnerability is more interesting as it is heavily related to the usage of X11 by the screen locker. To put this vulnerability into context I want to discuss screen lockers on X11 in general. In a previous post I explained that a screen locker has two primary tasks:

  1. Blocking input devices, so that an attacker cannot interact with the running session
  2. Blanking the screen to prevent private information being leaked

From the first requirement we can also derive a requirement that no application should get the input events except the lock screen and that the screen gets locked after a defined idle time. And from the second requirement we can derive that no application should have access to any screen content while the screen is being locked.

With these extended requirements we are already getting into areas where we cannot have a secure screen locker on X11. X11 is too old and too insecure to make it possible to fulfill the requirements. Why is that the case?

X11 on a protocol level doesn’t know anything of screen lockers. This means there is no privileged process which acts as the one and only screen locker. No, a screen locker is just an X11 client like any other (remote or local) X11 client connected to the same X server. This means the screen locker can only use the core functionality available to “emulate” screen locking. Also the X server doesn’t know that the screen is locked as it doesn’t understand the concept. If the screen locker can only use core functionality to emulate screen locking then any other client can do the same and prevent the screen locker from locking the screen, can’t it? And yes that is the case: opening a context menu on any window prevents the screen locker from activating.

That’s quite a bummer: any process connected to the X server can block the screen locker. Even more it could fake your screen locker. How hard would that be? Well I asked that question myself and needed about half an hour to implement an application which looks and behaves like the screen locker provided by Plasma 5. This is so trivial that I don’t see a point in not sharing the code:

#include <QGuiApplication>
#include <QQuickView>
#include <QQmlContext>
#include <QScreen>
#include <QStandardPaths>
#include <QtQml>

class Sessions : public QObject
{
    Q_OBJECT
    Q_PROPERTY(bool startNewSessionSupported READ trueVal CONSTANT)
    Q_PROPERTY(bool switchUserSupported READ trueVal CONSTANT)
public:
    explicit Sessions(QObject *parent = 0) : QObject(parent) {}
    bool trueVal() const { return true; }
};

int main(int argc, char **argv)
{
    QGuiApplication app(argc, argv);

    const QString file = QStandardPaths::locate(QStandardPaths::GenericDataLocation,
        QStringLiteral("plasma/look-and-feel/org.kde.breeze.desktop/contents/lockscreen/LockScreen.qml"));

    qmlRegisterType<Sessions>("org.kde.kscreenlocker", 1, 0, "Sessions");
    QQuickView view;
    QQmlContext *c = view.engine()->rootContext();
    c->setContextProperty(QStringLiteral("kscreenlocker_userName"),
                          QStringLiteral("Martin Graesslin"));
    c->setContextProperty(QStringLiteral("kscreenlocker_userImage"), QImage());
    view.setFlags(Qt::BypassWindowManagerHint);
    view.setResizeMode(QQuickView::SizeRootObjectToView);
    view.setSource(QUrl::fromLocalFile(file));
    view.show();
    view.setGeometry(view.screen()->geometry());
    view.setKeyboardGrabEnabled(true);
    view.setMouseGrabEnabled(true);

    return app.exec();
}

#include "main.moc"

This looks like and behaves like the real screen locker, but it isn’t. A user has no chance to recognize that this is not the real locker. Now if it’s that simple to replace the screen locker why should anyone go a complicated way to attack the lock screen? At least I wouldn’t.

And is there nothing which could be done to protect the real locker? Well obviously a good idea is to mark the one and only screen locker as authentic. But how to do that in a secure way on X11? We cannot e.g. show a specific user selected image. This would conflict with another problem with screen lockers on X11: it’s not possible to prevent that other windows grab screen content. So whatever the screen locker displays is also available to all other X11 clients. Also the window manager cannot help like preventing fullscreen windows to open fullscreen as can be seen in the code fragment above: it’s possible to bypass the window manager. Built in feature by X11.

Many of these issues could be considered as non-problematic using the old pragma of “if it runs, it’s trusted”. While I personally disagree, it just doesn’t hold for X11. If only clients of one user were connected to the X server one could say so. But X11 allows clients from other users and even remote clients. And this opens a complete new problem scope. Whenever you use ssh -X you open up your local system to remote attack vectors. If you don’t control the remote side it could mean that the client you start is modified in a way to prevent your screen from locking or to install a fake locker. I know that network transparency is a feature many users love, but it’s just a security night mare. Don’t use it!

Overall we see that attacking a screen locker or preventing that it opens up is really trivial on X11. That’s an inherent problem on the architecture and no implementation can solve them, no matter what the authors tell how secure it is. Compared to these basic attack vectors the vulnerability I found is rather obscure and it takes a considerable amount of understanding how X11 works.

Nevertheless we fixed the issue. And interestingly I chose to use the technology which will solve all those problems: Wayland. While we don’t use Wayland as the windowing system we use a custom domain-specific Wayland-based protocol for the communication between the parts our screen locker architecture. This uses the new libraries developed for later usage in kwin_wayland.

As we are speaking of Wayland: how will Wayland improve the situation? In the case of Plasma the screen locker daemon will be moved from ksmserver to kwin, so that the compositor has more control over it. Screen locking is a dedicated mode supported by the compositor. Whether a context menu is open or not doesn’t matter any more, the screen can be locked. Also the compositor controls input events. If it has the knowledge that the screen is locked, it can ensure that no input goes to the wrong client. Last but not least the compositor controls taking screenshots and thus can prevent that clients can grab the output of the lock screen.

KWin on speed

With the 5.2 release basically done, I decided to do some performance investigation and optimizations on KWin last week. From time to time I’m running KWin through valgrind’s callgrind tool to see whether we have some expensive code paths. So far I hadn’t done that for the 5.x series. Now after the switch to kdecoration2 I was really interested in the results as in the past rendering the decoration used to be a bottle neck during our compositing rendering loop.

Unfortunately callgrind doesn’t give us a good look on the performance of KWin as it neither includes GPU times nor roundtrips to the X server. Nevertheless it gives us a good look on our own CPU usage. I was rather surprised by the result as I didn’t find anything which looked bad. Nevertheless I was able to slightly optimize one method which is called whenever the X11 stacking order is changed by improving an internal algorithm which didn’t scope well with the larger than expected number of child windows of the root window.

But callgrind output wasn’t the only performance relevant thing I looked into. I investigated a really interesting bug report about the screen freezing for a short time when a new window opened. While I wasn’t able to reproduce the issue as is, I was able to reproduce a small freeze whenever a Qt 5 application opened. Interestingly only with a Qt 5 application. So I ran the same application in a Qt 4 and Qt 5 variant and only in the latter I got a freeze. Investigation showed pretty quick that KWin is not to blame, for one I got the freeze before KWin started to manage a window for it and I was able to reproduce with different window managers. With the help of xtrace I finally found the culprit and we found the appropriate bug report on Qt side. Also our KDE domain experts started to look into the issue on Qt side.

But still others were able get a small freeze whenever KWin started to manage a window. And in deed further investigation showed that the method handling the managing of a new window can take some time and can cause the compositor to drop frames. Ideally this would be solved by moving the compositor into a dedicated rendering thread but that’s quite a lot of work and might not help in that case as KWin’s main thread grabs the XServer while managing a window. So the better solution was to investigate why the method takes so long. To not drop any frames the method may not take longer than 16 msec, the shorter the better.

While managing a window KWin needs to read quite a lot of properties. Most of them are nicely read in a non-blocking way through the KWindowSystem framework, but some properties are also KWin internal and read in a blocking way. Most expensive was reading the icons which was triggering several round trips especially if the window did not specify the icons in a NETWM compliant way. This could easily cause a delay of 50 to 100 msec during managing a window. Overall the method could trigger up to 14 round trips to the X server which were not needed at all in the case of KWin. Our KWindowSystem framework got an adjustment to prevent the roundtrips if the user of the KWindowSystem framework has all required information already fetched. The result is that reading the icons is now significantly below one msec. For other roundtrip causing methods I introduced two new methods: one to perform the request, one to later read the result. This allowed to remove another set of roundtrips. My measurements showed that each roundtrip takes about half a millisecond on my system. Half an msec here, half an msec there easily adds. Unfortunately there are still some XLib calls (one to read motif hints and one to read WM_SIZE_HINTS) which ideally would get ported and as long as they are not ported delay the managing of a new Client.

Nevertheless this shows quite some nice improvements for our development version which will become 5.3 in a few months. Of course all of that would not have been possible without the switch from XLib to XCB.

Locking the screen before system suspends

Our Plasma workspace has offered the feature to lock the screen when resuming from suspend for a long time. Ideally the screen gets locked right before the system goes to suspend to ensure that the screen is properly locked when the system wakes up.

The process was controlled by powerdevil: when powerdevil decided that it should suspend the system, it invoked the lock screen to get the screen locked. But this has drawbacks. For example the screen locker doesn’t know that the system is going to suspend. Neither does powerdevil know when the screen is fully locked (the lock process includes multiple stages and the actual lock is hold before the screen gets blanked). Also it only works with powerdevil, if one suspends the session in a different way (e.g. through systemctl suspend) the screen doesn’t get locked. The worst drawback was that sometimes it was still possible that the system woke up and expose the session for a split second as the screen locker has no way to indicate to the system to wait with the suspend till everything is settled.

With the upcoming Plasma 5.2 release this will significantly improve by leveraging functionality provided by logind. The lock screen on resume functionality is moved from powerdevil into the screen locker. This already means that the screen locker has a more complete picture of what is going on and allowed us to streamline the settings: whether the screen locks on resume is now found in the same configuration module as all other screen locker related settings.

The most important change is that we are now able to ensure that the screen is locked before going to suspend. For this the screen locker holds an inhibitor lock. When the screen is going to suspend, the screen locker is notified by logind, can start the lock process and remove the inhibitor lock once the screen is properly locked. This means under normal condition the screen will be locked before going to suspend (of course a timeout in logind could be hit if everything takes too long or if the screen cannot be locked).

Obviously this will only work if the system is suspended through logind, which powerdevil does if logind is available. So our users who use logind get an improved experience. For our users who do not yet use logind nothing changes: powerdevil notifies the screen locker that it’s going to suspend and the screen locker starts the lock process. This obviously has the same limitations as described above.