Porting KWin to XCB: making C usable through RAII

One of the most important work areas in KWin is currently the porting from XLib to XCB. The main reason for this is that Qt 5 threw out the existing XLib based code and replaced it by a new XCB-based implementation using QPA. This is overall a good thing. But it means that applications which used native features like the X11-event filter need to be adjusted. There are a few areas which need porting, which I summarized for KWin (hopefully completely) in this wiki page. If you use some native interaction it might make sense to check the list.

Technically we would not be required to port everything to XCB, but it gives us quite some advantages to do so. XCB is based directly on the X11 protocol which means we see better in the code what is going on. Also XCB is a complete asynchronous API which means we wait less for the X Server. This brings a noticeable improvement.

I want to present an example from KWin to show the difference between the synchronous XLib approach and the asynchronous XCB approch.

XWindowAttributes attr;
Window dummy;
Window* windows = NULL;
unsigned int count = 0;
QRect screen = QRect(0, 0, displayWidth(), displayHeight());
QVector<Window> proxies;
XQueryTree(display(), rootWindow(), &dummy, &dummy, &windows, &count);
for (unsigned int i = 0; i < count; ++i) {
    if (m_screenEdgeWindows.contains(windows[i]))
        continue;
    if (XGetWindowAttributes(display(), windows[i], &attr)) {
        if (attr.map_state == IsUnmapped)
            continue;
        const QRect geo(attr.x, attr.y, attr.width, attr.height);
        if (geo.width() < 1 || geo.height() < 1)
            continue;
        // some other checks
        proxies << windows[i];
    }
}

So what in this code is synchronous? In line 7 we ask the X Server for all windows – of course synchronous. Then we go through the list and get for each window we are interested in the window attributes (line 11). In the X protocol that are two requests to the server – one for the attributes and one for the geometry. Overall this means we have a synchronous call for each window we have to check. I recently looked at a delay when ending Present Windows or Desktop Grid and noticed that this for loop can take up to 500 msec.

Now let’s look at the same thing done with XCB (note: I did not compile it):

xcb_connection_t *c = connection();
xcb_query_tree_cookie_t cookie = xcb_query_tree_unchecked(c, rootWindow());
QVector<xcb_window_t> proxies;
QVector<xcb_get_window_attributes_cookie_t> attribsCookies;
QVector<xcb_get_window_geometry_cookie_t> geometryCookies;
QRect screen = QRect(0, 0, displayWidth(), displayHeight());

xcb_query_tree_reply_t *tree = xcb_query_tree_reply(c, cookie, NULL);
xcb_window_t *windows = xcb_query_tree_children(tree);
for (unsigned int i=0; i<tree->children_len; i++) {
    attribsCookies << xcb_get_window_attributes_unchecked(c, windows[i]);
    geometryCookies << xcb_get_geometry_unchecked(c, windows[i]);
}

for (unsigned int i=0; i<attribsCookies.size(); i++) {
    if (m_screenEdgeWindows.contains(windows[i])) {
        xcb_discard_reply(c, attribCookies.at(i));
        xcb_discard_reply(c, geometryCookies.at(i));
        continue;
    }
    xcb_get_window_attributes_reply_t *attrib = xcb_get_window_attributes_reply(c, attribsCookies.at(i), NULL);
    if (attrib->map_state == XCB_MAP_STATE_UNMAPPED) {
        xcb_discard_reply(c, geometryCookies.at(i));
        free(attrib);
        continue;
    }
    xcb_get_window_geometry_reply_t *geometry = xcb_get_window_geometry_reply(c, geometryCookies.at(i));
    const QRect geo(geometry->x, geometry->y, geometry->width, geometry->height);
    if (geo.width() < 1 || geo.height() < 1) {
        free(attrib);
        free(geometry);
        continue;
    }
    // some other checks;
    proxies << windows[i];
    free(attrib);
    free(geometry);
}
free(tree);

This example shows nicely the advantage of having an asynchronous API. We first have one loop to send all requests to the X-Server and then fetch the reply in a second loop. But we also see quite some disadvantages. First of all the_naming_is_just_awfull() and becomes worse when working with extensions. It’s nothing special for XCB, it’s a common problem to all C APIs, I always have to cringe when seeing such code. Also the method arguments are rather unhandy. We need to pass an xcb_connection_t* parameter to each method call. Fair enough the same had been the case with XLib, but still we have just one connection to the X-Server.

If we do not want to check for errors, we have to use the _unchecked variant of the method. It’s again a general C API problem by not having the possibility to overload methods.

And a real problem which is nicely visible in the code fragment is that the reply calls return a pointer and pass the responsibility for the pointer to the client of the API. For someone being used to a memory managed programming language like C++ with Qt that’s really painful. I expect to get back const references and not pointers I have to free.

Also if we do not need the reply for a request, we need to discard it, otherwise we would leak a reply. This makes the whole code segment rather unhandy.

We did not really like it and worked on improving this in general over multiple iterations. The whole thing can easily become memory managed by using a QScopedPointer with a QScopedPointerPodDeleter. That way we don’t have to call free on the replies any more. When the QScopedPointer goes out of scope the memory will be freed implicitly.

But that’s not all we did to improve it and I show now the same code segment how it looks like after applying our Wrapper classes:

QVector<Xcb::WindowId> ownWindows = windows();
Xcb::Tree tree(rootWindow());
QVector<Xcb::WindowAttributes> attributes(tree->children_len);
QVector<Xcb::WindowGeometry> geometries(tree->children_len);

Xcb::WindowId *windows = tree.children();
QRect screen = QRect(0, 0, displayWidth(), displayHeight());
QVector<Xcb::WindowId> proxies;

int count = 0;
for (unsigned int i = 0; i < tree->children_len; ++i) {
    if (ownWindows.contains(windows[i])) {
        // one of our screen edges
        continue;
    }
    attributes[count] = Xcb::WindowAttributes(windows[i]);
    geometries[count] = Xcb::WindowGeometry(windows[i]);
    count++;
}

for (int i=0; i<count; ++i) {
    Xcb::WindowAttributes attr(attributes.at(i));
    if (attr.isNull() || attr->map_state == XCB_MAP_STATE_UNMAPPED) {
        continue;
    }
    Xcb::WindowGeometry geometry(geometries.at(i));
    if (geometry.isNull()) {
        continue;
    }
    const QRect geo(geometry.rect());
    if (geo.width() < 1 || geo.height() < 1) {
        continue;
    }
    // further checks
    proxies << attr.window();
}

It’s a RAII approach ensuring like the QScopedPointer that the memory is freed once the object goes out of scope. In addition we got completely rid off the connection we have to pass to the method calls and we do no longer have to call discard: our Wrapper class is taking care of it. What you can also see is that the fact that the API is asynchronous is completely hidden. It still is async, the request is done in the constructor and the reply is fetched once you try to access the data for the first time.

Overall it makes the code much cleaner and easier to use and ensures that we do not leak memory even if we use loops with various code paths jumping out of them. Also we know that the code works (unit tested) and ensures that we do not repeat ourselves.

In the background we have a templated class which allows us to easily wrap any XCB request/reply group. So far we only support method calls operating on one xcb_window_t argument, but once we enabled C++11 in KWin we can easily wrap any command through variadic templates.

If you find this interesting: we have lots of work for XCB porting. Just come to #kwin on freenode and ping me (mgraesslin) or send a mail to kwin@kde.org.

This week in KWin (2013, week 5)

Major event last week has of course been the tagging of 4.10 and some last minute fixes for it. TabBox is now getting proper translucent backgrounds again, other parts of KWin still have unfortunately still problems with it. Most of the work has not yet made it into master and is still on review. There’s some new stuff for screen edges in the pipeline and some further areas got ported to XCB.

Summary

Crash Fixes

Critical Bug Fixes

    Bug Fixes

    • 313909: XReconfigureWMWindow fails to stack (lower) a window using an Above CWSibling combination
      This change will be available in version 4.11
      Git Commit
    • 192807: Plastik window decoration doesn’t paint for very small windows
    • 310945: New maximize effect leads to visual glitch
      This change will be available in version 4.10
      Git Commit

    New Features

      Tasks

        This week in KWin (2013, week 4)

        This week quite some bug fixes have entered both 4.10 and master branch. One of the most interesting one is the improved detection for whether a window is on the local system or on a remote one. If you want to get more information about that read my dedicated blog post. But there are also some nice general improvements. The first KConfigXT patch has entered and work has started to make use of it in our Options class. The XCB porting continous with an improved Xcb::Wrapper class – for this I plan to write a dedicated post as it’s an interesting topic from the C++ perspective.

        Summary

        Crash Fixes

        • 313655: unloading of “Mouse Click Animation” result in kwin crash
          This change will be available in version 4.10
          Git Commit

        Critical Bug Fixes

          Bug Fixes

          • 313950: Typo in compact/contents/ui/main.qml
            This change will be available in version 4.10
            Git Commit
          • 308438: When compositing is disabled, opening a window which blocks compositing *enables* it
            This change will be available in version 4.10
            Git Commit
          • 302248: Show Desktop has inconsistent behavior when launching KRunner
            This change will be available in version 4.10
            Git Commit
          • 312956: Desktop Effects – Zoom: Mouse Tracking “Push” laggy, inprecise and/or not working
            This change will be available in version 4.11
            Git Commit
          • 270586: build fails – getdomainname
            This change will be available in version 4.11
            Git Commit
          • 308391: KWin uses getdomainname() to obtain the hostname, should use getaddrinfo()
            This change will be available in version 4.11
            Git Commit
          • 304435: Track mouse effect generates graphics artifacts
            This change will be available in version 4.11
            Git Commit

          New Features

            Tasks

              About network access, fuzzy specifications and non-POSIX calls with window managers

              KWin supports a feature to recognize windows from a remote host. If KWin recognizes such a window it adds the host name (as provided by the property WM_LOCALE_NAME) to the caption. This is a very handy feature in case you work with remote system and use X11 network transparency. But it is also a feature hardly known or needed by most users.

              Unfortunately this feature does not work properly for LibreOffice, because LibreOffice uses the FQDN instead of the hostname and KWin checked for the hostname. Some time ago the problem got fixed by using getdomainname to work around the LibreOffice situation. But this does not work in all cases and introduced issues with non-Linux systems as it is a non-POSIX call[1].

              Of course one might ask why we don’t fix LibreOffice if only their usage of FQDN is causing a problem instead of fixing KWin. In this case it’s quite simple: LibreOffice is doing it right, everyone else is doing it wrong. Let me quote a section from the NETWM specification:

              If _NET_WM_PID is set, the ICCCM-specified property WM_CLIENT_MACHINE MUST also be set. While the ICCCM only requests that WM_CLIENT_MACHINE is set “ to a string that forms the name of the machine running the client as seen from the machine running the server” conformance to this specification requires that WM_CLIENT_MACHINE be set to the fully-qualified domain name of the client’s host.

              Of course the specification does not say anything about the case that the client’s host is the localhost. I would assume that the specification only considers the remote host case, but this is my personal interpretation based on the overall fuzziness of the specification. Given that it doesn’t say anything of the local system case, the interpretation of LibreOffice is absolutely correct by providing the FQDN. Also I can understand that one doesn’t want to maintain two code paths.

              Now the fun part is that it looks like everyone else is not NETWM compliant in that point. In preparation for this blog post I forwarded an GTK+ based and a Qt (4) based application to another system and looked at the properties. _NET_WM_PID is set and WM_CLIENT_MACHINE contains the hostname, not the FQDN.

              The fact that LibreOffice was always considered as a remote system had been unnoticed for quite some time. Personally I’m surprised by that and can only assume that users think that it was supposed to be called like that. I myself try to not use office applications and if I have to I use applications of the Calligra suite. But apparently it got noticed in the Trinity fork and a patch had been prepared there. After my last rant, the developer who wrote the patch, proposed it for inclusion on ReviewBoard. The patch used getaddrinfo to resolve the domain name for the provided client’s hostname. We decided against the patch because it is a blocking call to the network and in KWin we don’t do blocking calls. A blocking call in a window manager is pretty bad as it means that you can no longer interact with your windowing system in any way that would require a window manager. A blocking call in the compositor is deadly as the screen does not get updated any more. The system appears as being frozen. That’s why we have a clear no blocking call policy. This problem has been communicated to the Trinity developers and I suggested them to revert the patch [2].

              I had put quite some thought into the problem and realized that it’s not going to be an easy fix and requires some internal rework to ensure that KWin can properly resolve whether a window is on the local machine even if it does not provide the hostname. Recently I sat down and turned my thoughts into working code. The general idea is to split out the complete hostname resolving into an own class to have this encapsulated. This class can provide whether the hostname it is encapsulating is on the local machine, so that we don’t have to query again and again – a problem I noticed when looking at the code: whenever the information was needed, it was queried again, which makes the interaction quite difficult if we are not allowed to do sync calls.

              The resolving works as it used to be. So first the hostname comparison is used. If this does not work and the hostname looks like a FQDN then we do the resolving with getaddrinfo, but through a helper class and in a background thread. When the information is finally available a signal is emitted that it’s a local system allowing the external world to react on it to e.g. update the window’s caption. Interestingly when I worked on the code and started with the existing patch I noticed that it did not work correctly. Now a remote window with a FQDN was considered local if the name is resolvable. So overall we have now four approaches to get it right (initial code, first fix, Trinity fix and my fix) which shows that this is quite a non-trivial task and I wanted to be sure that it does not break ever again and if it does that we understand it. Therefore I wrote a unit test to cover the cases. I’m rather happy about that test as it is the first unit test I added which actually talks with X. So far I only wrote test code for non-X11 code.

              Unfortunately there are now still cases where the information will be provide too late and a local system will be considered local. For example also the rule system and session management may need this information. Here not much is possible to be done. For window rules it does not really matter as by default we do not match the host at all and even if it’s more likely to try to match remote than local system. Also an adjustment of the rule to match a FQDN would work.

              Overall this has been one of the most interesting bug fixes I had worked on recently which motivated this blog post. As this is a rather large change it is not going to be seen in 4.10, but only available in master (4.11). A decision made due to the long time the issue had not been noticed, which implies that it’s not a real issue for our users.

              [1] This is a classic example for why I think it doesn’t make sense to state that we support non-Linux Unix systems. We break the code without noticing, because nobody is even trying to compile the code on non-Linux systems (also no CI system). The non-Linux system always have to catch up and fix our stuff and then they have a hard time to get the changes back upstream. This issue got reported to us with a patch attached. But I did not accept the patch as I spotted a possible issue and unfortunately there had not been any further trying on improving the patch. I couldn’t do it as I lacked the operating system to test it. Probably it was easier to carry the patch downstream than trying to get it into shape for upstream inclusion.

              [2] That a commit entered Trinity which did not pass code review in KWin due to being dangerous is not surprising. This is no Trinity bashing, but it’s something to be expected when working on a foreign code base in an area that you don’t know. How should a Trinity developer know that you are not allowed to do blocking calls in a window manager? And even if you consider that as being obvious: how should a developer get the feeling to see a piece of code and ask himself instantly “is this blocking?”. This requires experience for working with a window manager and is something I explained to the Trinity developers in my very first mail to them where I suggested to drop their fork of KWin, because of exactly such reasons:

              Working on a window manager and compositor comes with great responsibility. It is one of the most complex parts of the desktop environment and introduced bugs affect all users and can be really harmful and very difficult to debug. Developing a window manager is not trivial and you have to understand how the window manager works

              And this is not a problem just for Trinity, it’s a problem for all such forks, be it Mate’s Metacity fork or Cinnamon’s Mutter fork “Muffin” or now Consort’s fork of whatever window manager they use at base (probably a GTK3 version of Metacity). I can only suggest to work together with the upstreams and unfork what can be unforked.

              4.10 Feature Presentation: Application Menu in Window Decoration

              My favorite new feature in the KDE Plasma workspaces 4.10 (to be released on February, 6th) is the ability to hide an application menu inside the window decoration. I’m someone who is hardly using the menu, I get too lost in it. I’m preferring to use tool bars with the nice icons which help me to recognize what I’m looking for (a scissor is easier to recognize than “Cut” or “Ausschneiden” – depending on the language the app thinks to have to use).

              By removing the window menu the complete window frame gets less cluttered. A huge area containing lots of text is just gone. Just look at this screen shot of Calligra Words, which I use to write this blog post:

              Aus KWin

              While KDE Plasma workspaces does not only allow to have the menu in the window decoration it also provides the possibility to put it into a global menu bar. Personally I do not like this (have used Mac OS, know what I’m talking about) as it disconnects the menu from the window. With the window decoration this problem does not exist and also removes the problem of long mouse distance travels in case of non-maximized windows.

              In 4.10 we provide this new feature only in the Oxygen (default) window decoration. But our 4.11 development branch has seen some activity in this area: all Aurorae themes support it and also the Laptop window decoration.

              This pretty awesome new feature can be enabled in System Settings. Just go to “Application Appearance”, then to “Style”, select the second tab called “Fine Tuning”. If your system provides the required support libraries (if not complain to your distribution) you will find there a section “Menubar” with a dropdown to select the way you want to have it. If you select “Title bar button” the button will be added automatically. As screen shots say more than words:

              Aus KWin

              And last but not least a screenshot with the menu:

              Aus KWin

              This week in KWin (2013, week 3)

              This week has seen lots of last minute polish for the 4.10 release. Most issues should be fixed in RC3, but there is still some work going on to deal with regressions outside of KWin which nobody had noticed for too long. From the features/improvement department: KWin’s support information includes information about screens to finally end the discussions about whether a user is using multi-head, xrandr or who knows what. Our window decoration library gained support for two new window border sizes: no borders on the side and no borders at all. This is highly inspired by the already existing feature in Oxygen and got now also available in Aurorae for QML based themes.

              Last but not least I wanted to remind that the summary below is generated from the bug reports set to resolved fixed in Bugzilla over the last week. This does not mean that the bug got fixed, but that the report had been set to fixed. Sometimes this is done through a commit, sometimes it’s just some house cleaning. The last weeks have seen lots of cleaning. Also the title of a bug report is mostly set by the user and especially in the case of a crash report has very often not much to do with what has really been the case. So if you want to get excited about a bug in that list, you should at least click the link and read through it and see what it is about and when it got fixed. Even if the bug report is more than a decade old it does not say anything without the context.

              Summary

              Crash Fixes

              • 302094: KWin crashes with flash
              • 264897: crash during alt+tab
              • 312712: Assert on “kwin –restart &”
                This change will be available in version 4.10 RC 3
                Git Commit

              Critical Bug Fixes

                Bug Fixes

                • 301909: “No titlebar and frame = Force Yes” does not work
                • 311896: callDBus always fails for methods with signature containing array of strings
                  This change will be available in version 4.10
                  Git Commit
                • 293734: kwin does not honor disableMultihead=true and causes window focus problems
                  This change will be available in version 4.11
                  Git Commit
                • 308557: blur left on screen
                • 312851: Windows wider/taller than screen are cut off/partially mirrored when doing a screenshot of them using ksnapshot
                  This change will be available in version 4.11
                  Git Commit
                • 273104: Dual screen with different dimension screens and one rotated, desktop size is getting strange values

                New Features

                  Tasks

                    This week in KWin (2013, week 2)

                    A rather busy week is behind us. Not only have there been many bugs fixed, but also some feature/refactoring branches were merged which are not listed in the summary. For example a refactoring of virtual desktops landed in master. The management of virtual desktops has been split out of the Workspace class into an own module. As new features we have more support for the application menu button in window decorations. The Laptop decoration got such a button and generic support has been added to the Aurorae theme engine. Now almost all decorations shipped with KWin can provide this button.

                    Summary

                    Crash Fixes

                    Critical Bug Fixes

                      Bug Fixes

                      • 275235: Logout effect interferes with Plasma extenders effect
                      • 299398: shadow on volume OSD is wrong the first time
                      • 312835: The keyboard shorcut Ctrl+Alt+Right for “Switch to next desktop” doesn’t work anymore since commit a2a335064e206f0689e315d58c30bedce90decff
                        Git Commit
                      • 311319: Color correction breaks EffectFrames
                        Git Commit
                      • 306169: Thumbnail Aside effect not updating/disappearing
                        This change will be available in version 4.10 RC 3
                        Git Commit
                      • 295055: AbilityUsesAlphaChannel broken when used with shadow pixmap hints
                      • 312168: Plasma popup shadows not drawn during slide animation
                        This change will be available in version 4.10 RC 3
                        Git Commit
                      • 306404: The “walk through desktop” shortcuts don’t show up in the desktop kcm
                        This change will be available in version 4.11
                        Git Commit
                      • 281186: maximizing a full-size window causes title bar corruption
                      • 309853: KWin kcm wrongly informs that all effects failed to load
                        This change will be available in version 4.10
                        Git Commit
                      • 312784: Can’t drag and drop files or folders on edgeless windows
                        This change will be available in version 4.10 RC 3
                        Git Commit
                      • 313091: Kded-appmenu – Enabling makes “Window menu” and “on all desktops” buttons vanish
                        Git Commit

                      New Features

                      • 312900: wish: kwin option to save vertical space: merge window buttons into menu-bar
                        This change will be available in version 4.10

                      Tasks

                        [Help KWin] Create a KConfigXT file for KWin’s configuration

                        Just the other day a user in IRC complained about a default in KWin. I thought that the default he expected, is the one which is set in KWin sources. So I opened the respective source file and saw my assumption confirmed. But still the user claimed that there is a different default and I believed him. Further investigation showed that the source code of the configuration module had a different default set. It’s probably like that for years but it shows a problem: the config values are written and read at different places and the hard coded default values might diverge.

                        This reminded me of the great project we had last year to migrate the configuration of the KWin effects to KConfigXT and of the project to transform the configuration modules in KWin to have ui files. The combination of both calls for a new project: let’s migrate KWin core to KConfigXT. Now this isn’t a project which we will be able to do in one go. So I will split it into three parts:

                        1. Create the kcfg file
                        2. Migrate KWin core
                        3. Migrate the configuration modules

                        Let’s start with the first one: create the kcfg file. Interestingly we already have such a file in the kwin directory – last change: Nov 20th 2007. It might be that some of the options are still encoded correctly, but I rather doubt it and default values are missing anyway. So I suggest to start clean. I would like to split the task into creating the XML part for the different config groups in KWin, so that the tasks are small. Later on we can then put it together to have one complete file. The project is outlined in this wiki page. Just add yourself to a section if you want to work on it :-)

                        This week in KWin (2012, week 52 and 2013, week 1)

                        With Christmas break over there is again quite some work happening for KWin. Of course given that 4.10 is close by a few bugs got fixed, but with master open for 4.11 we also have the first feature commits. Most of it is in the area of porting KWin to XCB. Those changes are not listed, but they are quite nice as each of them brings a small improvement due to the asynchronous nature of XCB. For the actual features I try to create bug reports again, so that they can be listed in the summary.

                        Summary

                        Crash Fixes

                        • 308040: KWin crashes after restarting it
                          This change will be available in version 4.10
                          Git Commit
                        • 310142: KWin crash due to wobbly windows effect when closing window
                          This change will be available in version 4.10
                          Git Commit

                        Critical Bug Fixes

                          Bug Fixes

                          • 312346: PySolFC (And possibly other Tkinter and Tk programs), after moving window, put the menu at the original menu position.
                            This change will be available in version 4.10
                            Git Commit
                          • 293385: glsl should be disabled for the 945G because it’s slow and broken
                            This change will be available in version 4.10
                            Git Commit
                          • 308919: Window Switcher fails to repaint background if Fade Effect enabled
                          • 311553: No minimum size on the general or effects tab of kwincompositing kcm
                            This change will be available in version 4.10
                            Git Commit

                          New Features

                          • 308992: Use Resize Area in Aurorae
                            This change will be available in version 4.11
                            Git Commit

                          Tasks

                            [Help KWin] Save the Explosion Effect

                            One of our KWin Effects hasn’t seen much love over the last years and is in fact more broken than working. It’s a pure eye-candy effect which means that it is not at all in the development focus of the KWin team. The truth is, that we are tempted to just delete the effect because we won’t fix it. But of course there are users who like it and would be sad if it gets deleted.

                            Here you can help: if the issues gets fixed and the effect becomes maintained there is no need to remove it. So if you want to get your hands dirty with a small OpenGL based effect have a look at the Explosion effect and improve it. Have a look at Bug 312176 and all the linked reports to find the issues.