Full credits go to Alex Fiestas for complaining about the performance in a way that I wanted to do something about it, instead of being just annoyed and ignoring the comment!
Alex’s complaints got me wondering why we are not able to render 60 frames/second. Each frame should only take 16.67 msec and knowing that our repaint loop is fine, I could not imagine how a frame could take longer to render than the 16.67 msec. Knowing the KWin’s source code pretty well I suspected two parts of the repaint loop to be slow: the method which actually renders each window and our effect chain. The effect chain calls each effect in turn to transform windows. I had an idea to improve the effect chain for quite some time by calling only the currently active effects. That is currently each effect checks first whether it is active and just does nothing. So you basically call all effects again and again and nothing is actually performed except waisting cycles on checking whether it should do nothing (some effects are heavy there).
Of course I do not just optimize without checking if the code needs to be optimized, so I did an analysis of callgrind output and I was surprised. The effect chain was way more heavy than expected. In fact it’s so heavy that the paint method doesn’t matter in comparison. A deeper analysis of the code showed that there is a small bug, which we will fix in 4.7.2 (too late for 4.7.1), so that we can give the benefits as fast as possible to our users. But the real optimization by only calling the active effects will hit only 4.8. After that change the effect chain is no longer visible in the hot pathes of KWin. Also the change immediatelly helped to identify some expensive checks in some effects which are now ensured that they are not called in each frame (unless the effect is active).
After the optimization of the effect chain we are still not yet there and do not reach 60 frames during the animation, but you can really feel the improvements 🙂 Now the paint method shows up as one of the most expensive in callgrind (as expected previously) and I will now spend some time and thinking in how to optimize this one further by moving heavy operations out of the repaint loop.
In the long term I hope to be able to move the compositor into an own thread (also something I have in my mind for quite some time), so that heavy operations in window management and the decoration plugin do not slow down the repainting of the compositing. This might be tricky as we can run into dangerous deadlocks (imagine compositor and window manager both blocking the X Server), so we will probably first concentrate on moving e.g. texture loading into threads – something one of our new deelopers in KWin experimented with.
Overall very nice improvements, I’m happy with, but not yet satisfied as we do not yet reach the constant 60 frames/second.
Powered by Blogilo