The relevance of game rendering performance benchmarks to compositors

In my essay about the methodology of game rendering performance benchmarks and how you should not draw any conclusions from incomplete data I did not cover a very important aspect about game rendering performance: the relevance of such benchmarks for OpenGL based compositors.

Like my post from Saturday this post represents my personal opinion and applies to all game rendering benchmarks though the publication of one specific one triggered the writing.

As you might guess from my way of writing and the caption, the conclusion I will provide at the end of this essay, is that such benchmarks are completely irrelevant. So I expect you to run straight to my comment section and tell me that games are extremely relevant and that the lack of games for Linux is the true reason for the death of Linux on the Desktop and Steam and Valve and Steam! Before doing so: I do know all these reasons and I considered them in this evaluation.

The first thing to note is that OpenGL based compositing on X11 introduces a small, but noticeable overhead. This is to be expected and part of the technology in question. There is nothing surprising in that, everybody working in that area knows that. When implementing an compositor you consider the results of that and KWin’s answer to that particular problem is: “Don’t use OpenGL compositing when running games!” For this we have multiple solutions like:

  • Alt+Shift+F12
  • KWin Scripts to block compositing
  • KWin Rules to block compositing
  • Support for a Window Property to block compositing

In the best case the game uses the provided property to just say “I need all resources, turn off your stupid compositing”. It looks like we will soon have a standardized way for that, thanks to the work done by GNOME developers.

Of course we could also optimize our rendering stack to be better for games. In the brainstorm section the question was already raised why unredirection of fullscreen windows is not enabled by default given the results published in the known benchmark (one of the reasons why I don’t like these benchmarks, so far every time we got bug reports or feature requests based on the incorrect interpretation of the provided data).

Optimizing the defaults means to adjust the settings for the needs of a specific group of users. Unredirection of fullscreen windows has disadvantages (e.g. flickering, crashes with distribution/driver combinations when unlocking the screen) and one has to consider that carefully. Gaming is just one of many possible tasks you can do with a computer. I hardly do games, I use my computer for browsing the Internet, writing lengthy blog posts, hanging out in social networks, sometimes watching TV or a video on youtube and hacking code. All these activities would not benefit from optimizing for games, the introduced flickering would clearly harm my watching TV activity. So if you think games are important, please step back and think how many activities you do and how much of it is gaming.

The next point I want to consider in the discussion is the hardware, especially the screen. As such benchmarks are published to be applied for general consumption we can assume standard hardware and a standard screen renders at 60 Hz. This is an extremely important value to be remembered during the discussion.

Now I know that many people think it’s important to render as many frames as possible, but that is not the case. If you do not reach 60 frames per seconds, it’s true – the more the better. If you reach 61 frames per seconds, you render one frame which will never end up on the screen. It’s not about that it’s more frames than your eye could see, it’s about the screen not being able to render more than 60 frames per seconds physically. Rendering more than 60 frames per second is a waste of your resources, of your CPU, of your GPU of energy, of your money.

Given that we can divide the benchmark results in two categories: those which are below 60 fps and those which are above 60 fps.

A compositor should not have any problems rendering at 60 frames per seconds. That’s what it’s written for and there is nothing easier than rendering a full-screen game. It’s the most simple task. Rendering from top-to-bottom all opaque windows, stopping after rendering the first one (the game) because nothing would end on the screen anyway. Perfect. It’s really the most simple task (considering it uses a sane RGB visual). If a compositor is not able to render that at 60 frames per seconds, something is fundamentally broken.

Let’s start with looking at the first category. I already pointed out on Saturday the one run benchmark where the result has been 10 frames per seconds. This is a result we can discard. It’s not representing the real world. Nobody is going to play a game at 10 frames per seconds. That just hurts your eyes, you won’t do it. The overhead introduces by the compositor does not matter when being in the category “too slow”. We can consider anything underneath 60 frames per seconds as “the hardware is not capable of running that game”, the user should change the settings or upgrade. This will turn the game into having more than 60 frames per seconds.

Before I want to discuss the second category I want to point out another hardware limitation: the GPU. This is a shared resource between the compositor and the game. Both want to use the rendering capabilities provided by the GPU, both want to upload their textures into the RAM of the GPU. Another shared resource is the CPU, but given modern multi-core architectures that luckily hardly matters.

Looking at the data provided we see at least one example where the game renders with more than 120 frames per seconds. That is twice the amount of frames than the hardware is capable to render. The reason for this is probably that the game is run in a kind of benchmark mode to render as many frames as possible. Now that is a nice lab situation but not a real world situation. In real world the game would hopefully cap at 60 frames per second, if not at least I would consider it as a bug.

But what’s the result of trying to provide as many frames as possible. Well it produces overhead. In that case the compositor get’s approximately twice as many damage events from the game than there need to be. Instead of signaling once a frame it’s twice. That means the event needs to be processed twice. It doubles the complete computational overhead to schedule a new frame. It does not only keep the compositor busy, but also the X-Server. So part of the shared resource (CPU) is used in a completely useless way. This of course includes additional context switches and additional CPU usage which would otherwise be available to the game.

Of course given that the game runs at “give me everything which is possible”, the available resources for the compositor are lower. This can of course result in frame drop in the compositor and also in badly influencing the run game.

But it is not a real world situation. The problem that the compositor does not get enough resources or takes away the resources from the game is introduced by the game running at a too high frame rate. So to say it is an academic example. Such benchmarks matter to game developers like Valve who really need to know how fast the game can go, but I’m quite confident that they also know about the side-effects of having a compositor (and other applications) and run it on a blank X-Server which I suggested as the control in my post from Saturday.

Given that I can only conclude that such benchmarks show data which are not relevant to the real world. It’s a lab setup and as so often a lab setup doesn’t fit to reality.

And this shows another problem of the benchmark. It shows nice numbers, but it does not answer the only valid question: is the game playable. In the end does it matter to the user whether a game renders at two frames more or less depending on which compositor is used as long as it doesn’t affect the game play? I would say it doesn’t matter.

It would be possible to setup a benchmark which could highlight regressions in a compositor. But the result would in most cases several bars all at 60 fps and if not, it’s a reason to report a bug and not to do a news posting about it. As an example for a benchmark done right, I want to point to the benchmark provided by Owen Taylor from GNOME Shell fame. Nowadays I know that the data provided in this benchmark nicely shows the performance problem which we fixed in 4.8. Back when the benchmark was published I thought it’s an issue with the benchmark and tried it with all available rendering backends of KWin and always got the same problem (and also studied the code). So yes proper done benchmarks, considering real world situations can be helpful, but even then it needs someone with expertise from the field to interpret the provided data. That’s also an important aspect missing in most benchmarks. A “here we see that KWin is two frames faster than Compiz”, is no interpretations.

25 Replies to “The relevance of game rendering performance benchmarks to compositors”

  1. I don’t play many games, but I am a game developer doing work with OpenGL. Having compositing enabled means a large drop in FPS in incomplete games, and needs to be turned off for more complete games or I’ll miss effects and have jerky motion.
    Thankfully, the kwin rule to disable compositing automatically does exist, so it’s not too great of a concern.

    1. if your a game developer do yourself and users a favor and implement the window property to block compositing. That’s why we added it: to be used by games 🙂

      1. To be honest I didn’t know there was an actual standard being worked on until I read this post. It’ll be my pleasure to implement it, then! What’s the name of this property?

  2. Another important point is that fps is not everything. I occasionally play games, so I do care about game performance and was quite annoyed noticing some tiny micro-stuttering in one game (Trackmania, played with wine) on my new laptop. The frame rate was at stable 40fps, all fine, but sometimes the game would still freeze for a fraction of a second, barely noticeable enough to be annoying. Turning on unredirection for full-screen windows (I was using KDE 4.8.5) fixed this, but obviously such issues – whatever it is – are not caught by simple fps benchmarks.

    Thanks for making KWin a great,, well-performing window manager and caring about optimizations without hitting the fps trap!

  3. Unfortunately, the stupid (stupid!) FPS metric has become _the_ one benchmark used to determine system performance in the consumer market. But that’s hardly new – it was stupid in the 90s and its stupid now.
    But people have been trained to accept it, even pay hundreds of $currency to get the additional 30 FPS, because, you know, numbers don’t lie and I really see the difference….
    I guess what I’m trying to say is something like this: You won’t change these people, they will continue making benchmarks and publishing them.

    IMO providing a window property for this kind of stuff is the perfect 100% solution to the problem, so you could have just stopped after the first paragraph.
    Anyway, isn’t that what Windows is doing? Turning off desktop effects when games are running? 😉

    1. Anyway, isn’t that what Windows is doing? Turning off desktop effects when games are running?

      Yes, Windows offers an API call which games can use. And one of our ideas when we added the property was that Wine could implement it, I even mentioned it in a bug report against Wine

  4. That benchmark is obviously flawed, but compositing can get in the way of games, I remember playing something and feeling “why is this a bit jerky?” then remembering about compositing. The problem ins neither at 120fps nor at 10fps, the real problem comes at 40-60 fps, where small drops are annoying.

    1. I guess most of these problems are caused because games tie their simulation / timing to the rendering (ie they update the simulation once per each frame and derive timing from the time time between these frames). I don’t know how exactly a compositor influences timing of the frames when in the “usable” range (say 40-60), but I wouldn’t be suprised if it can badly influence the game’s timekeeping… This can easily cause stuttering in the simulation: the game has no problem producing 40 frames per second, but if the timig in the simulation is off, the simulation itself is jerky / stuttering, not the rendering part 😉

  5. Okay we get it : If you want to play a game smoothly on Linux, disable the compositor. Right.

    However on Windows and OS X, it seems the compositor/display server are performing much, much better than Linux when it comes to Compositing in general.
    For example I have dual screen and I play games always in windowed mode, and on Windows, enabling Aero doesn’t drop any FPS in games or other apps at all whereas on Linux, even a simple web browser with enabled smooth scrolling starts to slow down a lot when Compositing is enabled…
    So where is the faulty bit in Linux that makes Composite+OpenGL so poorly performing ? X11 ? The compositor ? Drivers ?…
    Will Wayland (which permanently enables Compositing am I right?) solve things on this matter ?

    1. So where is the faulty bit in Linux that makes Composite+OpenGL so poorly performing ? X11 ? The compositor ? Drivers ?…
      Will Wayland (which permanently enables Compositing am I right?) solve things on this matter ?

      I think I mentioned in my blog post that the technology introduces a known overhead. Whether Wayland will solve that issue we will see when we have full Wayland compositors and Wayland games. Given the architecture I assume that it will remove the overhead or other said: there is no alternative to a fully composited system.

  6. I agree that the benchmarks in question are completely useless. I’d much rather see benchmarks that try and identify stutter, as this is actually a problem. I’d be interested to see if compositing causes/worsens this.

    The idea that performance in the compositor while running OpenGL intensive applications doesn’t matter is dubious though. You may not care about gaming; you could be running Google Earth; or running Blender; how about Firefox with hardware acceleration? These are all cases when you’d be running an OpenGL intensive program in a window. You’re unlikely to want to disable compositing as it’d severly decrease the user experience. Surely performance matters here?

    There also seems to be the assumption that optimising for gaming won’t benefit general performance too. Is this the case? I would have though increasing the performance while sharing the GPU may have benefits with less performant hardware.

    Of course, all of this only actually matters if the compositor CAN be made more efficient. As you said, there is an unavoidable overhead.

    1. You may not care about gaming; you could be running Google Earth; or running Blender; how about Firefox with hardware acceleration? These are all cases when you’d be running an OpenGL intensive program in a window. You’re unlikely to want to disable compositing as it’d severly decrease the user experience.

      It depends. If I do a CUDA based calculation I would turn off compositing, also when watching Full-HD videos I personally turn off compositing.

      For everything else I must say that there are OpenGL based applications and OpenGL based applications. Just using OpenGL (e.g. browser) does not mean that the performance matters, but using a WebGL based game in the browser, yeah there it might matter and you better turn off compositing.

      There also seems to be the assumption that optimising for gaming won’t benefit general performance too. Is this the case? I would have though increasing the performance while sharing the GPU may have benefits with less performant hardware.

      Let’s turn it around: optimizing the compositor will benefit the games. That way it makes sense and that’s what we do (compare my blog post from I think Friday).

  7. Don’t get me wrong, but isn’t window manager irrelevant for most people? If system software(kernel, WM, etc) does not get in users way(they are able to run their favorite applications without problems produced by system stuff), I’m sure most won’t care at all. That’s very often problem for many open source projects, the developers don’t care much about their users global experience. FOSS being FOSS, the only way for community driven project to achieve this, is to have easy introduction path for new developers, if my memory serves well – the author of this blog post has something about it and thus big thumbs up for him as one of the best FOSS leads.

    Anyhow, if one wants to see how graphics handling is done properly – take a look at DirectX history, and answer some very simple questions, “What was DirectX in its beginning?”, “What is DirectX today?”, and most of all “Why one operating system graphics stack(including buttons, text boxes and such) is build entirely on top of DirectX?”

    P.S. Qt5 could solve and force for resolution a lot of graphics related performance problems in GNU/Linux world.

  8. Depending on the architecture of the game, I don’t believe your statements about FPS > 60 being a waste, are necessarily true.

    If the main loop of a game is structured something like…

    while(true)
    {
    handleInput();
    simulatePhysics();
    render();
    }

    …thus having the rendering block the next iteration of the main loop, every frame counts. Input handling and physics simulation can both benefit from the higher FPS, leading to a more responsive experience (twitch gaming?).

      1. Not implying it is. But there are quite a few games out there based on some variant of the Quake1-3 engines, which does exactly this.

        Anyway, just trying to add a bit of detail – keep up the fantastic work on KWin. Keeps dragging me back to KDE whenever I test the waters elsewhere.

        1. well those games are old, they should not have a problem with a compositor running (compositor runs on different core, so all is fine).

  9. There is only one point where I don’t totally agree with you (this means that I partially agree, not that I disagree and I totally agree with everything else): “We can consider anything underneath 60 frames per seconds as “the hardware is not capable of running that game”, the user should change the settings or upgrade.”. Sorry but no really. 10 FPS is what you say, but between 30 and 60 your hardware is more then able to render the game and give the user a good experience. I’m not an hard core gamer, but I like to game from time to time. I have an intel HD 4000 card given AMD should just burn in hell and Nvidia…. works, but I don’t see a point the additional expense and problems with their binany thing. So for my main task (running linux) intel is just perfect, but still I can play skyrim (sadly on windows) with medium settings at 40 FPS…… quite good I have to say, I have no need at all to render at 60 FPS.

    That said your point is still valid. If kwin renders 38 instead of 40 this doesn’t matter much, but it is still more then nothing. When you are near the limit of the hardware some FPS is always welcome.

    Thank you for all your hard work optimizing kwin.
    Cheers

    1. I have to agree – while most everything in this blog post are spot on, there are lots of games (WoW, Total War Series, etc) that are perfectably playable under 60fps. I would say the range of acceptible FPS is between 20-60. Obviously, for a hard core first person shooter, one really needs 60fps, but for other types of games such as strategy, RPGs, casual, etc., less than 60fps can definately be considered ‘playable’ and enjoyable at that.

      1. yes I had decided to not go into the details of the different style of games. So this post is basically games == 1st person shooter given that the benchmarks are also only 1st person shooters (at least those for which I know the game).

  10. I have read the whole post and commentaries with some worry.
    You clearly don’t play games very often, at least the games that people worries about FPS (first person shooter).
    The FPS is not the perfect benchmark for games but it is not useless.
    My point is to the user have the better experience with the game, all frames should be rendered in less than 16 ms (60 FPS).
    All of them!!!
    If a frame is rendered in more time, the player sees the difference (I don’t know how, but they do). The game becomes laggy. You cannot frag anyone anymore. You are fragged all the time.
    Well, the FPS used by benchmarks is a average (of course). Some frames are rendered in 1000 ms, but the average can be in 80 or 100 FPS.
    Only saying that the FPS is 65 doesn’t mean that the user have the better experience that is possible.
    Saying that 60 (average) FPS and 200 (average) FPS is the same to the user is totally wrong because some frames in 60 (average) FPS will be rendered in 1000 ms. But in 200 (average) FPS the chances that a frame will be rendered in 1000 ms are very much less.
    Conclusion: the more (average) FPS, the better user experience to the user.
    The window manager should not interfere with the game. It doesn’t exclude the others goods of the window manager, but for the hardcore gamer it matters.
    And if we want games comes to linux, we should attract the hardcore gamers, making linux the better place to play games.
    Valve has make the first step to it.
    http://blogs.valvesoftware.com/linux/faster-zombies/

Comments are closed.