Announcing the Duplicate Finder

DrKonqi is really awesome in finding possible duplicates when a new crash report is created. This makes the work for triagers much easier because they get the information directly in the report. But my work with bugzilla shows that it’s not good enough. In many cases one gets a list of about five bug reports which are all marked as duplicate of another bug. So which one is it? One has to click all possible duplicates and trace them down. Sometimes one finds not triaged bugs, sometimes they all link to the same bug.

What one would want is to get presented the most likely duplicate directly to start with. And it would also be awesome to see all related crash reports which are not yet triaged. I spent some time thinking about it and thought that there most be an automated solution to this problem. We have the information on which are the possible duplicates, we know that the possible duplicates have further possible duplicates and we have the duplicates directly. So overall there is a network which goes from our report we start with directly to the report which should be used as the duplicate. If we add all the links we get a most likely duplicate, just need to compare the backtrace and are happy.

So I started to work on this using bugzilla’s WebService API to query a bug, find the duplicates and start to connect the various reports in a graph. The result is:

duplicate-finder

It’s a small application which takes a bug report, loads it and builds up the graph of all related crash reports. It finds the most likely duplicate and offers to mark the selected bug as a duplicate. The visualization helps to see whether there are further reports in the network and one can use the tool to also mark those as duplicates.

I hope that this is a useful tool for our bug squad team. It can be downloaded from my scratch repository. As dependencies it needs QJson, Qt4, kdelibs4 and kdepimlibs4. As runtime requirements it needs dot (Graphviz) and you should have a mailtransport configured in KMail. This is needed to mark the duplicate reports.

To start the tool use:

./duplicatefinder --from=your-bugzilla-address@email.de bugid

6 Replies to “Announcing the Duplicate Finder”

  1. Hi Martin!

    I know it is not related to the subject, but I could not find your mail address, so please excuse me for asking you here!

    A longer time ago you wrote about the qml improvements for the grid and present windows effect. I have been using KDE 4.11 on Opensuse for a longer period of time now and there was one thing which really annoyed me:

    When I set the grid layout mode to “Automatic”, the pager widget in my panel only adjusts the number of desktops, but not the number of rows. Therefore the design is inconsistent. Is this a hard thing to change? (I mean when the grid effect is allowed to change the number of desktops, why not also the number of rows for the pager?)
    I suppose this has nothing to do with the grid effect but with the pager widget. Should I report a bug?

    Thank you very much!

  2. I know, I was referring to the new button elements, as mentioned in this blog post:
    http://blog.martin-graesslin.com/blog/2013/03/an-update-on-kwin-on-5/
    I know I should not have mentioned it, as it was not related to the subject.
    Anyway, what is your statement to
    “When I set the grid layout mode to “Automatic”, the pager widget in my panel only adjusts the number of desktops, but not the number of rows. Therefore the design is inconsistent. Is this a hard thing to change? (I mean when the grid effect is allowed to change the number of desktops, why not also the number of rows for the pager?)
    I suppose this has nothing to do with the grid effect but with the pager widget. Should I report a bug?”

    Thank you very much for your response

    1. I’m sorry, I’m not going to answer this as it’s completely off-topic here.

Comments are closed.