DrKonqi is really awesome in finding possible duplicates when a new crash report is created. This makes the work for triagers much easier because they get the information directly in the report. But my work with bugzilla shows that it’s not good enough. In many cases one gets a list of about five bug reports which are all marked as duplicate of another bug. So which one is it? One has to click all possible duplicates and trace them down. Sometimes one finds not triaged bugs, sometimes they all link to the same bug.
What one would want is to get presented the most likely duplicate directly to start with. And it would also be awesome to see all related crash reports which are not yet triaged. I spent some time thinking about it and thought that there most be an automated solution to this problem. We have the information on which are the possible duplicates, we know that the possible duplicates have further possible duplicates and we have the duplicates directly. So overall there is a network which goes from our report we start with directly to the report which should be used as the duplicate. If we add all the links we get a most likely duplicate, just need to compare the backtrace and are happy.
So I started to work on this using bugzilla’s WebService API to query a bug, find the duplicates and start to connect the various reports in a graph. The result is:
It’s a small application which takes a bug report, loads it and builds up the graph of all related crash reports. It finds the most likely duplicate and offers to mark the selected bug as a duplicate. The visualization helps to see whether there are further reports in the network and one can use the tool to also mark those as duplicates.
I hope that this is a useful tool for our bug squad team. It can be downloaded from my scratch repository. As dependencies it needs QJson, Qt4, kdelibs4 and kdepimlibs4. As runtime requirements it needs dot (Graphviz) and you should have a mailtransport configured in KMail. This is needed to mark the duplicate reports.
To start the tool use:
./duplicatefinder --email@example.com bugid