Quantcast

Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Diederik van Liere
I just posted a blog titled "Improving Bugzilla’s Bug Overview List by
Predicting Which Bug Will Get Fixed".

It is a proposal to make a small change to the bug overview list. I
suggest adding a new column to Bugzilla’s bug list overview called
‘Probability of Fix’.  Adding this column to the bug list overview can
help open source communities in focusing on the most promising bug
reports. The current bug list overview treats every bug as if it has
the same probability of being fixed. I demonstrate, using an extensive
dataset from Firefox, that bug reporter experience, bug reporter past
success rate, the presence of a stack trace and whether the bug
reporter is a Mozilla affiliate are strong and positive predictors
whether a bug will be fixed.

I hope you will find this interesting and you can read the blog at:
http://network-labs.org/2009/06/improving-bugzilla’s-bug-overview-list-by-predicting-which-bug-will-get-fixed/

Looking forward to your reactions and feedback,
best,
Diederik
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Gervase Markham
On 02/07/09 17:43, Diederik wrote:
> I just posted a blog titled "Improving Bugzilla’s Bug Overview List by
> Predicting Which Bug Will Get Fixed".

Diederik,

That's brilliant :-) Here are some comments:

- The operating system a bug is labelled with is normally set to the
operating system the reporter is using, rather than being an indicator
that the bug is only found on a certain OS. Are you aware of that? Was
your use of OS an attempt to analyse whether Linux users filed better or
worse bugs than Windows users, or something like that?

- Table 1 confuses me. It seems that almost all rows are Significant,
even though they have widely differing values for Coefficient. Could you
explain how that works in a bit more detail?

- Did you start with a database dump and then augment it with web
crawling? Or did you just get a really recent dump (June 28th)?

- The "steps" and "user agent" variables are going to be biased towards
novice bug reporters, because both of those bits of data are required by
the Guided bug reporting form, which new bug filers must use, and not by
the standard one.

- You said that experience alone, not considering success rate, makes it
less likely a bug report will be fixed. This is an amazing result. Does
it mean that there's a small group of prolific filers of bad bugs? If
so, can you tell us who they are so we can close their accounts?

- Even given the above, is it still true that (experience x success
rate) is a better predictor than success rate alone?

I definitely think we need to find a way to help triagers use this data
to work more effectively. Let me be clear on what you are suggesting.
You are suggesting that when a bug is filed, Bugzilla should calculate
the reporter's current experience and success rate, whether there's a
stack trace and whether they are a Mozilla person, and use it to
calculate a field which is then unchangeable but which can be displayed
in bug lists?

Gerv
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Diederik van Liere
On Jul 3, 4:35 am, Gervase Markham <[hidden email]> wrote:
Diederik,

That's brilliant :-) Here are some comments:

THANKS!

- The operating system a bug is labelled with is normally set to the
operating system the reporter is using, rather than being an
indicator
that the bug is only found on a certain OS. Are you aware of that?
Was
your use of OS an attempt to analyse whether Linux users filed better
or
worse bugs than Windows users, or something like that?

** I am aware that the Operating System variable usually describes the
platform that the bug reporter uses and that it does not necessarily
mean that a bug only applies to that OS. The reason why I include this
variable is to rule the Operating System out as an alternative
explanation. The Windows / Linux and (to a lesser extent) OSX platform
can draw from a larger pool of potential developers then let's say
OpenVMS or Mac System 8. As the pool of developers is smaller for
these platforms it is likely that it will take more time for a bug to
get fixed. So, this variable is not to say that users of a particular
platform are better / worse in writing bug reports but some platforms
have more potential developers than other platforms.

- Table 1 confuses me. It seems that almost all rows are Significant,
even though they have widely differing values for Coefficient. Could
you
explain how that works in a bit more detail?

** Sure. The significance of a coefficient is determined by the
standard error. Basically, significance is obtained by dividing the
coefficient by the standard error, this gives you the t-value of a
coefficient. As a rule of thumb, t-values larger than 1,96 are
significant. A standard error indicates how close the coefficient from
the sample is to the true but unknown coefficient of the population.
As the data set is really large, the standard errors becomes smaller
and hence almost all coefficients are highly significant. When
comparing different coefficients then you are looking at different
effect sizes, some coefficients are more important than other
coefficients because they have a larger effect size.

- Did you start with a database dump and then augment it with web
crawling? Or did you just get a really recent dump (June 28th)?

** I downloaded all bug reports from Bugzilla starting at bugid 1 to
bugid 480000 if:
        -the bug applies to Firefox / SeaMonkey
        -the initial bug report contains a comment.
Downloading started sometime around the 20th of June and took about a
week but I didn't want to hammer Bugzilla too much :)

- The "steps" and "user agent" variables are going to be biased
towards
novice bug reporters, because both of those bits of data are required
by
the Guided bug reporting form, which new bug filers must use, and not
by
the standard one.

** Yes, you are right. So far, i have run different statistical models
and i rarely find any effect for these variables.
It seems that it doesn't really matter which seems strange to me.


- You said that experience alone, not considering success rate, makes
it
less likely a bug report will be fixed. This is an amazing result.
Does
it mean that there's a small group of prolific filers of bad bugs? If
so, can you tell us who they are so we can close their accounts?


** that is indeed what it means :) so people who file a lot and have a
bad past success rate clog the system. However, if you don't take into
account past success rate then the sign for the experience coefficient
flips and it becomes positive (that model is not part of the blog
post). I will try to identify those people but some caution and human
judgment is advisable before closing accounts.

- Even given the above, is it still true that (experience x success
rate) is a better predictor than success rate alone?

** the interaction effect can only be interpreted together with the
main effects.  The strongest predictor is past success rate, if you
include the interaction effect then the experience variable becomes
negative.

I definitely think we need to find a way to help triagers use this
data
to work more effectively. Let me be clear on what you are suggesting.
You are suggesting that when a bug is filed, Bugzilla should
calculate
the reporter's current experience and success rate, whether there's a
stack trace and whether they are a Mozilla person, and use it to
calculate a field which is then unchangeable but which can be
displayed
in bug lists?

** Exactly, i also emailed with Max Kanat-Alexander about this and he
suggested that this might have a negative impact on 'morale' which i
can see. So maybe this field should only be visible to bug triagers.

Gerv
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Gervase Markham
On 03/07/09 21:31, Diederik wrote:
> ** Exactly, i also emailed with Max Kanat-Alexander about this and he
> suggested that this might have a negative impact on 'morale' which i
> can see. So maybe this field should only be visible to bug triagers.

One option would be to hold this data externally, and have a
Greasemonkey script or Firefox extension which inserted it into the HTML
of every bug you opened.

Another option would be to have it available to those who have
"editbugs" privileges only.

By the way, you don't have to scrape Bugzilla to get data. We now
automatically generate MySQL database dumps with secure bugs removed.
I'm sure we could get you access to one.

Gerv
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Diederik van Liere
On Jul 6, 6:11 am, Gervase Markham <[hidden email]> wrote:

> On 03/07/09 21:31, Diederik wrote:
>
> > ** Exactly, i also emailed with Max Kanat-Alexander about this and he
> > suggested that this might have a negative impact on 'morale' which i
> > can see. So maybe this field should only be visible to bug triagers.
>
> One option would be to hold this data externally, and have a
> Greasemonkey script or Firefox extension which inserted it into the HTML
> of every bug you opened.
>
> Another option would be to have it available to those who have
> "editbugs" privileges only.
>
> By the way, you don't have to scrape Bugzilla to get data. We now
> automatically generate MySQL database dumps with secure bugs removed.
> I'm sure we could get you access to one.
>
> Gerv

I like the idea of of holding the data externally, however my
javascript coding skills are poor so I am happy to help someone but I
won't be able to code it myself.
It would be awesome to have access to the database dumps!

D
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Diederik van Liere
On Jul 6, 1:16 pm, Diederik <[hidden email]> wrote:

> On Jul 6, 6:11 am, Gervase Markham <[hidden email]> wrote:
>
>
>
> > On 03/07/09 21:31, Diederik wrote:
>
> > > ** Exactly, i also emailed with Max Kanat-Alexander about this and he
> > > suggested that this might have a negative impact on 'morale' which i
> > > can see. So maybe this field should only be visible to bug triagers.
>
> > One option would be to hold this data externally, and have a
> > Greasemonkey script or Firefox extension which inserted it into the HTML
> > of every bug you opened.
>
> > Another option would be to have it available to those who have
> > "editbugs" privileges only.
>
> > By the way, you don't have to scrape Bugzilla to get data. We now
> > automatically generate MySQL database dumps with secure bugs removed.
> > I'm sure we could get you access to one.
>
> > Gerv
>
> I like the idea of of holding the data externally, however my
> javascript coding skills are poor so I am happy to help someone but I
> won't be able to code it myself.
> It would be awesome to have access to the database dumps!
>
> D

I wrote a jetpack add on that calculates the probability that a bug
report will result in a bug fix. This add-on might be helpful to focus
the resources of the mozilla community to the most promising bug
reports.  Using an extensive data set from the Firefox  project I
showed that bug reporter experience, bug reporter past success rate,
the presence of a stack trace and whether the bug reporter is a
Mozilla affiliate are strong and positive predictors whether a bug
will be fixed. The full blog post is available here
http://network-labs.org/2009/06/improving-bugzilla%E2%80%99s-bug-overview-list-by-predicting-which-bug-will-get-fixed/

And the new post introducing the jetpack addon can be found here:
http://network-labs.org/2009/07/jetpack-add-on-to-predict-likelihood-of-bug-fix-in-bugzilla/

best,
Diederik
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Jean-Marc Desperrier
In reply to this post by Gervase Markham
Gervase Markham wrote:
> Does it mean that there's a small group of prolific filers of bad bugs?

Given the number of reporters, and that *is* a small group of extremely
experimented reporters who are also prolific reporters, I don't think
this groups is small.

But I can imagine a number of slightly different explanation thought.

Maybe more experiences reporter might enter more *sophisticated* bug
that have a smaller probability of being solved.

Another point is that if the bug is closed as a dupe, it should be
considered a *failure* of the reporter and not a success.
It might be that novice reporter have their success rate very much
bumped up by the fact that they mostly report dupe.

Given that (experience x success rate) is a positive predictor, this
probably isn't everything there is here, but it could be still a
significant factor.

I feel I must include here another perspective that is that some of the
volunteer who are doing the first wave of bug filtering are a bit
overzealous and easily closing as dupe some bug that are not dupe when
you look at the details closely (not exactly the same reproduction
conditions, which means the main bug can get closed *without* solving
that bug). So getting experience can mean in some case fighting them
more successfully to not get the bug marked as a dupe, which lowers the
success rate if closed as dupe is counted as a success.

Maybe the best would be if the success rate indicator was to get one's
bug marked as NEW and nothing else. But adding to that the failure rate
of bug directly opened as NEW who finally get closed as DUPE, or
INVALID. WONTFIX is not an easy one to handle, but if you watch this
strictly from the mozilla seat, you're interested in reporters that
report bugs that you *want* to fix, so WONTFIX is indeed a failure.

_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Gervase Markham
On 09/07/09 10:17, Jean-Marc Desperrier wrote:
> But I can imagine a number of slightly different explanation thought.
>
> Maybe more experiences reporter might enter more *sophisticated* bug
> that have a smaller probability of being solved.
>
> Another point is that if the bug is closed as a dupe, it should be
> considered a *failure* of the reporter and not a success.
> It might be that novice reporter have their success rate very much
> bumped up by the fact that they mostly report dupe.

We certainly need to be certain to get the criteria for success rate
right. Diederik: you define it in your blog post as:

"Success Rate: the percentage of past bug reports by the bug reporter
that has been fixed."

Does that mean (number of bugs marked as FIXED/total bugs filed)?
I think we need to nuance that. I would define success rate as:

(number of bugs marked as FIXED + (number of bugs marked as DUPLICATE of
bugs with higher bug numbers)/number of bugs resolved, not including
EXPIRED or WONTFIX)

DUPLICATE of lower-numbered bug is generally bad, as is INVALID.
WORKSFORME could be either, I guess.

I'd be interested to hear if you get significantly different results
using that metric.

Gerv
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Diederik van Liere
On Jul 9, 7:17 am, Gervase Markham <[hidden email]> wrote:

> On 09/07/09 10:17, Jean-Marc Desperrier wrote:
>
> > But I can imagine a number of slightly different explanation thought.
>
> > Maybe more experiences reporter might enter more *sophisticated* bug
> > that have a smaller probability of being solved.
>
> > Another point is that if the bug is closed as a dupe, it should be
> > considered a *failure* of the reporter and not a success.
> > It might be that novice reporter have their success rate very much
> > bumped up by the fact that they mostly report dupe.
>
> We certainly need to be certain to get the criteria for success rate
> right. Diederik: you define it in your blog post as:
>
> "Success Rate: the percentage of past bug reports by the bug reporter
> that has been fixed."
>
> Does that mean (number of bugs marked as FIXED/total bugs filed)?
> I think we need to nuance that. I would define success rate as:
>
> (number of bugs marked as FIXED + (number of bugs marked as DUPLICATE of
> bugs with higher bug numbers)/number of bugs resolved, not including
> EXPIRED or WONTFIX)
>
> DUPLICATE of lower-numbered bug is generally bad, as is INVALID.
> WORKSFORME could be either, I guess.
>
> I'd be interested to hear if you get significantly different results
> using that metric.
>
> Gerv

Hi Gerv, Jean-Marc,

A bug fix is defined as a bug that receives the “FIXED” status in
Bugzilla. WORKSFORME, INVALID, DUPLICATE, INCOMPLETE, WONTFIX are not
defined as a bug fix.  Alternative definitions of success will
definitely lead to different results, i will try to post alternative
set of results this weekend.

About the experience effect, it is caused by the interaction effect.
removing the interaction effect will make the experience predictor
positive. in the next set of results, i will drop the interaction
effect to ease interpretation and it seems that the effect size of the
interaction is a bit marginal as well.

best,
Diederik
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Max Kanat-Alexander
Diederik wrote:
> About the experience effect, it is caused by the interaction effect.

        What's the "interaction effect"? (That is, what does that mean?)

        -Max
--
http://www.everythingsolved.com/
Competent, Friendly Bugzilla and Perl Services. Everything Else, too.
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

alexsibeth
In reply to this post by Diederik van Liere
On Thursday, July 2, 2009 at 6:43:04 PM UTC+2, Diederik wrote:

> I just posted a blog titled "Improving Bugzilla’s Bug Overview List by
> Predicting Which Bug Will Get Fixed".
>
> It is a proposal to make a small change to the bug overview list. I
> suggest adding a new column to Bugzilla’s bug list overview called
> ‘Probability of Fix’.  Adding this column to the bug list overview can
> help open source communities in focusing on the most promising bug
> reports. The current bug list overview treats every bug as if it has
> the same probability of being fixed. I demonstrate, using an extensive
> dataset from Firefox, that bug reporter experience, bug reporter past
> success rate, the presence of a stack trace and whether the bug
> reporter is a Mozilla affiliate are strong and positive predictors
> whether a bug will be fixed.
>
> I hope you will find this interesting and you can read the blog at:
> http://network-labs.org/2009/06/improving-bugzilla’s-bug-overview-list-by-predicting-which-bug-will-get-fixed/
>
> Looking forward to your reactions and feedback,
> best,
> Diederik

Hello,
Following up on this report, I am curious if any of the results were implemented into the Bugzilla system?
I am doing a master thesis on something very similar and curious.

Thanks!
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Andre Klapper
On Mon, 2017-03-06 at 08:55 -0800, [hidden email] wrote:

> On Thursday, July 2, 2009 at 6:43:04 PM UTC+2, Diederik wrote:
> > I just posted a blog titled "Improving Bugzilla’s Bug Overview List by
> > Predicting Which Bug Will Get Fixed".
> >
> > It is a proposal to make a small change to the bug overview list. I
> > suggest adding a new column to Bugzilla’s bug list overview called
> > ‘Probability of Fix’.  Adding this column to the bug list overview can
> > help open source communities in focusing on the most promising bug
> > reports. The current bug list overview treats every bug as if it has
> > the same probability of being fixed. I demonstrate, using an extensive
> > dataset from Firefox, that bug reporter experience, bug reporter past
> > success rate, the presence of a stack trace and whether the bug
> > reporter is a Mozilla affiliate are strong and positive predictors
> > whether a bug will be fixed.
> >
> > I hope you will find this interesting and you can read the blog at:
> > http://network-labs.org/2009/06/improving-bugzilla’s-bug-overview-list-by-predicting-which-bug-will-get-fixed/
> >
> >
> Following up on this report, I am curious if any of the results were
> implemented into the Bugzilla system? 
> I am doing a master thesis on something very similar and curious. 

As the URL that you posted does not work anymore it's hard to say...
Can you post questions / change proposals that you refer to?

Thanks,
andre
--
Andre Klapper  |  [hidden email]
http://blogs.gnome.org/aklapper/
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

alexsibeth
In reply to this post by Diederik van Liere
On Thursday, July 2, 2009 at 6:43:04 PM UTC+2, Diederik wrote:

> I just posted a blog titled "Improving Bugzilla’s Bug Overview List by
> Predicting Which Bug Will Get Fixed".
>
> It is a proposal to make a small change to the bug overview list. I
> suggest adding a new column to Bugzilla’s bug list overview called
> ‘Probability of Fix’.  Adding this column to the bug list overview can
> help open source communities in focusing on the most promising bug
> reports. The current bug list overview treats every bug as if it has
> the same probability of being fixed. I demonstrate, using an extensive
> dataset from Firefox, that bug reporter experience, bug reporter past
> success rate, the presence of a stack trace and whether the bug
> reporter is a Mozilla affiliate are strong and positive predictors
> whether a bug will be fixed.
>
> I hope you will find this interesting and you can read the blog at:
> http://network-labs.org/2009/06/improving-bugzilla’s-bug-overview-list-by-predicting-which-bug-will-get-fixed/
>
> Looking forward to your reactions and feedback,
> best,
> Diederik

Sorry for the delay.

I am referring to this article:
https://web.archive.org/web/20100124063818/http://network-labs.org/2009/06/improving-bugzilla%E2%80%99s-bug-overview-list-by-predicting-which-bug-will-get-fixed/

Which talks about implementing a system that will rank bugs depending on the probability they will be resolved.
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Blog titled "Improving Bugzilla’s Bug Overview List by Predicting Which Bug Will Get Fixed"

Andre Klapper
On Tue, 2017-03-14 at 02:29 -0700, [hidden email] wrote:
> I am referring to this article: 
> https://web.archive.org/web/20100124063818/http://network-labs.org/20
> 09/06/improving-bugzilla%E2%80%99s-bug-overview-list-by-predicting-
> which-bug-will-get-fixed/
>
> Which talks about implementing a system that will rank bugs depending
> on the probability they will be resolved.

On Mon, 2017-03-06 at 08:55 -0800, [hidden email] wrote:
> Following up on this report, I am curious if any of the results were
> implemented into the Bugzilla system? 
> I am doing a master thesis on something very similar and curious. 

I am not aware of anybody who proposed actual code to implement it.

andre
--
Andre Klapper  |  [hidden email]
http://blogs.gnome.org/aklapper/
_______________________________________________
support-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/support-bugzilla
PLEASE put [hidden email] in the To: field when you reply.
Loading...