Quantcast

Regarding Duplicate Bug Report Detection in Bugzilla

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Regarding Duplicate Bug Report Detection in Bugzilla

amar.budhiraja1
Hi,
I am working on a research project on automatic detection of duplicate bug report detection.
I am using the last 10 years of Mozilla bug reports(~750K) to do the ask in order to make it easier triager by getting the duplicate bug report in top-10.
The results look promising quantitatively and we want to publish the result in a tier 1 conference.

For the same, we need Mozilla's help. We have about 600 sets of 10-words and we request Mozilla to help us do the quantitative evaluation on those. Basically for each set of 10-words, someone will have to say whether these words belong to the same topic.

We would appreciate if Mozilla could help by asking its community to help with the labeling.

Hoping to hear back.

Thanks,
Amar Budhiraja
Data Science and Analytics Centre
IIIT-Hyderabad
_______________________________________________
dev-apps-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-bugzilla
-
To view or change your list settings, click here:
<https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lists+s6506n84121h51@...>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Duplicate Bug Report Detection in Bugzilla

amar.budhiraja1
Hi Emma,

Thank you for such a prompt reply.
Please find my replies in-line.



On Saturday, October 15, 2016 at 6:45:34 AM UTC+5:30, Emma Humphries wrote:
> Hi Amar,
>
> Thanks for your inquiry. I'm CC'ing the Firefox community manager Mike Hoye
> on this in case he has volunteer resources available to help.
>
> To clarify your request, do you want someone to verify if a group of words
> describes the same topic?

Yes. To add to it, I'd like to know if most of the words are under the same topic, not necessarily describe a topic. The rating would be 0/1.
 
> What level of experience/expertise will be needed to do this verification?
> Does the verify-er need to understand the product/component organization of
> bugzilla, understand Firefox source code, have programming experience?


Some level of product/component organization of Firefox is necessary. For example, a set of words that I encountered had words like email, reply etc with thunderbird. Programming experience is necessary to understand code jargons. No Source code knowledge is required.


> Do you have an estimate of how long this task would take, and do you have a
> sample of the data you need reviewed and verified?



In worst case, scenario all the words in the 10-word set could be unknown to the verify-er and hence, ideally she/he would search for it online. In such a case, it could take up to 2-3 minutes per 10-words set.

Sample of 10 such 10-word sets is here:

https://drive.google.com/file/d/0BwFxxPd1ZJkrbks3TUtoZkI4OTg/view?usp=sharing



Thanks,
Amar




> Thanks,
>
> Emma Humphries
> Bugmaster
>
> On Fri, Oct 14, 2016 at 6:03 PM, <[hidden email]> wrote:
>
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug
> > report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask
> > in order to make it easier triager by getting the duplicate bug report in
> > top-10.
> > The results look promising quantitatively and we want to publish the
> > result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words
> > and we request Mozilla to help us do the quantitative evaluation on those.
> > Basically for each set of 10-words, someone will have to say whether these
> > words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help
> > with the labeling.
> >
> > Hoping to hear back.
> >
> > Thanks,
> > Amar Budhiraja
> > Data Science and Analytics Centre
> > IIIT-Hyderabad
> > _______________________________________________
> > dev-apps-bugzilla mailing list
> > [hidden email]
> > https://lists.mozilla.org/listinfo/dev-apps-bugzilla
> >


On Saturday, October 15, 2016 at 6:45:34 AM UTC+5:30, Emma Humphries wrote:

> Hi Amar,
>
> Thanks for your inquiry. I'm CC'ing the Firefox community manager Mike Hoye
> on this in case he has volunteer resources available to help.
>
> To clarify your request, do you want someone to verify if a group of words
> describes the same topic?
>
> What level of experience/expertise will be needed to do this verification?
> Does the verify-er need to understand the product/component organization of
> bugzilla, understand Firefox source code, have programming experience?
>
> Do you have an estimate of how long this task would take, and do you have a
> sample of the data you need reviewed and verified?
>
> Thanks,
>
> Emma Humphries
> Bugmaster
>
> On Fri, Oct 14, 2016 at 6:03 PM, <[hidden email]> wrote:
>
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug
> > report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask
> > in order to make it easier triager by getting the duplicate bug report in
> > top-10.
> > The results look promising quantitatively and we want to publish the
> > result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words
> > and we request Mozilla to help us do the quantitative evaluation on those.
> > Basically for each set of 10-words, someone will have to say whether these
> > words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help
> > with the labeling.
> >
> > Hoping to hear back.
> >
> > Thanks,
> > Amar Budhiraja
> > Data Science and Analytics Centre
> > IIIT-Hyderabad
> > _______________________________________________
> > dev-apps-bugzilla mailing list
> > [hidden email]
> > https://lists.mozilla.org/listinfo/dev-apps-bugzilla
> >



On Saturday, October 15, 2016 at 6:45:34 AM UTC+5:30, Emma Humphries wrote:

> Hi Amar,
>
> Thanks for your inquiry. I'm CC'ing the Firefox community manager Mike Hoye
> on this in case he has volunteer resources available to help.
>
> To clarify your request, do you want someone to verify if a group of words
> describes the same topic?
>
> What level of experience/expertise will be needed to do this verification?
> Does the verify-er need to understand the product/component organization of
> bugzilla, understand Firefox source code, have programming experience?
>
> Do you have an estimate of how long this task would take, and do you have a
> sample of the data you need reviewed and verified?
>
> Thanks,
>
> Emma Humphries
> Bugmaster
>
> On Fri, Oct 14, 2016 at 6:03 PM, <[hidden email]> wrote:
>
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug
> > report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask
> > in order to make it easier triager by getting the duplicate bug report in
> > top-10.
> > The results look promising quantitatively and we want to publish the
> > result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words
> > and we request Mozilla to help us do the quantitative evaluation on those.
> > Basically for each set of 10-words, someone will have to say whether these
> > words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help
> > with the labeling.
> >
> > Hoping to hear back.
> >
> > Thanks,
> > Amar Budhiraja
> > Data Science and Analytics Centre
> > IIIT-Hyderabad
> > _______________________________________________
> > dev-apps-bugzilla mailing list
> > [hidden email]
> > https://lists.mozilla.org/listinfo/dev-apps-bugzilla
> >
_______________________________________________
dev-apps-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-bugzilla
-
To view or change your list settings, click here:
<https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lists+s6506n84121h51@...>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Duplicate Bug Report Detection in Bugzilla

Dylan Hardison
In reply to this post by amar.budhiraja1

> On Oct 14, 2016, at 21:03, [hidden email] wrote:
>
> Hi,
> I am working on a research project on automatic detection of duplicate bug report detection.
> I am using the last 10 years of Mozilla bug reports(~750K) to do the ask in order to make it easier triager by getting the duplicate bug report in top-10.
> The results look promising quantitatively and we want to publish the result in a tier 1 conference.
>
> For the same, we need Mozilla's help. We have about 600 sets of 10-words and we request Mozilla to help us do the quantitative evaluation on those. Basically for each set of 10-words, someone will have to say whether these words belong to the same topic.
>
> We would appreciate if Mozilla could help by asking its community to help with the labeling.
>
> Hoping to hear back.
>

This is very interesting. How will the labeling work? Some online questionnaire / google form?
Let me know exactly what's expected and I'll see what I can do. If we can find 60 volunteers, that's only ten sets of ten words
each -- but I might be misunderstanding how you'd need to collect the answers
(and how you control for humans reporting incorrectly).

Kind regards,

Dylan Hardison.-
To view or change your list settings, click here:
<https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lists+s6506n84121h51@...>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Duplicate Bug Report Detection in Bugzilla

amar.budhiraja1
In reply to this post by amar.budhiraja1
So, I think I will develop a web app/google form which will show you groups of words and then you are supposed to label if they belong to the same topic or not.

Also, I was hoping to have each set labelled by say 3 (preferably 5 people)  in order to remove the diversity of human judgement.

If each user still does only 10 sets of 10-words each, I will minimally require 180 volunteers and ideally say 300 volunteers. I was hoping Mozilla's community size would help in there.

Thanks,
Amar


On Saturday, October 15, 2016 at 8:59:17 AM UTC+5:30, Dylan Hardison wrote:

> > On Oct 14, 2016, at 21:03, [hidden email] wrote:
> >
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask in order to make it easier triager by getting the duplicate bug report in top-10.
> > The results look promising quantitatively and we want to publish the result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words and we request Mozilla to help us do the quantitative evaluation on those. Basically for each set of 10-words, someone will have to say whether these words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help with the labeling.
> >
> > Hoping to hear back.
> >
>
> This is very interesting. How will the labeling work? Some online questionnaire / google form?
> Let me know exactly what's expected and I'll see what I can do. If we can find 60 volunteers, that's only ten sets of ten words
> each -- but I might be misunderstanding how you'd need to collect the answers
> (and how you control for humans reporting incorrectly).
>
> Kind regards,
>
> Dylan Hardison.-
> To view or change your list settings, click here:
> <https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=dev-apps-bugzilla@...>
_______________________________________________
dev-apps-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-bugzilla
-
To view or change your list settings, click here:
<https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lists+s6506n84121h51@...>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Duplicate Bug Report Detection in Bugzilla

amar.budhiraja1
In reply to this post by amar.budhiraja1
Hello,
I was wondering if there has been any update on this?

Thanks,
Amar



On Saturday, October 15, 2016 at 6:33:10 AM UTC+5:30, [hidden email] wrote:

> Hi,
> I am working on a research project on automatic detection of duplicate bug report detection.
> I am using the last 10 years of Mozilla bug reports(~750K) to do the ask in order to make it easier triager by getting the duplicate bug report in top-10.
> The results look promising quantitatively and we want to publish the result in a tier 1 conference.
>
> For the same, we need Mozilla's help. We have about 600 sets of 10-words and we request Mozilla to help us do the quantitative evaluation on those. Basically for each set of 10-words, someone will have to say whether these words belong to the same topic.
>
> We would appreciate if Mozilla could help by asking its community to help with the labeling.
>
> Hoping to hear back.
>
> Thanks,
> Amar Budhiraja
> Data Science and Analytics Centre
> IIIT-Hyderabad

_______________________________________________
dev-apps-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-bugzilla
-
To view or change your list settings, click here:
<https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lists+s6506n84121h51@...>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Regarding Duplicate Bug Report Detection in Bugzilla

Gervase Markham
On 18/10/16 13:16, [hidden email] wrote:
>> We would appreciate if Mozilla could help by asking its community to help with the labeling.

If you build the necessary web app for Mozillians to rate groups of
words, then perhaps Emma or someone else could put out a call for
volunteers for you. But you need to have something for them to do,
before we start trying to find people.

Gerv

_______________________________________________
dev-apps-bugzilla mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-apps-bugzilla
-
To view or change your list settings, click here:
<https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lists+s6506n84121h51@...>
Loading...