E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

classic Classic list List threaded Threaded
42 messages Options
123
Reply | Threaded
Open this post in threaded view
|

E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Damon Sicore-2
All,

There will be a meeting Friday, November 4, at 10:00AM PDT to discuss optimizing our efforts to improve browser responsiveness.  Specifically, we will discuss remaining tasks for e10s and if and how we will resource them, hangs and jank caused by plugins, and other efforts to improve responsiveness.  Below is the agenda for the meeting.   If you are in the BCC list (using BCC list to avoid dev-planning auto-bouncing this as spam), you should plan to attend.  

Meeting Details:

# When: Friday, Nov 4, 10am PDT.  Blocking out two hours for this discussion.
# Mozilla Mountain View: Warp Core, 3rd floor
# 650-903-0800 or 650-215-1282 x92 Conf# 95312 (US/INTL)
# 1-800-707-2533 (pin 369) Conf# 95312 (US)
# Vidyo Room: Warp Core
# Vidyo Guest URL: https://v.mozilla.com/flex.html?roomdirect.html&key=UK1zyrd7Vhym (please mute)
# irc.mozilla.org #planning for backchannel

# Agenda

1) Confirm responsiveness is a top goal (no matter the method) for engineering.

2) Discuss potential jank problems to be solved outside E10S and the measurements we are using to track them

+ Andreas & Joel/Ted's tools
+ Taras' I/O tracking
+ Places
+ Incremental GC
+ Cycle Collector

3) Role of E10S in solving jank issues

4) Out of process plugin hangs, jank, memory, and lifecycle issues - Identify efforts and staff.

5) E10s Future prioritization, staffing and drivers
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Honza Bambas-5
Are there any meeting notes to look at?  I wasn't online on Friday and
missed the meeting.

Thanks.
-hb-

On 11/3/2011 7:01 PM, Damon Sicore wrote:

> All,
>
> There will be a meeting Friday, November 4, at 10:00AM PDT to discuss optimizing our efforts to improve browser responsiveness.  Specifically, we will discuss remaining tasks for e10s and if and how we will resource them, hangs and jank caused by plugins, and other efforts to improve responsiveness.  Below is the agenda for the meeting.   If you are in the BCC list (using BCC list to avoid dev-planning auto-bouncing this as spam), you should plan to attend.
>
> Meeting Details:
>
> # When: Friday, Nov 4, 10am PDT.  Blocking out two hours for this discussion.
> # Mozilla Mountain View: Warp Core, 3rd floor
> # 650-903-0800 or 650-215-1282 x92 Conf# 95312 (US/INTL)
> # 1-800-707-2533 (pin 369) Conf# 95312 (US)
> # Vidyo Room: Warp Core
> # Vidyo Guest URL: https://v.mozilla.com/flex.html?roomdirect.html&key=UK1zyrd7Vhym (please mute)
> # irc.mozilla.org #planning for backchannel
>
> # Agenda
>
> 1) Confirm responsiveness is a top goal (no matter the method) for engineering.
>
> 2) Discuss potential jank problems to be solved outside E10S and the measurements we are using to track them
>
> + Andreas&  Joel/Ted's tools
> + Taras' I/O tracking
> + Places
> + Incremental GC
> + Cycle Collector
>
> 3) Role of E10S in solving jank issues
>
> 4) Out of process plugin hangs, jank, memory, and lifecycle issues - Identify efforts and staff.
>
> 5) E10s Future prioritization, staffing and drivers
> _______________________________________________
> dev-planning mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-planning
>

_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Damon Sicore-2
In reply to this post by Damon Sicore-2
All,

Per the message below, a meeting was held to discuss E10s and responsiveness resourcing and planning.  We listed and discussed the major projects that have an E10s aspect:

1)  Front-end inversion of control flow efforts, currently staffed by Gavin, Felipe, and Drew.
2)  Add-on static analysis to determine the path towards add-on compatibility, dherman.
3)  Jetpack
4)  Graphics E10s (E10s layers, D3D9, 10, and OS X OpenGL Layers)
5)  Dev Tools (dcamp and crew)
6)  A11y (dbolter and crew)
7)  IndexDB and printing support (Kyle and Smaug)
8)  Eng tools special powers.

Today, we have product issues around responsiveness that are not well defined; however, it's clear that responsiveness issues exist as our users are providing consistent feedback indicating hangs and jank (problems where scrolling is paused or the UI is unresponsive) or Flash hangs or jitters during video playback.  These are problems we cannot ignore.  There are specific efforts that require focus in order to address responsiveness:

A)  Out of process plugin optimizations:  Memory leaks, restarts, hangs, and kill timer problems.
B)  Event loop tuning, Gecko Reflow tuning:  We need to instrument and optimize.
C)  Incremental GC:  JS team is on track to deliver incremental GC this quarter.
D)  Cycle collector optimizations.
E)  Places optimizations:  Taras' team is working on optimizing; however, we need to make hard decisions about when and where to use an SQL database.  We also need to consider alternatives to SQLite.
F)  Front-end jank.
G)  Content stimulated jank.

We discussed the merits of efforts 1-8 above, and it was decided that the front-end IOC work (1) should be suspended in order to focus on places optimizations, item (E).  And, the dev tools efforts would also be suspended (5).  

In addition, we discussed the formation of a program and/or a team strictly responsible for improving responsiveness.  Items A-G above embody a significant amount of work, and to address these issues, we'll need to apply significant resources.  David Mandelin suggested that someone should be living and breathing responsiveness.  As a result, it was decided that JP, Johnath, Bob Moss, and I would form a program, identifying specific resources to address responsiveness issues as a whole.  This type of effort has been extremely effective in the past (CritSmash resulted in shipping with zero reproducible sg:crits, CrashKill dramatically improved our stability and ability to track product stability, etc.)  This program will be similar.  

Summary:  We'll suspend the efforts noted above to refocus on responsiveness issues with a special program to be formed by Johnath, JP, Bob Moss, and myself.  As in previous special programs, we'll need everyone's attention and support to identify and fix responsiveness issues.  As Firefox is a portal to the web, users demand that it be as responsive as possible.  This is a basic fact.  This is not an effort any of us can ignore.  Just like the critical security bug program (CritSmash), we'll need developers to immediately respond and act on identified responsiveness issues.  

All my best,

Damon




On Nov 3, 2011, at 11:01 AM, Damon Sicore wrote:

> All,
>
> There will be a meeting Friday, November 4, at 10:00AM PDT to discuss optimizing our efforts to improve browser responsiveness.  Specifically, we will discuss remaining tasks for e10s and if and how we will resource them, hangs and jank caused by plugins, and other efforts to improve responsiveness.  Below is the agenda for the meeting.   If you are in the BCC list (using BCC list to avoid dev-planning auto-bouncing this as spam), you should plan to attend.  
>
> Meeting Details:
>
> # When: Friday, Nov 4, 10am PDT.  Blocking out two hours for this discussion.
> # Mozilla Mountain View: Warp Core, 3rd floor
> # 650-903-0800 or 650-215-1282 x92 Conf# 95312 (US/INTL)
> # 1-800-707-2533 (pin 369) Conf# 95312 (US)
> # Vidyo Room: Warp Core
> # Vidyo Guest URL: https://v.mozilla.com/flex.html?roomdirect.html&key=UK1zyrd7Vhym (please mute)
> # irc.mozilla.org #planning for backchannel
>
> # Agenda
>
> 1) Confirm responsiveness is a top goal (no matter the method) for engineering.
>
> 2) Discuss potential jank problems to be solved outside E10S and the measurements we are using to track them
>
> + Andreas & Joel/Ted's tools
> + Taras' I/O tracking
> + Places
> + Incremental GC
> + Cycle Collector
>
> 3) Role of E10S in solving jank issues
>
> 4) Out of process plugin hangs, jank, memory, and lifecycle issues - Identify efforts and staff.
>
> 5) E10s Future prioritization, staffing and drivers
> _______________________________________________
> dev-planning mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-planning

_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Boris Zbarsky
In reply to this post by Damon Sicore-2
On 11/8/11 4:12 PM, Damon Sicore wrote:
> G)  Content stimulated jank.

This is the big one I run into that seems to be very difficult outside
something like e10s.  Do we have any ideas on attacking it?

-Boris
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Robert O'Callahan-3
On Wed, Nov 9, 2011 at 11:06 AM, Boris Zbarsky <[hidden email]> wrote:

> On 11/8/11 4:12 PM, Damon Sicore wrote:
>
>> G)  Content stimulated jank.
>>
>
> This is the big one I run into that seems to be very difficult outside
> something like e10s.  Do we have any ideas on attacking it?
>

Yeah, I have no idea how to solve this outside e10s and I haven't heard
anyone else suggest anything either. At least for problems that boil down
to "page runs long-running JS script without yielding".

Under D, "Cycle collector optimizations", while I think we can make some
incremental improvements to make pause times a bit shorter, I have no idea
how we can eliminate nasty CC pauses for large heaps without e10s (or
something similar in risk).

Rob
--
"If we claim to be without sin, we deceive ourselves and the truth is not
in us. If we confess our sins, he is faithful and just and will forgive us
our sins and purify us from all unrighteousness. If we claim we have not
sinned, we make him out to be a liar and his word is not in us." [1 John
1:8-10]
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Robert O'Callahan-3
To be clear --- I think it very likely makes sense to temporarily delay
e10s to focus on short-term wins, especially stuff like Places that
wouldn't be helped by e10s at all. But I also think we have to have e10s to
win the war on jank.

Rob
--
"If we claim to be without sin, we deceive ourselves and the truth is not
in us. If we confess our sins, he is faithful and just and will forgive us
our sins and purify us from all unrighteousness. If we claim we have not
sinned, we make him out to be a liar and his word is not in us." [1 John
1:8-10]
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

SmauG-2
In reply to this post by Damon Sicore-2
On 11/08/2011 11:12 PM, Damon Sicore wrote:

> All,
>
> Per the message below, a meeting was held to discuss E10s and
> responsiveness resourcing and planning.  We listed and discussed the
> major projects that have an E10s aspect:
>
> 1)  Front-end inversion of control flow efforts, currently staffed by
> Gavin, Felipe, and Drew. 2)  Add-on static analysis to determine the
> path towards add-on compatibility, dherman. 3)  Jetpack 4)  Graphics
> E10s (E10s layers, D3D9, 10, and OS X OpenGL Layers) 5)  Dev Tools
> (dcamp and crew) 6)  A11y (dbolter and crew) 7)  IndexDB and printing
> support (Kyle and Smaug) 8)  Eng tools special powers.
>
> Today, we have product issues around responsiveness that are not well
> defined; however, it's clear that responsiveness issues exist as our
> users are providing consistent feedback indicating hangs and jank
> (problems where scrolling is paused or the UI is unresponsive) or
> Flash hangs or jitters during video playback.  These are problems we
> cannot ignore.  There are specific efforts that require focus in
> order to address responsiveness:
>
> A)  Out of process plugin optimizations:  Memory leaks, restarts,
> hangs, and kill timer problems. B)  Event loop tuning, Gecko Reflow
> tuning:  We need to instrument and optimize. C)  Incremental GC:  JS
> team is on track to deliver incremental GC this quarter. D)  Cycle
> collector optimizations. E)  Places optimizations:  Taras' team is
> working on optimizing; however, we need to make hard decisions about
> when and where to use an SQL database.  We also need to consider
> alternatives to SQLite. F)  Front-end jank. G)  Content stimulated
> jank.


We should, IMO, also *actively* blocklist badly behaving addons.
https://bugzilla.mozilla.org/show_bug.cgi?id=694683
is an example. F-Secure was making Firefox almost unusable on
a fast, new laptop.





>
> We discussed the merits of efforts 1-8 above, and it was decided that
> the front-end IOC work (1) should be suspended in order to focus on
> places optimizations, item (E).  And, the dev tools efforts would
> also be suspended (5).
>
> In addition, we discussed the formation of a program and/or a team
> strictly responsible for improving responsiveness.  Items A-G above
> embody a significant amount of work, and to address these issues,
> we'll need to apply significant resources.  David Mandelin suggested
> that someone should be living and breathing responsiveness.  As a
> result, it was decided that JP, Johnath, Bob Moss, and I would form a
> program, identifying specific resources to address responsiveness
> issues as a whole.  This type of effort has been extremely effective
> in the past (CritSmash resulted in shipping with zero reproducible
> sg:crits, CrashKill dramatically improved our stability and ability
> to track product stability, etc.)  This program will be similar.
>
> Summary:  We'll suspend the efforts noted above to refocus on
> responsiveness issues with a special program to be formed by Johnath,
> JP, Bob Moss, and myself.  As in previous special programs, we'll
> need everyone's attention and support to identify and fix
> responsiveness issues.  As Firefox is a portal to the web, users
> demand that it be as responsive as possible.  This is a basic fact.
> This is not an effort any of us can ignore.  Just like the critical
> security bug program (CritSmash), we'll need developers to
> immediately respond and act on identified responsiveness issues.
>
> All my best,
>
> Damon
>
>
>
>
> On Nov 3, 2011, at 11:01 AM, Damon Sicore wrote:
>
>> All,
>>
>> There will be a meeting Friday, November 4, at 10:00AM PDT to
>> discuss optimizing our efforts to improve browser responsiveness.
>> Specifically, we will discuss remaining tasks for e10s and if and
>> how we will resource them, hangs and jank caused by plugins, and
>> other efforts to improve responsiveness.  Below is the agenda for
>> the meeting.   If you are in the BCC list (using BCC list to avoid
>> dev-planning auto-bouncing this as spam), you should plan to
>> attend.
>>
>> Meeting Details:
>>
>> # When: Friday, Nov 4, 10am PDT.  Blocking out two hours for this
>> discussion. # Mozilla Mountain View: Warp Core, 3rd floor #
>> 650-903-0800 or 650-215-1282 x92 Conf# 95312 (US/INTL) #
>> 1-800-707-2533 (pin 369) Conf# 95312 (US) # Vidyo Room: Warp Core #
>> Vidyo Guest URL:
>> https://v.mozilla.com/flex.html?roomdirect.html&key=UK1zyrd7Vhym
>> (please mute) # irc.mozilla.org #planning for backchannel
>>
>> # Agenda
>>
>> 1) Confirm responsiveness is a top goal (no matter the method) for
>> engineering.
>>
>> 2) Discuss potential jank problems to be solved outside E10S and
>> the measurements we are using to track them
>>
>> + Andreas&  Joel/Ted's tools + Taras' I/O tracking + Places +
>> Incremental GC + Cycle Collector
>>
>> 3) Role of E10S in solving jank issues
>>
>> 4) Out of process plugin hangs, jank, memory, and lifecycle issues
>> - Identify efforts and staff.
>>
>> 5) E10s Future prioritization, staffing and drivers
>> _______________________________________________ dev-planning
>> mailing list [hidden email]
>> https://lists.mozilla.org/listinfo/dev-planning
>

_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Andrew  McCreight
In reply to this post by Robert O'Callahan-3
----- Original Message -----
> Under D, "Cycle collector optimizations", while I think we can make
> some
> incremental improvements to make pause times a bit shorter, I have no
> idea
> how we can eliminate nasty CC pauses for large heaps without e10s (or
> something similar in risk).

Generally, I think long CC pause times are a symptom of leaks.  But it would be nice for the cycle collector to be more bulletproof.  As you say, there are various incremental ways to reduce CC times (interruptible CC, cycle collect pure DOM cycles separately from ones involving JS, aging out objects, etc.), but I don't know how much these will help in these cases we've been seeing recently where there may be some kind of leak resulting in multi-second pauses.

There is work on concurrent cycle collection, but this would probably require changing how we ref count cycle collected objects in the browser, which would not be a light undertaking, to say the least...


>
> Rob
> --
> "If we claim to be without sin, we deceive ourselves and the truth is
> not
> in us. If we confess our sins, he is faithful and just and will
> forgive us
> our sins and purify us from all unrighteousness. If we claim we have
> not
> sinned, we make him out to be a liar and his word is not in us." [1
> John
> 1:8-10]
> _______________________________________________
> dev-planning mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-planning
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Robert O'Callahan-3
On Wed, Nov 9, 2011 at 1:24 PM, Andrew McCreight <[hidden email]>wrote:

> Generally, I think long CC pause times are a symptom of leaks.  But it
> would be nice for the cycle collector to be more bulletproof.  As you say,
> there are various incremental ways to reduce CC times (interruptible CC,
> cycle collect pure DOM cycles separately from ones involving JS, aging out
> objects, etc.), but I don't know how much these will help in these cases
> we've been seeing recently where there may be some kind of leak resulting
> in multi-second pauses.
>

The goal is get steady 60fps, which means CC pause times of about 10ms max
(maybe that's even a bit generous). I suspect even non-leaky heap sizes can
hit that.

Rob
--
"If we claim to be without sin, we deceive ourselves and the truth is not
in us. If we confess our sins, he is faithful and just and will forgive us
our sins and purify us from all unrighteousness. If we claim we have not
sinned, we make him out to be a liar and his word is not in us." [1 John
1:8-10]
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Shawn Wilsher-2
In reply to this post by Damon Sicore-2
On 11/8/2011 1:12 PM, Damon Sicore wrote:
> E)  Places optimizations:  Taras' team is working on optimizing; however, we need to make hard decisions about when and where to use an SQL database.  We also need to consider alternatives to SQLite.
Places is one of the few places where we need to use a SQL database
unless we plan on dropping a bunch of features.  Is this remark about
using SQLite usage in general?

Cheers,

Shawn
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Doug Turner-4

On Nov 8, 2011, at 7:41 PM, Shawn Wilsher wrote:

> Places is one of the few places where we need to use a SQL database unless we plan on dropping a bunch of features.  Is this remark about using SQLite usage in general?


OOC, what feature would we have to drop if we moved away from SQL.  I spend a few weeks in the code and didn't see anything directly tied to SQL that couldn't be replaced by a different data store, but clearly I didn't mess with the entire places schema…


_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Dietrich Ayala
For all ~10 SQLite databases used, we should evaluate whether SQL is
the right tool, Places included. SQLite is certainly not required for
all Places features, and we absolutely should look at alternatives
including hybrid storage solutions, or switching away from SQLite
altogether if its specific performance challenges are insurmountable.

But storage is only one part of the story - switching persistent
storage solutions is not a panacea for broader architectural problems.

There are various scenarios in which chrome code can block the UI for
unreasonable periods of time. For instance, I recently found a
Facebook page that results in the session-restore code blocking the UI
while serializing session history for subframes. Combinations of
user-data and web content could be causing chrome side-effects like
this in the wild on a regular basis, resulting in worse problems than
any of our known storage-related problems.

Telemetry data is starting to help our visibility here, and we should
continue to push on it - implementing broader instrumentation that
tells us exactly what's causing long event-loop lag, etc.

On Tue, Nov 8, 2011 at 7:56 PM, Doug Turner <[hidden email]> wrote:

>
> On Nov 8, 2011, at 7:41 PM, Shawn Wilsher wrote:
>
>> Places is one of the few places where we need to use a SQL database unless we plan on dropping a bunch of features.  Is this remark about using SQLite usage in general?
>
>
> OOC, what feature would we have to drop if we moved away from SQL.  I spend a few weeks in the code and didn't see anything directly tied to SQL that couldn't be replaced by a different data store, but clearly I didn't mess with the entire places schema…
>
>
> _______________________________________________
> dev-planning mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-planning
>
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Boris Zbarsky
In reply to this post by Boris Zbarsky
On 11/8/11 6:56 PM, Robert O'Callahan wrote:
> At least for problems that boil down
> to "page runs long-running JS script without yielding".

This one is perhaps sorta-solvable.  In particular, we could do exactly
what we do for our slow script dialog right now but slightly better: off
an operation callback block all event delivery to the page, spin up a
nested event loop, process events, then return control to the page.
Unless the tab gets closed, in which case we don't return.  We already
support stopping a script from the operation callback.

The hard part aboveis "block all event delivery to the page".

A possibly somewhat harder problem is what to do when page JS asks for a
sync layout on a large page, or some other C++ operation that can't
quite handle being interrupted partway (even interruptible reflow can't
handle being interrupted at arbitrarily fine resolution).

-Boris
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Doug Turner-4
In reply to this post by Dietrich Ayala
Using SQLite, or more likely its usage, is a major problem for the mozilla platform, and it is a source of huge problems for mobile.  See https://bugzilla.mozilla.org/show_bug.cgi?id=696141#c7

How do we start evaluating alternatives for all of these databases?  Are you driving that?

Doug

On Nov 8, 2011, at 8:12 PM, Dietrich Ayala wrote:

> For all ~10 SQLite databases used, we should evaluate whether SQL is
> the right tool, Places included. SQLite is certainly not required for
> all Places features, and we absolutely should look at alternatives
> including hybrid storage solutions, or switching away from SQLite
> altogether if its specific performance challenges are insurmountable.
>
> But storage is only one part of the story - switching persistent
> storage solutions is not a panacea for broader architectural problems.
>
> There are various scenarios in which chrome code can block the UI for
> unreasonable periods of time. For instance, I recently found a
> Facebook page that results in the session-restore code blocking the UI
> while serializing session history for subframes. Combinations of
> user-data and web content could be causing chrome side-effects like
> this in the wild on a regular basis, resulting in worse problems than
> any of our known storage-related problems.
>
> Telemetry data is starting to help our visibility here, and we should
> continue to push on it - implementing broader instrumentation that
> tells us exactly what's causing long event-loop lag, etc.
>
> On Tue, Nov 8, 2011 at 7:56 PM, Doug Turner <[hidden email]> wrote:
>>
>> On Nov 8, 2011, at 7:41 PM, Shawn Wilsher wrote:
>>
>>> Places is one of the few places where we need to use a SQL database unless we plan on dropping a bunch of features.  Is this remark about using SQLite usage in general?
>>
>>
>> OOC, what feature would we have to drop if we moved away from SQL.  I spend a few weeks in the code and didn't see anything directly tied to SQL that couldn't be replaced by a different data store, but clearly I didn't mess with the entire places schema…
>>
>>
>> _______________________________________________
>> dev-planning mailing list
>> [hidden email]
>> https://lists.mozilla.org/listinfo/dev-planning
>>

_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Robert O'Callahan-3
In reply to this post by Boris Zbarsky
On Wed, Nov 9, 2011 at 5:14 PM, Boris Zbarsky <[hidden email]> wrote:

> On 11/8/11 6:56 PM, Robert O'Callahan wrote:
>
>> At least for problems that boil down
>> to "page runs long-running JS script without yielding".
>>
>
> This one is perhaps sorta-solvable.  In particular, we could do exactly
> what we do for our slow script dialog right now but slightly better: off an
> operation callback block all event delivery to the page, spin up a nested
> event loop, process events, then return control to the page. Unless the tab
> gets closed, in which case we don't return.  We already support stopping a
> script from the operation callback.
>

> The hard part aboveis "block all event delivery to the page".
>

Effectively we'd be trying to emulate multiple threads/processes by using a
single thread with some cooperative context switching and a lot of magic to
avoid reentrancy. That doesn't sound like a workable short-term solution to
me.

Rob
--
"If we claim to be without sin, we deceive ourselves and the truth is not
in us. If we confess our sins, he is faithful and just and will forgive us
our sins and purify us from all unrighteousness. If we claim we have not
sinned, we make him out to be a liar and his word is not in us." [1 John
1:8-10]
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Dietrich Ayala
In reply to this post by Doug Turner-4
On Tue, Nov 8, 2011 at 8:22 PM, Doug Turner <[hidden email]> wrote:
> Using SQLite, or more likely its usage, is a major problem for the mozilla platform, and it is a source of huge problems for mobile.

Yes, our non-performant usage of SQLite is most often the culprit of
our storage-related problems. However, Brendan had thoughts on why
SQLite's mutex dependencies were a poor design in another thread, so
maybe he'll say more here on why using it is inherently bad.

> How do we start evaluating alternatives for all of these databases?  Are you driving that?

I'm not driving that, nor do I think it's something one person should
drive. Usage of SQLite spans from network code all the way up to the
glass in content preferences.

Each SQLite consumer should be evaluating whether it is too big a
hammer for their needs, and looking for main-thread-IO blocking going
on due to their usage of it. See bug 699820.

LevelDB support in the Moz platform is happening in bug 679852, so
maybe that's a solution for current SQLite consumers that can be
implemented on a key/value store alone. But I haven't seen much data
on how it would address specific problems that we're having.

Like we talked about in IRC, I think it would be worth having someone
(perf team maybe?) build a matrix of storage options in our platform,
their pros/cons, recommended best-practices, etc. That would provide
something for feature-owners to look at when evaluating the best
storage approach for their needs. Maybe we should do reviews like we
do with Security team :)

>
> Doug
>
> On Nov 8, 2011, at 8:12 PM, Dietrich Ayala wrote:
>
>> For all ~10 SQLite databases used, we should evaluate whether SQL is
>> the right tool, Places included. SQLite is certainly not required for
>> all Places features, and we absolutely should look at alternatives
>> including hybrid storage solutions, or switching away from SQLite
>> altogether if its specific performance challenges are insurmountable.
>>
>> But storage is only one part of the story - switching persistent
>> storage solutions is not a panacea for broader architectural problems.
>>
>> There are various scenarios in which chrome code can block the UI for
>> unreasonable periods of time. For instance, I recently found a
>> Facebook page that results in the session-restore code blocking the UI
>> while serializing session history for subframes. Combinations of
>> user-data and web content could be causing chrome side-effects like
>> this in the wild on a regular basis, resulting in worse problems than
>> any of our known storage-related problems.
>>
>> Telemetry data is starting to help our visibility here, and we should
>> continue to push on it - implementing broader instrumentation that
>> tells us exactly what's causing long event-loop lag, etc.
>>
>> On Tue, Nov 8, 2011 at 7:56 PM, Doug Turner <[hidden email]> wrote:
>>>
>>> On Nov 8, 2011, at 7:41 PM, Shawn Wilsher wrote:
>>>
>>>> Places is one of the few places where we need to use a SQL database unless we plan on dropping a bunch of features.  Is this remark about using SQLite usage in general?
>>>
>>>
>>> OOC, what feature would we have to drop if we moved away from SQL.  I spend a few weeks in the code and didn't see anything directly tied to SQL that couldn't be replaced by a different data store, but clearly I didn't mess with the entire places schema…
>>>
>>>
>>> _______________________________________________
>>> dev-planning mailing list
>>> [hidden email]
>>> https://lists.mozilla.org/listinfo/dev-planning
>>>
>
>
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Boris Zbarsky
In reply to this post by Boris Zbarsky
On 11/8/11 11:40 PM, Robert O'Callahan wrote:
> Effectively we'd be trying to emulate multiple threads/processes by using a
> single thread with some cooperative context switching and a lot of magic to
> avoid reentrancy.

Yep.

Note that we need to have most of said magic to properly do sync XHR, by
the way.....

> That doesn't sound like a workable short-term solution to me.

I guess that depends on our definitions of term durations.   I can see
this maybe being doable in 6-9 months if we try.  Maybe.

-Boris

_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Dave Townsend
In reply to this post by Doug Turner-4
On 11/8/2011 8:40 PM, Dietrich Ayala wrote:

> On Tue, Nov 8, 2011 at 8:22 PM, Doug Turner<[hidden email]>  wrote:
>> Using SQLite, or more likely its usage, is a major problem for the mozilla platform, and it is a source of huge problems for mobile.
>
> Yes, our non-performant usage of SQLite is most often the culprit of
> our storage-related problems. However, Brendan had thoughts on why
> SQLite's mutex dependencies were a poor design in another thread, so
> maybe he'll say more here on why using it is inherently bad.
>
>> How do we start evaluating alternatives for all of these databases?  Are you driving that?
>
> I'm not driving that, nor do I think it's something one person should
> drive. Usage of SQLite spans from network code all the way up to the
> glass in content preferences.
>
> Each SQLite consumer should be evaluating whether it is too big a
> hammer for their needs, and looking for main-thread-IO blocking going
> on due to their usage of it. See bug 699820.

I agree that no one person should be evaluating whether each sqlite user
is doing so for the right reasons however I think it would be
fantastically useful for one or two people to put together a list of
good storage mechanisms along with the performance and memory
characteristics to help those people at least narrow down what options
they should be looking into as alternatives.
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Marco Bonardo-2
In reply to this post by Dietrich Ayala
On 09/11/2011 05:22, Doug Turner wrote:
> Using SQLite, or more likely its usage, is a major problem for the mozilla platform, and it is a source of huge problems for mobile.  See https://bugzilla.mozilla.org/show_bug.cgi?id=696141#c7

The fact we were unable to use SQLite correctly doesn't make it a bad
choice for everything by itself. The problems are:
- We decided to use SQLite where we should have not. So in any cases
where there is not need to run complicate queries, see for example the
searchService where a json would have been more than enough, or see
DOMStorage where a simple hash database like levelDB would have been
much better. There are more of these.
- Where we used SQLite, we did it mostly wrong. Whoever built Places
initially had no idea what a database is, and we still fight those bad
decisions. There are other examples in the codebase where we may use it
better though. So surely we have issues that are not directly due to the
chosen datastore.
- SQlite has some issues with slop memory, we identified most of them in
bug 699708, and SQLite team is evaluating solutions.
- Some default settings we use are particularly bad, for example see bug
692487 that reduces the cache size.

> How do we start evaluating alternatives for all of these databases?  Are you driving that?

bug 699820 has some connection to services using storage on mainthread,
these may be a first starting point to evaluate alternatives. For sure a
lot of consumers don't need a database, any consumer that doesn't need
to query more than 1 field at a time, or that just has to read all
entries, is a bad database consumer.

-m
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
Reply | Threaded
Open this post in threaded view
|

Re: Summary: Re: E10s Planning, Plugins, and Responsiveness Meeting - Friday, Nov 4, 10AM PDT

Marco Bonardo-2
In reply to this post by Shawn Wilsher-2
On 09/11/2011 04:56, Doug Turner wrote:
> OOC, what feature would we have to drop if we moved away from SQL.  I spend a few weeks in the code and didn't see anything directly tied to SQL that couldn't be replaced by a different data store, but clearly I didn't mess with the entire places schema…

I don't think this discussion on "unimpementable features" brings
anything useful, I can implement all features you want with a txt file,
but clearly they may have performance and functionality limits.
In my opinion Places can't go out of a database, unless you want to
fight worse performance issues or provide a really features-limited
solution with good performances. And we already plan to drop some
features and data to come with a smaller and more efficient datastore.
Surely we should be open to alternatives, but so far nobody provided a
decent one.
Personally I think we should start converting all those Storage users
who don't need a SQLite db and make the ones who need it be as slick as
possible, and obviously async. We have 12 databases in the profile
folder, at first look I only see 2 or 3 who deserve that (that said I
don't have deep knowledge of the needs of each single module).
-m
_______________________________________________
dev-planning mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-planning
123