Feed on
Posts
Comments

Snappy Feb 9: Blame Canada!

The meeting was short this time because all of the participating people in the Toronto office conspired be busy or on vacation today.

Our UX team helped us decide to turn on tabs-on-demand + do tab restore by default, Bug 711193. This change will make interacting with the browser more responsive after startup, help MemShink and not trigger as much captive wifi portal badness.

Frontend people are busy adding telemetry to everything that matters, bug 671038. Some of this has already paid off in terms of us catching a tab animation regression in bug 724349. We plan to switch our awesomebar searching from SQL to an FTS. If you are a text-search/tokenizer expert, perhaps you help us with bug 725821. There is also a lot of activity on making various sync IO things async, see the meeting notes for complete details.

Brian Bondy posted an update documenting his two-week rampage through Firefox startup inefficiencies on Windows. Brian’s blog post contains tips on xperf, Firefox profiler, about:startup – read it.

The networking team is busy nuking the big cache lock, see bug 717761.

Olli has landed most of the cycle collector fixes. Telemetry shows a dramatic reduction in cycle collection times for Firefox 13. He and Andrew investigating the remaining causes of long CC times.

I’ll end this post with a pretty picture demonstrating recent cycle collection improvements.


* y: frequency, x: milliseconds

 

We cancelled last week’s snappy meeting due to Perf/Snappy workweek + FOSDEM. See Jared’s post for a summary of the workweek, I’ll mention the rest below.

We figured out a strategy for avoiding blocking DOM Storage IO (use scriptblocker to async preload relevant dom storage. Do async writeback to commit). We have a plan for cancellable SQL queries, bug 722243.

SetTimeouts/30s telemetry landed in bug 715953, I attached result of that in bug 715376. Persistent telemetry was backed out while Nathan investigates problems, bug 707320.

Brian Bondy has been fixing our usage of Windows APIs:

  • bug 722225 - Firefox startup opt on Windows by optimizing D3D10CreateDevice1 (pending review)
  • bug 722315 - Firefox startup opt on Windows by lazy loading CLSID_DragDropHelper (landed)
  • bug 692255 - Find a way to get rid of prefetch files on Windows for faster startup (pending review)

We spent the weekend at FOSDEM. I re-presented my Plumbers talk on why Linux sucks for starting big apps. I also did a Telemetry talk. The audience was great.

Help Wanted

To my great regret, I forgot to mention that Mozilla is hiring in my talks. In particular, I’m looking for more performance hackers. If enjoy spending quality time with stack traces,writing profilers or analyzing performance logs leave a comment or send me an email. Compiler toolchain and/or kernel hacking experience would be a great bonus.

At BetaGroup in Brussels

As the outside temperature reached -10C, the DIY heating system at hackerspace.be proved insufficient. On Wednesday the Perf/Snappy workweek was relocated to BetaGroup Coworking Brussels. We’ll be here until FOSDEM.

Betagroup Coworking Brussels is an industrial-strength coworking space with lots of desk space, internet, kitchen, ping-pong and a bunch of heavy metal…

fast DXR

DXR is now uber-fast. See bottom of search pages for timing info. Give it a try. We have a few more bugs to fix before we jump into fixing the UI.

"Lubricants"

We just started our Perf/Snappy workweek in Brussels, Belgium. hackerspace.be let us use their space. If you are also performance hacker in the area, why not drop in for some beers?

See Dietrich’s post for more details.

Snappy, January 26

Slow Sessions – Tabs-on-Demand

Armed to the teeth with about:jank, I was testing session restore scenarios that people reported. While at it I came up with a testcase for bug 711193. At first we were going to use telemetry to debate the merits of tabs on demand by default, but I feel my example illustrates responsiveness problems with session-restore well enough. Gavin is looking into this so we can make a decision this week.

Laggy Sessions

On my machine about:jank indicated that most lag was caused by our direct2d accelerated drawing code, bug 721273. Turning off graphics acceleration made things a lot less slow (Options/Advanced/use hardware acceleration) . It would be nice if people experiencing lots of lag in their sessions (on youtube, blogs with high quality backgrounds, etc) could try about:jank. This requires running a very recent nightly.

Install the extension, go to about:jank, browse around, then refresh about:jank. In the case of gfx lag, DrawThebesLayers shows up on top.

Imminent Cycle Collector + GC Improvements

Olli is landing huge cycle collector improvements (half of the patches landed so far), bug 705582, bug 717500. If that doesn’t solve all CC problems by Tuesday, Andrew is standing by with bug 710496 to limit how often CC can run. If we are lucky, incremental JS GC will land before Tuesday too (bug 641025). Landing by Tuesday means that these improvements have a good chance of showing up in Firefox 12. CC + GC are the most well-known causes of pauses in Firefox, so this is very exciting.

Other stuff

Profiling tools are moving along at a good clip. Benoit’s profiler works well on Mac now, hopefully Windows support will happen next week. Non-destructive chromehang is almost landed.

Telemetry histograms should now survive restarts (so we can do shutdown telemetry, etc), bug 707320.

Peptest didn’t manage to survive deployment on try due to bug 719618, 719511.

We are now transitioning from identifying issues to fixing identified issues. It’s exciting to move from speculation as to what sucks to actual results. For more details see meeting notes.

Meeting notes.

Network Cache Horrors

Last week we discovered that our cache uses main thread locks to successfully block on off-main thread io. See (Bug 695399, Bug 717761). QA did an experiment which confirmed that our disk cache is performing poorly.

Flash Lag

We are looking into reports of flash lag, tracking Bug 720000. Initial QA data shows a significant slowdown when page is first loaded and smaller slowdowns later. There are also long browser pauses when the flash container progress freezes.

Profiling

Vlad continued work on non-destructive chromehang, Bug 712109. Client-side is ready to land and he is wrapping up symbolification for the server-side.

Interactivity profiler is now able to collect stacks on 64-bit MacOS. Benoit is looking for contributors to add Windows, Linux support (Bug 719536). I highly encourage adventurous contributors to help out with that as it involves modifying some concise, straightforward, yet highly ironic JavaScript. We are also looking for help with the profiler UI. If you are a skilled addon/frontend person, see Bug 719530.

Jeff posted an early preview of about:jank addon. He also working on measuring painting speed via telemetry. Note this addon is buggy and requires a very recent nightly.

Last week I asked for some laggy session restore profiles. I’m behind on reproducing those(will be done today or next week). I’ve been in email contact with several of the commenters. I really appreciate the data gathered so far.

Snappy UX

Jared landed smooth scrolling, Bug 198964. He is now working on hooking it up to scrolling via scrollbar, Bug 710373. Up next: fixing fallout from turning on smooth scrolling, hooking it up to the refresh driver and tweaking scrolling physics.

Marco landed inline autocomplete, Bug 566489 and is now fixing fallout from that too.

Snappy Jan12

The most user facing fix has been discovery and removal of some sneaky cache IO on the main thread.

Saptashi did some analysis on the impact of running sqlite in async mode on mobile. Turns out it’s only a win for DELETEs. Expect a blog post from him soon.

Dave discovered that we sometimes wait on locks on the main thread.

Jeff and Bas are looking into diagnosing when d2d causes a slowdown.

There was discussion of 4x reduction in cycle collection times landing soon, focusing on having cycle collector run less, etc. Lots of work(chromehang, profiler, …) is continuing from last week.

I have been working under assumption that the browser gets less snappy as more tabs are opened. This increases the chances of having an ill-behaved website in the background. An ill-behaved tab (or a couple of them) can in theory ruin scrolling, typing, clicking, etc in active tabs. However I do not have anything behind anecdotal evidence on this. There are bugs on specific websites in bugzilla, but it would be nice to get them mixed into a realistic set of tabs.

Would someone be willing to contribute a list of webpages they use often that cause Firefox to lag (maybe a session restore file?)? I am a low-tab person myself, so I can’t easily reproduce this. Please make sure that Firefox is slow with your list of tabs even when all addons are disabled, include a description of slowness encountered.

Snappy, Jan 5

I expected to a slow week, but there was a surprising amount of progress. I  take this as further evidence that having managers  go on vacation does wonders to engineer productivity :)

Interactivity with lots of tabs

We spent a lot of time pondering how to approach browser sluggishness in light of having tons of tabs open. On one hand people should understand, that one can’t expect the browser perform the same whether 1 tab is active or infinity. On the other hand we should do more to a) make the browser punish background tab hogs and b) communicate hogs to the user.

For now we will look at throttling background setTimeouts better (bug 715376715378, 715380), XMLHttpRequest loops, etc more aggressively. We also plan to make more use of interactive state so Firefox can suspend non-critical tasks (bug 712478).

Occasionally the cycle collector misbehaves, Andrew will look into not running cycle collection frequently when it is slow: bug 710496. Olli has been fixing many of the cycle-collection extremes, I don’t have bug #s for that, but apparently the improvements are dramatic.

Super-Slow Startups

Thanks to telemetry we now know that some users experience tragic startup speeds ranging from 30seconds to 34hours (bug 701872). Our network cache is to blame for some of these (bug 707436). Another theory is that an unfortunate turn of events causes us to start loading webpages before the UI is shown (bug 715402).

Vlad will post some of his analysis and interested people can help us with telemetry forensics.

Profiling

Being able to profile interactivity bugs is an important key to making the browser snappier. Large parts of Benoit’s interactivity profiler have landed (bug 713227). Using this extension on nightly win/mac should give you an idea of what it will look like when completed.

We make heavy use of compiler optimizations. Unfortunately one of them is to omit the stack pointer. Ehsan has setup a developer-friendly profiling branch.

Vlad is making progress on non-destructive chromehang(bug 712109). Traditionally we could not do this, but with a combination of telemetry + cycling Ehsan’s shiny new profiling branch on nightly channel… we’ll be in developer heaven.

Responsiveness testing

Peptest should be landing on try soon, Aki is wrapping stuff up. This should enable us to catch responsiveness regressions on our infrastructure.

Smooth Scrolling

Jared is almost done fixing tests to land smooth scrolling to gather feedback and move on to fancy physics (bug 710372).

Other ongoing projects with nothing specific to link to: Vlad’s slow-sql telemetry, Rafael’s quest to close sql connections so we can exit(0), QA browser-cache-effectiveness comparisons.

Next »