Page MenuHomePhabricator

Safari crashes on some click handling events for banners that move the centralNotice element outside of the the main body
Open, MediumPublicBUG REPORT

Description

Users have reported "crashes" on mobile Safari when trying to close the banners of the German "Thank you" fundraising campaign (See T242589: Browser crashes when closing banner).

"Crash" means that mobile Safari becomes unresponsive when scrolling up: sometimes scrolling does not work at all, sometimes it "stutters" during scroll. When it has "crashed" in such a way, mobile Safari does not render new elements that were outside the rendered viewport. Using a cable connection and connecting the Safari web inspector on Desktop to an iPhone shows that Mobile Safari uses 100% CPU (and the web inspector becomes unresponsive).

I've investigated this issue and it boils down to three factors:

  • This issue occurs only with the MinervaNeue skin, both on the mobile version of Wikipedia (unsing the m. subdomain) and with the Desktop version (using the &useskin=minerva parameter). Showing the same banner with the Vector skin does not lead to a crash when clicking the button.
  • The Fundraising Ops Team requires us to position the banner at the top of the page (above the header element that contains the search/menu bar). The easiest way to accomplish this is to move the centralNotice element to be the first child of the body element. When we leave the centralNotice element in place (or at least inside the main element), Mobile Safari does not crash when clicking the close button.
  • The issue only occurs on the "close" event handler (on the click event), not on the "expand" handler (another click event) handler or the form validation event handlers (also click events). Why the other handlers are unaffected is not clear to me. I've tried changing the class name of the element but that had no effect. Maybe the issue is related to an element's position in the DOM.

Since I'm not familar with the MobileFrontend/MinervaNeue code, I'm hoping someone can chime in and shed some light on this issue:

  • Do you have ideas what could trigger this crash?
  • Is there anything we can do differently (except not moving around the DOM node)?
  • Was there any code deployed recently that could have created this behavior? We have not gotten complaints during the fundraising campaign (Nov 11 - Dec 25), so the error could be a recent phenomenon.

You can try for yourself with these test banners:

The source code for the two test banners is tracked at https://github.com/wmde/fundraising-banners/tree/thankyou-fix

A workaround for the "banners must be positioned at the top" requirement would be to measure the height of the top bar and subtract it from our absolute/relative positioning of the banner, but I'd rather not do that, it feels hacky and might get into trouble with the width of the when it's no longer determined by the viewport.

Event Timeline

Restricted Application added subscribers: Masumrezarock100, Aklapper. · View Herald TranscriptJan 15 2020, 5:53 PM
Jseddon added a subscriber: Jseddon.

I'm sorry I did not express myself more clearly: With this ticket I intended to ask someone from the Readers Web Team for support on this issue, since the banner is affected by the JavaScript in the MinervaNeue skin (or the MobileFrontend extension). I don't know enough about those code bases to investigate further and was hoping that someone who is more knowledgeable have an idea what might cause this. @Jdlrobson could you or someone from your team have a look?

Jdlrobson added a subscriber: ovasileva.

I've only taken a quick glance at https://github.com/wmde/fundraising-banners/blob/thankyou-fix/thank_you/src/show_banner.js but the lack of throttling here seems a little suspect:

window.addEventListener( 'resize', resizeHandler );

Can you see if you can replicate the same issue on the Vector skin on the same mobile device ? e.g. https://de.m.wikipedia.org/wiki/Wikipedia:Hauptseite?banner=B19_WMDE_Mobile_Thankyou_Working&uselang=en&force=1&useskin=vector -

@gabriel-wmde can you link us to the exact source code of those banners on wiki? It's probably useful to see that in addition to source code.

I'm ccing @ovasileva for prioritisation for someone in my team to look at more closely (And determine the underlying issue). An iOS device will be needed to replicate this issue. Performance team would also be good to consult around the crashes. They may have encountered this before.

Jdlrobson moved this task from Incoming to Needs Prioritization on the Readers-Web-Backlog board.

I've only taken a quick glance at https://github.com/wmde/fundraising-banners/blob/thankyou-fix/thank_you/src/show_banner.js but the lack of throttling here seems a little suspect:

window.addEventListener( 'resize', resizeHandler );

That's why I've left out the show_banner function in the examples and I'm not doing anything with resize handlers. So I can confirm that the cause is not my resize handler. But thanks for the tip, I'll incorporate throttling when I reinstate show_banner.

Can you see if you can replicate the same issue on the Vector skin on the same mobile device ? e.g. https://de.m.wikipedia.org/wiki/Wikipedia:Hauptseite?banner=B19_WMDE_Mobile_Thankyou_Working&uselang=en&force=1&useskin=vector -

No, on Vector the close button works fine, that's why I initially tagged this issue with "MinervaNeue".

@gabriel-wmde can you link us to the exact source code of those banners on wiki? It's probably useful to see that in addition to source code.

The code for the broken thank-you banner is at https://github.com/wmde/fundraising-banners/blob/thankyou-fix/thank_you/banner_ctrl.js
The code for the working thank-you banner is at https://github.com/wmde/fundraising-banners/blob/thankyou-fix/thank_you/banner_var.js
The only difference between them code-wise is the positioning of the centralNotice in the DOM tree.
The close event handler that gets triggered in both banners (identical code) is at https://github.com/wmde/fundraising-banners/blob/thankyou-fix/thank_you/banners/Banner.jsx#L23-L31
The code for an old fundraising banner , that uses more vanilla JS but has the same issue is at https://github.com/wmde/fundraising-banners/blob/thankyou-fix/mobile/banner_var.js#L241-L253

Gilles moved this task from Inbox to Radar on the Performance-Team board.Jan 21 2020, 9:27 PM
Gilles edited projects, added Performance-Team (Radar); removed Performance-Team.
Krinkle removed a subscriber: Krinkle.Feb 4 2020, 6:04 PM
ovasileva triaged this task as Medium priority.Feb 17 2020, 1:59 PM
Jdlrobson removed ovasileva as the assignee of this task.EditedFeb 19 2020, 11:38 PM

Assuming this is waiting analysis from a developer in our team with access to an iOS device so unassigning Olga.

The fact that the Safari developer tools become unresponsive during this error makes it very difficult to debug.
I've stepped through the whole code execution path (twice, here are the videos to prove it :P ) and all I could learn from that is that the devtools start being unresponsive during a pretty benign call to $('#centralNotice").hide() from mw.centralNotice.customHideBanner( 'close', 1814400 ); in Banner.jsx. Envoking customHideBanner from the console produces no such error though, so I honestly don't know what could be the cause of this issue here. Also I don't know of any structural changes or refactors in MobileFrontend/Minerva within the past couple of months that could have affected this 🤷‍♂️.

Wait.. is this banner code pulling down the entire Preact library?

Wait.. is this banner code pulling down the entire Preact library?

To add context if you are pulling Preact in this banner then that's very significant and for me at least greatly increases the scope for debugging. If you can replicate the banner issue without Preact or at least with Preact shipped inside ResourceLoader that would help me.

Is there any reason not to use Vue.js here given it's now in core?

Marking as tracking until we have some more context regarding the nature of these banners.

As mentioned in the task description, the behavior can also be observed in banners that don't use Preact. I've created a minimal banner that does not use Preact and is not compiled with webpack and you can still observe the behavior: https://de.m.wikipedia.org/wiki/Wikipedia:Hauptseite?banner=gbirke_T242895&uselang=en&force=1 You can edit it on CentralNotice

I can now track back the crash behavior to moving the centralNotice element from the main content area to the "top" of the body area:

$( 'body' ).prepend( $( '#centralNotice' ) );

If you comment out this line, the close button of the banner will no longer crash mobile Safari.

Fortunately, the line does not seem to be necessary anymore. I guess the line was necessary at some point, because the header-container element and/or its contents had the CSS formatting position: absolute or position:fixed. Do you know if/when they became static in the last 3-5 years?

So - as long as we can rely on the header area being not positioned absolute/fixed, we can drop that DOM element move and avoid the crash. I'm still curious why the move causes the crash. Probably some events creating an endless loop, but which events?


Regarding Preact usage: When we developed the banners, Vue in core was not on the horizon. We specifically chose Preact because it had the smallest footprint (3kB gzipped) of all the JS frameworks we evaluated (about 6 of them, some small ones were not suitable for us) while still giving us a many benefits in terms of architecture, separation of concerns and reusability of banner code across all the 16+ banners we develop and test during the fundraising campaign. According to the webpack build analysis Preact is 4.6% (or 9kb unzipped) of the overall banner code, that's smaller than the Handlebars templating library we were using for non-preact banners. We just finished re-engineering all our banner code from JavaScript spaghetti to nice Preact components, which was a major undertaking. Porting the components to Vue would be a slightly smaller effort since the component hierarchies would stay the same, but the potential banner code size reduction does not justify the development effort that would still be necessary: Vue and Preact have different philosophies regarding event and state handling and would need to be adapted, and we'd have to figure out how to integrate Vue as an external dependency into our webpack workflow. In the long run, we'll use Vue, since we like it better and it might bring banner size reductions. But probably not this year.

As mentioned in the task description, the behavior can also be observed in banners that don't use Preact. I've created a minimal banner that does not use Preact and is not compiled with webpack and you can still observe the behavior: https://de.m.wikipedia.org/wiki/Wikipedia:Hauptseite?banner=gbirke_T242895&uselang=en&force=1 You can edit it on CentralNotice

I don't see that mentioned explicitly in the current task description and I'm not too familiar with how CentralNotice works to understand the banner internals. Thanks for the test URI and confirming that Preact is not at fault here.

I can now track back the crash behavior to moving the centralNotice element from the main content area to the "top" of the body area:

$( 'body' ).prepend( $( '#centralNotice' ) );
``
If you comment out this line, the close button of the banner will no longer crash mobile Safari.

The banners in Minerva are meant to be underneath the header so the requirement from "The Fundraising Ops Team requires us to position the banner at the top of the page" does come with a big trade off as your fighting a system that was trying to prevent that from happening. Fundraising at WMF may have some input here on best practices around this kind of banner.

So - as long as we can rely on the header area being not positioned absolute/fixed, we can drop that DOM element move and avoid the crash. I'm still curious why the move causes the crash. Probably some events creating an endless loop, but which events?

I'd recommend profiling that repositioning of the DOM node in a browser where it doesn't crash. My guess (which cannot confirm due to the crash) is that repaint and reflow cause issues on iOS. This is sadly not my expertise and I think the Performance-Team would have more expertise here to talk to about this rather than reading web.

Without diving too much into this I'd suggest exploring wrapping that call in a setTimeout to make it asynchronous, moving the element before rendering it, or avoiding the move altogether (possibly you can do this given you are using position: fixed?). Outside those suggestions I'd suggest talking to the performance team.

Fortunately, the line does not seem to be necessary anymore. I guess the line was necessary at some point, because the header-container element and/or its contents had the CSS formatting position: absolute or position:fixed. Do you know if/when they became static in the last 3-5 years?

Not really sure. The banner container did move below the header to avoid banners appearing above the header (for performance and UX reasons). We also made some changes to the main menu to allow it to work without JS. I suspect it was more relevant then?

Regarding Preact usage: When we developed the banners, Vue in core was not on the horizon. We specifically chose Preact because it had the smallest footprint (3kB gzipped) of all the JS frameworks we evaluated (about 6 of them, some small ones were not suitable for us) while still giving us a many benefits in terms of architecture, separation of concerns and reusability of banner code across all the 16+ banners we develop and test during the fundraising campaign.

Not judging in any way. I am sure you had your reasons to use it, however it's not been tested in the mobile stack by us so just wanted to call out that using it carries considerable risk and (as you are clearly aware) a small performance penalty. From a personal POV I'd be curious to whether the additional payload has any impact on interactions with the banner or causes any bugs during the campaign to inform any future ones.