Page MenuHomePhabricator

Add startup script to automatically wrap asynchronous functions in try..catch
Closed, DeclinedPublic

Description

To implement Javascript error logging, we need to wrap every asynchronous function in a try-catch block. (The alternative would be using window.onerror, which has lots of reliability problems.) To do that, an error logging module needs to be loaded before all normal modules, and it will need to decorate low-level asynchronous functions (setTimeout/setInterval, AJAX, event handling) so they wrap their callbacks in try...catch.

See the JS error logging RfC for more context, and T1345 for the current plans for JS error logging.

Related Objects

StatusSubtypeAssignedTask
OpenNone
DeclinedNone
DeclinedTgr
Resolved jlinehan
DeclinedNone
DeclinedNone
ResolvedTgr
Resolved Gilles
ResolvedTgr
ResolvedTgr
Resolved csteipp
ResolvedTgr
DeclinedTgr
DeclinedTgr
DeclinedTgr
ResolvedTgr
DeclinedTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
Resolvedjcrespo
ResolvedAklapper
ResolvedTgr
ResolvedTgr

Event Timeline

Tgr claimed this task.
Tgr raised the priority of this task from to High.
Tgr updated the task description. (Show Details)
Tgr set Security to None.

Some historical context: https://gerrit.wikimedia.org/r/#/c/127581/ and https://gerrit.wikimedia.org/r/#/c/145680/

Is there a reason this is being implemented as an extension instead of a core feature?

Is there a reason this is being implemented as an extension instead of a core feature?

There are lots of Javascript error logging solutions (see the bottom of the RfC for a partial list) and the extension will integrate one of them. If you do not limit yourself to open-source software, Sentry might not be the best choice for you, especially if you already integrate with one of the other solutions for different reasons (e.g. NewRelic offers JS error logging and many sites already use it to measure server performance).

And anyway, even if error logging was done in core, it should still hook into ResourceLoader via some extension mechanism, to keep module loading to be the single responsibility of that class.

What was the reason ResourceLoaderGetStartupModules was killed? T65240 is not very verbose on the subject; I agree that adding startup modules almost never makes sense, but an error logger seems to be one of the valid exceptions. The hook documentation says This hook is useful if you want to make code available to module loader scripts. and that's exactly what's needed here.

In T85262#943407, @Tgr wrote:

What was the reason ResourceLoaderGetStartupModules was killed? T65240 is not very verbose on the subject; I agree that adding startup modules almost never makes sense, but an error logger seems to be one of the valid exceptions. The hook documentation says This hook is useful if you want to make code available to module loader scripts. and that's exactly what's needed here.

On rereading, I might have misunderstood or misphrased. What I wanted to say is, IMO there is one valid use case for such a hook, when you want to interact with how other modules are loaded (ie. not the loading of your modules but module loading in general), which is rarely needed but for error logging it is. (That would be T85263 and the library for that would have to be loaded somehow.)

It doesn't have to be loaded before the rest per se. We can, but we don't have to. That's an implementation detail that has nothing to do with the overall project here. For example, another way would be to store data and process it later. (Leaving the data in a known place to access later, similar to how Google Analytics works, the array is converted to an interface once the module loads, async.)

In T85262#949595, @Tgr wrote:
In T85262#943407, @Tgr wrote:

What was the reason ResourceLoaderGetStartupModules was killed? T65240 is not very verbose on the subject; I agree that adding startup modules almost never makes sense, but an error logger seems to be one of the valid exceptions. The hook documentation says This hook is useful if you want to make code available to module loader scripts. and that's exactly what's needed here.

On rereading, I might have misunderstood or misphrased. What I wanted to say is, IMO there is one valid use case for such a hook, when you want to interact with how other modules are loaded (ie. not the loading of your modules but module loading in general), which is rarely needed but for error logging it is. (That would be T85263 and the library for that would have to be loaded somehow.)

Your first message was more accurate. No modules, including startup modules, are to interact with how modules are loaded. That makes for an unpredictable environment. Whatever influences module loading is either shit or belongs in core and should be reviewed carefully by ResourceLoader maintainers. I don't plan on opening that up to being extended by extensions. It's nice to be able to iterate over things in isolation, but some things just belong in core.

That doesn't mean the entire error-logging stack has to be in core. Just the footprint to expose this data. Not opening in a generic way, but in a way related to collecting errors.

This part of the error logging RFC doesn't have as a goal to wrap execution in a try-catch; the goal is to get stack traces. Point being, we're not violating or changing our agreed RFC by doing this differently.

As for the try-catch approach, that doesn't work. JavaScript can detach itself from the root execution stack in many ways. This isn't an edge case. It's the 80% case. Injecting yourself in the module loader will not get you any significant coverage of exceptions.

Also, ResourceLoader already wraps module execution in a try-catch. Right now it logs that to the console. We can easily add an mw.track there for consumption by error-logging software.

The majority of the errors (and code execution in general) has its call stacks originating in user events (e.g. click), browser events (dom-ready), callbacks from http requests (ajax), callbacks from timers (setTimeout), etc.

As for the try-catch approach, that doesn't work. JavaScript can detach itself from the root execution stack in many ways. This isn't an edge case. It's the 80% case. Injecting yourself in the module loader will not get you any significant coverage of exceptions.

The way I would put it is, not injecting in the module loader will open a significant hole in the coverage of exceptions. We don't want to miss errors which happen at module evaluation time, even if it is just a small part of all errors. I'm fine with leaving this to a later iteration so that no time gets wasted if Sentry turns out to be unsuitable for us for other reasons, but it should be covered eventually.

You are absolutely right that most errors will have a timeout or event callback at the top of their call stack, but that's a disjunct issue that has to be handled with different methods. There are two sources of errors:

  1. initial evaluation of module code
  2. event handlers and other top-level callbacks

and there are four methods of catching errors:

  1. wrapping module code in try-catch
  2. wrapping event handlers / other callbacks in try catch, the most practical solution for which is to decorate the functions which set up these callbacks
  3. catching uncaught exceptions via window.onerror.
  4. the module itself doing the error handling.

Module wrapping covers initial errors (in non-debug mode, but I don't think there is anything that could be done about debug mode); event handling decoration covers event handler errors. window.onerror covers everything, but has lots of browser compatibility issues and I would rather not rely on it. In-module error handling can cover anything and has UX advantages so should be the preferred solution, but it's up to the module author so again we can't rely on it. So module wrapping and event handler decoration both need to be done to get full coverage.

Raven.js decorates setTimeout, $.on, $.ready and $.ajax. It does not decorate native event handling like direct calls to addEventListener, but I think that can be ignored; so that takes care of logging errors in event etc. callbacks. (Assuming we are okay with modifying those functions; the related code is here and here.)

It doesn't have to be loaded before the rest per se. We can, but we don't have to. That's an implementation detail that has nothing to do with the overall project here. For example, another way would be to store data and process it later. (Leaving the data in a known place to access later, similar to how Google Analytics works, the array is converted to an interface once the module loads, async.)

That's what I originally planned (and how it was proposed in the RFC and the tasks based on that like T499): add an error logging module to core, that module gets the stack trace and does something configurable with it, one option being to lazy-load a handler module and pass the trace to it. But after looking at the error logging code in raven.js, it's a bit more involved than I have thought, and it does some things that cannot be in a lazy-loaded way; specifically, it rethrows exceptions and re-catches them via window.onerror to get better IE compatibility (the code is in TraceKit.report).

Maybe that's only relevant for IE versions that we don't support anyway; I will look into that. Otherwise, if we can find some way to load the error logging module early without compromising the ResourceLoader architecture, I think it would be worth doing that to get stack traces for a grade A browser.

Another advantage would be the more reliable reporting of errors that happen right before the page is unloaded (e.g. when a click handler on a link throws an error and fails to prevent the default action). Some modern browsers provide navigator.sendBeacon() to reliably log data while the page is being onloaded, but there is no way to make them wait for a module loading first.

The drawback is increased script size, of course, but with raven.js being 15K when minified, I think that's not terrible.

Also, ResourceLoader already wraps module execution in a try-catch. Right now it logs that to the console. We can easily add an mw.track there for consumption by error-logging software.

That would be great. That's more or less what I had in mind with "add some sort of JS hook to the module loading code in mw.loader" in T85263.

[raven.js] does some things that cannot be in a lazy-loaded way; specifically, it rethrows exceptions and re-catches them via window.onerror to get better IE compatibility (the code is in TraceKit.report)

On second thought, it's not necessarily true that lazy-loading would break that. I'll test that once I figure out which browser is affected.

In T85262#960660, @Tgr wrote:

But after looking at the error logging code in raven.js, it's a bit more involved than I have thought, and it does some things that cannot be in a lazy-loaded way; specifically, it rethrows exceptions and re-catches them via window.onerror to get better IE compatibility

I didn't think that one through. Exceptions at module loading time cannot be rethrown, that would break the module loading logic. And for every other kind of error, even if rethrowing (and thus loading raven.js early) is really needed, simply top-loading it is always early enough, I think. So this task can probably be closed as invalid.

Added mw.track calls and reimplemented the relevant part from raven.js in https://gerrit.wikimedia.org/r/188326 and children. Sentry or any other error logger can subscribe via mw.trackSubscribe now.

Tgr renamed this task from Create ResourceLoader hook to add new startup modules to Add startup script to automatically wrap asynchronous functions in try..catch.Mar 4 2015, 6:44 PM
Tgr updated the task description. (Show Details)

Rewrote what this task is about based on the discussion in the comments.

Change 188328 had a related patch set uploaded (by Gergő Tisza):
Wrap asynchronous callbacks with error logging

https://gerrit.wikimedia.org/r/188328

RobLa-WMF subscribed.

The people that have the expertise to review this are in Editing, not in MediaWiki Core, so I've removed "Blocked-on-MediaWiki-Core" from this.

Tgr changed the task status from Open to Stalled.Mar 25 2015, 11:39 PM

Abandoned for now, see T92247#1147500.

Change 188328 abandoned by Gergő Tisza:
Wrap asynchronous callbacks with error logging

Reason:
Abandoning, see T92247#1147500. Might be revisited later.

https://gerrit.wikimedia.org/r/188328

Krinkle lowered the priority of this task from High to Low.Apr 28 2015, 5:37 PM
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.
Krinkle removed a subscriber: gerritbot.