Page MenuHomePhabricator

Deal with minified scripts in JS error logging
Open, NormalPublic

Description

For the overwhelming majority of users, scripts are minified, rendering error locations mostly useless. Investigate the use of source maps to retrieve the original locations.

  • make the minifier emit source maps (or consider replacing it with another minifier, which might be the easier way of doing this - several minifiers, e.g. UglifyJS, support source map generation)
  • add source map support to ResourceLoader (bug 45514) - note that RL wraps scripts with some custom JS, so source maps need to be adjusted

This might well turn out to be outside the expertise or resource limitations of the Multimedia team, in which case we should try to find an owner for this task. If that doesn't work, consider a fallback solution:

  • use the wrapper described in T513 do identify the file which the error originates from
  • use line/column information to extract a short javascript snippet from the location of the error

Event Timeline

Tgr created this task.Oct 1 2014, 9:51 AM
Tgr claimed this task.
Tgr raised the priority of this task from to Normal.
Tgr updated the task description. (Show Details)
Tgr changed Security from none to None.
Tgr added a subscriber: Tgr.
Tgr updated the task description. (Show Details)Oct 1 2014, 9:59 AM
Tgr removed a subscriber: Unknown Object (MLST).
Tgr renamed this task from Deal with minified scripts to Deal with minified scripts in JS error logging.Oct 1 2014, 11:40 AM
Krinkle added a subscriber: Krinkle.Jan 9 2015, 2:54 AM

Based on developing and debugging MediaWiki since the introduction of ResourceLoader in 2010 (and with it, minified scripts) and web development in general: I never got any value out of having file names and line numbers from javascript errors. Especially when dealing with e.g. jquery.js (which is build from many separate files), and yet it's never been an issue.

File names and line numbers are present in debug mode, but aside from someone reporting an error using debug mode to me, I never use debug mode.

The stack trace is all that matters. We don't mangle identifiers and as such the error message and method name are more then enough to identify which project it comes from, and find the relevant source code in no-time.

E.g. from an error by mw.Foo#bar, it's fast enough to open "foo.js" (quick file openers in modern text editor index files recursively, no need to know the directory structure), jump to method #bar, and see the surrounding code in front of me. Compared to copying and pasting a file path from the exception, and scrolling or jumping to the reported line number.

When we get rid of debug mode as we know it (T85805), Source Maps is a priority, but it might not make the first iteration. As counter-intuitive as it may sound from a historical standpoint, I wouldn't recommend error-gathering projects to prioritise line numbers and source file paths.

Tgr added a comment.Jan 9 2015, 4:15 AM

Personally I find it a lot easier to reproduce errors in debug mode, and use the developer toolbar to follow the control flow through the source files. I got in the habit of using debug mode all the time because of that (and also because it's a bit faster). Admittedly, if I used my text editor instead of the browser's developer toolbar as the primary debugging tool, I would probably fine file names less useful.

YMMV aside, I think there are a few use cases of filenames / line numbers in an error tracker:

  • with certain coding styles, stack traces with only function names are not as helpful. E.g. $element.on('click', function() {...}) will give a stack trace with <anonymous> plus a bunch of jQuery internals.
  • Sentry uses the stack trace to dedupe bugs. Function names are not always enough to do that, given that Javascript uses anonymous functions and non-unique function names. With column numbers, minified code is just as fine for deduping as unminified code, but not all browsers support column numbers.
  • Sentry can show code context to speed up debugging. If you go to http://sentry-beta.wmflabs.org/jserrors/jserrors/group/15/ and click on one of the code lines, it will expand to show a small excerpt from the source file. That is very helpful to get a quick overview of the issue; even more so if one is not that familiar with the code.
  • file names can be used to identify the module which caused the error, which can be used to categorize it or alert the right group of people.
  • file names name + line number can be used to open your IDE at the right location with a single click (like e.g. xdebug.file_link_format does). Not a huge deal but a nice convenience.
  • without source maps, all bugs will be duplicated since people will trigger it in both normal and debug mode and Sentry has no way of connecting those.

All that said, I agree this is not a blocker, and stack traces will be quite useful even without source maps. There are certainly a lot of other areas of the error logging roadmap with comparable impact. And, as mentioned in the bug description, for some of the use cases, the lack of source maps could be worked around.

Tgr removed Tgr as the assignee of this task.Jan 30 2015, 2:38 AM

Large web apps in JavaScript don't allow for an interconnected IDE. True debugging in an editor therefore doesn't exist. One can tie an editor to V8's remote debugging and tie that to an editor for displaying code. But it wouldn't map to the file system.

After capturing a bug in the remote environment it was reported in, I quickly move to the local environment to reproduce the bug again. There, everything maps to the file system. At which point debugging and fixing the issue happen in editor.

Once we have Source Maps, Developer Tools' ability to imitate a file system in the Source tab will come in handy. Displaying paths and files that don't even necessarily have to exist (e.g. like https://gerrit.wikimedia.org/r/#/c/156544/).

Modern javascript engines go far beyond the function name for stack traces. In fact, pure function names wouldn't be that useful. As there is no concept of classes in JavaScript, one would essentially have a method name without its associated class or object (because the function can come from anywhere, detached). Method names alone can be quite ambiguous (unless named in an overly verbose verbose manner). Engines now use assignments to derive names. And lexical scoping to form a breadcrumbs path to a function.

In T520#965197, @Tgr wrote:
  • with certain coding styles, stack traces with only function names are not as helpful. E.g. $element.on('click', function() {...}) will give a stack trace with <anonymous> plus a bunch of jQuery internals.

The only unnamed branch is the bottom level, e.g. (anonymous function) (well, almost bottom level, on top of jQuery). However that branch is also the branch currently shown in the dev tools with all its surrounding (prettified) code. That's better than a function name. I don't need a filename/line number. Whatever code I'd find from disk in an editor, is right there in the browser when looking at the exception. Even better if you prefer reading code in the browser.

Also, while a lot of our code violates this massively, event handlers should contain very little (if any) logic. They should handle events and nothing else. A good indicator of where this goes wrong is if one finds the need to trigger events in unit tests. (Maintainable JavaScript - Event handlers (YouTube)).

Moving code out of event handlers also moves the stack into meaningful territory. The only case where an event handler is be able to throw is if the event object is missing expected properties, or if the method being delegated to, throws. Both result in good stacks.

  • Sentry uses the stack trace to dedupe bugs. Function names are not always enough to do that, given that Javascript uses anonymous functions and non-unique function names. With column numbers, minified code is just as fine for deduping as unminified code, but not all browsers support column numbers.

While individual function names aren't always unique (or have names at all). Modern strack traces are quite good. Arguably the breadcrumbs path is better than a mere method/class label or a verbose function name. It reflects the context and location regardless of a pass-by-value or other detachment. More descriptive than one could write out ahead of time.

And in my experience, exception messages themselves are also quite unique. And that's without silly prefixes like class, method or module names inside the exception message.

For a later iteration, source maps would do the rest.

  • Sentry can show code context to speed up debugging. If you go to http://sentry-beta.wmflabs.org/jserrors/jserrors/group/15/ and click on one of the code lines, it will expand to show a small excerpt from the source file. That is very helpful to get a quick overview of the issue; even more so if one is not that familiar with the code.

I'm curious how that works. That url shows raw file paths and unminified code. Did those come from the browser? Without debug mode? Does it fetch the context from urls (client-side or server-side)? Or maps to files on disk? Looks quite neat.

  • without source maps, all bugs will be duplicated since people will trigger it in both normal and debug mode and Sentry has no way of connecting those.

I don't think we should log in debug mode.

He7d3r added a subscriber: He7d3r.Apr 17 2015, 7:27 PM
Jdforrester-WMF edited projects, added Sentry; removed Multimedia.May 7 2015, 3:56 PM