Page MenuHomePhabricator

"Script error." Scripts loaded from other domains with empty file_uri and no stack trace should not be included
Closed, ResolvedPublic

Description

The rather unhelpful "Script Error" accounts for a large chunk of the errors recorded as client side errors. These come from scripts running on a different domain so possibly sourced via global gadgets or loaded across wiki. In the last 12 hrs, excluding 2 known errors which are being resolved, 90% of errors recorded to mediawiki.org and catalan wikipedia were reported with this unhelpful inactionable message.

https://logstash.wikimedia.org/goto/da15ba157025c9c1d8ab0041baff45cf

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 31 2020, 4:10 PM
Restricted Application added a project: Performance-Team. · View Herald TranscriptJul 31 2020, 4:10 PM
Jdlrobson renamed this task from Gadgets loaded from other domains should have more actionable error messages to ScriptError: Gadgets loaded from other domains should have more actionable error messages.Jul 31 2020, 4:11 PM
Krinkle renamed this task from ScriptError: Gadgets loaded from other domains should have more actionable error messages to "Script error." Gadgets loaded from other domains should have more actionable error messages.Jul 31 2020, 5:24 PM

[…] this could be helped with crossorigin="anonymous". I'm not sure if that is feasible given gadgets likely need credentials, […]

What kind of credentials and how would those be used?

The rather unhelpful "Script Error" accounts for a large chunk of the errors recorded as client side errors. These come from scripts running on a different domain so possibly sourced via global gadgets.

Could you confirm this in any (artificial) scenario? I'm unable to reproduce this.

  1. See https://www.mediawiki.org/w/index.php?title=User:Krinkle/foo.js&action=raw&ctype=text/javascript.
  2. Open https://meta.wikimedia.org/
  3. Run setTimeout( ()=>mw.libs.bar());
  4. Run mw.trackSubscribe('global.error', console.log)

Observe (Firefox stable, macOS):

{
  "errorMessage": "Error: Boo hoo from foo",
  "url": "https://www.mediawiki.org/w/index.php?title=User:Krinkle/foo.js&action=raw&ctype=text/javascript",
  "lineNumber": 2,
  "columnNumber": 8,
  "stackTrace": "at foo https://www.mediawiki.org/w/index.php?title=User:Krinkle/foo.js&action=raw&ctype=text/javascript:2:8\nat bar https://www.mediawiki.org/w/index.php?title=User:Krinkle/foo.js&action=raw&ctype=text/javascript:6:2\nat debugger eval code:1:25\n",
  "errorObject": {}
}

Here's some replication steps:

  1. Run the following code on https://ca.wikipedia.org/w/index.php?title=Diccionari_descriptiu_de_la_llengua_catalana&action=edit
window.onerror= function (e) { console.log(e)}
mw.loader.getScript('https://commons.wikimedia.org/w/index.php?title=MediaWiki%3AGadget-Cat-a-lot.js%2Fca&action=raw&ctype=text%2Fjavascript&maxage=2419200&smaxage=2419200' );`
  1. In the JS console you'll see Uncaught ReferenceError: catALot is not defined and ScriptError
  2. The event is sent to https://intake-logging.wikimedia.org/v1/events?hasty=true with "ScriptError."

What kind of credentials and how would those be used?

I am not familiar with the crossorigin=anonymous policy and how this works. I was thinking specifically if a script on meta.wiki uses the api to save content whether that would work or not.

I see. The lack of stack trace appears to be limited to specific scenarios where 1) the error is implicit and native e.g. ReferenceError, not so when explicitly throwing error, and 2) the error happened during the initial execution of a cross-origin script, and 3) the other origin is outside the same registrable ETLD domain (e.g. within *.wikipedia.org it might be fine, unconfirmed).

Meaning, that if the script produces an error from setTimeout, or event handlers, or mw.hook, or $() etc, then the trace already works fine. The trace also already works fine if the error happens anywhere inside a function that is later called (e.g. a function that is imported, required, or globally referenced and then called locally).

That particular subset of cross-origin errors is protected for security reasons as the stack would reveal the contents of the top-level script which would not be readable or exposed by any other means. Effectively the same reason you can't fetch random cross-origin URLs and read the response as text. The contents of functions can be read via fn.toString(), and most DOM APIs etc could be hijacked to capture such traces, but the contents of the outer/init scope are considered private.

What kind of credentials and how would those be used?

I am not familiar with the crossorigin=anonymous policy and how this works. I was thinking specifically if a script on meta.wiki uses the api to save content whether that would work or not.

In a web browser, generally all JavaScript code on a page executes in the same shared environment. The decision of whether or not a session cookie applies is based on the navigation to the HTML page (e.g. en.wikipedia/foo can see all cookies for that path, and parents of the path or domain, until the ETLD).

There is no way in browsers today to limit what any script, regardless of its origin, can or can't access or can or can't do.

The crossorigin= attribute and Fetch's cors/credentials options relate to whether or not that one web request will be made with or without cookies. The load.php URLs are publicly cached and act the same with or without cookies. Regardless of whether the executed script came from a cookieless (cors=anonymous/credentials:omit) or cookied request, the script itself will still be parsed and executed the same way, run on the same web page, and have access to the same global scope and DOM and cookies etc there.

All to say, that yes, to the extent that the crossorigin attribute can be applied to any cross-origin load.php requests, it could be set to anonymous without any functional difference in behavioiur. Although there may be a performance cost is these requests may not be able to share connections, and require an additional CORS pre-flight check/roundtrip, and require the server to set Access-Control-Allow-Origin: * in all its responses as otherwise the anonymous request will be blocked.

Krinkle triaged this task as Medium priority.Jul 31 2020, 7:34 PM
Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.
Jdlrobson added a subscriber: jlinehan.

One short term fix would be for @jlinehan to not log or filter out any "Script Error." with no file_url set. At least when file_url is set we have a change of debugging ( T259385)

Krinkle added a comment.EditedJul 31 2020, 7:45 PM

In general the URL still always be set in those cases (afaik). If not, then I would guess it is not gadget or mediawiki related but rather from browser extension or some such. So that would be fine to filter out no matter what.

Jdlrobson added a comment.EditedAug 10 2020, 10:35 PM

In the last 24 hrs at time of writing there were 3750 errors and 2,828 "Script errors"
Out of these script errors 136 errors came from the gadget https://meta.wikimedia.org/w/index.php?title=MediaWiki:Wikiminiatlas.js with no actionable stack trace.
However 88% had no stack trace or file uri and are likely coming from extensions.

Note there seems to be a few cases where file uri is set incorrectly as the url eg.
https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2020.08.10/clienterror/?id=AXPaQUmrMQ_08tQaw9aj
These should also be filtered out.

@jlinehan can filtering out errors where both file_uri and stack trace are empty URIS be prioritized? I'm not sure whether this makes sense to do on client or as part of post-processing on the server.

Jdlrobson renamed this task from "Script error." Gadgets loaded from other domains should have more actionable error messages to "Script error." Scripts loaded from other domains with empty file_uri and no stack trace should not be included.Aug 10 2020, 10:36 PM
Jdlrobson updated the task description. (Show Details)

Change 619481 had a related patch set uploaded (by Jason Linehan; owner: Jason Linehan):
[mediawiki/extensions/WikimediaEvents@master] clientError: Filter out errors without stack_trace or file_uri.

https://gerrit.wikimedia.org/r/619481

jlinehan claimed this task.Aug 11 2020, 2:53 PM
jlinehan raised the priority of this task from Medium to High.
jlinehan added a comment.EditedAug 11 2020, 3:17 PM

@jlinehan can filtering out errors where both file_uri and stack trace are empty URIS be prioritized? I'm not sure whether this makes sense to do on client or as part of post-processing on the server.

I uploaded a patch to do this, but let me make sure I understand. We want to filter cases where there is no stack trace AND no file_uri? We believe this is diagnostic of "Script Errors" that arise from:

...specific scenarios where 1) the error is implicit and native e.g. ReferenceError, not so when explicitly throwing error, and 2) the error happened during the initial execution of a cross-origin script, and 3) the other origin is outside the same registrable ETLD domain (e.g. within *.wikipedia.org it might be fine, unconfirmed).

Does that sum it up? I will want to document this clearly on the patch.

Also, could this be accomplished at the same time as https://phabricator.wikimedia.org/T259383 by requiring some pattern for the file_url field that only captures ResourceLoader URLs? Perhaps this would isolate MediaWiki errors from browser extensions etc. in an easier way? Or do we still want visibility into those?

I uploaded a patch to do this, but let me make sure I understand. We want to filter cases where there is no stack trace AND no file_uri?

Correct. In certain cases the stack trace and file_uri is an empty string. I'm not 100% sure what the situation for these is, but I don't see any situation where knowing these are happening would be useful - at minimum we need one of stack_trace or file_uri to debug - and they should always be present for production errors we care about.

There seems to be a second condition where file_uri = current uri of page. e.g. file_uri https://en.wikipedia.org/wiki/Main_Page for an error that occurs on https://en.wikipedia.org/wiki/Main_Page and stack trace is empty. We also can't do much about thesse.

From my perspective, why these errors are occurring is not so important. We can count them if necessary (similar to what we do in Minerva) but no need to log them.

There seems to be a second condition where file_uri = current uri of page. e.g. file_uri https://en.wikipedia.org/wiki/Main_Page for an error that occurs on https://en.wikipedia.org/wiki/Main_Page and stack trace is empty. We also can't do much about thesse.

This is simple to test for, is it something you want to add to the patch?

Change 619481 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] clientError: Filter out errors without stack_trace or file_uri.

https://gerrit.wikimedia.org/r/619481

Thanks for taking care of that. Hopefully, this should be enough to roll out to Hebrew Wikipedia on Wednesday next week (Hebrew Wikipedia runs earlier in the train)?

Jdlrobson closed this task as Resolved.Aug 19 2020, 7:52 PM

This appears to be working well.

I believe I am still seeing the errors meant to be filtered out by this task in our chores dashboard. There are about 350 currently. @jlinehan, should I open a new ticket or reopen this one?

jlinehan reopened this task as Open.Sep 10 2020, 3:44 PM
jlinehan added a comment.EditedSep 10 2020, 3:55 PM

I believe I am still seeing the errors meant to be filtered out by this task in our chores dashboard. There are about 350 currently. @jlinehan, should I open a new ticket or reopen this one?

Seems fine to re-open this since we have a lot of discussion here.

So, there are 'Script Errors' with no stack trace which nonetheless have both a file_url and a url. I assume that these were things that we don't think are actionable but which aren't being filtered out currently. Should we target them for filtering out? If so, then since these examples avoid our criteria of having file_url and url match, how shall we target them? Is the presence of 'Script Error' and an empty stack trace diagnostic of a non-actionable error? I'll let @Jdlrobson opine.

Jdlrobson closed this task as Resolved.Sep 10 2020, 4:38 PM

No these script errors are genuine and useful and should be included so this ticket should remain closed. I am working on fixing these as we speak so they should be gone by the end of the week. The bug here was about "script errors" which had no useful file uri, as I can see, all the existing 350 script errors have file uris associated.

Note: Usually, they bugs be replicated by using the file_uri to get the code and running on the target page. Filtering by file_uri will allow you to see errors that occurred with provided they run on the same domain. Now we are running scripts on meta.wikimedia.org we are getting stack traces for these errors which we can link to the mysterious "Script error" as the code throwing an error is also coming from meta.wikimedia.org.

To extinguish all the errors completely we'll likely need to have client side error logging on all wikis, including English Wikipedia so unfortunately we'll stuck with a few for a while, but I've been filtering those out.

jlinehan added a comment.EditedSep 10 2020, 5:04 PM

Sounds good to me.