- Ensure a first version of the schema has been setup for MinervaErrorLog
Create a new MinervaErrorLog schema. The schema should include the following fields:
|url||string||true||Complete URI including query parameters and fragment.|
|oldid||number||true||The revision ID for the page accessed when the report was generated.|
|msg||string||true||A short but informative summary.|
|line||number||true||The line number of where the report originated|
|column||number||true||The character offset of where the report originated|
|stack||string||true||A partial stacktrace that hopefully includes the module and function name where the report originated in MobileFrontend or MinervaNeue.|
|anon||boolean||true||False if the user is logged in, true otherwise.|
|uselang||string||true||The wiki user interface language code.|
|useskin||string||true||The skin used such as "minerva", "vector", or "monobook".|
|meta||Object||true||A JSON dictionary of extra information that was considered pertinent at the time of the report.|
|gitMobileFrontend||string||true||The MobileFrontend short Git commit hash. E.g., "deadc0de".|
|gitSkin||string||true||The skin's short Git commit hash. E.g., "deadc0de".|
The fields will be transmitted in the standard event capsule.
Note: this data is added by the server and doesn't count against the URL limit.
- A new ErrorLogger module is built as a distinct file.
- The new ErrorLogger module is just EventLogging logger that sends MobileFrontendErrorLogs.
- The ErrorLogger subscribes to global.error as early into loading as possible. The logger should not interfere with any other listeners such as the upload error reporter.
- All uncaught errors thrown should be reported.
- It should be possible to report an error without throwing. E.g., instead of throw new Error('foo') use ErrorLogger.log('foo').
- Reports that exceed 2000 characters should trim stack and then url properties until within limits. Creative abbreviation or compression may be used to make the fields more applicable and concise.
The following new configs shall be added:
|mfErrorLoggerEnabled||bool||false||Whether or not to enable ErrorLogger. When false, ErrorLogger is replaced with a console.log implementation. This config defaults to a sensible value for third-party installations, false.|
|mfMobileFrontendErrorLogSamplingRate||number||1||The rate at which reports are sampled. 1 means report all, .5 means report half, 0 means report none. This throttling mechanism is provided to ensure the backend isn't overwhelmed. The config defaults to a sensible value when mfErrorLoggerEnabled is enabled, 1.|
- When Errors are unreported as they are now, we're operating completely in the dark and under the good graces of our users to manually report them. This is _scary_. Reporting problems has been extremely useful in improving the Android and iOS native apps.
- Reports sent will be monitored by Grafana quantity trends. E.g., https://grafana.wikimedia.org/dashboard/db/eventlogging-schema?orgId=1&var-schema=VirtualPageView. When an alarming trend is noticed as part of chores, a developer will ssh into the EventLogging storage host and perform queries on the database. No special code must be written for either. It's all included with EventLogging.
- mw.eventLog.logFailure() may be used ErrorLogger if statsd reporting is wanted.
- Example UploadWizard implementation.
- The Minerva skin should install a single error handler inside skins.minerva.scripts that forwards any client side errors to EventLogging. Alternatively, we might put this error handler inside mobile.init (which will also log events for other skins operating in mobile mode - currently theoretical)
- Configuration exists to change sampling rate and disable/enable the feature
- Configuration is not subject to caching - can be enabled/disabled within 5s (any config variables should make use of onResourceLoaderGetConfigVars to avoid being baked into the HTML)
- By default the error logger is disabled and sampling rate set to 0
- Errors are not logged to the schema on desktop Vector or any other skin.
Sign off steps
- Failures are to be reported in Grafana and usable for investigation via SSH. Set up a task to make that happen.
- Chores will need to be updated to include checking the new Grafana reports.
JR: Tracked in T106915