Page MenuHomePhabricator

RfC: Server-side Javascript error logging
Open, HighPublic

Description

Main document: Server-side Javascript error logging RfC.

Implementation steps:

  1. T525: Review existing JS error logging solutions
  2. T499: Create error logging JS module
  3. T500: Create basic endpoint for JS error logging (MVP for vanilla MW)
  4. T501: Create WMF endpoint for error logging - part 1 (producer)
  5. T502: Create WMF endpoint for error logging - part 2 (consumer) (experimental MVP for Wikimedia)
  6. T526: Add sampling and throttling support to JS error logging
  7. T521: Make sure JS error logging respects user privacy (stable MVP for Wikimedia)
  8. T519: Improve error id generation in JS error logging
  9. T514: Collect environment information for JS error logging
  10. T512: Deal with some browsers providing less details for JS error logging

Harder / experimental stuff:

  1. T520: Deal with minified scripts in JS error logging
  2. T507: Measure how many users have CORS-hostile proxies
  3. T508: Use CORS-enabled fetch of scripts to avoid same-domain limitations in JS error logging
  4. T513: Wrap scripts with exception handling for automatic JS error logging

(Vague) interface ideas:

  1. T522: Add JS error counts to graphite
  2. T523: Deduplicate JS error logs
  3. T524: Interface to display JS error logs

Details

Reference
fl575

Related Objects

StatusAssignedTask
OpenNone
OpenTgr
OpenNone
OpenTgr
OpenNone
OpenNone
OpenNone
OpenNone
OpenTgr
ResolvedTgr
OpenNone
StalledNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedTgr
Resolved Gilles
OpenTgr
ResolvedTgr
Resolvedcsteipp
ResolvedTgr
DeclinedTgr
DeclinedTgr
StalledTgr
ResolvedTgr
StalledTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
OpenNone
Resolvedjcrespo
ResolvedAklapper
ResolvedTgr
ResolvedTgr
OpenNone
OpenTgr

Event Timeline

flimport raised the priority of this task from to High.Sep 12 2014, 1:46 AM
flimport added a project: Architecture.
flimport set Reference to fl575.

qgil wrote on 2014-09-03 06:01:55 (UTC)

This RFC has been scheduled to be discussed in the Architecture RfC meeting today.

qgil wrote on 2014-09-04 16:29:02 (UTC)

The logs of the meeting are available at https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-03-21.01.log.html (21:48:58)

I saw no actions, no decisions. What comes next?

Tgr added a subscriber: Tgr.Sep 28 2014, 4:21 PM
Tgr updated the task description. (Show Details)Oct 1 2014, 1:00 PM
Tgr set Security to None.
Tgr added a subscriber: Qgil.Oct 1 2014, 1:03 PM

@Qgil, from the IRC meeting:

21:48:58 <TimStarling> tgr: I like your proposal, do you need anything to make it happen?

I'm taking that as green light; I added more detailed implementations steps as subtickets. I think the first few are fairly uncontroversial and the rest can be discussed as we get there. I'll propose for the Multimedia team to allot some weekly time to work on this.

Tgr updated the task description. (Show Details)Oct 1 2014, 1:15 PM
Qgil added a comment.Oct 1 2014, 1:20 PM

Well, now I get your question about creating projects. You do need one here to organize all these tasks. You are welcome to request one, check https://www.mediawiki.org/wiki/Phabricator/Help#Requesting_a_new_project

Qgil assigned this task to Tgr.Oct 1 2014, 1:21 PM
Qgil added a comment.Oct 1 2014, 1:23 PM
In T382#6691, @Tgr wrote:

@Qgil, from the IRC meeting:

21:48:58 <TimStarling> tgr: I like your proposal, do you need anything to make it happen?

I'm taking that as green light;

Maybe... but I would check with Tim to be sure. https://www.mediawiki.org/wiki/RFC_metadata says "In draft".

Qgil edited projects, added TechCom-RFC; removed Architecture.Oct 22 2014, 8:45 PM
Gilles moved this task from Untriaged to Prototyping on the Multimedia board.Nov 24 2014, 3:57 PM
Tgr added a project: Epic.Dec 21 2014, 4:30 PM
Tgr added a comment.Dec 30 2014, 7:32 AM

Some interesting stuff from this presentation:

Tgr removed Tgr as the assignee of this task.Jan 30 2015, 2:38 AM
daniel assigned this task to brion.Feb 25 2015, 8:14 PM
daniel added a subscriber: daniel.
Spage moved this task from Inbox to Backlog on the TechCom-RFC board.Feb 26 2015, 6:51 PM
Spage removed brion as the assignee of this task.Mar 26 2015, 12:49 AM
Spage added a subscriber: brion.

It sounds like @Tgr is working on improving the RFC in T91357: Update server-side JS error-logging RfC, so assigning this to him. Tgr, when you're done move this task to "under discussion" on TechCom-RFC and the architecture committee will probably move to "approved" \o/ (The rest of the blockers are for implementation.)

Spage assigned this task to Tgr.Apr 6 2015, 8:16 AM
Gilles moved this task from Prototyping to Untriaged on the Multimedia board.Apr 6 2015, 9:23 AM
Prtksxna removed a subscriber: Prtksxna.Apr 6 2015, 9:49 AM
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptJul 16 2015, 5:09 PM
Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:44 PM
DarTar removed a subscriber: DarTar.Sep 21 2015, 10:22 PM

It looks like a large number of the blockers here have been left without any projects.

daniel added a subscriber: Jonas.
Tgr added a comment.Sep 25 2015, 5:35 PM

Added them to Sentry.

Qgil removed a subscriber: Qgil.Sep 30 2015, 8:33 PM
RobLa-WMF lowered the priority of this task from High to Normal.May 25 2016, 5:32 AM
RobLa-WMF raised the priority of this task from Normal to High.

Hey @Tgr, I'm going to work on this as a side project, to get more familiar with mediawiki. I'm going to read up on the status in the various subtasks, let me know if there's somewhere obvious I should start.

Tgr added a comment.EditedApr 12 2019, 3:50 AM

@Milimetric cool! Note though that all this is very outdated and currently most of the work is not in MediaWiki. I wrote a more current summary in T217142#5103038.

Thanks @Tgr, that's a big pivot from what I was expecting, but hey, let's do it! How/when/who is making the decision on each bullet point? I could drive client and pipeline work, and beg ops for help with the Sentry server / Logstash part. In terms of owning the work going forward, I think Analytics is overloaded at the moment but it makes the most sense there. Extension:Sentry is broadly similar to Extension:EventLogging, and it sounds like EventGate/Kafka is the preferred choice for the pipeline, so that's squarely in our world.

I propose that I work on Extension:Sentry and use @Jdforrester-WMF's help to decide between sentry/browser and a custom implementation. Maybe we could look at sentry/browser and trim it down into some sort of sentry/browser-lite that we could convince upstream to maintain. Surely other people are interested in a client lighter than 50k.

If there are no objections by Monday, I'll reach out to James.

Tgr added a subscriber: phuedx.Apr 12 2019, 8:46 PM

@phuedx can you comment? You probably have a better grasp of the current status.

Thanks @Tgr, that's a big pivot from what I was expecting, but hey, let's do it!

<3 <3 <3

How/when/who is making the decision on each bullet point?

As yet, this is unclear. @Tgr has done an incredible job breaking down this work and implemented at least one MVP for Multimedia IIRC; I've poked and prodded a bit in T217142; and @fgiunchedi (and SRE) have very recently picked up the proverbial torch.

I could drive client and pipeline work, and beg ops for help with the Sentry server / Logstash part.

SRE seem willing to drive Kafka/Logstash part of the pipeline. IIRC they're looking at Q1-2 FY19-20.

In terms of owning the work going forward, I think Analytics is overloaded at the moment but it makes the most sense there. Extension:Sentry is broadly similar to Extension:EventLogging, and it sounds like EventGate/Kafka is the preferred choice for the pipeline, so that's squarely in our world.

I'm a little nervous about ownership of an infrastructure piece as critical as client-side error logging being shared by more than one team as it could lead to friction when prioritising bugfixes/maintenance/feature requests. In practice, though, this likely won't be so clean cut, e.g. the client-side component could be maintained by Readers Infrastructure in Audiences. Let's talk about this sooner rather than later.

Shipping an MVP by any means makes sense though!

I propose that I work on Extension:Sentry and use @Jdforrester-WMF's help to decide between sentry/browser and a custom implementation. Maybe we could look at sentry/browser and trim it down into some sort of sentry/browser-lite that we could convince upstream to maintain. Surely other people are interested in a client lighter than 50k.

I wonder how many browsers @sentry/browser supports that we don't deliver JavaScript to. One path might be to not go completely custom but to trim out any parts that won't apply to the Wikipedias, if any.