Page MenuHomePhabricator

Create a dashboard or alerts or something for the rate of errors in various uploading tools, and rate of completed uploads using various uploading tools
Closed, DeclinedPublic

Description

The current state of upload tools is that we at Multimedia do not know that something is horribly broken until someone nicely points it out to us. This is terrible. We should be alerted about things like no cross-wiki uploads coming through (T132612) or a new error message popping up in large numbers (T130238).

Event Timeline

MarkTraceur moved this task from Untriaged to Next up on the Multimedia board.
MarkTraceur subscribed.

I believe we've spoken about this at some length - namely, I pointed out that it's entirely possible (if not likely) that uploads from a certain tool might stop for totally innocent reasons - like on-wiki admins directing people to another tool, or whatever. But, I like the principle, as long as the thresholds are appropriate and the notifications aren't too blaring or constant.

Perhaps we could have an IRC bot, for example, that checks every 30 minutes (or less often?) for the state of uploads on the cluster. If it's, let's say, two standard deviations below the mean for the past hour, then raise a warning in the -multimedia channel. I don't think filing a Phabricator task would be useful or appropriate. But we could do this for different tools fairly easily (CWUs are tagged, UploadWizard has a particular edit summary), and we could compare those results to the total uploads to determine if the entire upload system is down, or if it's just one tool, or a couple of tools that use the same library.

I'm prioritising as "high" because I'd like to see this happen, but I think this will naturally sink to "normal" if we don't get to it in the next month or so. I believe it might be useful to do this at the dev summit.

MarkTraceur lowered the priority of this task from High to Medium.May 30 2017, 3:10 PM
dr0ptp4kt raised the priority of this task from Medium to Needs Triage.Aug 28 2017, 4:46 PM
dr0ptp4kt moved this task from Next up to Product owner backlog on the Multimedia board.

@matmarex: Do you know if there is any existing team or project tag which still cares about this task, or is this task just invalid these days?