Create Grafana graph to show number of ORES API requests per user-agent
Closed, DuplicatePublic
Actions

Assigned To

None

Authored By

	awight
	Dec 6 2017, 6:02 PM

Description

Knowing whether a single user-agent was responsible for load spikes would be helpful for diagnostics and incident response, as well as long-term growth planning.

Event Timeline

awight created this task.Dec 6 2017, 6:02 PM

Restricted Application added a project: Machine-Learning-Team. · View Herald TranscriptDec 6 2017, 6:02 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

greg moved this task from Active investigation to Follow-up prevention on the Wikimedia-Incident board.Dec 6 2017, 9:45 PM

awight moved this task from Unsorted to Maintenance/cleanup on the Machine-Learning-Team board.Jun 20 2018, 3:05 PM

Given that we have the 4 parallel connection per IP in place and difficulties of implementing this. I propose we decline this ticket. We need some better protections against malicious DDoS though but I doubt this would help.

Today this would have been useful for monitoring low cache hit rate in one data center. This still feels like a useful feature, though low priority.

@Ladsgroup points out in IRC that we can get this information easily once logstash parses UA out of log messages.

My two cents: given the big cardinality of user-agent header I think doing this with grafana (and thus graphite or prometheus) would be impractical. Kibana/Elasticsearch should have no problem displaying something like this though.

FWIW, I 'll echo @Ladsgroup and @fgiunchedi. Having the data is obviously useful. Representing them in grafana on the other hand it probably not so practical. I also have my doubts as to whether a graph would help identify the culprits of load spikes, mostly due to the nature of the service, but I am be at fault here.

Ladsgroup closed this task as a duplicate of T181542: Monitoring for top IPs and User-Agents hitting the ORES service.Nov 27 2018, 1:14 AM

Krinkle edited projects, added Sustainability (Incident Followup); removed Wikimedia-Incident.Apr 28 2020, 9:50 PM

Create Grafana graph to show number of ORES API requests per user-agentClosed, DuplicatePublicActions

Description

Event Timeline

Create Grafana graph to show number of ORES API requests per user-agent
Closed, DuplicatePublic
Actions