Page MenuHomePhabricator

Fix issue with database writes not keeping up with events
Closed, ResolvedPublic

Description

In some cases, the database is not able to keep up with events. They accumulate, and eventually EventLogging issues a massive write (e.g. 70,000 records at once) which some part of the database write path can not handle.

This led to the outage from 2015-02-05 to 2015-02-10

Event Timeline

Mattflaschen-WMF assigned this task to Nuria.
Mattflaschen-WMF raised the priority of this task from to High.
Mattflaschen-WMF updated the task description. (Show Details)
Mattflaschen-WMF updated the task description. (Show Details)
Mattflaschen-WMF set Security to None.
Nuria moved this task from Next Up to In Progress on the Analytics-Kanban board.Feb 20 2015, 3:05 PM
Nuria added a comment.EditedMar 17 2015, 5:12 PM

While I do not think recent code patches changed anything core they definitely mitigated issues that affect db writes when there are bursts of traffic. Also, better error reporting will help to troubleshoot issues like this in the future.

After couple days of monitoring after deploying EL latest code we see no drops of events thus moving this item to done.

There are improvements that can be done to the current code regarding db writes.

Nuria closed this task as Resolved.Mar 17 2015, 5:12 PM
Nuria moved this task from Ready to Deploy to Done on the Analytics-Kanban board.