Page MenuHomePhabricator

One row event (at least) was not correctly replicated on trwiki (db1069)
Closed, DuplicatePublic

Description

The following query returns one row of data even though the page in question was deleted on March 8th, 2016.

SELECT
    p.page_title
FROM
    trwiki_p.page p
WHERE
    p.page_namespace = 0 AND p.page_title = 'Fort'

May this have been caused by a caching issue?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 11 2016, 6:59 PM
jcrespo renamed this task from Replication broken on c1 to One row event (at least) was not correctly replicated on trwiki.Mar 13 2016, 3:24 PM
jcrespo edited projects, added Cloud-VPS, DBA; removed Tool-Database-Queries.
Restricted Application added a project: Cloud-Services. · View Herald TranscriptMar 13 2016, 3:24 PM

Replication "is working", obviously -with incorrect data on all labs hosts. The rows are correct on production.

This could be related to the findings at T129432, where a bunch of replication events were skipped entirely around the same dates. Needs further investigation. But all points to an s2 skip of events for a specific period of time.

jcrespo renamed this task from One row event (at least) was not correctly replicated on trwiki to One row event (at least) was not correctly replicated on trwiki (db1069).Mar 13 2016, 3:28 PM
jcrespo triaged this task as High priority.
jcrespo moved this task from Triage to Backlog on the DBA board.

There is one error on the logs that could be related to both events (the dates match):

160308 18:08:29 [ERROR] Slave SQL: Error 'The user specified as a definer ('root'@'208.80.154.151') does not exist'

However, this was corrected later without skipping any events (just recreating the missing user), so it should have not been the source of the issues (it does not provide by itself the explanation of why the events were skipped).