Allow content blobs to be marked as broken in the content table
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	daniel
	Dec 13 2019, 4:08 PM

Description

Sometimes, revision data is lost due to data corruption (see e.g. T205936). Such corruption should not be silently ignored, but should be reported and handled gracefully, probably by treating the content as empty.

However, in instance where the corruption has been recognized and handled as well as possible, the broken entries in the content table would continue to cause warnings in the log. To avoid this, the bad entries in the content tables should be marked as "known to be bad". When reading such "known bad" entries, nothing is written to the logs in production, and the content is treated as empty.

One obvious way to do this is to change the content_address field to something that represents the problem or the desired outcome. We could introduce a (pseudo-)address scheme called "bad", with possible values like bad:gone missing or bad:T205936. The value after the "bad:" prefix is arbitrary and can be used for later eyeballing, investigation, or processing.

Details

	Subject	Repo	Branch	Lines +/-
	BlobStore: support "known bad" addresses.	mediawiki/core	master	+65 -9

Customize query in gerrit

Related Objects

Mentioned In: T205936: Unable to view some pages due to fatal RevisionAccessException: "Failed to load data blob from tt"
Mentioned Here: T205936: Unable to view some pages due to fatal RevisionAccessException: "Failed to load data blob from tt"

Event Timeline

daniel created this task.Dec 13 2019, 4:08 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 13 2019, 4:08 PM

daniel updated the task description. (Show Details)Dec 13 2019, 4:09 PM

Change 557068 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] BlobStore: suppoer "known bad" addresses.

https://gerrit.wikimedia.org/r/557068

gerritbot added a project: Patch-For-Review.Dec 13 2019, 5:00 PM

daniel mentioned this in T205936: Unable to view some pages due to fatal RevisionAccessException: "Failed to load data blob from tt".Dec 13 2019, 9:57 PM

DannyS712 updated the task description. (Show Details)Dec 15 2019, 10:53 AM

daniel claimed this task.Dec 16 2019, 6:35 PM

daniel triaged this task as Medium priority.

daniel moved this task from Inbox to Doing(WIP:5) on the Platform Team Workboards (Clinic Duty Team) board.

daniel moved this task from Doing(WIP:5) to Waiting for Review on the Platform Team Workboards (Clinic Duty Team) board.Jan 6 2020, 11:30 AM

Change 557068 merged by jenkins-bot:
[mediawiki/core@master] BlobStore: support "known bad" addresses.

https://gerrit.wikimedia.org/r/557068

ReleaseTaggerBot added a project: MW-1.35-notes (1.35.0-wmf.19; 2020-02-11).Feb 6 2020, 4:01 PM

Maintenance_bot removed a project: Patch-For-Review.Feb 6 2020, 4:12 PM