Page MenuHomePhabricator

Job cirrusSearchLinksUpdate: Failed creating job from description ("Page does not exist")
Closed, ResolvedPublic

Description

Error

Request ID: 070d7a27354a68fc5d4ae980

message
channel: JobExecutor
level: ERROR
wiki: sdwiki
message: Failed creating job from description
c_message: Page *** does not exist
job_type: cirrusSearchLinksUpdate

Impact

Loss of certain jobs (type "cirrusSearchLinksUpdate" in this case). The failure seems deterministic so both with and without retry, there is no path to recovery for these updates.

I don't know currently whether these updates are meant to work (e.g. they should work but are failing), or whether it is a case of one part of the system trying something another part doesn't support (in which case the only thing we need to do is make it not queue these jobs).

As I understand it, our validation is not significantly different at execution time than at queuing time, so it's unclear why this fails at execution time instead of at queueing time.

Source: https://gerrit.wikimedia.org/g/mediawiki/extensions/EventBus/+/aa85993e5887e6426738963810105ab307904f89/includes/JobExecutor.php#36

Notes

Recorded 560 times in WMF Logstash in recent weeks.

Most breakdown factors show a fairly equal distribution, except the wiki ID. It only affects a small subset of wikis, others are entirely unaffected

wikiCount
sdwiki188
pswiki50
shwiki32
angwiki30
newiki30
diqwiki26

Details

Related Gerrit Patches:
mediawiki/extensions/EventBus : masterTitles rejected by newFromDBKey are invalid not inexistent
mediawiki/extensions/CirrusSearch : masterDo not index invalid titles

Event Timeline

Krinkle created this task.Nov 7 2018, 3:44 AM
Restricted Application added a project: Discovery-Search. · View Herald TranscriptNov 7 2018, 3:44 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Krinkle renamed this task from JobExecutor ERROR: "Failed creating job from description" to Job cirrusSearchLinksUpdate: Failed creating job from description ("Page does not exist").Nov 7 2018, 3:44 AM
Krinkle moved this task from Untriaged to Found longer ago on the Wikimedia-production-error board.
Krinkle updated the task description. (Show Details)
Krinkle removed a project: WMF-JobQueue.
EBjune triaged this task as Normal priority.Nov 8 2018, 6:03 PM
EBjune moved this task from needs triage to Up Next on the Discovery-Search board.
EBjune moved this task from Up Next to Current work on the Discovery-Search board.Nov 13 2018, 6:30 PM
dcausse added a comment.EditedDec 10 2018, 2:03 PM

The few pages I checked seem to be invalid title strings attached to existing entries in the DB but failing to load using Title::newFromDBKey().

e.g. eswiki https://es.wikipedia.org/wiki/Discusi%C3%B3n:WP:MI which exists in the DB : https://es.wikipedia.org/w/api.php?action=query&prop=info&pageids=3214680&inprop=url

I suspect that the sanitizer is the source of these events in the jobqueue. I suppose that even if these titles are still in the DB they are waiting for some cleanup actions. These pages are technically here, somewhat "viewable" using the API but not when using their URL.
I suppose that we may want to remove them from the index?

Change 478713 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/CirrusSearch@master] Do not index invalid titles

https://gerrit.wikimedia.org/r/478713

Change 478721 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/EventBus@master] Titles rejected by newFromDBKey are not necessarily inexistent

https://gerrit.wikimedia.org/r/478721

Change 478713 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Do not index invalid titles

https://gerrit.wikimedia.org/r/478713

Change 478721 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Titles rejected by newFromDBKey are invalid not inexistent

https://gerrit.wikimedia.org/r/478721

debt closed this task as Resolved.Dec 13 2018, 6:08 PM
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM