Page MenuHomePhabricator

Flow fails to load content when running CirrusSearchLinksUpdate jobs
Closed, ResolvedPublicPRODUCTION ERROR

Description

Spotted in production:

[09a78b5c2ff178424bb3103e] /rpc/RunJobs.php?wiki=testwiki&type=cirrusSearchLinksUpdate&maxtime=30&maxmem=300M   Flow\Exception\InvalidDataException from line 366 of /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Model/AbstractRevision.php: Failed to load the content

With the following stacktrace:

#0 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Templating.php(154): Flow\Model\AbstractRevision->getContent(string)
#1 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Formatter/RevisionFormatter.php(1013): Flow\Templating->getContent(Flow\Model\PostRevision, string)
#2 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Formatter/RevisionFormatter.php(334): Flow\Formatter\RevisionFormatter->processParam(string, Flow\Model\PostRevision, Flow\Model\UUID, Flow\View, Flow\Formatter\TopicRow)
#3 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Formatter/TopicFormatter.php(45): Flow\Formatter\RevisionFormatter->formatApi(Flow\Formatter\TopicRow, Flow\View)
#4 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Block/Topic.php(642): Flow\Formatter\TopicFormatter->formatApi(Flow\Model\Workflow, array, Flow\View)
#5 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Block/Topic.php(549): Flow\Block\TopicBlock->renderTopicApi(array)
#6 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/View.php(189): Flow\Block\TopicBlock->renderApi(array)
#7 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/View.php(68): Flow\View->buildApiResponse(Flow\WorkflowLoader, array, string, array)
#8 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Content/BoardContent.php(220): Flow\View->show(Flow\WorkflowLoader, string)
#9 /srv/mediawiki/php-1.30.0-wmf.17/extensions/Flow/includes/Content/BoardContent.php(171): Flow\Content\BoardContent->generateHtml(Title, User)
#10 /srv/mediawiki/php-1.30.0-wmf.17/includes/content/ContentHandler.php(1205): Flow\Content\BoardContent->getParserOutput(Title, integer, ParserOptions)
#11 /srv/mediawiki/php-1.30.0-wmf.17/extensions/CirrusSearch/includes/Updater.php(368): ContentHandler->getParserOutputForIndexing(WikiPage, ParserCache)
#12 /srv/mediawiki/php-1.30.0-wmf.17/extensions/CirrusSearch/includes/Updater.php(199): CirrusSearch\Updater->buildDocumentsForPages(array, integer)
#13 /srv/mediawiki/php-1.30.0-wmf.17/extensions/CirrusSearch/includes/Updater.php(82): CirrusSearch\Updater->updatePages(array, integer)
#14 /srv/mediawiki/php-1.30.0-wmf.17/extensions/CirrusSearch/includes/Job/LinksUpdate.php(52): CirrusSearch\Updater->updateFromTitle(Title)
#15 /srv/mediawiki/php-1.30.0-wmf.17/extensions/CirrusSearch/includes/Job/Job.php(98): CirrusSearch\Job\LinksUpdate->doJob()
#16 /srv/mediawiki/php-1.30.0-wmf.17/includes/jobqueue/JobRunner.php(295): CirrusSearch\Job\Job->run()
#17 /srv/mediawiki/php-1.30.0-wmf.17/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob(CirrusSearch\Job\LinksUpdate, Wikimedia\Rdbms\LBFactoryMulti, BufferingStatsdDataFactory, integer)
#18 /srv/mediawiki/rpc/RunJobs.php(47): JobRunner->run(array)
#19 {main}

It's impossible to tell from the trace (which I'll also remedy), but we think this is T95580: Flow data missing on Wikimedia production wikis. They're testwiki posts, but we need to make sure it doesn't break here.

Event Timeline

Does anyone know which wikis this error happens on? For example, does it only happen on testwiki and test2wiki?

Aha. But there are some fawiki occurrences too, although they look different.

I still think the cause is as stated, but this is being caught, not impacting anything, and I don't think it's a release blocker.

An example is https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2017.09.05/mediawiki?id=AV5T7G-aTqnoMLVnze58&_g=h@44136fa .

We are catching it then logging it, which is tracked via caught_by.

I am still changing how this works for the benefit of the dumps (T139791). That code path does not go throw Templating, so it's actually not caught.

However, I don't think this is a bug.

And I believe the checks for release-blockers should exclude caught_by: other

It is a release blocker because of the sheer volume it's logged. It would drown out everything if I move forward.

Er, I got this confused. Yes we still need to get the fix merged but it wasn't as loud as I thought.

Checked the logstash for this type of erros - as far as I could see, there are no more such errors present.

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM