Page MenuHomePhabricator

History of old topic is broken
Closed, ResolvedPublic

Description

https://www.mediawiki.org/w/index.php?title=Topic:Rs9k21e73g4qsurn&action=history gives:

Invalid value was provided for loading Flow content.

Return to MediaWiki.
[1040191a] 2015-06-19 04:44:28: Fatal exception of type "Flow\Exception\InvalidInputException"

Event Timeline

Mattflaschen-WMF raised the priority of this task from to High.
Mattflaschen-WMF updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mattflaschen-WMF raised the priority of this task from High to Unbreak Now!.Jun 19 2015, 4:46 AM
Mattflaschen-WMF set Security to None.

This isn't that helpful, but I think it worked the first time I went there, so maybe cache-related.

And again when I rechecked, worked on first request, then (probably due to the cache fill) started failing.

2015-06-22 18:07:57 mw1017 mediawikiwiki exception INFO: [d561d283] /w/index.php?title=Topic:Rs9k21e73g4qsurn&action=history&useskin=vector   Flow\Exception\InvalidInputException from line 39 of /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Repository/MultiGetList.php: Not scalar:NULL
#0 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Repository/TreeRepository.php(399): Flow\Repository\MultiGetList->get()
#1 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/Storage/TopicHistoryStorage.php(64): Flow\Repository\TreeRepository->fetchSubtreeNodeList()
#2 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/Index/FeatureIndex.php(611): Flow\Data\Storage\TopicHistoryStorage->findMulti()
#3 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/Index/FeatureIndex.php(403): Flow\Data\Index\FeatureIndex->backingStoreFindMulti()
#4 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/Compactor/ShallowCompactor.php(96): Flow\Data\Index\FeatureIndex->findMulti()
#5 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/Index/FeatureIndex.php(470): Flow\Data\Compactor\ShallowCompactor->expandCacheResult()
#6 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/ObjectLocator.php(80): Flow\Data\Index\FeatureIndex->findMulti()
#7 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/ObjectLocator.php(55): Flow\Data\ObjectLocator->findMulti()
#8 (): Flow\Data\ObjectLocator->find()
#9 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/ManagerGroup.php(129): call_user_func_array()
#10 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/ManagerGroup.php(141): Flow\Data\ManagerGroup->call()
#11 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Formatter/TopicHistoryQuery.php(30): Flow\Data\ManagerGroup->find()
#12 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Data/Pager/HistoryPager.php(50): Flow\Formatter\TopicHistoryQuery->getResults()
#13 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Block/Topic.php(753): Flow\Data\Pager\HistoryPager->doQuery()
#14 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Block/Topic.php(719): Flow\Block\TopicBlock->processHistoryResult()
#15 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Block/Topic.php(487): Flow\Block\TopicBlock->renderTopicHistoryApi()
#16 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/View.php(180): Flow\Block\TopicBlock->renderApi()
#17 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/View.php(71): Flow\View->buildApiResponse()
#18 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Actions/Action.php(101): Flow\View->show()
#19 /srv/mediawiki/php-1.26wmf10/extensions/Flow/includes/Actions/Action.php(46): Flow\Actions\FlowAction->showForAction()
#20 /srv/mediawiki/php-1.26wmf10/includes/MediaWiki.php(413): Flow\Actions\FlowAction->show()
#21 /srv/mediawiki/php-1.26wmf10/includes/MediaWiki.php(291): MediaWiki->performAction()
#22 /srv/mediawiki/php-1.26wmf10/includes/MediaWiki.php(634): MediaWiki->performRequest()
#23 /srv/mediawiki/php-1.26wmf10/includes/MediaWiki.php(431): MediaWiki->main()
#24 /srv/mediawiki/php-1.26wmf10/index.php(41): MediaWiki->run()
#25 /srv/mediawiki/w/index.php(3): include()
#26 {main} {"private":false}
Legoktm subscribed.

I've managed to trace this back to Flow\Data\Index\FeatureIndex->backingStoreFindMulti() getting passed a NULL. Unassigning from myself because I really don't know how to investigate or debug further.

I also note that this apparently only affects certain accounts? I could not reproduce this while logged in as Legoktm, but Roan Kattouw (WMF) could, and I was able to while logged out.

AFAICT, the only place backingStoreFindMulti is called is from FeatureIndex.php L402, guarded by an if ( $storageQueries ) ($storageQueries is the value being passed). If it's null, it shouldn't even get past that if-statement?

if ( $storageQueries ) {
    $fromStorage = $this->backingStoreFindMulti( $storageQueries );
    ...

I now see what you meant there: $storageQueries is an array( 0=> array( 'topic_root_id' => null ) )

I narrowed it down to:

$topicHistoryStorage = Flow\Container::get( 'storage.topic_history' );

$queries = array( array( 'topic_root_id' => Flow\Model\UUID::create( 'rs9k21e73g4qsurn' ) ) );
$options = array( 'sort' => 'rev_id', 'order' => 'DESC', 'limit' => 50, 'offset-id' => NULL, 'offset-dir' => 'fwd', 'offset-include' => false, 'offset-elastic' => false, );

// Next part from ObjectLocator->findMulti
$keys = array_keys( reset( $queries ) );
$options['sort'] = array( $options['sort'] );
$indexObj = $topicHistoryStorage->getIndexFor( $keys, $options ); // get_class shows this is TopicHistoryIndex

$keyToQuery = array ( 'flowdb:flow_revision:topic:rs9k21e73g4qsurn:4.7' =>   array ( 'topic_root_id' => 'rs9k21e73g4qsurn',  ), )
$fromCache = array ( 'flowdb:flow_revision:topic:rs9k21e73g4qsurn:4.7' =>  array ( 'rtsq1b0nq2nvz5kr' =>  array ( 'rev_id' => 'rtsq1b0nq2nvz5kr', ), ) );

$compactReflect = new ReflectionProperty( $indexObj, 'rowCompactor' );
$compactReflect->setAccessible( true );
$compactor = $compactReflect->getValue( $indexObj );
$compactor->expandCacheResult( $fromCache, $keyToQuery );

That $fromCache is just the first one from the original (original shown below):

array (
  'flowdb:flow_revision:topic:rs9k21e73g4qsurn:4.7' => 
  array (
    'rtsq1b0nq2nvz5kr' => 
    array (
      'rev_id' => 'rtsq1b0nq2nvz5kr',
    ),
    'rtsq1b0j1pcxylff' => 
    array (
      'rev_id' => 'rtsq1b0j1pcxylff',
    ),
    'rtb3flzn8ru9sr9q' => 
    array (
      'rev_id' => 'rtb3flzn8ru9sr9q',
    ),
    'rtb3e2475wkadti7' => 
    array (
      'rev_id' => 'rtb3e2475wkadti7',
    ),
    'rshzn8g9nzv8qwf3' => 
    array (
      'rev_id' => 'rshzn8g9nzv8qwf3',
    ),
    'rshz6psantjqpn3w' => 
    array (
      'rev_id' => 'rshz6psantjqpn3w',
    ),
    'rshz5lfgech3u9nx' => 
    array (
      'rev_id' => 'rshz5lfgech3u9nx',
    ),
    'rshxhjnawh4tq0ao' => 
    array (
      'rev_id' => 'rshxhjnawh4tq0ao',
    ),
    'rsao7b6d5df535sc' => 
    array (
      'rev_id' => 'rsao7b6d5df535sc',
    ),
    'rs9khgfx7k6yvzac' => 
    array (
      'rev_id' => 'rs9khgfx7k6yvzac',
    ),
    'rs9kbf8vtldmbjm3' => 
    array (
      'rev_id' => 'rs9kbf8vtldmbjm3',
    ),
    'rs9k7q826ik4vg3s' => 
    array (
      'rev_id' => 'rs9k7q826ik4vg3s',
    ),
    'rs9k4hyqh87ycfnp' => 
    array (
      'rev_id' => 'rs9k4hyqh87ycfnp',
    ),
    'rs9k21ea7p01h877' => 
    array (
      'rev_id' => 'rs9k21ea7p01h877',
    ),
    'rs9k21e8nkke51hf' => 
    array (
      'rev_id' => 'rs9k21e8nkke51hf',
    ),
  ),
)

It does *not* happen with all of the entries in the list. E.g. there is no exception if you swap it out for the second one (rtsq1b0j1pcxylff). Matthias said he can reproduce it locally, which I haven't done yet, so I'm going to pass it over to him.

Yeah, the problem is that those entries are expanded using TopicHistoryStorage::findMulti, which operates under the assumption that it'll only get queries for topic_root_id (not rev_id)
I'll fix that today

Change 220450 had a related patch set uploaded (by Matthias Mullie):
Fix topic history's revision cache expansion

https://gerrit.wikimedia.org/r/220450

Change 220450 merged by jenkins-bot:
Fix topic history's revision cache expansion

https://gerrit.wikimedia.org/r/220450

I wonder if the following is related:

Two other reports:

  • T93338 Unknown column 'topic_root_id' in 'where clause' when trying to see history in Flow
  • T91916 Database error on Flow history page

Specifically, the non-existent column - topic_root_id was mentioned in error message:

Error: 1054 Unknown column 'topic_root_id' in 'where clause' (10.64.16.18)

And it indeed seems not be present(at least, in beta enwiki db):

MariaDB [enwiki]> SELECT TABLE_NAME FROM information_schema.COLUMNS WHERE COLUMN_NAME like ('topic_root_id');
Empty set, 4 warnings (0.47 sec)

Re-checked https://www.mediawiki.org/w/index.php?title=Topic:Rs9k21e73g4qsurn&action=history after deploy - no error is displayed.

Also, checked some old pages in beta e.g. workflow_last_update_timestamp < 20140609211255.

I wonder if the following is related:

Two other reports:

  • T93338 Unknown column 'topic_root_id' in 'where clause' when trying to see history in Flow
  • T91916 Database error on Flow history page

Specifically, the non-existent column - topic_root_id was mentioned in error message:

Yes, topic_root_id is not a real column. It only exists for indexing (memcached) purposes. We would have to investigate those to see if it had the same underlying cause.

There were a couple of relatively recent changes that contributed to this:

  • "Properly" implemented TopicHistoryStorage::findMulti recently - it was previously assumed to only fetch when data in cache in missing, not when no index applies (e.g. out of range): https://gerrit.wikimedia.org/r/#/c/210354/
  • We recently allowed invalid (e.g. missing ExternalStore data) cache entries: previously, we likely would've immediately discarded them; now we don't (and store them compacted, for indexes), causing us to have to expand them from DB (because we still don't cache the entry), whereas previously we wouldn't even get to try to expand them.