Page MenuHomePhabricator

PHP Notice: Undefined index: h-Pages_locked_from_recreation
Closed, ResolvedPublic

Description

PHP Notice: Undefined index: h-Pages_locked_from_recreation
from /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/includes/ThreadItemStore.php(511)
#0 /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/includes/ThreadItemStore.php(511): MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/mediawiki/php-1.40.0-wmf.8/includes/libs/rdbms/database/Database.php(2690): MediaWiki\Extension\DiscussionTools\ThreadItemStore->MediaWiki\Extension\DiscussionTools\{closure}(Wikimedia\Rdbms\DatabaseMysqli, string)
#2 /srv/mediawiki/php-1.40.0-wmf.8/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure, string)
#3 /srv/mediawiki/php-1.40.0-wmf.8/includes/libs/rdbms/database/DBConnRef.php(680): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#4 /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/includes/ThreadItemStore.php(573): Wikimedia\Rdbms\DBConnRef->doAtomicSection(string, Closure, string)
#5 /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php(174): MediaWiki\Extension\DiscussionTools\ThreadItemStore->insertThreadItems(MediaWiki\Revision\RevisionStoreRecord, MediaWiki\Extension\DiscussionTools\ContentThreadItemSet)
#6 /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php(136): MediaWiki\Extension\DiscussionTools\Maintenance\PersistRevisionThreadItems->processRow(stdClass)
#7 /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php(88): MediaWiki\Extension\DiscussionTools\Maintenance\PersistRevisionThreadItems->process(Wikimedia\Rdbms\SelectQueryBuilder, array)
#8 /srv/mediawiki/php-1.40.0-wmf.8/maintenance/includes/MaintenanceRunner.php(309): MediaWiki\Extension\DiscussionTools\Maintenance\PersistRevisionThreadItems->execute()
#9 /srv/mediawiki/php-1.40.0-wmf.8/maintenance/doMaintenance.php(85): MediaWiki\Maintenance\MaintenanceRunner->run()
#10 /srv/mediawiki/php-1.40.0-wmf.8/extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php(185): require_once(string)
#11 /srv/mediawiki/multiversion/MWScript.php(120): require_once(string)
#12 {main}

https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-deploy-1-7.0.0-1-2022.11.07?id=9wz8U4QBkh1O0J80G_Xd
https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-deploy-1-7.0.0-1-2022.11.07?id=sjv-U4QB-vZ6BagG1M0-

(occurred in the maint. script for T315510)

Event Timeline

There are many more occurrences now, with different indexes. They seem to be related to processing pages that have headings with no comments (in discussion namespaces, but mostly not actual discussion pages), e.g. https://ie.wikibooks.org/wiki/Wikibooks:Nospam#Pages_locked_from_recreation.

Change 854138 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] ThreadItemStore: Fix setting parent IDs when parent already existed

https://gerrit.wikimedia.org/r/854138

The error occurred whenever the script processed a revision that was already processed, e.g. by a refreshlinks job. It processes the revision again, in case new data can be generated from it (which can happen if it contains transclusions of other talk pages).

As a result of the bug, when generating new data, it would use an incorrect value for the itr_parent_id field (always NULL, which should only be used for top-level headings – never for comments or sub-headings).

However, the incorrect data was probably never used – it was computed for every row (hence the high volume of errors), but it would only be saved for new rows, and there probably weren't actually any new rows for previously processed revisions on any of the wikis we processed this time (since we just enabled the feature a few hours ago, and there probably aren't any talk pages that contain transclusions of other talk pages on group0 wikis anyway).

Luckily, due to a little bit of redundancy in the schema, this can be verified with the following query, which returns data about all comments that have no parent assigned:

select * from discussiontools_item_revisions
where itr_level != 0 and itr_parent_id is null

– and it returns 0 rows on every wiki I tested.

@matmarex I ran that query across all group0 wikis, and it returns non-zero number of results for testwiki (all other wikis have zero results).

FTR, this is how I did it (from stat1005):

[urbanecm@stat1005 ~]$ source conda-activate-stacked
[...]
(2021-04-07T23.21.00_urbanecm) [urbanecm@stat1005 ~]$ ipython3
[...]
In [1]: from wmfdata import mariadb, utils

[...]


In [6]: dfs = []

In [7]: for wiki in wikis:
   ...:     if wiki == 'labtestwiki' or wiki == 'labswiki':
   ...:         continue
   ...:     dfs.append(mariadb.run('select database(), count(*) from discussiontools_item_revisions where itr_level != 0 and itr_parent_id is null', wiki))
   ...:

In [8]: import pandas as pd

In [9]: df = pd.concat(dfs)

In [10]: df.head()
Out[10]:
     database()  count(*)
0   aawikibooks         0
0  aawiktionary         0
0  abwiktionary         0
0  advisorywiki         0
0   akwikibooks         0

In [11]: df.loc[df['count(*)'] > 0]
Out[11]:
  database()  count(*)
0   testwiki         2

In [12]:

Change 854138 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] ThreadItemStore: Fix setting parent IDs when parent already existed

https://gerrit.wikimedia.org/r/854138

Change 854592 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@wmf/1.40.0-wmf.8] ThreadItemStore: Fix setting parent IDs when parent already existed

https://gerrit.wikimedia.org/r/854592

@Urbanecm Thanks for double-checking. So on testwiki I get two results corresponding to these comments (or in fact one comment, transcluded on 2 pages):

This is exactly the scenario I described. Given that it's testwiki, I don't think it's worth worrying about fixing it (even though they are "real" comments, not just testing).

If we really wanted to fix it, I could tweak the maint. script so that it would update the buggy rows (rather than skip them because they exist), and run it with --current --page 'Wikipedia:Requests' --page 'Wikipedia:Requests/Permissions'.

No problem. I'm not sure what kinds of issues do those invalid rows – if you say that the buggy rows aren't worth fixing at testwiki, no issues with that.

Change 854592 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.40.0-wmf.8] ThreadItemStore: Fix setting parent IDs when parent already existed

https://gerrit.wikimedia.org/r/854592

Mentioned in SAL (#wikimedia-operations) [2022-11-08T21:26:22Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:851182|Bump sampling rate to 0.2 for various editing schemas on a/b test wikis (T321734)]], [[gerrit:854592|ThreadItemStore: Fix setting parent IDs when parent already existed (T322599)]]

Mentioned in SAL (#wikimedia-operations) [2022-11-08T21:26:41Z] <urbanecm@deploy1002> urbanecm and kemayo and matmarex: Backport for [[gerrit:851182|Bump sampling rate to 0.2 for various editing schemas on a/b test wikis (T321734)]], [[gerrit:854592|ThreadItemStore: Fix setting parent IDs when parent already existed (T322599)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-11-08T21:32:07Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:851182|Bump sampling rate to 0.2 for various editing schemas on a/b test wikis (T321734)]], [[gerrit:854592|ThreadItemStore: Fix setting parent IDs when parent already existed (T322599)]] (duration: 05m 45s)