Page MenuHomePhabricator

GrowthExperiments: PHP Notice: Undefined index: XXXXXX
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
labels.normalized_message
[{reqId}] {exception_url}   PHP Notice: Undefined index: XXXX
error.stack_trace
from /srv/mediawiki/php-1.42.0-wmf.14/extensions/GrowthExperiments/includes/MentorDashboard/MenteeOverview/UncachedMenteeOverviewDataProvider.php(589)
#0 /srv/mediawiki/php-1.42.0-wmf.14/extensions/GrowthExperiments/includes/MentorDashboard/MenteeOverview/UncachedMenteeOverviewDataProvider.php(589): MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/mediawiki/php-1.42.0-wmf.14/extensions/GrowthExperiments/includes/MentorDashboard/MenteeOverview/UncachedMenteeOverviewDataProvider.php(403): GrowthExperiments\MentorDashboard\MenteeOverview\UncachedMenteeOverviewDataProvider->getBlocksForUsers(array)
#2 /srv/mediawiki/php-1.42.0-wmf.14/extensions/GrowthExperiments/includes/MentorDashboard/MenteeOverview/MenteeOverviewDataUpdater.php(80): GrowthExperiments\MentorDashboard\MenteeOverview\UncachedMenteeOverviewDataProvider->getFormattedDataForMentor(MediaWiki\User\User)
#3 /srv/mediawiki/php-1.42.0-wmf.14/extensions/GrowthExperiments/maintenance/updateMenteeData.php(121): GrowthExperiments\MentorDashboard\MenteeOverview\MenteeOverviewDataUpdater->updateDataForMentor(MediaWiki\User\User)
#4 /srv/mediawiki/php-1.42.0-wmf.14/maintenance/includes/MaintenanceRunner.php(698): GrowthExperiments\Maintenance\UpdateMenteeData->execute()
#5 /srv/mediawiki/php-1.42.0-wmf.14/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /srv/mediawiki/multiversion/MWScript.php(158): require_once(string)
#7 {main}
Notes
  • Happens hourly from a maintenance script
  • Looks like it might be some unexpected data somewhere
  • Across multiple wikiversions

Event Timeline

Urbanecm_WMF subscribed.

This is very interesting. It is looking for example for 555éGAYSAÇO and the code branch in question expects it to be an username of a blocked user. However, https://meta.wikimedia.org/wiki/Special:CentralAuth/555%C3%A9GAYSA%C3%87O says such an account does not exist. Same goes for https://pt.wikipedia.org/wiki/Especial:Contribui%C3%A7%C3%B5es/555%C3%A9GAYSA%C3%87O from ptwiki itself. Even though the account allegedly does not exist, it apparently got blocked in 2005. What is happening? And more importanly, why it started to happen only now?

This is very interesting. It is looking for example for 555éGAYSAÇO and the code branch in question expects it to be an username of a blocked user. However, https://meta.wikimedia.org/wiki/Special:CentralAuth/555%C3%A9GAYSA%C3%87O says such an account does not exist. Same goes for https://pt.wikipedia.org/wiki/Especial:Contribui%C3%A7%C3%B5es/555%C3%A9GAYSA%C3%87O from ptwiki itself. Even though the account allegedly does not exist, it apparently got blocked in 2005. What is happening? And more importanly, why it started to happen only now?

It maybe be locked, which also suppressed the username from public lists. You can ask one from the steward group to look up, if that the case here.

It maybe be locked, which also suppressed the username from public lists. You can ask one from the steward group to look up, if that the case here.

Thanks for the suggestion. I’m a steward myself, so in this case, I am fairly certain account of that name does not exist (in theory, a hypothetical global suppression should've hidden the ptwiki entries as well). There is also no mention of the account in the production database (neither centralauth nor ptwiki):

wikiadmin2023@10.192.0.146(centralauth)> select * from globaluser where gu_name='555éGAYSAÇO';
Empty set (0.001 sec)

wikiadmin2023@10.192.0.146(centralauth)>

wikiadmin2023@10.192.16.47(ptwiki)> select * from user where user_name='555éGAYSAÇO';
Empty set (0.001 sec)

wikiadmin2023@10.192.16.47(ptwiki)>

So, something else than a suppression needs to be happening here. Considering this is an ancient block entry (nearly two decades old), I think it might be caused by data inconsistency (account renamings that were not reflected in the logging table or something similar).

However, irrespective of how the log entry was created, considering it dates to 2005, the main question here is why it only now started to log those notices (and why the code previously did not care for it).

UncachedMenteeOverviewDataProvider::getBlocksForUsers is using UserSelectQueryBuilder, which is not using local user, but local actor. Maybe a match is there for this name.

Oversight seems also no reason (another reason as locked), as quering the production database would also show that.

Pinging to request some kind of forward step toward fixing this issue. This batch of 4 errors showing up regularly in logspam-watch over the last few months is unpleasant.

Pinging to request some kind of forward step toward fixing this issue. This batch of 4 errors showing up regularly in logspam-watch over the last few months is unpleasant.

Seconding this, these errors continue to plague logspam-watch.

Explicitly pinging @KStoller-WMF to please prioritize this.

Michael raised the priority of this task from Medium to High.May 27 2024, 3:21 PM

I'll be so bold to raise it high. It is also annoying in our own logstash dashboards. Let's see if I can get it included in our next sprint, not sure though.

I looked into this a tiny bit in preparation for the estimation meeting later today, and this task is about exactly four different keys: 555_É_GAY, 555éGAYSAÇO, 555éViado, and registration. The registration issue comes from a different line than the three user names.

With the usernames, I'm suspecting an encoding issue. This might be one of the cases where the é in 555éGAYSAÇO could either be encoded as one single UTF character, or as a composite of two?
Quoting from https://en.wikipedia.org/wiki/Precomposed_character:

Technically, é (U+00E9) is a character that can be decomposed into an equivalent string of the base letter e (U+0065) and combining acute accent (U+0301).

That would explain why we do not seem to find this name in the database, if the database is encoding it differently than PHP/shell.

Turns out, it was not a unicode composite character issue. @Urbanecm and I figured out that it was a lossy mariadb/mysql type conversion issue. A bug in our code caused the number-only username 555 to make its way into a query as the number 555 instead of the as the string '555'. Then mariadb/mysql did some funny type conversion and gave us also entries for all other ever blocked usernames starting with 555:

wikiadmin2023@10.64.131.12(ptwiki)> SELECT log_title,COUNT(log_id) AS `blocks` FROM `logging`  WHERE log_type = 'block' AND log_action = 'block' AND log_namespace = 2 AND log_title IN (555,[<other usernames would be here as proper strings>]) GROUP BY log_title;
+---------------+--------+
| log_title     | blocks |
+---------------+--------+
| 555           |      5 |
| 555_É_GAY     |      2 |
| 555éGAYSAÇO   |      3 |
| 555éViado     |      1 |
+---------------+--------+
4 rows in set, 65535 warnings (12.479 sec)

So fixing that bug should resolve 3 of the 4 issues in this task.

Thanks @Michael! This makes sense. We should be able to fix this within our next sprint (starting tomorrow).

Change #1036990 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/GrowthExperiments@master] Fix PHP Notice

https://gerrit.wikimedia.org/r/1036990

Change #1036991 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/GrowthExperiments@master] Fix PHPNotice about index `registration` not being set

https://gerrit.wikimedia.org/r/1036991

Change #1036990 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Fix PHP Notice: Undefined Index <username>

https://gerrit.wikimedia.org/r/1036990

Change #1036991 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Fix PHPNotice about index `registration` not being set

https://gerrit.wikimedia.org/r/1036991

Finally merged! I'm moving this to "Test in Production" as the real test will be checking logstash after the train has run its course late next week (That is, June 6th or 7th).

These PHP Notice errors should be gone, and there should be a new error in the GrowthExperiments channel telling us about the userIds for which there seems to be no registration timestamp. That new error we should probably investigate further, maybe in a new task then.

These PHP Notice events have completely stopped in wmf.8 🙌

In turn, we now have a lot of events for T366761. We can use that task to figure out what's going on there.