Page MenuHomePhabricator

Translation Notification Bot sending the same message multiple times to every translator
Closed, ResolvedPublic

Description

Current status

Done:

Original report

Hello. It seems that Translation Notification Bot is broken again. It is sending the same notification up to 4 times to the translators. I've picked a random example at: https://meta.wikimedia.org/w/index.php?title=User_talk:MrLeopold&action=history. I've globally locked the account to prevent the bot from editing until this can be sorted out. Thanks.

QA plan

Post deployment: This was tested on mediawiki.org by sending a delivery via all possible methods (email, local talk page, talk page in another wiki) to a language where only a single person (the tester) is subscribed

Outcome

Translation notifications are now reliable and translation admins can send them without restrictions.

Details

SubjectRepoBranchLines +/-
operations/mediawiki-configmaster+0 -7
operations/mediawiki-configmaster+0 -3
mediawiki/extensions/TranslationNotificationsmaster+0 -6
operations/mediawiki-configmaster+0 -4
mediawiki/extensions/TranslationNotificationsmaster+26 -118
mediawiki/extensions/TranslationNotificationsmaster+5 -3
mediawiki/extensions/TranslationNotificationsmaster+123 -58
mediawiki/extensions/TranslationNotificationsmaster+119 -15
mediawiki/extensions/TranslationNotificationsmaster+242 -126
mediawiki/extensions/TranslationNotificationsmaster+165 -205
mediawiki/extensions/TranslationNotificationsmaster+511 -350
integration/configmaster+1 -1
operations/mediawiki-configmaster+1 -1
mediawiki/extensions/TranslationNotificationsmaster+88 -9
operations/mediawiki-configmaster+2 -1
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

The code is very simple:

		foreach ( $jobsByTarget as $wiki => $jobs ) {
			$this->logInfo( "Wiki: $wiki, Jobs: " . count( $jobs ) );
			JobQueueGroup::singleton( $wiki )->push( $jobs );
		}

Also https://grafana.wikimedia.org/d/000000400/jobqueue-eventbus?orgId=1&from=now-3h&to=now&var-site=eqiad&var-type=All does not show any other jobs besides TranslationNotificationSubmitJob (expecting TranslationNotificationTalkPageJob or TranslationNotificationEmailJob).

We know from the logs that $wiki is metawiki:

> $x = JobQueueGroup::singleton( 'metawiki' );

> var_dump( $x );
object(JobQueueGroup)#303 (5) {
  ["cache":protected]=>
  object(MapCacheLRU)#304 (5) {
    ["cache":"MapCacheLRU":private]=>
    array(0) {
    }
    ["timestamps":"MapCacheLRU":private]=>
    array(0) {
    }
    ["epoch":"MapCacheLRU":private]=>
    float(1582039130.9259)
    ["maxCacheKeys":"MapCacheLRU":private]=>
    int(10)
    ["wallClockOverride":"MapCacheLRU":private]=>
    NULL
  }
  ["domain":protected]=>
  string(8) "metawiki"
  ["readOnlyReason":protected]=>
  bool(false)
  ["invalidDomain":protected]=>
  bool(false)
  ["coalescedQueues":protected]=>
  NULL
}

Looking at https://gerrit.wikimedia.org/g/mediawiki/core/+/7f53ada97088b935644445509b03e5395327382b/includes/jobqueue/JobQueueGroup.php#148 I don't really see anything that would fail without causing an exception, or any kind of log entry for that matter.

I tried executing this code sample manually, and it worked (actually failed with body too short error, but that's fine as it was submitted):

$params = [];
$params['replyTo'] = $params['from']= TranslationNotificationEmailJob::buildAddress( $wgNoReplyAddress, 'Niklas', 'Testing' );
$translator = User::newFromName( 'Nikerabbit' );
$params['to'] = TranslationNotificationEmailJob::addressFromUser( $translator );
$params['subject'] = 'Test subject';
$params['body'] = 'Test body';
$title = Title::newFromText( 'Testing page title' );
$jobs = [];
$jobs[] = new TranslationNotificationEmailJob( $title, $params );
JobQueueGroup::singleton( 'metawiki' )->push( $jobs );

Maybe it's other type of job that fails? Or is it failing when submitted inside a job running in the job queue?

Another question is, why are there only 5 jobs, if there are 11 users to notify? The only code path I can see if user has disabled emails, but that seems unlikely... Maybe if user has signed up but has not selected any of the three methods?

I was eyeing this code but looking at createJobEvent it seems impossible that the value is null.

Change 573070 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/TranslationNotifications@master] Rename TranslationNotification to TranslationNotifications

https://gerrit.wikimedia.org/r/573070

Change 573070 merged by jenkins-bot:
[mediawiki/extensions/TranslationNotifications@master] Rename TranslationNotification to TranslationNotifications in jobs

https://gerrit.wikimedia.org/r/573070

I've been poking the wmf job queue infrastructure and couldn't find any additional info about this. Unfortunately I don't have the rights to go to Special:NotifyTranslators to test this further. Will try more on beta cluster.

I've been poking the wmf job queue infrastructure and couldn't find any additional info about this. Unfortunately I don't have the rights to go to Special:NotifyTranslators to test this further. Will try more on beta cluster.

@Pchelolo If you need rights on meta.wikimedia to test this I'll be happy to assist.

daniel lowered the priority of this task from High to Medium.Apr 7 2020, 12:54 PM

Process comment: this is high priority for the Language team, but since it is on multiple boards now, this cannot be indicated while we are blocked on figuring out what's going on with the disappearing jobs.

I ran another test today (so you can have the most recent data) and sent a translation notification on Meta-Wiki for the page https://meta.wikimedia.org/wiki/Global_rights for the ast language. I have chosen to receive talk page and email when there's something new to translate. I have not received anything regarding this event though. Regards.

Hi @Krinkle Could you please browse through the logs regarding this action before they vanish? I can coordinate with you for live-debug in order to trigger Special:NotifyTranslators and and see what Logstash & friends show, if that's an option. Thanks.

@MarcoAurelio I have scanned the task and all its comments, but I am not sure what or where you'd like me to search in Logstash. Note that Language team do (or should) all have Logstash access as well.

In any case, feel free to ping one me on IRC for live debugging if you see me active.

In my latest check, email delivery worked, but talk page notification did not. The job is created, but there is zero messages after that (no failures or warnings of any kind) and run() is never called on TranslationNotificationsTalkPageJob. It's as if the job was never added to the queue (or stays in the queue forever and not picked, I am unable to check what's in the queue). This is the mystery that needs solving.

Change 596200 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/TranslationNotifications@master] Avoid using Job::factory, instead create the job object directly

https://gerrit.wikimedia.org/r/596200

Change 596200 merged by jenkins-bot:
[mediawiki/extensions/TranslationNotifications@master] Avoid using Job::factory, instead create the job object directly

https://gerrit.wikimedia.org/r/596200

CCicalese_WMF subscribed.

Moving on workboard from Backlog to Next, since the Backlog column was hidden some time ago. Is there work remaining for Platform Engineering to do on this task?

@CCicalese_WMF We are still blocked as we cannot figure out why a specific type of job (TranslationNotificationsTalkPageJob) is not inserted to job queue or not executed. There is nothing in the logs. We have one more shot in the dark going out with next train, but then we have exhausted all our options.

Change 598997 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[mediawiki/extensions/TranslationNotifications@master] Remove TranslationNotificationsTalkPageJob

https://gerrit.wikimedia.org/r/598997

Looks like our shot in the dark seems to have worked and we were able to get the job execute locally (but not in the remote wiki, due to unrelated issue). The fact remains that there was no warning whatsoever which made debugging this issue extremely difficult.

I'll do another test on Meta when 1.35-wmf.34 arrives there, and see if I can get talk page/email messages, and report back. Regards.

@MarcoAurelio FYI my previous comment results were on mediawiki.org running wmf.34. You probably run into same issues that email works, as well as delivery to wikis where TranslationNotifications is installed, but not to others.

Change 598997 merged by jenkins-bot:
[mediawiki/extensions/TranslationNotifications@master] Remove TranslationNotificationsTalkPageJob

https://gerrit.wikimedia.org/r/598997

@MarcoAurelio FYI my previous comment results were on mediawiki.org running wmf.34. You probably run into same issues that email works, as well as delivery to wikis where TranslationNotifications is installed, but not to others.

Tested today on Meta. This time it worked for aa. I shall run a test for a language with more subscribers and see how it behaves.

The fix for remote wiki delivery is now deployed.

What'll happen with User:Translation Notification Bot? Is this account no longer to be used? Thanks.

It should either be removed from code, or kept if it is possible to use it as sender name in MassMessage. Any opinions which way to go?

Considering that (a) TranslationNotifications is now dependant on Extension:MassMessage and (b) Extension:MassMessage already uses the system account User:MediaWiki message delivery as sender, I'd say it'd be better to retire that account.

Change 603163 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/TranslationNotifications@master] Remove username / password for sending notification to other wikis

https://gerrit.wikimedia.org/r/603163

Change 603167 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[operations/mediawiki-config@master] Remove TranslationNotifications user settings

https://gerrit.wikimedia.org/r/603167

Change 603169 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[operations/mediawiki-config@master] TranslationNotifications: Remove username / password for sending messages

https://gerrit.wikimedia.org/r/603169

Change 603169 abandoned by Abijeet Patro:
TranslationNotifications: Remove username / password for sending messages

Reason:
Abandoned in favor of I0315b59aae71290ff5def8c0f9992a6263e891eb

https://gerrit.wikimedia.org/r/603169

So based on my QA this works. I am moving this to Done as development is complete.

If this is confusing, let's place the follow-ups in separate tasks. For now I placed them in the task description.

Change 603163 merged by jenkins-bot:
[mediawiki/extensions/TranslationNotifications@master] Remove username / password for sending notification to other wikis

https://gerrit.wikimedia.org/r/603163

Nikerabbit updated the task description. (Show Details)

Change 607414 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[operations/mediawiki-config@master] Remove TranslationNotifications user settings 2/2

https://gerrit.wikimedia.org/r/607414

Change 603167 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove TranslationNotifications user settings 1/2

https://gerrit.wikimedia.org/r/603167

Change 607414 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove TranslationNotifications user settings 2/2

https://gerrit.wikimedia.org/r/607414