Page MenuHomePhabricator

Discuss batching requirements for finalizing the donor preference sync script
Closed, ResolvedPublic3 Estimated Story Points

Description

Background

In T420548, there is an initial implementation of a maintenance script to sync donor preferences from a csv file to a MediaWiki global preference, taking in account if a user does not give consent for this.

Potentially the csv file could have 1 million rows. Currently the script processes users in a row-by-row process which can be slow if the csv is very large.

User story

As a developer I want to know how to finalize the script so it is ready to run in production.

Requirements

  • The script can operate safely on ~250k contacts normally, ~1m when donation volume is higher. See {T418164#11643063}
  • The script should be able to handle large CSVs e.g. 1 million rows in a way that doesn't cause problems for SRE.
  • Review developer notes and discuss with SRE and Fundraising to clearly define what is remaining for the script to be ready for processing large data sets.

Developer notes

  • Read and process a batch of email addresses, and do a batch db lookup WHERE gu_email IN (...) AND gu_email_authenticated IS NOT NULL
  • Is there a way to do localUserFromCentralId with a batch of central ids?
  • Avoid calling saveOptions once per user.

See the SetReadingListHiddenPreference script for some other ideas.

Maybe the UserPreferenceBatchUpdater could be moved to MediaWiki core and then be used for the donor preference.

https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/ReadingLists/+/refs/heads/master/src/Service/UserPreferenceBatchUpdater.php

https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/ReadingLists/+/refs/heads/master/maintenance/setReadingListHiddenPreference.php

BDD

  • For QA engineer to fill out

Test Steps

  • For QA engineer to fill out

Design

  • Add mockups and design requirements

Acceptance criteria

  • Add acceptance criteria

Communication criteria - does this need an announcement or discussion?

  • Add communication criteria

Rollback plan

  • What is the rollback plan in production for this task if something goes wrong?

Sign off steps

  • @SToyofuku-WMF to create new ticket based on outcome of work in this ticket.

This task was created by Version 1.3.0 of the Web team task template using phabulous

Event Timeline

Jdlrobson-WMF set the point value for this task to 3.Apr 27 2026, 5:10 PM
SToyofuku-WMF subscribed.

I have thoughts/feelings/opinions about this task, so would love to call dibs if that's not too controversial

In progress although semi-stalled on discussing approach in fundraising x product collab channel

Jdlrobson-WMF renamed this task from Add batching to the donor preference sync script to Discuss batching requirements for finalizing the donor preference sync script.Fri, May 15, 7:41 PM
Jdlrobson-WMF updated the task description. (Show Details)
Jdlrobson-WMF subscribed.

@SToyofuku-WMF following our chat I've converted this into a discussion task so we can reflect the fact that implementing this is more than 3 points. If you could create a new task for the implementation documenting the requirements we can create a new estimation thread and look to wrap this up next sprint. Thanks in advance!