Background
The Growth-Team tried increasing the maximum number of edits in Impact module to 10,000 edits at the English Wikipedia, see T341599: Impact Module: improvements for former newcomers. Unfortunately, that increase triggered a lot of fatal errors, see T398418: TypeError: array_map(): Argument #2 ($array) must be of type array, int given. We fixed them by reverting the changes, but we need to adjust the Impact module in a way so that it works with larger values for wgGEUserImpactMaxEdits (10,000 is the current goal).
The underlying technical problem is related to the refreshUserImpactJob. On every edit, we:
- compute user impact data,
- store it into the database,
- pass the computed value to the refreshUserImpactJob,
- the job computes its own version of user impact data and updates the database
The issue happens, because the job (generated at step 3) is larger than the maximum size, and is refused by the event gate.
Questions to resolve
- Why are we using the "compute-schedule-compute-again" approach? If we want the computation to happen within the job, wouldn't it make more sense to just schedule the job and let it compute?
- Is it necessary for us to pass the impact data blob to the job? Can we avoid that to ensure the job is not becoming too large?
- What is the maximum size of a job params?
Acceptance Criteria
Determine the next steps that are needed to increase the wgGEUsereImpactMaxEdits value.