Page MenuHomePhabricator

Ingest user similarity data for May 2021
Closed, ResolvedPublic1 Estimated Story Points


Similarusers database should be refreshed with May 2021 data.
This is a maintenance ticket to coordinate all parties involved, and set an ETA.

This action requires:

New run of the algorithm that generates user similarity data.
MySQL ingestion.
During ingestion the service will enter a maintenance window of approx 4 to 6 hours. During maintenance,
recommendations won't be served.


Event Timeline

gmodena set the point value for this task to 1.

The May training/ingestion run completed successfully.

Loading /home/gmodena/similar-users-private/data/2021-05/temporal.tsv: 19341597rows [57:43, 5584.90rows/s]
Loading /home/gmodena/similar-users-private/data/2021-05/metadata.tsv: 8634325rows [29:55, 4809.05rows/s]
Loading /home/gmodena/similar-users-private/data/2021-05/coedit_counts.tsv: 116052554rows [5:35:37, 5762.88rows/s]
Model=Temporal  Read=19341597   Skipped=0       Inserted=19341597
Model=UserMetadata      Read=8634325    Skipped=0       Inserted=8634325
Model=Coedit    Read=116052554  Skipped=0       Inserted=116052554

Throughout the ingestion, all looked fine on the database I/O front.

cc / @Marostegui.