Page MenuHomePhabricator

Save campaign images to the database
Closed, ResolvedPublic

Description

This would reduce the tools load on the Wikimedia API significantly as the complex depth categorymember calls are currently run every time a user participates. Instead, we would run the query once at the moment the manager creates the campaign and store the image list in the database under a new campaign_images column in the campaigns table.

The down side is that users would not necessarily have the latest contents of the category. So, the campaign manager would then need to have an "update images" button that would re-run the category query. This would be needed if big changes happened to the category tree and the manager wanted to refresh the list.

This changes would also make it easier to:

  • Track current image the user is on (image index could be added to the url)
  • Give counts for total images in campaigns

Event Timeline

NavinoEvans raised the priority of this task from Medium to High.Mar 12 2020, 5:03 PM

Change 579600 had a related patch set uploaded (by Gabrielchihonglee; owner: Gabrielchihonglee):
[labs/tools/Isa@master] Save campaign images to the database

https://gerrit.wikimedia.org/r/579600

With the above patch, the tool now:

  • Uses images in db to calculate # of images
  • Uses images in db to show in participation page

Future patches:

  • Allow image id specification in participation page
  • Can new test cases for new route and util functions

(I'll work on them after the above patch gets approved :) )

Structure of the 2 new tables:

CREATE TABLE `image` (
  `id` int NOT NULL AUTO_INCREMENT,
  `page_id` int NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

CREATE TABLE `campaign_image` (
  `id` int NOT NULL AUTO_INCREMENT,
  `campaign_id` int NOT NULL,
  `image_id` int NOT NULL,
  PRIMARY KEY (`id`),
  KEY `campaign_id` (`campaign_id`),
  KEY `image_id` (`image_id`),
  CONSTRAINT `fk_campaign_id` FOREIGN KEY (`campaign_id`) REFERENCES `campaign` (`id`),
  CONSTRAINT `fk_image_id` FOREIGN KEY (`image_id`) REFERENCES `image` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Fixing the current issues in the above patch:

  • # of images got is far less than what's expected

(Update: fixed, had to handle continue argument with api call)

  • API query max number of pageids to provide at a time is 50, figuring out what's the best way to avoid hitting the max.

(Update: fixed, instead of translating all pageids to image name at once, we'll have participation manager do it when the image is loaded)

A problem with the above patch is that it doesn't work for Wiki Love campaigns, as they require specific category information of the images.
I've considered fetching the info for each image and store them to the database, however it would take a lot of time. @NavinoEvans, any suggestions?

Another problem is that storing images into the db seems to be very slow (roughly 2000 images per 60s), I'll have to investigate why.

Aklapper subscribed.

@Gabrielchl: Hi, are you still working on this? Any idea which ISA maintainer or developer could provide a patch review?

Change 743201 had a related patch set uploaded (by Sebastian Berlin (WMSE); author: Sebastian Berlin (WMSE)):

[labs/tools/Isa@master] Collect images on the server side when a campaign is updated

https://gerrit.wikimedia.org/r/743201

Change 579600 abandoned by Gabrielchihonglee:

[labs/tools/Isa@master] Save campaign images to the database

Reason:

task picked up by Sebastian Berlin (WMSE)

https://gerrit.wikimedia.org/r/579600

Sebastian_Berlin-WMSE changed the task status from Open to In Progress.Dec 23 2021, 7:14 AM

Change 743201 merged by jenkins-bot:

[labs/tools/Isa@master] Collect images on the server side when a campaign is updated

https://gerrit.wikimedia.org/r/743201