Page MenuHomePhabricator

Measure success of Media Matching bots
Closed, ResolvedPublic

Description

As described in the parent task, we are working on an API that will allow bot writers to automatically add highly relevant images to specific articles.

The goal is to have bots running on a few trial wikis in the experimental stage at the end of March 2021, so we would like to begin gathering data then, and be able to answer these questions in April 2021.

There are two categories of metrics we would like to collect:

  1. Metrics about the health of the project, that will help us understand how and whether to continue to move forward or if we need to make major changes
    1. How many edits are made by bots to add images, per wiki?
    2. What proportion of those edits are reverted within 48 hours (aka “unconstructive edits”)? Does this change by wiki?
    3. How many images are added to an article in each edit? Does the number of images added per edit relate to revert rate? Does this differ by wiki?
    4. Are there certain topic areas where images added by bots are more likely to be reverted?
  2. Larger analytical questions or hypotheses to answer
    1. Do page views increase when an article has more images?
    2. Do revert rates relate to page views?

Reports

  • JarBot in Arabic Wikipedia, Mar 2021- Aug 2021
  • Lsjbot in Cebuano Wikipedia, Nov 2021 - Jan 2022

Event Timeline

LGoto triaged this task as Medium priority.Mar 2 2021, 6:11 PM
LGoto moved this task from Triage to Upcoming Quarter on the Product-Analytics board.

Info pasted from Slack:

Morten Warncke-Wang: JarBot does thousands of edits on a daily basis. I took a look at the bot’s contributions and went looking to see if I could find it, but nothing showed up yet (because of the thousands of other edits). It looks like JarBot typically has well-structured edit comments, so I think we can find them by process of eliminating all the other ones. It would be easier if Jarallah gave us the edit comment structure for the image edits, or revision IDs if he has them. If neither of those are options, I can look into this next week.

Cormac Parle: here are his image-adding edits https://quarry.wmflabs.org/query/52973

Marshall Miller: @Cormac — thanks! is it possible to modify the query to output the page title and revision IDs so we can make them into URLs and check them out? or how do you recommend we use the result?

Morten Warncke-Wang: @mmiller: Yes, but I didn’t bother to make the revision IDs into full URLs to arwiki: https://quarry.wmflabs.org/query/57516

nettrom_WMF added subscribers: cchen, nettrom_WMF.

Reassigning this to @cchen to pick up the analysis of JarBot's edits.

@CBogen The numbers and analysis for media matching bots can be found in this Jupyter notebook. For this analysis, we are collecting metrics for JarBot running on Arabic Wikipedia from 01 March 2021 to 31 August 2021. It uses this query to find the image edits.

The results are as follows:

How many edits are made by bots to add images?

From 01 March 2021 to 31 August 2021, 19,426 image edits have been made by JarBot.

What proportion of those edits are reverted within 48 hours (aka “unconstructive edits”)?

The proportion of image edits are reverted within 48 hours is 1.9% (369 out of 19,426 edits).
As a reference, the 48-hour revert rate is 5.0% for overall edits and 2.4% for bot edits in Arabic Wikipedia.

How many images are added to an article in each edit? Does the number of images added per edit relate to the revert rate?

99.1% of the image edits by JarBot add 1 image to an article in each edit. Please refer to the table below for a detailed distribution of image count per edit.

image_countedit_countpct_total
11925599.1%
230.015%
31260.65%
5250.13%
770.036%
940.02%
1110.0051%
1320.01%
2120.01%

From the analysis, we see that all 369 reverts are happening in image edits with 1 image added by JarBot. In this case, we don't have sufficient data to draw a relationship between the number of images added per edit and the revert rate.

Are there certain topic areas where images added by bots are more likely to be reverted?

The image edits by JarBot were made across all 64 topics (please refer to this taxonomy for a detailed list of article topics). The most edited main topic is STEM (38.8% of total image edits) and Geography (23.5% of total image edits).

Below is the top 10 topics, JarBot edited:

topicedit_countpct_total
STEM.STEM*808718.5%
Geography.Regions.Europe.Europe*21494.9%
Culture.Biography.Biography*20544.7%
Culture.Philosophy_and_religion20024.6%
STEM.Medicine_&_Health19904.6%
Geography.Regions.Asia.Asia*18614.3%
History_and_Society.Politics_and_government16603.8%
History_and_Society.History14113.2%
STEM.Biology13303.1%
STEM.Technology12792.9%

Most of the reverts were made to Culture.Biography.Biography* topic with a 17.7% revert rate (363 reverts). And History_and_Society.History topic edits with a 4.9% revert rate (69 reverts). In other topics, the revert rates are comparatively lower.

(Note that one article may have multiple topics. We are counting edits and reverts per article topic. When topics are aggregated, this results in double counting of articles and makes the totals edits and reverts look much bigger than they are.)

As discussed in the meeting, we will pause the “larger analytical questions” noted in descrption until we have more wikis this has run on.

Reopening so that @cchen can do an analysis on the work done by lsjbot on Cebuano, now that those runs are completed.

The numbers and analysis for media matching bots can be found in this Jupyter notebook. For this analysis, we are collecting metrics for Lsjbot running on Cebuano Wikipedia from 01 Nov 2021 to 31 Jan 2022. We use the comment text Images from API to identify the image edits.

How many edits are made by bots to add images?

49,217 image edits have been made by Lsjbot.

What proportion of those edits are reverted within 48 hours (aka “unconstructive edits”)?

The proportion of image edits are reverted within 48 hours is 0.016% (8 out of 49,217 edits).
As a reference, the 48-hour revert rate is 1.0% for overall edits in Cebuano Wikipedia, given that there is a smaller number of active editors in this wiki.

How many images are added to an article in each edit? Does the number of images added per edit relate to the revert rate?

There are in total 128,294 images added through all the image edits.
49% of the image edits added 1 image to an article in each edit, and 20.4% of them added 2 images.
Compared to JarBot, Lsjbot tend to have more image edits with multiple images adding to 1 a single article. And the images were mostly added in the Image Gallery section in the articles.

Please refer to the table below for a detailed distribution of image count per edit.

image_countedit_countpct_total
12412949.0%
21003820.4%
346169.4%
419263.9%
514653.0%
612452.5%
719143.9%
817593.6%
910792.2%
1010552.1%
1110.002%

From the analysis, 8 reverts are happening in image edits with different numbers of images added. Since there are very few reverts, we don't have sufficient data to draw a relationship between the number of images added per edit and the revert rate.

Are there certain topic areas where images added by bots are more likely to be reverted?

The image edits by Lsj were made across all 64 topics (please refer to the taxonomy for a detailed list of article topics). The most edited main topic is STEM (64.7% of total edits).

Below is the top 10 topics, Lsjbot edited and reverts count:

topicedit_countpct_totalrevertedrevert_rate
STEM.STEM*3365132.3%40.012%
Geography.Regions.Europe.Europe*3352032.2%40.012%
Culture.Biography.Biography*1313012.6%20.4%
Geography.Regions.Europe.Western_Europe1286312.4%10.14%
Geography.Regions.Oceania26482.5%10.5%
Geography.Regions.Asia.Asia*14721.4%00%
Geography.Regions.Africa.Africa*7330.7%00%
Geography.Regions.Americas.South_America6700.6%00%
Geography.Regions.Asia.Southeast_Asia6150.6%00%
Geography.Regions.Americas.North_America5380.5%00%

(Note that one article may have multiple topics. We are counting edits and reverts per article topic. When topics are aggregated, this results in double counting of articles and makes the totals edits and reverts look much bigger than they are.)

@cchen this is an awesome analysis! I wanted to highlight something for any future ones (as I don't think this affect either of these two analyses): the compare API that you used doesn't just return what wikitext was changed, it also returns a few lines of context. So if there was an image file near the edit, that would potentially be counted. I say it probably doesn't affect either of these two analyses because these bots were generally adding e.g., new galleries and in my checking of some of the edits, I didn't see any instances where the context had an image in it. To improve this in future analyses, you might be able to extract just the changed content from the diff or just use the regexes that I compiled and remove any images that show up more than once in the diff HTML (as you'll know then that those were unchanged by the edit) -- e.g.,:

IMAGE_EXTENSIONS = ['.jpg', '.png', '.svg', '.gif', '.jpeg', '.tif', '.bmp', '.webp', '.xcf']
VIDEO_EXTENSIONS = ['.ogv', '.webm', '.mpg', '.mpeg']
AUDIO_EXTENSIONS = ['.ogg', '.mp3', '.mid', '.webm', '.flac', '.wav', '.oga']
MEDIA_EXTENSIONS = list(set(IMAGE_EXTENSIONS + VIDEO_EXTENSIONS + AUDIO_EXTENSIONS))

# build regex that checks for all media extensions
EXTEN_REGEX = ('(' + '|'.join([e + r'\b' for e in MEDIA_EXTENSIONS]) + ')').replace('.', r'\.')
# join in the extension regex with one that requiries at least one alphanumeric and/or a few special characters before it
EXTEN_PATTERN = re.compile(fr'([\w ,().-]+){EXTEN_REGEX}', flags=re.UNICODE)

...

images = EXTEN_PATTERN.findall(diff_html)
image_count = len([i for i in images if images.count(i) == 1])

@cchen this is an awesome analysis! I wanted to highlight something for any future ones (as I don't think this affect either of these two analyses): the compare API that you used doesn't just return what wikitext was changed, it also returns a few lines of context. So if there was an image file near the edit, that would potentially be counted. I say it probably doesn't affect either of these two analyses because these bots were generally adding e.g., new galleries and in my checking of some of the edits, I didn't see any instances where the context had an image in it. To improve this in future analyses, you might be able to extract just the changed content from the diff or just use the regexes that I compiled and remove any images that show up more than once in the diff HTML (as you'll know then that those were unchanged by the edit) -- e.g.,:

IMAGE_EXTENSIONS = ['.jpg', '.png', '.svg', '.gif', '.jpeg', '.tif', '.bmp', '.webp', '.xcf']
VIDEO_EXTENSIONS = ['.ogv', '.webm', '.mpg', '.mpeg']
AUDIO_EXTENSIONS = ['.ogg', '.mp3', '.mid', '.webm', '.flac', '.wav', '.oga']
MEDIA_EXTENSIONS = list(set(IMAGE_EXTENSIONS + VIDEO_EXTENSIONS + AUDIO_EXTENSIONS))

# build regex that checks for all media extensions
EXTEN_REGEX = ('(' + '|'.join([e + r'\b' for e in MEDIA_EXTENSIONS]) + ')').replace('.', r'\.')
# join in the extension regex with one that requiries at least one alphanumeric and/or a few special characters before it
EXTEN_PATTERN = re.compile(fr'([\w ,().-]+){EXTEN_REGEX}', flags=re.UNICODE)

...

images = EXTEN_PATTERN.findall(diff_html)
image_count = len([i for i in images if images.count(i) == 1])

Thanks for the suggestion, Isaac! this one was created before, and I ran this analysis with the notebook you share as well. i will update the regex for the future ones.

Hi @cchen! JarBot just completed a run of his bot in arz.wp. Can you run the same analysis there?

Hi @CBogen, I uploaded the analysis for Jarbot in this Jupyter notebook. For this analysis, we are collecting metrics for Lsjbot running on Egyptian Arabic Wikipedia from April 2022 to May 2022.

How many edits are made by bots to add images?

33,281 image edits have been made by JarBot.

What proportion of those edits are reverted within 48 hours (aka “unconstructive edits”)?

The proportion of image edits reverted within 48 hours is 0.009% (3 out of 33,281 edits).
As a reference, the 48-hour revert rate is 2.3% for overall edits in Egyptian Arabic Wikipedia, given that there is a smaller number of active editors in this wiki.

How many images are added to an article in each edit? Does the number of images added per edit relate to the revert rate?

There are in total 34,687 images added through all the image edits.
95.8% of the image edits added 1 image to articles in each edit, and 4.2% of them added 2 images.

All the reverts are happening in image edits with 1 image given that most of the edits Jarbot made added 1 image to articles.

Are there certain topic areas where images added by bots are more likely to be reverted?

The image edits by JarBot were made across all 64 topics (please refer to the taxonomy for a detailed list of article topics). The most edited main topics are Geography (~50% of image edits) and Culture (~41% of image edits).

Most of the reverts are in Culture.Biography.Biography* and Culture.Philosophy_and_religion topic.

Thanks @cchen , this is great! Closing as Resolved.