Page MenuHomePhabricator

Track whether image suggestions notifications lead to media additions
Closed, ResolvedPublic

Description

As a product manager, I want to know whether image suggestions notifications (T292142) lead to the addition of media on the suggested articles, so that I can track whether the feature is a success.

Acceptance Criteria:

Create a dashboard that shows the following, per wiki, updated monthly:

  • Number of notifications sent
  • Revert rate for image additions
    • For all image additions
    • Filtered by experienced users with over 500 edits
    • Filtered additions made after opening one of our notifications
  • Percentage of users who make image edits
    • For all image additions
    • Filtered by experienced users with over 500 edits
    • Filtered by additions made after opening one of our notifications
  • Total users who make image edits
      • For all image additions
      • Filtered by experienced users with over 500 edits
    • Filtered by additions made after opening one of our notifications
  • Percentage of notifications read
    • For all notifications
    • Filtered by image suggestions notifications only (so we can compare the overall engagement rate)
  • Images added
    • Total images added
    • Filtered by experienced editors with over 500 edits only
    • Filtered by users who opened an image suggestions notification
  • Notification opt out rate

  • Media added to infoboxes are included in the total
  • All available media types are included in the total (video, audio, images, pdfs, etc)
  • Icons and other unwanted media types are filtered out of the total
  • Additions that have been reverted within 48 hours are filtered out of the total
  • Data is kept permanently so the process can be revised

Technical Approach Notes:

  • Using list of sent notifications, match all the revisions of users on these pages.
  • Parse disks on cluster offline and use code that extracts whether media was added in a given diff.
  • First gather all edits to pages for which you sent notifications associated with the right user and see if there’s evidence of image based activity on the disks.
  • To include media in infoboxes, by default it won't catch it unless there’s a link, so write something more specific that looks for .jpeg and .png and other media types getting added to wikitext anywhere, which could be a lower cost way.
  • Use this tool to test and have example revisions: https://wiki-topic.toolforge.org/diff-tagging
    • Extract the image stuff out of this. Run a regex for wikitext to look for links that start with file, media, image, and any other aliases, and one that looks for .jpg or .png or other known media extensions and do that for other diffs and compare them. Simpler regex plus compare.
  • Would be an ad hoc run of fetching all of the revisions in the latest x timespan and parsing that revision and the previous one. We have the history dumps on the cluster so you would identify potential edits with user+page and then parse them to discover what was done.

Dashboard in Superset: Image Suggestion Notification Dashboard

Event Timeline

Question maybe for @cchen or @Isaac or @matthiasmullie : can we use this method to also track how many reverts there were? If I added something like "Track, on a monthly basis, how many articles with image suggestions added as a result of notifications had those images reverted", would that be possible?

I'll let @cchen/@Isaac confirm, but I assume they could add a step at the end where we fetch all latest revisions of the relevant affected pages and see how many of the images we've discovered are still around.
Probably worth adding to the description to make sure that is also captured.

I'll let @cchen/@Isaac confirm, but I assume they could add a step at the end where we fetch all latest revisions of the relevant affected pages and see how many of the images we've discovered are still around.
Probably worth adding to the description to make sure that is also captured.

Added, thanks!

@CBogen @matthiasmullie yes, we can look for all the revisions of relevant pages and check the revert status.

@cchen Alexandra and I took a look at the data you shared and would like to keep the following data, per wiki, in a dashboard updated monthly. I'll update the description of the ticket as well.

  • Revert rate for image additions
    • For all image additions
    • Filtered by experienced users with over 500 edits
    • Filtered additions made after opening one of our notifications
    • Correlated to the number of page views
  • Percentage of users who make image edits
    • For all image additions
    • Filtered by experienced users with over 500 edits
    • Filtered by additions made after opening one of our notifications
  • Total users who make image edits
      • For all image additions
      • Filtered by experienced users with over 500 edits
    • Filtered by additions made after opening one of our notifications
  • Percentage of notifications read
    • For all notifications
    • Filtered by image suggestions notifications only (so we can compare the overall engagement rate)
  • Images added
    • Total images added
    • Filtered by illustrated articles only
    • Filtered by experienced editors with over 500 edits only
    • Filtered by users who opened an image suggestions notification
  • Notificaton opt out rate
mpopov subscribed.

@cchen: Looks like you've been working on this, so I'm bringing it into our team's kanban for visibility.

FYI @cchen I just added

  • Number of notifications sent

to the list of dashboard contents in the description of this ticket -- it was mentioned in the parent task but we accidentally left it off of this list. Thanks!

mpopov triaged this task as Medium priority.Feb 1 2023, 5:36 PM
mpopov moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.
cchen updated the task description. (Show Details)

@cchen I noticed you removed "Correlated to the number of page views" from the "Revert rate for image additions", and "Filtered by illustrated articles only" from "Images added". Are those not possible to measure?

@cchen I noticed you removed "Correlated to the number of page views" from the "Revert rate for image additions", and "Filtered by illustrated articles only" from "Images added". Are those not possible to measure?

Update from meeting with @cchen today:

  • There's no evidence that page views are correlated at all to revert rate, so we're taking that requirement out.
  • After discussion with product, the illustrated articles only metric isn't necessary, and it's difficult, so we're taking that requirement out.

Dashboard in Superset: Image Suggestion Notification Dashboard

Updated the dashboard from the meeting:

  • added a note in the Number of Images Added graph explaining how we’re linking image notifications and images added
  • added a reference line in the graph showing when we decreased the threshold of experienced editors for PTwiki from 500 for 300

Added charts for total number of images added and the number of users reached through image suggestion notifications.