Page MenuHomePhabricator

Pilot social media traffic reports for English Wikipedia
Closed, ResolvedPublic

Description

Develop and pilot a set of on-wiki reports that display articles that have recently received a high volume of traffic from social media sites (broken down platform-by-platform). The goal of the project is to provide volunteer patrollers with useful, timely information about articles that are receiving sudden spikes in traffic from particular social media sites.

Traffic spikes, especially spikes to otherwise low-traffic articles, may indicate that the topic of a Wikipedia article is associated with viral content, is the target of a coordinated disinformation campaign, and/or is being used by social media platforms to "fact-check" controversial claims. In all of these cases, editors currently have no effective way of identifying these articles and monitoring them for suspicious or problematic edits.

The project will involve developing and launching a time-limited pilot of these traffic reports, and evaluating their impact. The results of the pilot will inform subsequent decisions to develop or improve tools to help editors monitor particular kinds of traffic flows. The results of the pilot can also help researchers identify useful case studies of 1) signatures of disinformation campaigns, 2) the impact of platforms using Wikipedia as a free fact-checking service, and 3) the overall impact of social media traffic spikes on article quality.

Event Timeline

Weekly update: meetings scheduled with WMF Privacy team and community stakeholders. Research brief updated with more specific timeline, research questions, and implementation plans.

Week 1/13 updates: met with Legal and Security reps to schedule privacy review; updates to the research brief; new Meta page

Weeks 1/20 and 1/27 updates: building out research plan and documenting some implementation details.

Week 2/3 update: JM trying to get access to production again, so he can test some sample queries. Blocked on technical difficulties.

Week 2/10 update: requested access to stat1007 so I can start exploring the data (T244785). Waiting on approval from Nuria Ruiz.

Or simply a Michael Jackson event! (that we'd better be aware, too)

Week 2/17 update: as of Friday, still blocked on both privacy review and stat1007 access. Discussed a traffic-spike detection algorithm with Isaac.

@Capt_Swing can you make the two tasks you're blocked on subtasks of this task? (one should be resolved by now, the other I will escalate.)

@leila thanks! Subtasks added (see privacy review one T246041). I'm coordinating with @JFishback_WMF directly via the tracking task on Asana and I don't think we need to escalate at this time. He's been down with flu but is back at work now and aims to have an update by the end of the week.

@Capt_Swing @leila I'm actually working on this right now, reading through the Research Brief. At first glance I think this is probably fairly low risk, but will work up our standard risk analysis and send to you. Typically I also send to WMF-Legal for approval and Analytics for feedback / sanity check (on data releases like this). Do you know if Legal has already taken a look at this yet?

@JFishback_WMF I know that @APalmer_WMF was part of the meeting where you and I discussed this privacy review, and Aeryn and Stephen have also been provided the Research brief for review. See email thread "Meet to discuss privacy requirements for social media traffic reports research project?" from 1/7/2020. I don't know if Aeryn or anyone else in Legal has taken any action since then. I don't believe that anyone from Analytics has performed any review on this yet.

@Capt_Swing Ah yes, I remember now. I think the plan was for me to send my privacy risk analysis to @APalmer_WMF once I was done. Re: Analytics, it's not really a formal review as such, more just a second set of eyes. They know more than I do about what data is already released into the wild, so it's more like a double-check of my assumptions to make sure I didn't miss something. If we already publish A^2 we should know that by also publishing B^2 a malicious actor will now be able to figure out C^2. So Analytics helps me out by making sure that there is no A^2 already out there that I don't know about. Hopefully that makes sense.

Week 2/24: validated basic workflow for collecting and processing traffic data; received full approval from Privacy, Legal, Analytics.

Week 3/2 updates: Sketched out the production pipeline and began finalizing requirements and dependencies with @Isaac.

Week 3/9 update: We've developed an end-to-end workflow for generating the reports! Repo on GitHub. A sample report (generated by script, using fake data) is available here: https://test.wikipedia.org/wiki/User:Jmorgan_(WMF)/sandbox

Week 3/23 update: the report has been launched and announced. Still squashing bugs.

Week 3/30 update: more or less the same as last week.

I'll add that we now have had 4-5 days of running the pipeline fully automated (including kinit) via crontab with no errors! So that bodes well for our ability to sustain this sort of pipeline.

Week of 4/6 update: the pipeline wasn't *really* fully automated by 4/6... but it is now! We've been running on full automation for the past 4 days. I'm also working on a blog post (see subtask above)

@Isaac can you update this task based on what you and Jonathan have agreed in terms of the next steps? My understanding is that we can move this work to the Q4 lane and resolve at the end of May when the pilot stops.

Yes, the following is what we've agreed to:

  • Our current view is that while some people find this useful, we have not seen clear broad adoption (pageviews; feedback). We're glad to have piloted it and it means that we collected some useful feedback and it will be much easier to reimplement in the future if there's a clear request / need. But our initial privacy review was for this as a pilot. And so far, the script runs without any need for maintenance, but we do not want to be indefinitely committed to maintaining in case of breaks.
  • I will notify community members around May 15th that we are planning to officially end the report at the end of May (to give some time for additional feedback in case we should reconsider)
  • I will continue to collect the data on our end (but not publish the report) for at least another month while we determine next steps, what datasets we would need for research, etc.

Weekly update:

  • Presented on this work to the research team
  • I did not put out the community notifications regarding the report today as I had hoped but I am in the process of drafting them and will hopefully have them by early next week.

I'll be shortly posting that we are intending to end our maintenance of the report to the channels Jonathan initially announced the project to (wiki-research-l, analytics-l), but adding to what I wrote in T241768#6129556, I also did a few additional analyses to check whether the decision made sense:

  • We have not seen much vandalism, edit activity, or protection associated with these articles (see analysis below)
  • I also reached out to James about the privacy evaluation for this pilot and whether it allows us to continue reporting indefinitely or would require additional review.

Quick Analysis Details

Based on the 3324 unique articles that had been posted to the Social Media Traffic Report as of May 18th, I gathered each article and the first date that it was included in the report. I then gathered statistics about page protections before/after the post and the # of edits made in the two weeks prior and following that date for that article as well as the number of identity reverts that were made in the two weeks prior and following the post. This was a basic analysis (I lack a strong control group of articles and many articles were often posted many times, which ideally we would account for). 99% confidence intervals (via bootstrap resampling with 1000 iterations) are given as well:

MetricTwo weeks prior to postTwo weeks following post
average # of edits10.698 [8.347-13.726]10.848 [8.551-13.286]
average # of reverts0.833 [0.695-0.994]0.831 [0.698-0.981]

In summary, we see no significant changes in editor activity on the articles that appear on the social media traffic report.

For page protections, I found that 98% of the articles had no change in page protections following getting posted to the social media traffic report. For 64 (2%) pages, additional page protections were put into place. For 13 (<1%), page protections were removed or expired after the post.

Code: https://github.com/jtmorgan/social-media-traffic-reports/tree/master/analysis

@Isaac thanks for capturing this. You may want to look at number of editors who have the page in their watchlist 2 (or 4) weeks before and after the introduction of the service.

The need that we were asked to satisfy was that some enwiki community members where manually checking social media groups/sites and notify enwiki community if they expect a spike based on a conversation in the social media sites. One possible reaction on the editor ends then can be to add the page to their watchlist.

@leila good point. I will quickly add that check though I will say in advance that for this one, without doing the more robust comparison to a control set of articles, it'll be really hard to interpret the results. I don't have historical watchlist data (i.e. # of watchers in the weeks before an article was posted), so without pulling a control set of articles, I'll really only be able to compute the Two weeks following post column and it'll be hard to interpret whether the increase is expected or not. But very much agree that it's the right metric to pay attention to, especially if we expand out these analyses..

Results suggest no strong uptick in people with these articles on their watchlist:

  • Change (99% confidence interval) in # of watchers after post: 3.346 [2.689-4.090]
  • Change (99% confidence interval) in # of visiting watchers after post: 1.950 [0.501-3.066]

Like I said, I don't love this analysis because there's no comparison group and it's hard to estimate what the baseline should be, though I highly doubt that it's 0 -- i.e. a lot of these articles are relatively new or high traffic so we expect a fair number of editors adding them to their watchlist as part of normal activity outside of the Social Media Traffic Report. Unfortunately, what I have to go on is the difference between # watchers when the article was first posted and # watchers today. There's no way that I know of (e.g., dumps, streams, or eventlogging) to pull historical data on # of watchers so that's what we've got :/

Code: https://github.com/jtmorgan/social-media-traffic-reports/blob/master/analysis/watchlist_analysis.ipynb

Thanks for checking this. I agree that not having a proper comparison group makes it challenging to understand the true effect. In the interest of time and the limitation of our resources, let's leave it at this. The biggest indicator of whether we should continue this work will be if we hear from editors that they want it after the service is stopped. Not ideal at all, but given the constraints we have, I don't see another short term option.

Weekly update:

  • Did rough analysis of first two months of report
  • Sent out emails to wiki-research-l + analytics-l about ending of pilot
  • In the process of confirming with Privacy that the pilot could run beyond May 30th without raising any additional privacy concerns.

Weekly update:

  • Monitored talk pages / email thread but no responses yet
  • Confirmed that data could continue to be collected beyond May 31st
  • We will shut down the public-facing report though after May 31st so there is clarity that is not being maintained (we will of course still be open to feedback after that point should we hear it)

Weekly update:

  • Turned off public report -- haven't heard anything via email / talk pages

@leila with your permission, I'll close out this task. Any additional research around these reports / social media traffic would be under a new task.

Sounds great. Thanks to all the work you and Jonathan did for this.