Page MenuHomePhabricator

Define research agenda for external re-use
Closed, ResolvedPublic

Description

Based on the findings from T235780, define research questions that we should be addressing as part of better understanding and improving external re-use of Wikimedia content.

Event Timeline

Weekly update: continued progress on organizing panel on re-use for All-Hands, which will provide an informal venue for me to share out some of what we know and gather concerns, feedback, suggestions from staff about this area.

Weekly update:

  • Outlined basic narrative and desired data around interplay of Wikipedia and Search
  • Related:
    • Patch for external data referers that I was supporting is now in production! See: T239625
    • Began planning expanded queries for social media traffic report (T241768) as part of initial inquiries into editor impact of external traffic

Weekly update:

  • continued preparation for All-Hand's panel on reuse
  • iterated with Jonathan on social media traffic report planning -- in particular looking for anecdotal evidence of a link between external referrals and editing to help us gauge what we think those relationships will look like. Early observations would suggest that Youtube fact-checking links lead to a low level of vandalism, Reddit links seem to lead to minor, positive edits but can have larger positive impacts too depending on the community, Twitter/Facebook had little evidence of leading to editing.

Weekly update:

  • completed panel on re-use, which will turn into quarterly gatherings to discuss re-use related topics
  • research questions around search on hold until April

Weekly update: began thinking about how reuse might fit into annual planning but need to set aside time to focus on this goal.

Weekly update:

  • began moving final piece of literature review to Meta
  • provided input to product analytics on measuring the potential effect of changes in how Google handles featured snippets

Per a long discussion with @leila :

Decision

The research agenda will focus on re-use of Wikimedia projects within Wikimedia -- specifically starting with having a better understanding of how Wikidata is used within Wikipedia. I will use the next few weeks to help solidify / shape this, but here are my thoughts at this time:

  • This internal-to-Wikimedia between-project re-use is my understanding of how external reuse is defined as a metric of interest per the MTP. Regardless, understanding cross-project usage will help inform internal metrics.
  • Focusing inward means that we can more completely measure and experiment with approaches to reuse.
  • Focusing on Wikidata (and hopefully other projects) will bring research attention to these relatively understudied projects and push Wikipedia research in the direction of not studying a single language in isolation.
  • Many of the challenges that we face externally have analogues inside of Wikimedia -- e.g., page previews can be viewed as rich search results, editing transcluded Wikidata information from Wikipedia is similar to proposals to allow editing of Wikipedia from search results. This focus on the usage of Wikidata within Wikipedia is itself the start of research that looks at the broader value of linked open data.
  • As opportunities arise, we may consider research into other facets, but this will at least be the focus.

Related tasks: T247099 and T246709

Other projects considered
  • Building up a set of articles to track referrals and how they are surfaced in search results. These articles could then be tracked over time to watch for changes in how searchers encounter Wikipedia. For example: T246737
  • Better tracking re-use of text between Wikipedia articles and language editions. This comes out of Martin Potthast's work and is similar to the approaches being considered under T243256.
  • Research into the impact of external reuse on Wikimedia brand, pageviews, and ability to attract editors. There are anecdotally many instances of external usage of Wikimedia content that do not fully adhere to attribution standards, but, with limited resources, what types of reuse should we focus our resources on improving? While this research area is incredibly important, we do not have yet have a clear path for research that look into these questions.

@Isaac thanks for the update. Some points on my end:

  • You have my sign-off for closing this quarterly task. Please resolve when you're ready for it.
  • I recommend you have a conversation with Kate and Toby (and maybe Jon Katz) to make sure they're aware of where you are with this research.
  • My read of the MTP and considering last year's discussions is not that "external" in the language of the MTP doesn't necessarily mean external to Wikimedia projects only. I am, however, up for expanding the definition of "external" though I encourage you to talk about this with Toby. Please bring me as needed. Based on what you have learned in this first year, some updates to the language of the MTP may be granted.
  • Consider presenting what you've learned in one of the Research showcases. It's important that we share this first year's of learning. You've done a lot of research to arrive here and it's good to have it communicated (both on meta and via the showcase).

Thanks for all your work on this front.

You have my sign-off for closing this quarterly task. Please resolve when you're ready for it.

Thanks -- I will do in the next week probably.

I recommend you have a conversation with Kate and Toby (and maybe Jon Katz) to make sure they're aware of where you are with this research.

Thanks for setting up the meeting with Kate. We had discussed in December but good to touch base again as things have changed.

My read of the MTP and considering last year's discussions is not that "external" in the language of the MTP doesn't necessarily mean external to Wikimedia projects only. I am, however, up for expanding the definition of "external" though I encourage you to talk about this with Toby. Please bring me as needed. Based on what you have learned in this first year, some updates to the language of the MTP may be granted.

Yeah, definitely confusion here that we could work out. If it really is external to Wikimedia sites, then perhaps that might result in us making some adjustments to our research goals.

Consider presenting what you've learned in one of the Research showcases. It's important that we share this first year's of learning. You've done a lot of research to arrive here and it's good to have it communicated (both on meta and via the showcase).

Thanks. I'd like to do some additional work beyond the literature review and some of the cleaning up of our internal analytics that I kicked off. But hopefully in a few months, I'll feel it's the right to share more broadly.

Closing this task as next directions have been defined per T242170#5966074 and the literature review has been uploaded to Meta: https://meta.wikimedia.org/wiki/Research:External_Reuse_of_Wikimedia_Content/Background#Snippets_%28Search%29