Page MenuHomePhabricator

Submit paper on reader demographics surveys for peer-review
Closed, ResolvedPublic

Description

Submit academic paper based on reader demographics surveys to bring results and implications to the research community.

Event Timeline

Weekly update: proposed a narrative for paper that is less narrow and more a comprehensive view of what we know about readership to Wikipedia. Summary:

In this paper, we seek to provide a comprehensive overview of Wikipedia readers. We compile extensive data from past surveys and complement it with a large-scale survey to 14 different language communities. We approach our analyses through the lens of understanding to what degree Wikipedia is fulfilling its mission of providing free access to the sum of all human knowledge to every single person on the planet.

We ask the following research questions:
* How should we define a reader of Wikipedia?
* Are observed differences between reader motivations and needs across languages a function of language/culture or individual demographics and the reach of those language editions?
* How do readership gaps relate to content and contributor gaps?

We make the following contributions:
* We coalesce past large-scale surveys of Wikipedia readers and identify trends and questions raised by these surveys.
* We present the results from a new large-scale survey of Wikipedia readers across 14 different language communities that allow us to relate individual reader motivations, demographics, and reading behavior.
* We develop an approach to labeling Wikipedia articles according to a consistent topic hierarchy across all language editions.
* We identify the presence of self-focus bias amongst Wikipedia readers -- i.e. readers read content that reflects their identities -- demonstrating a connection between readership gaps and content and contributor gaps.
* We present a number of hypotheses for why we see certain barriers to readership with a specific focus on how the gender of readers relates to readership.

Weekly update:

  • Waiting to hear on Science abstract before proceeding with narrative described above
  • A few additional analyses:
    • Changing confidence intervals from 99% to 95% does not change the results much -- I will continue to use 99% given that it's the more appropriate option for the number of comparisons being made
    • Verified no relationship between gender and day of week / time of day

Weekly update:

  • Provided concrete examples of top-level findings in support of Science abstract
  • Computed data for what % of page views come from men/women for each language based on our survey results -- in general the % of pageviews from men is about 5-10% higher of a number than proportion of readers who are men because men consistently had slightly longer reading sessions than women as well.
  • Green light from team to begin expanding out comprehensive paper on Wikipedia readership
  • Florian will get back to me on deleting data from WtWRW project to free up space on stat1007

Weekly update:

  • Waiting to hear on Science abstract

Weekly update:

  • Attention turned towards Nature Human Behavior

Weekly update:

  • No update on NHB paper abstract
  • Reacquainting with broader readership paper proposal so that can begin to flesh that out for potentially CSCW (Apr 15)

Weekly update:

Weekly update:

  • Plan forward for NHB writing (4/30 internal deadline) -- note that this is a change from the end of this quarter

@Isaac Please move this task to the next quarter's goals inline with the new deadline you mentioned here.

(And for others reading this task: the context for pushing the timeline for this task to Q4 is that we need more time to submit the work to Nature Human Behavior which is the venue that has indicated early interest in receiving the full manuscript. Pending on NHB acceptance, we will decide to submit the rest of the work to CSCW or not, and we make this decision by July 2020, in time for the next CSCW deadline.)

Please move this task to the next quarter's goals inline with the new deadline you mentioned here.

@leila done in that I moved the task to the Q4 lane on the Phabricator board. If there is more to it than that, please let me know.

Weekly update: tentative start on the paper writing though mostly this has been on hold as I handle the Covid-19 dataset generation emergent work.

Weekly update:

  • wrote introduction to the paper -- feedback was that a higher-level connection to things outside of Wikimedia (i.e. not just knowledge equity etc.) be worked into the introduction.
  • Because that higher-level connection is not yet clear, rest of team requested that data/results be written before returning to the introduction.

Weekly update:

  • Drafted results and methods sections
  • Generated results figures to support the text

Weekly update:

  • Iteration with team on results section

Weekly update:

  • Continued iteration on narrative / results with team

Weekly update:

  • Continued iteration on narrative / results with team

Weekly update:

  • Progress on writing -- goal to submit early next week

Weekly update:

  • Paper complete. Waiting for go-ahead from all to submit and accompanying letter.

Weekly update:

  • Paper submitted!
  • Will wait to hear initial response from NHB before choosing whether to upload submission to arxiv (if positive, then upload; if negative, then decision to upload depends on what we choose to do with the paper)

@leila permission to close this? if we need to adjust course etc., that can be part of the parent task: T230677