Page MenuHomePhabricator

Global Readers Demographic Survey 2023
Open, Needs TriagePublic

Assigned To
Authored By
YLiou_WMF
Jul 14 2023, 4:12 PM
Referenced Files
Restricted File
Oct 20 2023, 8:42 PM
Restricted File
Oct 20 2023, 8:27 PM

Description

In service of SDS1.1: (Defining Essential Metrics) Scientific Metrics, we are planning to field a representative survey of global Wikipedia readers. This survey will be fielded in multiple languages (currently planned to represent at least 90% of current Wikipedia readership) and will be completed, cleaned, and analyzed by the end of Q2 FY2023-2024.

This project relies extensively on previous survey work by @Isaac documented here.

The draft project plan will be updated here.

Survey instrument drafts currently located here.

Major milestones are listed below:

  • Survey instrument draft (English)
  • Draft messaging (Quicksurveys opt-in, survey intro, research meta page, post-survey thank you page, etc.)
  • Survey translation
  • English-only pilot survey programmed, tested, and fielded
    • Programmed
    • Tested
    • Quicksurvey deployment requested
    • Fielded
  • Analysis of pilot survey
  • Revision of draft global survey
  • Global (multilingual) surveys programmed, tested, and fielded
  • Data cleaning, weighting, and preliminary analysis of global survey

Related Objects

Event Timeline

YLiou_WMF added a project: Research.
YLiou_WMF updated the task description. (Show Details)
YLiou_WMF moved this task from Backlog to In Progress on the Research board.

No major updates. Continuing work on survey draft. Length will be limited to 15-17 questions (including age screener). Length variation depends solely on whether a modified series of urbanity questions are adopted (vs. the current single question format).

TBD:

  • Best format for language questions (current variations in both CI and 2019 readers survey are unwieldy from UX perspective)
  • Urbanity question (current CI and 2019 readers survey has "coverage hole" for middle-sized cities / suburban areas of said cities. See also this Pew note (although this research only applies to the United States)

Update:

Continuing to iterate on survey draft. Current working version here

Update:

Draft survey script has been completed and submitted for Legal review.

Update:

  • Draft privacy statement has been received from Legal
  • Continuing tweaks to draft survey script
  • Preparing translation "drafts": documents that ease translation (see here for past examples)

Update:

  • Workflow for translation has been determined and key decision points identified and described (see here: https://phabricator.wikimedia.org/T344947)
  • Translation templates are being created for languages using Movement Comms translation

Update:

  • English survey has been programmed in Limesurvey (testing underway)
  • Translation templates created for Movement Comms translation (being reviewed)
  • Reached out to Qualtrics rep for translation quote for Farsi, Simplified Chinese, and Dutch translations
YLiou_WMF updated the task description. (Show Details)

Update: lots more translation prep happened this week!

  • Translation templates completed (need to be reviewed again) for Movement Comms
  • Translation templates for Qualtrics translation created
  • Quatlrics survey for translation programmed
  • English pilot quicksurvey deployment requested

Update:

Update:

  • English pilot survey QS configuration updated to create a unique ID connecting QS logging with a prefilled Limesurvey variable (QS question in our Limesurvey programmed survey). Thank you to @Isaac for your invaluable assistance on this!
  • Thanks to the whole Research team for serving as our survey testers!
    • Will continue to iterate on how best to solve for the bot problem while minimizing weird UX problems with Limesurvey text entry
  • enwiki pilot survey ready to go live on September 25:
  • Translations updates (see T344947 for details):
    • Most movement comms translations have been completed
    • Coupa request is being processed for Qualtrics translation.
    • Qualtrics contract has been signed
    • Qualtrics invoice has been forwarded to Merve Mursaloglu at Finance

Update:

  • pilot survey fielded with 1/300 coverage on enwiki beginning on Monday September 25
  • as of ~5:30 UTC on September 29, 1001 completes collected
  • pilot survey to be undeployed on October 2
  • Qualtrics translations (Farsi, Dutch, Simplified Chinese) to be completed and received by October 2 at latest

Update:

  • pilot survey undeployed
  • Qualtrics translations completed
  • analysis is underway!

Update:

  • Data cleaning and processing continues.
  • Note for future reference that coding skipped questions is very onerous and complicated (since limesurvey does not currently record skips explicitly, may need to consider alternative survey programming practice—e.g., requiring response with prominent opt-out option)
    • almost all questions (apart from pseudo-CAPTCHA and 18+ question) currently offer opt-out option, but it can be made more prominent

Update:

  • Translations are still missing for Hungarian and Vietnamese and incomplete for a few others; see translation status update in T344947 for more details
  • Data cleaning and processing is complete for the enwiki pilot survey.
    • RE skip-coding: solution has been found using question-timing metadata. This should work in the future for all Limesurvey-hosted surveys
  • Was able to calculate peak Limesurvey "usage" at 316 simultaneously open surveys. Tanja has contacted Limesurvey to see if they can share their data on this (would be useful to compare to see if our calculations are accurate).
    • Posted density plot shows that almost all surveys are open at a time when few other surveys are open.
    • Similarly, simple scatterplot of number of simultaneously open surveys against total time survey is open (each point is an individual survey instance) shows that most surveys are closed quickly and "high demand" appears disproportionately due to a relatively small number of surveys that are left open for long periods. This suggests we might want to have Limesurvey time out after 20-30 minutes (1200-1800 seconds).

Simple descriptives for extrapolating full sample survey demand:

Total impressionsTotal Quicksurvey interactionsTotal "yes" interactionsTotal surveys openedMax simultaneous surveysLast pageSaw first "real" page
2,552,63436,431 (1.4%)12,333 (0.5%)6,0053161,576 (26.2%)1,668 (27.8%)

{F38970374}
{F38973789}

Update:

Update:

  • Survey translations in all 24 versions programmed into limesurvey (see T350439)
  • Projects prepared for Quicksurvey deployment T344393
  • Projects planned for full deployment this week following testing!

Update:

  • enwiki global readers survey launched this week at 2% coverage and is expected to complete data collection by November 23
  • As of 4:31 PM on November 17, we have collected 4375 complete responses (16,118 surveys opened)
  • the other 23 language versions will launch next week

Update:

  • enwiki global readers survey completed data collection on November 22!
  • 9476 completes, 8898 18+ completes.

Update:

  • As detailed in T344393, global readers survey (non-english) launched on November 28
  • Based on initial data collection November 28-29, surveys were relaunched November 29 at higher coverage for hiwiki, kowiki, elwiki, and idwiki
    • Note that we are continuing to face some challenges with hiwiki that may require further investigation: specifically, idiosyncratically low response rates among those clicking through from Quicksurvey

Update:

  • Global readers survey data collection continues (currently planned to continue through December 15)

Update:

Update:

  • survey data cleaning complete (T353843 for details)
  • working on visualization code