Page MenuHomePhabricator

Enable microsurveys for long-term tracking of editing experience
Open, MediumPublic

Assigned To
None
Authored By
Whatamidoing-WMF
Feb 19 2015, 6:05 PM
Referenced Files
None
Tokens
"Like" token, awarded by Sdkb."Mountain of Wealth" token, awarded by Liuxinyu970226."Like" token, awarded by Quiddity."Mountain of Wealth" token, awarded by Elitre.

Description

The WMF needs to know how well the editing experience is working, both for VisualEditor and the wikitext editors. We should be asking editors for their feedback all the time, not just if the user is signed up for research.

I request a simple, single-question microsurvey. It should take the editor no more than a few seconds to either answer it or cancel it. If

My preferred "trigger" is upon saving an edit. A generic question like "Would you recommend editing Wikipedia to other people?" might be best, with a simple Likert scale [1] or perhaps yes-maybe-no options.

I want the data automatically collected and anonymized. Ideally, it would record some basic information about the account, such as the editing environment just used, whether the editor is logged in or logged out, the approximate age of the account, approximate number of edits, and/or userrights (e.g., autoconfirmed), so that it's possible to see whether brand-new editors have different views compared to experienced editors.

This should not require a major effort in terms of engineering, since there are some existing tools that could be adapted to collecting the information.

If the rate is set to just 1% of edits (or even lower), with a cookie (or account setting, for logged-in users) to prevent someone from being asked more than once or twice a month, then we should be able to get enough information without spamming people all the time. A preference to opt-out might be ideal; alternatively, it could be possible for logged-in users to hide it via personal CSS or JS code.

[1] https://en.wikipedia.org/wiki/Likert_scale

See also:

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
DuplicateNone
Resolvedawight
OpenNone
InvalidNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolved Jhernandez
Resolved mepps
Resolved ERayfield
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedIsaac
OpenNone
Resolvedawight

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_(technical)&oldid=801877570#Is_content_font_size_too_small.3F could be understood as a commentary on the absence of this feature. An editor wanted a straw poll, couldn't find a way to do it on wiki, and so resorted creating a survey in Google Docs and asking editors to go off wiki to vote.

We should not pre-define the trigger. And we should make this tool more general, so that it can appear on any page we're interested in, such as Recent Changes, for example. There are some other fine points to be worked out, but I definitely could use something like this.

We should not pre-define the trigger. And we should make this tool more general, so that it can appear on any page we're interested in, such as Recent Changes, for example. There are some other fine points to be worked out, but I definitely could use something like this.

I agree that the trigger shouldn't be limited to saving an edit. The search platform team has run some mini-surveys like this, and while the data we collected was very useful, there were some people who were upset about the lack of ability to opt-out. I can imagine that having multiple surveys running at once would be very frustrating for those folks. They might not distinguish between the surveys, and opting out of one should opt them out of all of them.

I suggest at the very least we have some mechanism for opting out of all mini-surveys and that all mini-surveys honor it. I think we used a user setting for logged in users, and a multi-month cookie for non-logged in users. The cookie may prevent people on a shared machine from seeing the survey, but that seems acceptable. Rather than making it binary, we could instead set "time-of-next-survey", which would keep any survey taker from seeing another one for a day/week/whatever, and for those opt-out, logged in users get "never" and non-logged in users get a date 3 months in the future, or something similar.

Better, of course, would be a unified mini-survey platform, which would not only prevent the same person from seeing multiple surveys close together, but would also handle conflicts if multiple surveys end up on the same page. I realize that this may be too much scope creep for this particular ticket, but our current use cases include (with sample wordings and durations):

  • On the search results page, "Are you happy with these search results?", with 3-5 answers, possibly with the answers being smiley icons, and placement under the sister projects results. Triggered by fixed random ratio, show immediately, no time out.
  • On specific article pages, "Would someone who searched for <X> want to visit on this page?", with 3 text answers (Yes/No/I don't know), placement would be floating on the page. Triggered by a random number specific to the page, show after the user has been on the page for 30 seconds, with a 60 second time out; don't start counting time if the tab doesn't have focus.

In both cases, we need an opt-out button and a link for more info. See T171740: [Epic] Search Relevance: graded by humans for more on what we've been doing so far in the absence of a unified mini-survey platform.

Talking about triggers, I could see the need for some different ones just on the top of my head:

  • Visiting a specific page or specific pages (or potentially a namespace).
  • Performing a specific action.
  • Using a specific tool.

What is meant here by "long term"? Feedback must not be collected unless it's acted upon. Either it goes to a public place where people can make use of it, or an end should be configured for the collection (say 1 year) and then whoever is left using it can ask the term to be prolonged.

Nemo, I want a "generic" satisfaction-oriented survey running indefinitely, with the basic data being posted publicly (and more or less instantly), so that anyone who is interested can see what happens. Think of the story this way:

  • We have a baseline satisfaction level of approximately X.
  • Someone changes the font.
  • The level drops dramatically.
  • Someone says "Ooops" and reverts the font change.

Since I started this idea, other teams have named some interesting short-term uses. We'll need to have some way to schedule surveys, so that they can happen at relevant times. I've added it at T184752: Make it easy to schedule short-term microsurveys, to start and stop at pre-defined times.

I suggest at the very least we have some mechanism for opting out of all mini-surveys and that all mini-surveys honor it. I think we used a user setting for logged in users, and a multi-month cookie for non-logged in users. The cookie may prevent people on a shared machine from seeing the survey, but that seems acceptable. Rather than making it binary, we could instead set "time-of-next-survey", which would keep any survey taker from seeing another one for a day/week/whatever, and for those opt-out, logged in users get "never" and non-logged in users get a date 3 months in the future, or something similar.

@TJones, I completely agree with you. See T183032: Registered editors should be able to permanently opt-out of (micro)surveys, and please expand with any useful advice/information you have. I created a separate task about not over-spamming logged-out users at T184761: Logged-out users should be able to temporarily suppress microsurveys.

I wonder what you would think about having different rates for asking a question of logged-in and logged-out users? For example, maybe a logged-in user should have a 1% chance of a microsurvey, and a logged-out user should have a change of 0.1%.

Better, of course, would be a unified mini-survey platform, which would not only prevent the same person from seeing multiple surveys close together, but would also handle conflicts if multiple surveys end up on the same page.

I completely agree. See T182018: In the microsurvey tool, prevent highly active users from being spammed by multiple surveys and T183034: In the microsurvey tool, prevent highly active users from being spammed with the same survey on multiple wikis, and please expand them with any useful advice/information you have.

In both cases, we need an opt-out button and a link for more info. See T171740: [Epic] Search Relevance: graded by humans for more on what we've been doing so far in the absence of a unified mini-survey platform.

You're right. I've added a sub-task about a cancel/go-away option at T184760: It should be easy to make a microsurvey question go away, without answering it.

Another important aspect, at least for how search is using micro-surveys, is per-page sampling rates. Or more specifically targeting # of impressions per week on a per-page basis, rather than site wide.

For our particular surveys because they use a unique question for each page ( ex: "Would someone who searched for <X> want to visit on this page?") we set sampling rates to target a particular # of impressions per week, rather than having a top level sampling for all pages. We implement this by using a tool to query the page views api to calculate average weekly views from which we calculate a per-article sampling rate. This sampling rate is stored in the cached html of each page so the survey can do sampling without additional web requests on every page view.

@Whatamidoing-WMF, I don't have any additional brilliant insight on the other tickets you've opened, other than to say that they look like you've got the right idea.

Below are some links to the discussions that came from our own initial survey efforts, and below that are a summary of the points people made.

Previous discussions:

Highlights:

  • People can be spurred to help!
  • People can be upset by semi-automated/semi-random decisions made by the survey (in this case, pairing a particular query with a particular article).
  • Links to simple, high-level documentation and a place for feedback is good.
  • More detailed documentation is good.
  • People want to be able to opt out.
  • Other communication (like a blog post) is good. In particular, a user-friendly explanation/description you can point people to is good to have around.
  • Start small when you first roll it out, so people can let you know all the mistakes you made!
  • Documentation is good, so people can understand the point of the survey.
  • A popup is inherently annoying to some people.
  • Expert UI advice on placement & timing (when does it pop up, how long does it stay) and other stuff would be helpful.
  • People really want to be able to opt out.
  • Don’t ever repeat the same survey on the same page to a user.
  • One should probably look at the Article Feedback Tool—and its Talk page archives—and try to avoid the mistakes made on that project.

I wonder what you would think about having different rates for asking a question of logged-in and logged-out users? For example, maybe a logged-in user should have a 1% chance of a microsurvey, and a logged-out user should have a change of 0.1%.

That seems like a reasonable feature, but it would depend on your survey and how you model users (e.g., logged in users are more likely to be motivated to take a survey or more knowledgeable on the survey topic). It makes sense to me to be able to condition survey sample rates on lots of things. For our relevance survey use case, we determined sample rates based on pageview data (computed offline before the survey) because what we really wanted was, say, 100 responses per page, so pages that get 10x as many pageviews can be surveyed with one tenth the sample rate. (As @EBernhardson explained above while I was composing this!)

So—that brings up another feature request: limits on total number of people surveyed. That is, stop the survey after n results have been gathered.

OTOH, I do worry a bit about feature creep as I keep making more suggestions. ;)

So—that brings up another feature request: limits on total number of people surveyed. That is, stop the survey after n results have been gathered.

OTOH, I do worry a bit about feature creep as I keep making more suggestions. ;)

Good suggestions can always be de-prioritized for an initial version, but added later, so please keep them coming. T184767: Make it possible to stop a survey after receiving a certain number of responses is the simple case. @EBernhardson, I feel like your needs might be somewhere between T184767 and T183941: Let me choose which page(s) my microsurvey question will be presented on. What do you think?

T183941 is probably relevant, but T184767 less so. The problem with T184767 is we want to spread the surveys out. If we ask for 100 impressions per week we don't want to get 100 impressions all in the first hour and end the survey, which T184767 seems to be implying. For 100/week we want around 14 /day spread across the days/hours of operation to get all the different kinds of users.

For 100/week we want around 14 /day spread across the days/hours of operation to get all the different kinds of users.

Good point; I hadn't thought about that particular sampling issue. On the other hand, the ability to set an upper limit—e.g., I want 100 surveys over 2 weeks, so I set my limit to 500—is another backstop against some other misconfiguration that makes the survey go off 100x more frequently than planned.

Nemo, I want a "generic" satisfaction-oriented survey running indefinitely, with the basic data being posted publicly (and more or less instantly), so that anyone who is interested can see what happens. Think of the story this way:

  • We have a baseline satisfaction level of approximately X.
  • Someone changes the font.
  • The level drops dramatically.
  • Someone says "Ooops" and reverts the font change.

Since I started this idea, other teams have named some interesting short-term uses. We'll need to have some way to schedule surveys, so that they can happen at relevant times. I've added it at T184752: Make it easy to schedule short-term microsurveys, to start and stop at pre-defined times.

You may need to have a way to periodically re-survey people who already replied to a running indefinitely one. IF you don't, how someone who has seen the change can be surveyed?

A brilliant observation (as usual), Trizek. I've turned that into T184944: Microsurveys: Don't keep asking me too often, but do ask me again, eventually, on some surveys, which I invite you and anyone else who's interested to boldly improve.

No longer using quick surveys component to track the creation/execution of surveys.

Sorry to split hairs in this ancient task, but I often see references to "a simple Likert scale" and I feel it's important to point out that there's no such thing, technically. The intended use of the scale is to take a set of several "Likert item" questions phrased in positive and negative terms, exploring a wide range around a focused topic, which helps compensate for tendencies such as an individual respondent only using "agree" and "strongly agree", for example. It's contested whether a single Likert-type item is actually a [1-5] value that can be meaningfully averaged across respondents and so on. (These details are all documented in the wiki article linked from the description.)

I'm bringing up these questions here because we should be cautious in setting best practices that may be hard to interpret statistically.