Page MenuHomePhabricator

[2 hours]: Determine relative level of effort for targeting Quick Surveys for specific cases
Closed, ResolvedPublic

Description

The Global Reach team is interested in the ability to run Quick Surveys targeted for browser access originating from specific Wikipedia Zero operator networks.

@Lydia_Pintscher has also expressed interest in the ability to target surveys for editors with a particular number of edits.

This task is dedicated to identifying how much work would be involved with

  • A new configuration variable for surveys for the operator code to facilitate such a survey.
  • A new configuration variable for surveys for a number of edits threshold upon which a user would become eligible for a survey.

Some (not all, there may be more) questions to spur thinking...

  1. Does ZeroBanner JavaScript provide sufficient context to the browser to make it possible to identify that the survey's operator code matches that of the operator?
  2. How would sampling and eligibility work to ensure that a large percentage of ResourceLoader capable UAs on a given zero-rated network would be presented with the survey, but repeated survey presentation would be avoided?
  3. How would sampling and eligibility work to ensure that a large percentage of editors exceeding a particular number of edits threshold would be presented with the survey, but repeated survey presentation would be avoided?

Event Timeline

dr0ptp4kt created this task.Jun 6 2016, 8:29 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 6 2016, 8:29 PM
dr0ptp4kt renamed this task from Spike [2 hours]: Determine relative level of effort for targeting Quick Surveys for specific operators to Spike [2 hours]: Determine relative level of effort for targeting Quick Surveys for specific cases.Jun 8 2016, 12:43 PM
dr0ptp4kt added a project: Wikidata.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt added subscribers: Lydia_Pintscher, DFoy.

Deferring to sprint 76, to make way for the primary Q1 goal's investigation spike on the ULS/CLS extension.

Addshore moved this task from incoming to monitoring on the Wikidata board.Jun 22 2016, 1:02 PM
dr0ptp4kt updated the task description. (Show Details)Jun 30 2016, 4:52 PM
dr0ptp4kt raised the priority of this task from Normal to High.Jul 1 2016, 5:51 PM
atgo added a subscriber: atgo.Jul 6 2016, 9:56 PM
bmansurov renamed this task from Spike [2 hours]: Determine relative level of effort for targeting Quick Surveys for specific cases to [2 hours]: Determine relative level of effort for targeting Quick Surveys for specific cases.Jul 8 2016, 12:40 PM

Removed "Spike" from the title as the task has been marked with Spike.

Jhernandez claimed this task.

Regarding:

A new configuration variable for surveys for a number of edits threshold upon which a user would become eligible for a survey.

Estimated2 points

Tasks

  • Add config variable for edit buckets (validate values)
    • All if not defined
    • Or an array of bucket names:
Bucket nameEdits
00 or not defined
1-41 ≤ edits ≤ 4
5-995 ≤ edits ≤ 99
100-999100 ≤ edits ≤ 999
1000+edits ≥ 1000

Regarding the related question 3:

How would sampling and eligibility work to ensure that a large percentage of editors exceeding a particular number of edits threshold would be presented with the survey, but repeated survey presentation would be avoided?

Quicksurveys handles bucketing percentages so you can choose which percentage of users matching the criteria will see the survey with a config variable.

Regarding repeated survey presentation, QuickSurveys remembers if the user has dismissed or answered a survey per browser session and won't show it again to that user.

That said, and given we're talking about logged in users w/ edits, if they login in a different browser, then they'll be bucketed again and they may see the same survey again on a different device.

That's how it is implemented right now, and syncing user buckets across devices would be non-trivial (could probably be done via user prefs though).

@Lydia_Pintscher Can you check T137151#2457107 and see if it answers your questions?


I'll get to the zero banner part now.

Regarding:

A new configuration variable for surveys for the operator code to facilitate such a survey

Surveys are tied to configuration changes to the QuickSurveys extension that are deployed by engineering personel.

It doesn't seem like a good idea that operators all around the world will be sending us surveys and configuration changes that will need to be modified and deployed by engineers.

Besides that, technically:

  1. Does ZeroBanner JavaScript provide sufficient context to the browser to make it possible to identify that the survey's operator code matches that of the operator?

After researching, it doesn't seem like it. No config variables seem to be sent to the client (1) from what I've been looking at.

They could probably be exposed, if it was really necessary, or checked on the server and only send the relevant surveys to the client. Tying QuickSurveys to explicitly depend on ZeroBanner doens't seem like a good idea technically though.

I'd estimate this to be an 8 pointer if we really wanted to work on it. Maybe more (it may take some time to come up with a reasonable approach).

  1. How would sampling and eligibility work to ensure that a large percentage of ResourceLoader capable UAs on a given zero-rated network would be presented with the survey, but repeated survey presentation would be avoided?

I don't see what the issue would be here, normal QuickSurvey bucketing and selection mechanisms apply. Any thoughts on why this could be problematic @dr0ptp4kt?

@Lydia_Pintscher Can you check T137151#2457107 and see if it answers your questions?


I'll get to the zero banner part now.

Yes this kind of bucketing would work for my use-case (taking samples of the attitude of experienced Wikipedia editors towards Wikidata over a longer period of time). Thanks for looking into it.

It doesn't seem like a good idea that operators all around the world will be sending us surveys and configuration changes that will need to be modified and deployed by engineers.

I believe this would be facilitated by @DFoy on a case by case basis.

They could probably be exposed, if it was really necessary, or checked on the server and only send the relevant surveys to the client. Tying QuickSurveys to explicitly depend on ZeroBanner doens't seem like a good idea technically though.
I'd estimate this to be an 8 pointer if we really wanted to work on it. Maybe more (it may take some time to come up with a reasonable approach).

It occurred to me I could maybe check this with a live connection. I added my IP to a test configuration and it looks like the interstitial that comes in by way of ResourceLoader goes directly after an encoded JSON object. Does that change the tenability of accessing a variable or change the point value materially?

  1. How would sampling and eligibility work to ensure that a large percentage of ResourceLoader capable UAs on a given zero-rated network would be presented with the survey, but repeated survey presentation would be avoided?

I don't see what the issue would be here, normal QuickSurvey bucketing and selection mechanisms apply. Any thoughts on why this could be problematic @dr0ptp4kt?

No. Based on your note in T137151#2457107 it sounds like first the check for whether the user could in principal be surveyed is determined, and then the eligibility is calculated to determine whether to show the survey, with a session cookie or its like helping to avoid repeat sampling. In practice it seems like this would imply surveys would probably be run best over a short period to avoid showing the same survey too many times to a user (and an analyst looking at the data should be aware of the possibility of a user answering multiple times if the eligibility rate is cranked up).

dr0ptp4kt closed this task as Resolved.Jul 16 2016, 2:04 AM

Marking as Resolved, although quick question for @Jhernandez.

@dr0ptp4kt Ohh, I missed that :) It clarifies where to get information from. Regarding points I'd estimate around an 8 pointer.

I thought that might be the case, but figured I should ask. Thanks @Jhernandez!