Page MenuHomePhabricator

Update QuickSurvey initiation to collect editCountBucket
Closed, ResolvedPublic

Description

User Story

Update QuickSurvey Initiation to collect editCountBucket, similar to QuickSurvey Responses, based on discussions between @TAndic @jsn.sherman @eigyan.

This will help QuickSurveys users to be able to adjust for nonresponse bias and weight data by editor type by allowing us to compare impression and response data.

(Note: T303736 deployment date will need to be decided based on effort and timeline of this task.)

Technical information

Based on conversation with Sam Smith (Readers Web), the process to do this is:

  1. Update the analytics/legacy/quicksurveyinitiation schema (defined in the schemas/event/secondary repository)
  2. Update both parts of the QS codebase that log QuickSurveyInitiation events
  3. Update any tests (IIRC there aren't any but we'd have to check)
  1. Has been started here: https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/768014/

Relevant links based on conversation with @jsn.sherman @eigyan:
we'd need to update logInitialized
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/QuickSurveys/+/refs[…]er/resources/ext.quicksurveys.lib/vue/QuickSurveyLogger.js
as well as the schema
https://meta.wikimedia.org/wiki/Schema:QuickSurveyInitiation
actual schema file
https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/secondary/+/refs/heads/mas[…]onschema/analytics/legacy/quicksurveyinitiation/1.0.0.yaml
you can see where editcount bucket lives in the response schema
https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/secondary/+/refs/heads/mas[…]onschema/analytics/legacy/quicksurveysresponses/1.0.0.yaml

Testing and QA steps

  • TBD e.g. beta cluster
  • In the developer console, under the NETWORK tab, locate the events request and validate that the userEditCountBucket exist in the Request Payload
  • Validate new editCount bucket is being capture in data farm.

Acceptance Criteria

  • QuickSurvey initiation (impressions) collect editCountBucket.

Event Timeline

Very helpful update from @phuedx !

Following up on this thread: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/QuickSurveys/+/768194 covers adding the property to the QuickSurveyInitiation events published by QuickSurveys
I think it's worth creating a task to attach these patches to and to seek reviewers

Madalina triaged this task as Medium priority.Mar 16 2022, 10:42 AM
Madalina raised the priority of this task from Medium to High.Mar 16 2022, 3:50 PM

@TAndic as per my conversation with @phuedx I will be attaching his patch to this ticket in hopes of a beta deployment this evening.

Change 768014 had a related patch set uploaded (by Phuedx; author: Phuedx):

[schemas/event/secondary@master] analytics/legacy/quicksurveyinitiation: Add editCountBucket property

https://gerrit.wikimedia.org/r/768014

Change 768194 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/QuickSurveys@master] Log bucketised edit count for eligible/impression events

https://gerrit.wikimedia.org/r/768194

Change 768194 merged by Eigyan:

[mediawiki/extensions/QuickSurveys@master] Log bucketised edit count for eligible/impression events

https://gerrit.wikimedia.org/r/768194

Change 768014 merged by jenkins-bot:

[schemas/event/secondary@master] analytics/legacy/quicksurveyinitiation: Add editCountBucket property

https://gerrit.wikimedia.org/r/768014

Greetings, Currently I am investigating why the EventLogging endpoint on beta cluster is yielding a 502 error for all events, including our QuickSurveyInitiation event that captures our new editCountBucket. Without the QuickSurveyInitiation event actually working in Beta Cluster we can't validate the new editCountBucket end-to-end. @TAndic

Screen Shot 2022-03-25 at 8.36.45 AM.png (206×1 px, 73 KB)

Hi folks,

not sure if it is https://gerrit.wikimedia.org/r/c/mediawiki/extensions/QuickSurveys/+/768194 or something else, but now QuickSurveyInitiation events carry attributes like "performanceNow":1648718876207.3, meanwhile the schema mentions integers, so validation fails. A change in the schema may be required, or a follow up in the JS code :)

Greetings @elukey the performanceNow attribute was recently added to the schema to support a new timestamp and I believe keeping it as an integer was an oversight. We will create a patch to address the issue.

Greetings @elukey after talking with the team it was determined that we do need to update performanceNow library variable with Math.round() to convert to an integer and this will resolve the validation error.

@TAndic is checking whether anything changed on her end, and will update ticket with her findings

From existing surveys which were deployed before this change (eg. the safety survey and performance survey) it appears that editCountBucket on impression comes up as a variable/data point which can be queried, but the query is returning as blanks, meaning this data is not being collected by these surveys. This assessment is from before and after the T305171 patch was completed, from looking at both the ptwiki safety survey data and querying the performance survey impressions on Superset.

My understanding to today:

  • @eigyan and I confirmed that this update was functional on beta and was able to collect editCountBucket on impression before deploying to production.
  • A potential reason that values were returning as NULL before the T305171 patch was because of the issue stated in said ticket. This patch may not have resolved it, BUT:
  • An open question is whether there is a potential that surveys which were deployed before this update will not pick up editCountBucket.
    • To test this, a potential next step could be to deploy a new test survey now that all current hurdles we're aware of are fixed to see if it picks up editCountBucket on impression.

Summary: from my view this is not yet picking up data, but we do not know whether it's an issue of when a survey was deployed or an issue separate from that.

Greetings @TAndic and @Madalina I wanted to bring you up to speed on this ticket and where we are now: We found that though deploying T305171 did resolve the datatype mismatch however we found an additional issue causing Null values being passed into the database and corrected that issue with T306638. We are confident this will resolve all issues with editBucketCount based on our testing, but in order to fully test end-to-end we agree with the path suggested

To test this, a potential next step could be to deploy a new test survey now that all current hurdles we're aware of are fixed to see if it picks up editCountBucket on impression.

Thanks @eigyan ! If we want to deploy a new test survey for this, would it make sense to begin thinking about the next iteration of the safety survey for June to avoid doubling up work?

Context: we're unsure if deploying the safety survey with the same configuration/name will make it so that users who answered in the 1st round but haven't cleared their browsers in the last 3 months will be sampled by QuickSurveys in the 2nd round. A potential way to work around this is to configure a new survey copying everything from the first one, which could also be used to test editCountBucket when we do QA.

I defer to your and @Madalina's judgement on how to proceed, and only suggest the above if you think it's helpful :) Happy to provide more information as well!

Greetings @TAndic I have shared my thoughts with @Madalina, in a retro meeting today on this, and am onboard with a test if you like. @Madalina said she will follow up with you in your meeting today and share the details.