Page MenuHomePhabricator

QA replying workflow (v1.0) instrumentation
Closed, ResolvedPublic

Description

This task involves the work with making sure we are logging data in such a way that we will be able to conduct a quantitative analysis of how people are engaging with the new replying workflow and the impact its deployment is having on user behavior.

Events to check

See the Event::Schema sheet in this Google Sheets workbook: Talk pages/Replying/Instrumentation spec

Open questions

  • Where and how should results of this QA be documented?

Results can be viewed and documented in this Template created for QAing the Replying features

Done

StatusDescriptionOwner
not startedFor each action [1] in the Talk pages/Replying/Instrumentation spec , document what event is fired [2]@Ryasmeen
not startedFor each event that is fired [2], document whether that event is being logged in the database@Mayakp.wiki

  1. "Action" = workbook: Talk pages/Replying/Instrumentation spec> sheet: Event::Schema > column: Event Description
  2. "Event that is fired" = workbook: Talk pages/Replying/Instrumentation spec > sheet: Event::Schema > column: Schema + Field

Event Timeline

Next step

  • This week, we will meet to discuss the QA process and determine where and how the results of this QA should be documented.
nshahquinn-wmf triaged this task as Medium priority.

Discussed QA process with DL, PP, RY on 02-18 And met with Rummana to walk thru the events on 02-19 this week.
Rummana will be doing one round of testing on Beta Cluster prior to production release to see if events are firing correctly. QC template for recording the events is available here and also posted in the description of the task.

When @Ryasmeen tries to abandon the page, by navigating to another page it clears the URL and the events do not get tracked. There might be other ways to test these, but we need feedback from @DLynch to confirm those ways. Please reference https://docs.google.com/spreadsheets/d/1txtypYKZHCiZEFnkye9LX7riF3VEippMmvN6ILs_BUQ/edit#gid=1596206308

Change 574919 had a related patch set uploaded (by DLynch; owner: DLynch):
[mediawiki/extensions/DiscussionTools@master] Instrumentation: abort-navigate case

https://gerrit.wikimedia.org/r/574919

There's a combination of missing and questionably-applicable involved here:

First, the navigate case wasn't firing, and the patch fixes that.

After that we have navigate-back, which represents a not-actually-browser-navigation way of exiting the editor in VE which is an artifact of VE/NWE being full-page takeovers of an article page that aren't actually fresh page-loads. A side-effect of this is that we can't reliably detect navigate-back when not in a page-takeover, so I don't think we can get it detected on this DT experience at all.

navigate-read fires in VE when you click the "read" tab or when you exit the editor by pressing the escape key. The former we could detect but it seems strange and pointless. The latter does still close the DT widget, but it's wildly semantically different, so maybe we shouldn't?

I personally would consider the patch landing and adding in just navigate to leave us in a reasonable state.

There's a combination of missing and questionably-applicable involved here...

Thank you for laying this out, David. Would I be correct to understand https://gerrit.wikimedia.org/r/574919 as having the effects listed below? If yes, then +1 to what you suggested in T244874#5918759: "...consider the patch landing and adding in just navigate to leave us in a reasonable state."


Patch: 574919 effects:

  • As it is currently written, https://gerrit.wikimedia.org/r/574919 enables us to detect when a contributor aborts their comment by navigating to another page and whether they have started typing a comment at the time when they are aborting their comment.
  • As it is currently written, https://gerrit.wikimedia.org/r/574919 does NOT enable us to detect what exactly a contributor did (e.g. clicks "Read" tab or clicks the browser's "back button") to abort their comment.

Change 574919 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Instrumentation: abort-navigate case

https://gerrit.wikimedia.org/r/574919

matmarex removed a project: Patch-For-Review.
matmarex subscribed.

We found that https://gerrit.wikimedia.org/r/574919 doesn't actually work right now due to a recent regression in EventLogging: T246382. @DLynch debugged that and proposed a patch, but we'll probably want someone from Analytics and/or Performance to review it. Once that bug is fixed, our abandon logging should start working.

The not-logging issue should also currently be affecting VisualEditor, as I understand it.

Proposed plan for QA as discussed in 02-27 meeting with PP,DL,RY,MK,JT

Thursday, 27-Feb: instrumentation is live in production (does not include navigate)
RY not able to test any abandon (navigate) events in production
RY finishes testing all events on production EXCEPT for abandon (navigate) events
✅MK: try to give DL numbers on abandon (navigate)

Friday, 28-Feb – Monday, 2-Mar: all events that have been "fired" by RY
MK: completes data QA to ensure events are landing in DB as we expect them to

Monday/Tuesday, 2/3-March:
As soon as T244874 (Patch: 574919) and T246382 (Patch: 575314) are +2'd, RY can start testing abandon-navigate events on Beta Cluster using ?trackdebug=1, but for Maya to check the logs on Beta the second patch needs to be merged.
Maya will provide list of issues found

Thursday, 5-March: T244874 and 575314 land in production (assuming train runs as we expect)
Soonest MK can do production Data QA for abandon-navigate events
Soonest RY can make sure abandon-navigate events are firing in production

We found that https://gerrit.wikimedia.org/r/574919 doesn't actually work right now due to a recent regression in EventLogging: T246382. @DLynch debugged that and proposed a patch, but we'll probably want someone from Analytics and/or Performance to review it. Once that bug is fixed, our abandon logging should start working.

Note: Any impacts due to the patch not landing on time will alter the timeline given above.

Change 575378 had a related patch set uploaded (by Bartosz Dziewoński; owner: DLynch):
[mediawiki/extensions/DiscussionTools@master] Correct the integration for logging

https://gerrit.wikimedia.org/r/575378

Change 575378 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Correct the integration for logging

https://gerrit.wikimedia.org/r/575378

My understanding of the current state is: almost everything is as-expected.

There are two issues which fall under the aegis of being not-right but also known-and-expected:

  • T244942 means there's one way to escape the page without triggering an abort (and, more importantly, without asking "are you sure?" for people with unposted comments)
  • T246382 means that any abort events of type navigate will be recorded very-unreliably. That's a platform-wide issue, not one with the code in DiscussionTools, though.

My understanding of the current state is: almost everything is as-expected...

This is a helpful summary – thank you, David.

Below are the next steps we talked about in chat.

Next steps:

Change 578317 had a related patch set uploaded (by Bartosz Dziewoński; owner: DLynch):
[mediawiki/extensions/EventLogging@wmf/1.35.0-wmf.22] Make BackgroundQueue more aware of page unload flow

https://gerrit.wikimedia.org/r/578317

Great. @Ryasmeen, the above means we are technically [1] able to start testing the below as soon as this afternoon (PST):

  • 2. Timing TBD (depends on SWAT): Confirm whether the abandon-navigate events are firing in the console

Next steps

  • 2. As soon as 9-March, afternoon PST: Confirm whether the abandon-navigate events are firing in the console
  • 3. Timing TBD (depends on event firing QA): Confirm whether the abandon-navigate events are showing up in the DB.

  1. Emphasis on "technically" considering today is a staff holiday

Nobody was available to actually deploy the patches… I rescheduled it for tomorrow.

Change 578317 merged by jenkins-bot:
[mediawiki/extensions/EventLogging@wmf/1.35.0-wmf.22] Make BackgroundQueue more aware of page unload flow

https://gerrit.wikimedia.org/r/578317

Mentioned in SAL (#wikimedia-operations) [2020-03-10T11:40:35Z] <lucaswerkmeister-wmde@deploy1001> Synchronized php-1.35.0-wmf.22/extensions/EventLogging/: SWAT: [[gerrit:578317|Make BackgroundQueue more aware of page unload flow (T246382, T244874)]] (duration: 00m 58s)

matmarex removed a project: Patch-For-Review.

The fix is deployed now and ready for QA in production.

I am done checking all the eventlogging for this both on Beta and Production. Once Maya verifies her part, we can close it.

Checked Specific User events by @Ryasmeen as well as others using Integration field. Confirmed that data is flowing to the data lake as expected with the correct action and dependent fields on all 4 target wikis.
Here is the link to the data-QA status

Checked Specific User events by @Ryasmeen as well as others using Integration field. Confirmed that data is flowing to the data lake as expected with the correct action and dependent fields on all 4 target wikis.
Here is the link to the data-QA status

!!