Page MenuHomePhabricator

Replies v2.0: determine what additional instrumentation is needed
Closed, ResolvedPublic

Description

In T243364, we implemented the instrumentation needed to understand how people are engaging with the new replying workflow.

This task is about defining what additional actions/features will be introduced in v1.1 and v2.0 [1] that will require instrumentation.

Methodology

What we instrument is dependent upon the questions we are trying to answer.

In the context of V2.0 of the Replying feature, below (see: "Instrumentation needs") are the questions we are depending on engagement metrics to help us answer.

Instrumentation needs

QuestionTicket
What – if any – changes should be made to the workflow before we begin an A/B test of the featureT247139 (see questions #1-#4)
What affects the likelihood someone gets a response to a comment they post/a conversation they start?T249762
What text input mode (source or visual) should be shown to people who do not already have a preference set?T247139 (see question #5)

Open questions

  • What event will be fired when someone types @ in the Reply tool's visual mode?
    • When someone types @, the following event will be fired: mwUsernameCompletion:window-open-from-sequence.
    • No additional work is needed for us to track mwUsernameCompletion:window-open-from-sequence events; it has already been implemented.
  • When Product Analytics queries asks/queries for all the of the events emitted when integration = discussiontools, what events should they currently expect to see? What are the actions that will have caused these events to be fired?

Done


  1. T235593

Event Timeline

ppelberg created this task.Feb 6 2020, 4:21 PM

Tentative steps decided in 1:1 with @ppelberg today -
V1.1 ---> V2.0 instrumentation: document features needing instrumentation as well as process for instrumenting those on T244498 for Maya to review; once reviewed, we will incorporate into Instrumentation spec
Step 1: document features, ordered by priority of release + metrics associated with each one
Step 2: ask maya to review features + metrics to determine, what – if any – additional instrumentation is needed
Step 3: create tickets for new instrumentation
Step 4 : document new instrumentation needs in instrumentation spec

Mayakp.wiki moved this task from Triage to Current Quarter on the Product-Analytics board.
ppelberg claimed this task.Feb 19 2020, 8:14 PM
ppelberg updated the task description. (Show Details)EditedFeb 26 2020, 5:11 PM

Tentative steps decided in 1:1 with @ppelberg today -
V1.1 ---> V2.0 instrumentation: document features needing instrumentation as well as process for instrumenting those on T244498 for Maya to review; once reviewed, we will incorporate into Instrumentation spec
Step 1: document features, ordered by priority of release + metrics associated with each one

I've updated the task description to include a list of the features we are planning to implement and instrument in v2.0 (T235924).

Next steps

  • "Step 2: @Mayakp.wiki to review features + metrics to determine, what – if any – additional instrumentation is needed"

This is waiting on @ppelberg

This ticket depends on us first knowing what questions we are seeking to answer with the data we are seeking to collect.

The work of defining the "questions we are seeking to answer" is happening in this ticket: T247139.

ppelberg updated the task description. (Show Details)Apr 8 2020, 7:38 PM

Next step
Next steps that came up in the conversation Maya and I had today:

  • Can EditAttemptStep be used to measure switching events between source and visual text input modes.

6-May meeting notes
@DLynch, @Mayakp.wiki, @MNeisler and I discussed the "approaches" listed below.

Next steps

  • Step 1: @Mayakp.wiki to review approaches and express an opinion about which she thinks is best
  • Step 2: we decide which approach to go with, balancing the needs of product, analytics and engineering.

Approaches

  • Approach #1: add new field to EditAttemptStep with updates to other tools
    • Considerations
      • Would require updating all tools that feed data into EditAttemptStep
      • Could introduce more complexity and therefore higher likelihood for mistakes/bugs stemming from needing to implement the same functionality differently across different tools.
    • Effort
      • Most consuming part of this would be finding the different tools and making adjustments to them.
  • Approach #1a: Put switching event in EditAttemptStep without updating other tools
    • Considerations
      • Tech debt: if we don't update other tools, schema is harder to understand b/c it becomes more conditional. E.g. some tools will log this action here, other tools will log this action elsewhere.
      • People in the future will have a harder time using schema
      • Number of changes is limited to DiscussionTools which lowers complexity and likelihood for mistakes/bugs
    • Effort
      • Least effort; most cost to our future selves
  • Approach #2: Put switching event in VEFeatureUse (proof oof concept implemented in T247139#6114691)
    • Considerations
      • This is a more robust approach; would enable us to more easily add instrumentation in the future (e.g. use of tools with in visual mode)
    • Effort
      • Similar to effort Approach #1.
        • Adding logging to diff. schema that isn't hooked up
        • Not a lot being added; repurposing existing events/code
  • Approach #3: create new schema for DiscussionToolsFeatureUse
    • Considerations
      • Deciding between 2 and 3: analysis becomes cleaner (would require one fewer join)
    • Effort
      • New schema
      • Exact copy of VEFutureUse with different name

6-May meeting notes
Next steps

  • Step 1: @Mayakp.wiki to review approaches and express an opinion about which she thinks is best

See below.

  • Step 2: we decide which approach to go with, balancing the needs of product, analytics and engineering.

Still #todo.


@Mayakp.wiki expressed a preference for "Approach #2: Put switching event in VEFeatureUse (proof of concept implemented in T247139#6114691)" with the following caveat:

  • Add some essential fields from EAS into VEfu, where "essential fields" include the following (prioritized per Maya):
    • Integration ↑↑↑↑
    • Editor_interface ↑↑↑↑
    • Platform ↑↑↑↑
    • User_id ↑↑↑↑
    • User_editcount ↑↑
    • Bucket ↑↑

Doing the above would make analysis easier as it would relieve Product Analytics from having to join VEFeatureUse with EditAttemptStep.

Notes:

  • If we do what is described above (add new fields to VEFeatureUse), we'd need [1] to write patches to make it so every other tool that uses VEFeatureUse is updated to include these additional fields as well. Where "every other tool..." means:
    • MobileFrontEnd
    • VisualEditor
    • WikiEditor
    • DiscussionTools

  1. Product Analytics shared that not doing this would be bad practice; tables should not have partial data.
  • Step 2: we decide which approach to go with, balancing the needs of product, analytics and engineering.

During the conversation @DLynch, @Mayakp.wiki, @MNeisler and I had yesterday, we decided DiscussionTool-related events will live in the VisualEditorFeatureUse schema and defined the series of subsequent changes that are needed to make this happen.

These changes and their associated tickets are described below. Also below: the rationale for how we arrived at this decision.

Decision

Put the Reply tool's switching event, and future #discussiontools-related events, in the VisualEditorFeatureUse schema [1][2] with the addition of the fields listed in "Implementation details" below."

  • This approach is an extension of Approach #2 described in T244498#6127236.

Rationale

We came to see putting the Reply tool's switching event, and future #discussiontools-related events, in the VisualEditorFeatureUse schema as being the approach that most closely satisfied the following requirements:

  • The time and effort required to do analysis should be minimized → All events should live in one schema
  • Coordination across teams to implement this change should be minimized → Now is not the right time to redesign EditAttemptStep (T118063)
  • We need to be able to extend whatever schema we decide on to include new events/tools as part of the Talk pages project.
  • It is not acceptable for a table to be populated with partial data

Implementation details

Step 1

Step 2

  • Write patches to replicate T252924 changes in the following interfaces: T252925
    • Wikieditor
    • Visual editor
    • MobileFrontend

Step 3

  • Pre-deployment QA (client): T252926
    • Ensure newly instrumented events are firing correctly by checking feature on beta and viewing events in the browser's console.
    • Ensure events are being logged correctly by checking events in log file

Step 4

  • Post-deployment QA (client): T252927
    • Ensure newly instrumented events are firing correctly by checking feature on production and viewing events in the browser's console.
  • Post-deployment QA (DB): T252930
    • Ensure newly instrumented events are being logged correctly by checking events in database.

  1. https://www.mediawiki.org/wiki/VisualEditor/FeatureUse_data_dictionary
  2. https://meta.wikimedia.org/wiki/Schema:VisualEditorFeatureUse

Hi @Nuria we plan to add the following new fields to VisualEditorFeatureUse Schema.

  • Integration
  • Editor_interface
  • Platform
  • User_id
  • User_editcount
  • Bucket

Once the instrumentation is deployed, will this be handled automatically or are there any manual steps (by Analytics or other team) that will be required to make this happen ?

ppelberg updated the task description. (Show Details)May 21 2020, 1:35 AM

Meeting notes
Notes from today's meeting with @Mayakp.wiki and @MNeisler:

Next steps

Nuria added a comment.May 22 2020, 4:58 AM

@Mayakp.wiki addition of fields are handled automatically

ppelberg added a comment.EditedMay 22 2020, 9:02 PM

Consequences of using VEFeatureUse
Documenting a conversation David L. and I had over chat and the questions that resulted from it...

CONFIRMED

  • By using Schema:VisualEditorFEatureUse for the Reply tool, we will be able to track usage of tools like @-mentioning (e.g. window-open-from-trigger) without needing to add any additional events. [i]

OPEN QUSETIONS
@DLynch: a couple additional questions we talked about in chat, posting them here so we have the answers in context.

  • 1. How – if at all – would we currently distinguish between the wikilink and user link (@-mention) sequences? I'm assuming both of the below will produce link:window-open-from-trigger events. [ii]
    • Links: [ + [
    • @ mention: @
  • 2. Below is a list of other actions we may be curious to know about and why...can you review and comment what – if anything – in the below needs adjusting? "B)" in the able below is the pretty much the same as question "1." above.

DecisionQuestionAssociated events
A) Which tools should be shown in the visual mode's toolbar? How should they be arranged?Which of the visual mode's tools are people using?E.g. bolding textStyle/bold; italicizing textStyle/italic;
B) Should an @ icon be added to the visual mode's toolbar?What percentage of comments posted using the reply tool involve people invoking the @-mention tool?E.g. ???:window-open-from-trigger
C) Should we prioritize adding support for extensions within indented comments (T251633)?What percentage of comments posted using the reply tool involve people using syntaxhighlighting?E.g. syntaxhighlightDialog:window-open-from-context

i. Note for posterity: we would get the same advantages had we taken the, "make a new schema DiscussionToolsFeatureUse that’s just a clone of VEFU” approach).
ii. I see there is an event for flowMention in the data dictionary, although I assume said event would not be fired for @ mentions considering the tool is linking to user pages rather than a specialized template.

Once the instrumentation is deployed, will this be handled automatically or are there any manual steps (by Analytics or other team) that will be required to make this happen ?

As far as I know, it's all kinda magic -- once I update the Schema:VisualEditorFeatureUse wiki-page and bump the version number that the events are being sent to, it creates the table automatically when events are sent.

  1. How – if at all – would we currently distinguish between the wikilink and user link (@-mention) sequences? I'm assuming both of the below will produce link:window-open-from-trigger events. [ii]

Depends when you mean. The @-mention dropdown will use the mwUsernameCompletion feature (and window-open-from-sequence) rather than link -- once the link is actually created, we haven't done anything with T252083 yet, so that'd still be link.

Change 598542 had a related patch set uploaded (by DLynch; owner: DLynch):
[mediawiki/extensions/WikimediaEvents@master] VisualEditorFeatureUse: bump schema version

https://gerrit.wikimedia.org/r/598542

Nuria added a comment.May 26 2020, 4:13 AM

it's all kinda magic -- once I update the Schema:VisualEditorFeatureUse wiki-page and bump the version number that the events are being sent to, it creates the table automatically when events are sent.

To be super clear , a table is created just once when the 1st event flows in, subsequent changes to the schema (that are backwards compatible) only add columns to the table. Non backwards compatible changes are not supported, in this case we are just adding three fields so that means the table will get three more 'columns'.

Nuria added a comment.May 26 2020, 4:47 PM

@DLynch : I think this is the case but, again, to be super clear: to keep things backwards compatible the new properties should not be required, otherwise all events coming with the old schema version will fail validation . They should be optional.

@DLynch : I think this is the case but, again, to be super clear: to keep things backwards compatible the new properties should not be required, otherwise all events coming with the old schema version will fail validation . They should be optional.

My understanding was that we wanted to update everything in one quick burst of patches, so everything is sending those properties. If you think it's better to make them optional first, and later shift it to required, I can do that as well.

Nuria added a comment.May 26 2020, 4:53 PM

My understanding was that we wanted to update everything in one quick burst of patches, so everything is sending those properties

If these events are client side the clients that have cached the old javascript (and there will be quite a few) will still send old events. Schemas (to be persisted) need to be though as always evolving with backwards compatible changes, thus an addition of a field needs to be optional.

Please see: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Schema_Guidelines#Guidelines

Okay, schema-change updated to not have anything new that's required.

  1. How – if at all – would we currently distinguish between the wikilink and user link (@-mention) sequences? I'm assuming both of the below will produce link:window-open-from-trigger events. [ii]

Depends when you mean. The @-mention dropdown will use the mwUsernameCompletion feature (and window-open-from-sequence) rather than link -- once the link is actually created, we haven't done anything with T252083 yet, so that'd still be link.

Got it. To be doubly sure I'm understanding, @DLynch: can you correct any inaccuracies and/or add any information that's missing to the below?

  1. When someone types @, the following event will be fired: mwUsernameCompletion:window-open-from-sequence.
  2. No additional work is needed for us to track mwUsernameCompletion:window-open-from-sequence events.
ppelberg updated the task description. (Show Details)May 27 2020, 8:50 PM

Change 598542 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] VisualEditorFeatureUse: bump schema version

https://gerrit.wikimedia.org/r/598542

Got it. To be doubly sure I'm understanding, @DLynch: can you correct any inaccuracies and/or add any information that's missing to the below?

  1. When someone types @, the following event will be fired: mwUsernameCompletion:window-open-from-sequence.
  2. No additional work is needed for us to track mwUsernameCompletion:window-open-from-sequence events.

The above is correct per conversation with @DLynch.

The above is now reflected in the task description.

ppelberg updated the task description. (Show Details)Jun 2 2020, 9:17 PM
ppelberg reassigned this task from ppelberg to Mayakp.wiki.Jun 10 2020, 7:16 PM

@ppelberg to ensure this is closed out when talking to @Mayakp.wiki today

Assigning this task over to @Mayakp.wiki to confirm the below with @DLynch.

  • When Product Analytics queries asks/queries for all the of the events emitted when integration = discussiontools, what events should they currently expect to see? What are the actions that will have caused these events to be fired?

Confirmed with @DLynch. All events and the data we expect to see in VisualEditorFeatureUse schema has been updated in the Instrumentation Spec.
We will update the data dictionary only after we complete data QA of the fields in production.

Mayakp.wiki updated the task description. (Show Details)Jun 12 2020, 11:10 PM
ppelberg closed this task as Resolved.Jun 15 2020, 10:21 PM

Confirmed with @DLynch. All events and the data we expect to see in VisualEditorFeatureUse schema has been updated in the Instrumentation Spec.
We will update the data dictionary only after we complete data QA of the fields in production.

Excellent. I am resolving this task considering the work to update the data dictionary will happen after T252930 AND the work to update the data dictionary is already represented in T254291.