Page MenuHomePhabricator

Instrument editing pipeline to be able to figure out which common editing features are used
Closed, ResolvedPublic

Description

T202133 aims to improve our understanding of which common editing features are used on mobile and desktop, and how they differ. This data is presently not recorded in the editing data pipeline in Schema:Edit on Meta-Wiki, so computing these statistics will not be possible right now. The pipeline should be instrumented to record the necessary data.

From the description of T202133, the common tasks that need instrumentation adding are:

  • add or modify image
  • add or modify citation
  • add or modify links
  • add or remove text formatting, for example bold and italics

There are lots of other tasks that could also have instrumentation added, but this is the set we'll start with.

The changes to the schema should be checked with @Neil_P._Quinn_WMF to make sure that they will give him the data he needs to compute the statistics.

Details

Related Gerrit Patches:
mediawiki/extensions/VisualEditor : masterUpdate VE core submodule to master (2465e0e60)
VisualEditor/VisualEditor : masterTrack feature-use activity for formatting
mediawiki/extensions/VisualEditor : masterUpdate VE core submodule to master (6038e6946)
VisualEditor/VisualEditor : masterTrack feature-use activity for tables
mediawiki/extensions/VisualEditor : mastertrackSubscriber: log activity events for VisualEditorFeatureUse

Event Timeline

Deskana triaged this task as High priority.Aug 17 2018, 4:20 PM
Deskana created this task.
Deskana moved this task from To Triage to Up next on the VisualEditor board.
Esanders added a subscriber: Esanders.EditedAug 17 2018, 4:59 PM

It is likely that the implementation of this will hook into things like ve.ui.Action, (where most toolbar/user commands get routed through), and the WindowManager (which opens all dialogs and inspectors). This means we will likely end up tracking more actions than just those listed here by default (which is fine). We would also track *every* invocation of that action by default, which could either be useful (e.g. how many references are added per session) or it could result it too much data being logged?

If we can instrument all actions in one fell swoop, then we should do that. Instrumenting everything like that will definitely require per-session sampling though, which is a question that @Neil_P._Quinn_WMF can best answer. Neil, what kind of sampling rate should we use for that?

FYI the current VE sampling rate is 1/16 (6.25%)

Deskana moved this task from Up next to Current work on the VisualEditor board.Aug 22 2018, 5:25 PM
Deskana edited projects, added VisualEditor (Current work); removed VisualEditor.
Esanders added a comment.EditedAug 23 2018, 11:05 AM

@Deskana: Perhaps we should have a meeting to brainstorm on this? There are probably other things to measure we may have missed (e.g. auto-save being invoked), and smart ideas about how to implement this.

@Deskana: Perhaps we should have a meeting to brainstorm on this? There are probably other things to measure we may have missed (e.g. auto-save being invoked), and smart ideas about how to implement this.

Good plan! Arranged something for later today.

dchan added a subscriber: dchan.Aug 23 2018, 5:19 PM

I don't know much about this, but will/can we get the data broken down by wiki (and therefore effectively by language)? That could be pretty useful for script / IME support purposes

DLynch added a subscriber: DLynch.Aug 23 2018, 6:05 PM

Meeting outcome: we seemed happy with the idea that we'd log some specific window-opened events for everything non-annotation, as we're more concerned with interacting with features rather than success. (I.e. "add image", "modify image", "opened the image inspector and didn't modify it", and "opened the add image dialog and then canceled" would all be in the same "opened the image dialog" event.)

I'm going to go investigate some technical details of how annotations are set and cleared, to see what events might be triggered. (e.g. does clearing an annotation go through AnnotationAction->clear or surfaceFragment->annotateContent? When toggling via keyboard shortcut vs toolbar button vs context remove-button?) Basically, do we need to add logging to more than just ve.ui.AnnotationAction?

JTannerWMF assigned this task to DLynch.Aug 23 2018, 6:17 PM

@dchan I think we do by default, but perhaps @Neil_P._Quinn_WMF can confirm?

Summary of what I observed: SurfaceFragment.annotateContent is the underpinning of most of this. Logging on it would catch use of formatting annotations and links, but not using formatting annotations without any text content -- that's done by changing the insertion annotations, not actually setting any annotation.

(If we wanted to learn things like how common it is to turn on formatting and then type, versus adding formatting to existing text, the already-existing distinction in AnnotationAction.toggle would make that really easy to log. I don't know if it matters, but it does intuitively seem like a thing which could change greatly between mobile and desktop.)

Clearing formatting via the toolbar command (or cmd-\) goes through AnnotationAction. It results in a flurry of annotateContent calls, because it individually clears each one.

Link actions use SurfaceFragment.anotateContent directly rather than going through AnnotationAction. The context item's remove-annotation button uses SurfaceFragment directly, unlike the other clear-formatting method. Setting or editing a link via the popup does an annotateContent clear and then set. Because of the live preview, opening the add-link dialog at all gets us a set, though opening it to edit an existing link doesn't.

If we log via SurfaceFragment.anotateContent, we should either filter out all the link-related actions in favor of logging attached to the inspector, or exempt link/language dialogs from the inspector logging.

That said, it's slightly more work for us in terms of places to add logging to, but I think we'd get cleaner results by logging the various AnnotationAction methods, and separately adding logging to opening the link dialog and the link context's remove-link button. Avoids the mixing up of link annotation actions with the rest of them, and doesn't overrepresent the user-action weight of things like clearing annotations from a heavily-annotated bit of text. (I.e. Neil probably has to do less data-wrangling after we've logged it.)

I've worked on a few mobile bugs recently around the copy/paste menus. It would be good to have numbers on the usage of copy/paste on mobile, to help us make UX decisions.

Useful thing Ed suggested in meeting: auto-save recovery?

To add to @DLynch's comment, @Esanders proposed adding auto-save recovery to our logging in order to provide a metric that would get us close to understanding how many users are affected by mid-edit memory unloading due to tab or application switching on mobile (especially iOS). Though that is not the only reason we would see autosave recoveries on mobile, it may help us understand whether this is a significant use case and prioritize it appropriately.

From the mobile report:

Mobile OSes often unload pages from memory if you switch tab or application, especially iOS. This would result in the editor being reloaded when you return to that tab. Fortunately with auto-save this is somewhat less catastrophic, but there may be more work to remember other user state (e.g. if the user was in the middle of creating a citation)

I don't know much about this, but will/can we get the data broken down by wiki (and therefore effectively by language)? That could be pretty useful for script / IME support purposes

For the record, yes, it will. EventLogging events are automatically bundled with the wiki, among other things.

I've worked on a few mobile bugs recently around the copy/paste menus. It would be good to have numbers on the usage of copy/paste on mobile, to help us make UX decisions.

Useful thing Ed suggested in meeting: auto-save recovery?

I don't see any problem with adding these (except potentially data volume, but we'd deal with that by dialing back the sampling rather than removing instrumentation). Auto-save recovery in particular is a bit different from the things we have in there currently, since it's not manually triggered by the user, but that's fine.

Auto-save could go in the other schema (for editor life cycle events)

Auto-save could go in the other schema (for editor life cycle events)

It is somewhat conceptually akin to the bits in there already. It's sort of a variant on init or ready, after all.

Oh, thing to consider with this: we explicitly disable tracking for any session where we have to ask about restoring the autosave (i.e. any session where a new revision has been made since the autosave happened).

This is probably a small source of sessions which have an init and then nothing else, since I think the sequencing would work out that way.

Oh, thing to consider with this: we explicitly disable tracking for any session where we have to ask about restoring the autosave (i.e. any session where a new revision has been made since the autosave happened).
This is probably a small source of sessions which have an init and then nothing else, since I think the sequencing would work out that way.

Oh, interesting. Out of curiosity, why do we disable tracking? Would those sessions produce weird data?

It's because those sessions stick a confirmation dialog in the middle of the loading process, between init and ready, and pause the process until the user chooses what to do with their autosave. Since most of this stuff was added for us to do performance testing, introducing a fairly randomly large delay into the timing there would throw the numbers off.

(The exact method chosen for disabling it is also going to turn out to disable the activity metrics we're adding here. I could work around that if you want, or not if you don't think it's a big deal.)

What's the progress on this task? It appears to have sat here for quite a while, but I know from talking to people that the implementation is in progress.

@Deskana: Sorry, we kinda answered that on the call yesterday, but putting it here for documentation and expansion... there's a patch for the actual instrumentation, and another for adding the schema to the events extension so we can use it. Once both of those are there, one last patch will actually hook up the logging in the MW-VE extension itself. (It can't go in with the WikimediaEvents patch, since it needs access to the session ID which the VE extension generates and doesn't expose...)

Change 466695 had a related patch set uploaded (by DLynch; owner: DLynch):
[mediawiki/extensions/VisualEditor@master] trackSubscriber: log activity events for VisualEditorFeatureUse

https://gerrit.wikimedia.org/r/466695

Change 466695 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] trackSubscriber: log activity events for VisualEditorFeatureUse

https://gerrit.wikimedia.org/r/466695

Neil_P._Quinn_WMF closed this task as Resolved.Nov 1 2018, 2:36 PM

This data started flowing in for real on October 26, and so far everything looks good. See P7750 for some quick validation queries I ran.

Change 490661 had a related patch set uploaded (by DLynch; owner: DLynch):
[VisualEditor/VisualEditor@master] Track feature-use activity for tables

https://gerrit.wikimedia.org/r/490661

Change 490661 had a related patch set uploaded (by DLynch; owner: DLynch):
[VisualEditor/VisualEditor@master] Track feature-use activity for tables
https://gerrit.wikimedia.org/r/490661

@DLynch, if you look at the full list of logged features we already have a table, window-open event being logged. Will this be different from that?

@DLynch, if you look at the full list of logged features we already have a table, window-open event being logged. Will this be different from that?

Oh, your commit message says tables are "the only part of the "insert" menu which wasn't being tracked by the dialog-tracking (apart from opening the properties on an existing table.)" That makes sense.

Thank you, this is a very nice addition!

Change 490661 merged by jenkins-bot:
[VisualEditor/VisualEditor@master] Track feature-use activity for tables

https://gerrit.wikimedia.org/r/490661

Change 491389 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (6038e6946)

https://gerrit.wikimedia.org/r/491389

Change 491389 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (6038e6946)

https://gerrit.wikimedia.org/r/491389

Change 491807 had a related patch set uploaded (by DLynch; owner: DLynch):
[VisualEditor/VisualEditor@master] Track feature-use activity for formatting

https://gerrit.wikimedia.org/r/491807

Change 491807 merged by jenkins-bot:
[VisualEditor/VisualEditor@master] Track feature-use activity for formatting

https://gerrit.wikimedia.org/r/491807

Change 492195 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (2465e0e60)

https://gerrit.wikimedia.org/r/492195

Change 492195 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (2465e0e60)

https://gerrit.wikimedia.org/r/492195