Page MenuHomePhabricator

Write query for Editing Team to access edit completion rate
Open, MediumPublic

Description

Overview

To detect regressions and know whether our team's efforts are having the impact they are intended to, we'd like to be able to independently access the latest edit completion rate, one of our team's core metrics.

Use Cases

  • Detecting regressions in peoples' likelihood of completing the edits they start
    • T209995 (a drop-off in the percentage of edit sessions that reach ready) is an example of the kind of issue we anticipate being able to detect more quickly by having access to this data.

"Done"

  • A query that outputs edit completion rate broken out by
    • operating system
    • device
    • browser
    • editing interface (VisualEditor, 2017 wikitext editor)
  • Product has the ability to run the "Query" above, independent of Product Analytics

Event Timeline

FYI that i do not think "edit completion rate" exists as a query, @Neil_P._Quinn_WMF can probably be of help . Is this metric documented anywhere?

FYI that i do not think "edit completion rate" exists as a query, @Neil_P._Quinn_WMF can probably be of help . Is this metric documented anywhere?

I answered this directly in a meeting, but for the record, the metric isn't exhaustively documented anywhere, so the plan is for me to write the query. Might be good to do the documentation as part of this.

The only relevant work I can think of is the comparison of VE and Wikitext editing we did with these queries: https://github.com/wikimedia/analytics-limn-edit-data/tree/master/edit

The metrics were defined by James and implemented by me with oversight from James. We never did a proper assessment of what signal they carried or anything, but the data is still there:

https://analytics.wikimedia.org/datasets/periodic/reports/metrics/ (in the rates, sessions, failure_rates_by_type, and success_by_user_type folders, split out by wikitext vs. VE)

Reassigning to @MNeisler to discuss with @ppelberg as they go through priorities

I'm documenting some available example queries that can be used to calculate edit completion rates using eventlogging data.

For the queries below, edit completion rate is defined as the percent of edit attempt sessions that reach ready (identified by the 'ready' action) and complete an edit (identified by the 'saveSuccess' ) action.

Please note you will need to switch to the even_sanitized.editattemptstep table if you would like to look at data older than 90 days.

Edit completion rate by date, device, interface, browser and os.

Note: This groups by all suggested breakdowns. You can modify the select and group by portion of the query below to include only the relevant fields.

with attempts as (
    select
        to_date(min(dt)) as date,
        event.platform as device,
        event.editor_interface as interface,
        useragent.browser_family as browser,
        useragent.os_family as os,
        sum(cast(event.action = 'saveSuccess' as int)) >= 1 as completed
    from event.editattemptstep
    where
        event.action in ('ready', 'saveSuccess') and
    --Substitute with desired date range
        year >=2019
    group by event.editing_session_id, event.platform, event.editor_interface, useragent.browser_family,  useragent.os_family 
)
select
    date,
    device,
    interface,
    browser,
    os,
    sum(cast(completed as int)) / count(*) as edit_completion
from attempts
group by date, device, interface, browser, os

Edit completion rate by VE as Default bucket

with attempts as (
    select
        date_format(min(dt), "yyyy-MM-dd") as date,
        event.bucket as bucket,
        sum(cast(event.action = "saveSuccess" as int)) >= 1 as completed
    from event.editattemptstep
    where
        event.action in ("ready", "saveSuccess") and
        event.bucket in ("default-source", "default-visual") and
--Substitute with desited date range
        year >= 2019
    group by event.bucket, event.editing_session_id
)
select
    date,
    bucket,
    sum(cast(completed as int)) / count(*) as edit_completion_rate
from attempts
group by date, bucket

Mobile VE Edit Completion Rate on Wikipedias by Experience

with attempts as (
    select
        to_date(min(dt)) as date,
        sum(cast(event.action = 'saveSuccess' as int)) >= 1 as completed,
        min(event.user_id) = 0 as unregistered,
        min(event.user_editcount) between 1 and 4 as one_to_four_edits,
        min(event.user_editcount)  >= 5 AND min(event.user_editcount) < 100 as five_99_edits,
        min(event.user_editcount)  >= 100 AND min(event.user_editcount) < 1000 as onehunderd_999_edits,
        min(event.user_editcount)  >= 1000 as over_1000
    from event_sanitized.editattemptstep
    inner join canonical_data.wikis 
    on wiki = database_code and
    database_group ==  'wikipedia'
    where
        event.platform = 'phone' and
        event.editor_interface = 'visualeditor' and
        event.action in ('ready', 'saveSuccess') and
        year >=2019
    group by event.editing_session_id
)
select
    date,
    sum(cast(completed as int)) / count(*) as edit_completion,
    sum(cast(completed and unregistered as int)) / sum(cast(unregistered as int)) as unregistered,
    sum(cast(completed and one_to_four_edits as int)) / sum(cast(one_to_four_edits as int)) as one_to_four_edits,
    sum(cast(completed and five_99_edits as int)) / sum(cast(five_99_edits as int)) as five_99_edits,
    sum(cast(completed and onehunderd_999_edits as int)) / sum(cast(onehunderd_999_edits as int)) as onehunderd_999_edits,
    sum(cast(completed and over_1000 as int)) / sum(cast(over_1000 as int)) as over_1000_edits
from attempts
group by date

I will look into making this data accessible in Superset using Presto and aim to complete in Q2.

Based on current team priorities, I am aiming to complete this in Q3.

Based on current team priorities, I am aiming to complete this in Q3.

Today, @MNeisler and I came to agree this ticket is not a priority for this current quarter (Q4: April through June 2021).

We will consider this ticket again in the context of creating a set of dashboards to help the Editing Team monitor the performance of the various editing interfaces the team maintains.


Next step(s)

  • @ppelberg to create a ticket for the below:

...a set of dashboards to help the Editing Team monitor the performance of the various editing interfaces the team maintains.

Removing task assignee due to inactivity, as this open task has been assigned for more than two years. See the email sent to the task assignee on February 06th 2022 (and T295729).

Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.

If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".

Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.