Page MenuHomePhabricator

Write query for Editing Team to access edit completion rate
Open, MediumPublic

Description

Overview

To detect regressions and know whether our team's efforts are having the impact they are intended to, we'd like to be able to independently access the latest edit completion rate, one of our team's core metrics.

"Done"

  • A query that outputs edit completion rate broken out by
    • operating system
    • device
    • browser
    • editing interface (VisualEditor, 2017 wikitext editor)
  • Product has the ability to run the "Query" above, independent of Product Analytics

Event Timeline

ppelberg created this task.May 17 2019, 2:00 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 17 2019, 2:00 AM
ppelberg updated the task description. (Show Details)May 17 2019, 2:12 AM
JTannerWMF moved this task from To Triage to Q4 on the VisualEditor board.May 21 2019, 3:06 PM
Nuria added a subscriber: Nuria.Jun 10 2019, 4:07 PM

FYI that i do not think "edit completion rate" exists as a query, @Neil_P._Quinn_WMF can probably be of help . Is this metric documented anywhere?

FYI that i do not think "edit completion rate" exists as a query, @Neil_P._Quinn_WMF can probably be of help . Is this metric documented anywhere?

I answered this directly in a meeting, but for the record, the metric isn't exhaustively documented anywhere, so the plan is for me to write the query. Might be good to do the documentation as part of this.

The only relevant work I can think of is the comparison of VE and Wikitext editing we did with these queries: https://github.com/wikimedia/analytics-limn-edit-data/tree/master/edit

The metrics were defined by James and implemented by me with oversight from James. We never did a proper assessment of what signal they carried or anything, but the data is still there:

https://analytics.wikimedia.org/datasets/periodic/reports/metrics/ (in the rates, sessions, failure_rates_by_type, and success_by_user_type folders, split out by wikitext vs. VE)

nshahquinn-wmf triaged this task as Medium priority.Jul 12 2019, 5:01 PM

Reassigning to @MNeisler to discuss with @ppelberg as they go through priorities

MNeisler moved this task from Next Up to Backlog on the Product-Analytics board.Sep 24 2019, 4:30 PM

I'm documenting some available example queries that can be used to calculate edit completion rates using eventlogging data.

For the queries below, edit completion rate is defined as the percent of edit attempt sessions that reach ready (identified by the 'ready' action) and complete an edit (identified by the 'saveSuccess' ) action.

Please note you will need to switch to the even_sanitized.editattemptstep table if you would like to look at data older than 90 days.

Edit completion rate by date, device, interface, browser and os.

Note: This groups by all suggested breakdowns. You can modify the select and group by portion of the query below to include only the relevant fields.

with attempts as (
    select
        to_date(min(dt)) as date,
        event.platform as device,
        event.editor_interface as interface,
        useragent.browser_family as browser,
        useragent.os_family as os,
        sum(cast(event.action = 'saveSuccess' as int)) >= 1 as completed
    from event.editattemptstep
    where
        event.action in ('ready', 'saveSuccess') and
    --Substitute with desired date range
        year >=2019
    group by event.editing_session_id, event.platform, event.editor_interface, useragent.browser_family,  useragent.os_family 
)
select
    date,
    device,
    interface,
    browser,
    os,
    sum(cast(completed as int)) / count(*) as edit_completion
from attempts
group by date, device, interface, browser, os

Edit completion rate by VE as Default bucket

with attempts as (
    select
        date_format(min(dt), "yyyy-MM-dd") as date,
        event.bucket as bucket,
        sum(cast(event.action = "saveSuccess" as int)) >= 1 as completed
    from event.editattemptstep
    where
        event.action in ("ready", "saveSuccess") and
        event.bucket in ("default-source", "default-visual") and
--Substitute with desited date range
        year >= 2019
    group by event.bucket, event.editing_session_id
)
select
    date,
    bucket,
    sum(cast(completed as int)) / count(*) as edit_completion_rate
from attempts
group by date, bucket

Mobile VE Edit Completion Rate on Wikipedias by Experience

with attempts as (
    select
        to_date(min(dt)) as date,
        sum(cast(event.action = 'saveSuccess' as int)) >= 1 as completed,
        min(event.user_id) = 0 as unregistered,
        min(event.user_editcount) between 1 and 4 as one_to_four_edits,
        min(event.user_editcount)  >= 5 AND min(event.user_editcount) < 100 as five_99_edits,
        min(event.user_editcount)  >= 100 AND min(event.user_editcount) < 1000 as onehunderd_999_edits,
        min(event.user_editcount)  >= 1000 as over_1000
    from event_sanitized.editattemptstep
    inner join canonical_data.wikis 
    on wiki = database_code and
    database_group ==  'wikipedia'
    where
        event.platform = 'phone' and
        event.editor_interface = 'visualeditor' and
        event.action in ('ready', 'saveSuccess') and
        year >=2019
    group by event.editing_session_id
)
select
    date,
    sum(cast(completed as int)) / count(*) as edit_completion,
    sum(cast(completed and unregistered as int)) / sum(cast(unregistered as int)) as unregistered,
    sum(cast(completed and one_to_four_edits as int)) / sum(cast(one_to_four_edits as int)) as one_to_four_edits,
    sum(cast(completed and five_99_edits as int)) / sum(cast(five_99_edits as int)) as five_99_edits,
    sum(cast(completed and onehunderd_999_edits as int)) / sum(cast(onehunderd_999_edits as int)) as onehunderd_999_edits,
    sum(cast(completed and over_1000 as int)) / sum(cast(over_1000 as int)) as over_1000_edits
from attempts
group by date