Page MenuHomePhabricator

Investigate what proportion of edits are done by new and existing editors
Open, MediumPublic

Description

Audiences leaders would like to understand better how much content/value is created by different groups of editors to help us understand how we should balance our engineering efforts between different audiences. One aspect of this is how much content/value new editors contribute, to help us understand how valuable new editors are in themselves, leaving out their potential to turn into established editors.

Event Timeline

@Neil_P._Quinn_WMF please add stakeholders and "why"/value for this task, per triage. :)

MBinder_WMF moved this task from Triage to Backlog on the Product-Analytics board.Jun 21 2018, 8:18 PM
Vvjjkkii renamed this task from Investigate what proportion of edits are done by new and existing editors to hvbaaaaaaa.Jul 1 2018, 1:06 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
CommunityTechBot renamed this task from hvbaaaaaaa to Investigate what proportion of edits are done by new and existing editors.Jul 2 2018, 1:48 PM
CommunityTechBot lowered the priority of this task from High to Medium.
CommunityTechBot updated the task description. (Show Details)

Investigate what proportion of edits are done by new and existing editors.

Maybe the following is helpful? I've been working on this query to get the proportion of surviving pages created by new and experienced editors. This is adapted from https://github.com/wikimedia-research/2018-19-Language-annual-plan-metrics/blob/master/Language-metrics.ipynb

acc_r = hive.run("""
SELECT
    wiki_db, 
    date_format(event_timestamp, "YYYY-MM") AS month,
    IF(coalesce(datediff(event_timestamp, ssac.dt) > 252, true), "experienced", "new") AS user_experience,
    IF(revision_is_deleted_by_page_deletion, "deleted", "survived") AS status,
    count(*) AS num_articles_created
FROM wmf.mediawiki_history mh
LEFT JOIN event_sanitized.serversideaccountcreation ssac
ON
    ssac.event.username = event_user_text AND
    ssac.year >= 0
WHERE
    mh.snapshot = "{MWH_SNAPSHOT}" 
    AND mh.event_timestamp >= "{Y_START_DATE}" 
    AND event_entity = 'page' 
    AND event_type = "create"  
    AND wiki_db in ({india_wiki_dbs})
GROUP BY 
    wiki_db,
    date_format(event_timestamp, "YYYY-MM"), 
    IF(coalesce(datediff(event_timestamp, ssac.dt) > 252, true), "experienced", "new"), 
    IF(revision_is_deleted_by_page_deletion, "deleted", "survived")
""".format(**query_vars))
Restricted Application edited projects, added Product-Analytics; removed Product-Analytics (Kanban). · View Herald TranscriptOct 16 2019, 5:47 PM

We don't have any specific plans to work on this, although it would still be useful to research.