Page MenuHomePhabricator

Analysis of disambiguator-link-added tag
Closed, ResolvedPublic3 Estimated Story Points

Description

Now that the disambiguator-link-added tag is in production (T287549), we can do analysis to determine what kind of editors and user experience levels are most likely to add dablinks. This can help us decide what areas we need to focus on for further work on the disambiguation wish.

Acceptance criteria:

Among edits that add dablinks, get the percentages for:

  • VE vs wikitext
  • Mobile vs desktop
  • New user (< 50 edits, let's say) vs non-new
    • For easier analysis, in this case we'll consider logged out users as new users

Event Timeline

For all the data I gathered, I used some variation of the following queries (they could probably be written better, but these should work):

Counting total dablinks
SELECT COUNT(rc_id) FROM recentchanges
JOIN change_tag ON rc_id = ct_rc_id
WHERE rc_namespace = 0
  AND rc_bot = 0
  AND ct_tag_id = 612 /* disambiguator-link-added */
  /* Only query against a finite sample of edits */
  AND rc_id BETWEEN 1413974685 AND 1415351713
Counting dablinks and filter by another tag
SELECT COUNT(DISTINCT rc_id) FROM (
  SELECT rc_id FROM recentchanges
  JOIN change_tag ON rc_id = ct_rc_id
  WHERE rc_namespace = 0
  AND rc_bot = 0
  AND ct_tag_id = 612 /* disambiguator-link-added */
  AND rc_id BETWEEN 1413974685 AND 1415351713
) a
JOIN change_tag ON rc_id = ct_rc_id
WHERE ct_tag_id = (
  SELECT ctd_id FROM change_tag_def
  WHERE ctd_name = 'visualeditor'
)
Filtering by multiple tags
SELECT COUNT(DISTINCT rc_id) FROM (
  SELECT rc_id, rc_actor FROM (
    SELECT rc_id, rc_actor FROM recentchanges
    JOIN change_tag ON rc_id = ct_rc_id
    WHERE rc_namespace = 0
    AND rc_bot = 0
    AND ct_tag_id = 612 /* disambiguator-link-added */
    AND rc_id BETWEEN 1413974685 AND 1415351713
  ) a
  JOIN actor_recentchanges ON actor_id = rc_actor
  JOIN change_tag ON rc_id = ct_rc_id
  WHERE actor_user IS NULL
    AND ct_tag_id = (
      SELECT ctd_id FROM change_tag_def
      WHERE ctd_name = 'visualeditor'
    )
) b
JOIN change_tag ON ct_rc_id = rc_id
WHERE ct_tag_id = (
  SELECT ctd_id FROM change_tag_def
  WHERE ctd_name = 'mobile web edit'
)
Filtering by user experience
SELECT COUNT(DISTINCT rc_id) FROM (
  SELECT rc_id, rc_actor FROM recentchanges
  JOIN change_tag ON rc_id = ct_rc_id
  WHERE rc_namespace = 0
  AND rc_bot = 0
  AND ct_tag_id = 612 /* disambiguator-link-added */
  AND rc_id BETWEEN 1413974685 AND 1415351713
) a
JOIN actor_recentchanges ON actor_id = rc_actor
JOIN user ON actor_user = user_id AND actor_user IS NOT NULL
WHERE user_editcount < 1000

The following data is gathered from a sample size of 3,954 mainspace, non-bot edits on the English Wikipedia.

All users

  • Total edits: 3,954
  • Mobile edits (app and mobile web): 950 (24%)
  • Mobile web edits (VE and wikitext): 873 (22%)
  • Mobile app edits (wikitext): 77 (2%)
  • Desktop edits: 3004 (76%)
  • VisualEditor (desktop and mobile): 278 (7%)
  • VisualEditor (mobile): 99 (2%)
  • VisualEditor (desktop): 179 (5%)

Logged out users

  • Total edits: 1,104 (28% of total sample size)
  • Mobile edits (app and mobile web): 465 (42% of logged out edits)
  • Mobile web edits (VE and wikitext): 441 (40%)
  • Mobile app edits (wikitext): 24 (2%)
  • Desktop edits: 639 (58%)
  • VisualEditor (desktop and mobile): 70 (6%)
  • VisualEditor (mobile): 40 (4%)
  • VisualEditor (desktop): 30 (3%)

Total dablink edits by users of different experience levels

Total dablink edits for registered accounts: 2,846

  • Users with < 100 edit count: 412
  • < 500 edit count: 740
  • < 1000 edit count: 918
  • < 5000 edit count: 1,395

Conclusions

Analysis of patterns could be very informative - but I've seen DABlinks which had been added by admins and >10-year veterans, so it isn't a straightforward problem. WP:TESTLINK is more a pious hope than a guideline.