Page MenuHomePhabricator

Measure % of edits coming from users without JS
Closed, ResolvedPublic

Description

We need to find out how many people are editing with no JS support (either in a browser that doesn't have JS support, or a regular browser with JS turned off). See the parent task for more info.

Acceptance criteria:

  • snapshot % of user edits done with no JS support
  • snapshot % of anon edits done with no JS support
  • snapshot % of all edits done with no JS support

We just need some ballpark numbers here, sampling is fine.

Event Timeline

kaldari created this task.Dec 13 2019, 6:30 PM
DLynch added a subscriber: DLynch.Dec 14 2019, 1:04 AM

Define “edits done” — do you want all visits to the edit page, or can we restrict it to successfully completed edits? If the latter, I think we could use analysis of current EditAttemptStep data to compare sessions that are just made of ‘init’ and ‘saveSuccess’ to ones that also include ‘ready’/‘firstChange’.

If the former it’s more complicated because we’d have to do more to filter out the potential effects of background loading / crawlers.

mpopov added a subscriber: mpopov.Dec 16 2019, 6:15 PM

Howdy! o/ Product Analytics team will review and prioritize the request during our next board review meeting (today, 2019-12-16).

Define “edits done” — do you want all visits to the edit page, or can we restrict it to successfully completed edits? If the latter, I think we could use analysis of current EditAttemptStep data to compare sessions that are just made of ‘init’ and ‘saveSuccess’ to ones that also include ‘ready’/‘firstChange’.

If the former it’s more complicated because we’d have to do more to filter out the potential effects of background loading / crawlers.

Server-side PHP sends EditAttemptStep events, right? And those are limited to action='init' & action='saveSuccess'?

We may also want to repeat File:Analysis of Wikipedia Portal Traffic and JavaScript Support.pdf but other page(s)?

"Edits done" means edits completed.

We may also want to repeat File:Analysis of Wikipedia Portal Traffic and JavaScript Support.pdf but other page(s)?

@mpopov - Yes, that would be super helpful for the other part of parent task T234695!

LGoto triaged this task as Medium priority.Dec 16 2019, 7:11 PM
LGoto moved this task from Triage to Backlog on the Product-Analytics board.

@kaldari @JKatzWMF moving to backlog right now because we don't have bandwidth and this looks like it will be an extensive task. Please let me know if it's high priority enough to bump currently planned work - perhaps we can touch base in a meeting?

@kzimmerman - Thanks for looking into it.

@DLynch - Is there any chance this could be done by the engineers on the Editing team without the help of Product Analytics (say in January), or would that be difficult?

DLynch added a comment.EditedDec 16 2019, 10:24 PM

Server-side PHP sends EditAttemptStep events, right? And those are limited to action='init' & action='saveSuccess'?

@mpopov Yeah, that's my logic. It does also do saveAttempt and saveFailure server-side. Perhaps checking for saveAttempt would be most inclusive, since that'll always fire.

Is there any chance this could be done by the engineers on the Editing team without the help of Product Analytics (say in January), or would that be difficult?

@kaldari I mean, assuming that I'm not missing anything here, I think it's something we are at least capable of doing -- I couldn't commit without running it by @ppelberg for the timeline. There'd be a bit of "we don't normally do this, quickly educating ourselves on the tools" delay involved. Since I've not done this analysis step before I can't really say how much of a delay that'll be.

To summarize, I think the analysis which makes sense is to look at EditAttemptStep in a chosen time period, filter for editor_interface === 'wikitext', then filter with user_id === 0 and user_id !== 0 for anon / loggedin statistics:

  • No-JS: sessions containing init and saveSuccess and not ready
  • JS: sessions containing init, ready, and saveSuccess (this will include all VisualEditor sessions inherently)

This wouldn't require adding any further instrumentation or waiting for data to be gathered.

Things that might confuse our results:

  • This would be using saveSuccess as a way to limit it to sessions that resulted in successful edits. If either having-JS or not-having-JS makes saving substantially harder (lack of tools / bugs), our numbers would be misleading.
  • This would exclude VisualEditor users, depressing the overall JS numbers. This would be easy to compensate for by showing a "number of successful edits from VE" figure in the same time period.
  • Bots would probably still be included. Depending on the bot's methodology, it could potentially be classed as JS or no-JS, or bypass this editor entirely and use the API to make its edits.

Thank you for the ping, @DLynch.

To answer the question you posed around our ability to take this on, I asked JK [1] how important the information this task is asking for is and how urgently it is needed. JK said we can expect answers about these things this week.

Also, @kaldari, if there is additional context here, please let us know.


  1. It's my understanding this task is intended to serve a higher level decision JK/others are needing to make

@ppelberg - JK says it isn't urgent. Let's circle back on this after the holidays.

@ppelberg - JK says it isn't urgent. Let's circle back on this after the holidays.

Sounds good. Thanks for following up, @kaldari.

This would be using saveSuccess as a way to limit it to sessions that resulted in successful edits. If either having-JS or not-having-JS makes saving substantially harder (lack of tools / bugs), our numbers would be misleading.

I don't actually think this would be misleading, as we want to find out how many actual edits are made with no-JS (i.e. how many edits would we lose by disabling no-JS editing support).

This would exclude VisualEditor users, depressing the overall JS numbers. This would be easy to compensate for by showing a "number of successful edits from VE" figure in the same time period.

Sounds like a good plan.

Bots would probably still be included. Depending on the bot's methodology, it could potentially be classed as JS or no-JS, or bypass this editor entirely and use the API to make its edits.

@DLynch - What we're specifically looking for is no-JS edits made through any editing interface besides the API or mobile apps (regardless of whether they are by bots or not). Is there a way to exclude API edits from the totals? Basically we just need to justify with actual data whether or not we should continue to maintain a no-JS editor (as part of a broader evaluation of all of our no-JS support). I imagine our no-JS editor is used a fair bit, but we need data rather than speculation. The rationale for having an editing API is separate and doesn't need further justification.

kaldari added a comment.EditedApr 10 2020, 5:38 PM

To answer my own question, it looks like we could limit it to cases where editor_interface = wikitext and integration = page (to make sure we exclude app edits) for the no-JS number.

@ppelberg - Well, it's after the holidays, but probably an even worse time to bring this up. Regardless, we need this data to move forward with our no-JS guidelines for engineering. From David's analysis, it sounds like this would be a relatively small task (maybe a few days for one engineer). Is there any chance the Editing team could do this in Q4?

@kaldari Yeah, you beat me to it -- API requests shouldn't do any of this logging, and (assuming that the apps are both doing their logging and tagging it correctly) we can filter it down to just on-page stuff with the filter you suggested.

Do you care about restricting it to just desktop browsers, as well? Or breaking out the platforms in reporting? (I assume that disabling JS on mobile browsers is vanishingly rare, but haven't verified this. Although... I think anyone with JS disabled would have to jump through some hoops to get out of our completely JS-dependent MobileFrontend so there's enough barriers already that we'd not be getting helpful numbers, I suspect.)

@DLynch - Interestingly, disabling JS on the mobile web relegates you to using the core/WikiEditor editor, which always reports as "desktop" (T249944), so breaking it out by platform would probably be pointless.

Mayakp.wiki moved this task from Backlog to Triage on the Product-Analytics board.Apr 15 2020, 7:57 PM
Mayakp.wiki added a subscriber: Mayakp.wiki.
kaldari added a comment.EditedApr 29 2020, 6:17 PM

FYI, if T249945 is completed soon this task may become easier to complete.

kaldari updated the task description. (Show Details)Apr 29 2020, 6:27 PM
kaldari added a comment.EditedApr 29 2020, 6:40 PM

After doing a bit more sussing out, I think I've come up with specific steps that could be used to answer each of the 3 questions in the task description:

Snapshot % of user edits done with no JS support:
Within a specific timespan, take:
number of sessions where user_id !== 0, integration === page, actions include init, saveSuccess, and not ready.
Divide by:
number of sessions where user_id !== 0, integration === page, actions include init, ready, and saveSuccess.

Snapshot % of anon edits done with no JS support:
Within a specific timespan, take:
number of sessions where user_id === 0, integration === page, actions include init, saveSuccess, and not ready.
Divide by:
number of sessions where user_id === 0, integration === page, actions include init, ready, and saveSuccess.

Snapshot % of all edits done with no JS support:
Within a specific timespan, take:
number of sessions where integration === page, actions include init, saveSuccess, and not ready.
Divide by:
number of sessions where integration === page, actions include init, ready, and saveSuccess.

Note that I've decided not to worry about which editor the user is using (editor_interface), which should simplify things a bit.

Assignment of this depends on T251464; moving to Needs Investigation

After doing a bit more sussing out, I think I've come up with specific steps that could be used to answer each of the 3 questions in the task description:

Snapshot % of user edits done with no JS support:
Within a specific timespan, take:
number of sessions where user_id !== 0, integration === page, actions include init, saveSuccess, and not ready.
Divide by:
number of sessions where user_id !== 0, integration === page, actions include init, ready, and saveSuccess.

Snapshot % of anon edits done with no JS support:
Within a specific timespan, take:
number of sessions where user_id === 0, integration === page, actions include init, saveSuccess, and not ready.
Divide by:
number of sessions where user_id === 0, integration === page, actions include init, ready, and saveSuccess.

Snapshot % of all edits done with no JS support:
Within a specific timespan, take:
number of sessions where integration === page, actions include init, saveSuccess, and not ready.
Divide by:
number of sessions where integration === page, actions include init, ready, and saveSuccess.

Note that I've decided not to worry about which editor the user is using (editor_interface), which should simplify things a bit.

@kaldari I tried the above method and have posted the detailed numbers on this Github notebook.

TLDR;
Results : For the year 2020 :

  • snapshot % of user edits done with no JS support : 26.74 %
  • snapshot % of anon edits done with no JS support : 12.79 %
  • snapshot % of all edits done with no JS support : 24.08 %

Results : For the year 2019 :

  • snapshot % of user edits done with no JS support : 28.92 %
  • snapshot % of anon edits done with no JS support : 13.98 %
  • snapshot % of all edits done with no JS support : 25.83 %

The listed percentages on non-JS editors are much higher than expected indicating that a large portion of these users have ad-blockers installed and/or enabled DNT. As a result, we don't think this data is useful in determining the percentage of non-JS users and we would recommend looking at adding instrumentation if more accurate numbers are needed. A breakdown by editor interface helps clarify these numbers and confirms that there is only a small percentage for VisualEditor which might be users with client side event blocking enabled.

Thanks for taking a look @Mayakp.wiki! I'll create a new task for creating new instrumentation that isn't affected by ad blocking.

kzimmerman closed this task as Resolved.Aug 11 2020, 5:04 PM
Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptAug 11 2020, 5:04 PM