Page MenuHomePhabricator

Set up a process to keep a tally of problematic VisualEditor edits
Closed, ResolvedPublic


Goal: Set up a process for regularly running a tally of VisualEditor edits, and classifying them whether they're problematic. This will be in addition to the existing automated tests and QA efforts.


  • Review a batch of edits from recent changes with the VisualEditor tag on a selection of wikis. Initial numbers: ~500 edits total from ~5 biggest wikis, but this can be tweaked as long as we get a reasonably representative sample of edits.
  • Classify them by handcoding them according to the categories below
  • Do this before the weekly VisualEditor triage meetings so the bugs can be reviewed then.

Possible categories for edits:

  • no problem
  • unavoidable user error
  • borderline (could be improved with UX)
  • obvious bug
  • unknown (not apparent what the user attempted to do, or what the problem was)

In order to save time, a simple user script or similar may help make the process faster. For example, the script could add a few buttons in the diff interface that match the different classifications, and automatically add the diff to the appropriate wikipage / spreadsheet / tally, instead of manually copy-pasting each URL.

Event Timeline

gpaumier claimed this task.
gpaumier raised the priority of this task from to High.
gpaumier updated the task description. (Show Details)

An update as promised: I now have a partial user script for this. I'm working on a local file but I've saved a copy at

Basically, the script adds a series of buttons, one for each category of diff. When the user clicks a button, the script records the revision ID, user name, time and diff category, and temporarily saves all that into the browser's localStorage. When the user is done with a review session, they can save the data on a wiki page.

All but the last part (saving on a wiki page) is now implemented and seems to work. This is the first JavaScript I write, so it's probably far from pretty or optimized, but it seems to mostly do what it's supposed to do. I should be able to finish this early next week and start classifying the diffs.

You're fast! I forgot to point you to one related piece of work -- the efforts by @Halfak and friends to create handcoding tools as part of the revision scoring project. See -- there might be opportunities to share code (though obviously if you've got something that works for you, no need to complicate it).

Cool! As I suspected, this use-cases of manually classifying edits are quite broad & common. We've been working on a somewhat general solution. It won't be ready in time for your work, but I'm happy to know that (1) this need exists here and (2) we agree on what good strategies look like.

@gpaumier Could you review and give us any feedback, insights or pull requests you've got? :) The most up-to-date mocks and discussions are on the talk page.

@Eloquence and @Halfak: Thank you for the links! Since our timeline for the VE diff monitoring is more pressing, I've proceeded with my user script for now.

I now have a working script at ; anyone can test it by adding this to their common.css


Once a user is done tagging the diffs, they can save the tallies for their session on a wiki page, which for now is one of my sandboxes. I'm not sure if it should be the in the user's own sandbox, or a central page for all users. I guess this depends on whether we're actually expecting people other than me to use this.

At the moment, the script creates a timestamped section every time the user saves a set of tags. If that's good enough for our use, then we're mostly done. Do we want better tallying / aggregation?

There are a number of things that can be improved if we want:

  • Clean up that messy code and rearrange the buttons a bit.
  • Only show the tool when we're looking at a diff;
  • Make the script translatable, and show it in the user's language, if we want to make it usable by other users.

An update after talking with @Eloquence today:

I'm not sure if it should be the in the user's own sandbox, or a central page for all users. I guess this depends on whether we're actually expecting people other than me to use this.

We want to encourage other people to participate, so the script needs to be a bit more robust and polished than if it was just me using it; this means going ahead with a few improvements like those mentioned above.

Also, in addition to those, Erik suggested to add a free-form field for comments whose classifying is difficult, or to use as tagging rationale. I'm thinking a twitter-style limit would work well.

I've updated the script with a newer version. There's now a comment field and a submit button, and the code is a little cleaner.

diff_review.png (597×1 px, 73 KB)

See for an example of what the table looks like after a review session.

I've finished the first batch of diffs for en.wp and posted the results at .

An update:

  • @Catrope has agreed to take a look at the "problematic" diffs identified during the review sessions, to help me figure out if they're due to a known bug, or if a new ticket should be created. We did this for the first batch and agreed to take ~10 minutes every week to go over the new ones.
  • The script now posts to the Wikipedia:VisualEditor/Feedback/Diffs page on the English Wikipedia.
  • The script now only shows on diff pages, and more specifically on diffs of edits made with VisualEditor.
  • There is a translated version of the script on the French Wikipedia. It's basically a copy with translated strings, and there are probably better ways to handle this without duplicating the code on every wiki, but this is good enough for now and it allows people to use it there.
  • I've made a second review of 100 diffs on the English Wikipedia, and a first review of 100 diffs on the French Wikipedia.
  • I've reached out to NicoV to get his feedback on the process and the classification of diffs, since he has diligently been reviewing so many VE diffs during the past few months. After I get his feedback, I'll reach out to more users to let them know about the tool.

I tried VEDiff following Guillaume suggestion, but I don't see any button to manage the set of reviews...
I do have the buttons to review a given diff but I can't publish the reviews.

Here's a screenshot (Chrome, Windows 7)

VEDiff.png (660×1 px, 92 KB)

@NicoV: Thank you. It's most likely due to the RevertDiff gadget that removes some elements from the page and prevents the vediffs buttons from loading. I should be able to work around it.

@NicoV : J'ai testé et le problème ne vient pas de RevertDiff. As-tu d'autres gadgets activés ?

@gpaumier :
J'ai désactivé un à un les gadgets dans Préférences, sans aucun effet. Mes gadget activés dans Préférences sont AncresTitres, HomonymiesEnCouleur, NewCollapsible, Popups, WikiOpenStreetMap, WikiMiniAtlas, SousPages, EditZeroth, OngletPurge, ResumeDeluxe, MonobookToolbarStandard, SpecialChars, RevertDiff, Wdsearch et BugStatusUpdate

Sinon, j'ai quelques trucs dans mon common.js (OneClickArchiver, SyntaxHighlighter, VEDiff) ou dans mon vector.js (TemplateDataEditor, LiveRC, verifHomon, xpatrol, couleur contributions)

@NicoV Merci pour les détails. J'ai testé tout ce que je pouvais, activé les mêmes gadgets, copié ton common.js et ton vector.js, et les boutons fonctionnent toujours chez moi :( Sans informations supplémentaires, je ne vois pas ce que je peux tester d'autre. Au pire, tu peux récupérer les évaluations que tu as faites ; elles sont dans le localStorage de ton navigateur. Ce n'est pas idéal mais c'est le mieux que je puisse faire.

The user script now exists and the process is in place. I'm going to continue to review VE diffs on a weekly basis but that's an ongoing task, so I'm going to close this as "resolved".

@Whatamidoing-WMF: At the moment this is happening on three wikis: The English, French and Italian WIkipedias. See T94767 for the details. should be fixed, I think? That line of code needs to be added to common.js (not .css).