Page MenuHomePhabricator

Many file descriptions added from Commons Android app (Suggested Edits) are unhelpful
Open, Needs TriagePublic


As the uploader of a very large number of generic Commons images (about 250,000 BSicons), it seems to be common for random users to find my uploads through Special:Random. Since about August, some of those users have started making edits, like this one. (Virtually all of the edits are tagged as Mobile edit, Mobile app edit, Android app edit, Suggested Edits edit.)

Rather unfortunately, many of those edits have been unproductive, and the quality of edits has decreased over time. The quality of the descriptions, when the users try, is often low; for a specific group of images like BSicons, generic descriptions like "wheelchair icon" are not particularly helpful; many users simply copy text from other parts of the page (including the actual page title and the URL); and I have had to remove or replace almost all of those descriptions. While this is not a problem with the code, I thought it was worth filing a Phabricator task anyway, since this seems to be a systematic problem and is frequently resulting in bad outcomes, and it is unclear to me whether the developers of this feature are aware that this is happening.

In the past 30 days, there were sixteen edits resulting from this feature to pages on my watchlist, and the majority of them were unhelpful.

  • Five edits (1 2 3 4 5) appeared to be blatant self-promotion (e.g. the user's own username).
  • Five edits (1 2 3 4 5) appeared to be test edits, and the added descriptions were unrelated to the content.
  • Six edits (1 2 3 4 5 6) appeared to be coherent descriptions.
    • Of these, four were variations of "Image for BSicon diagrams", the generic description for most of my BSicon uploads; one in English (somehow worded worse than the original), two in Turkish, and one in German. It's obviously somewhat useful to have this translated, and it's not a terrible description, but the process is insanely inefficient compared to, say, doing this in a more semi-automated fashion (e.g. adding lots of descriptions for whole groups of similar images in multiple languages through QuickStatements).
    • The other two were variations of "wheelchair logo": one in Turkish, and one in English (but incorrectly inserted into the German field, so I removed it). Again, not terrible, generally accurate, but it could be better. (Almost every not-terrible description which isn't "Image for BSicon diagrams" has been something like "wheelchair logo", for some reason.)

There aren't really any standards for descriptions in structured data, but most of the descriptions I've written myself sound something like "BSicon, metro (dark blue), representing an open accessible stop on a smooth reverse curve on a closed underground line". (I've modelled these on the tone of Wikidata descriptions.) Unfortunately I haven't added a lot of these yet, because I currently mass-upload using pywikibot and would need to significantly complicate my workflow in order to add detailed descriptions to my new uploads.

Looking at the recent changes resulting from Suggested Edits, some of the descriptions seem to be somewhat-not-terrible, but many of them are either lifted directly from the existing non-structured descriptions (without translation) or are obviously vandalism. They seem to be a little better than the ones on my uploads, though. I don't have any concrete suggestions for how the situation could be improved, although some possibilities that come to mind are:

  • Excluding certain groups of images from the feature (e.g. images in X category or with X extension).
  • Disabling the feature if e.g. the user has recently visited Special:Random and has made fewer than ten edits.
  • Limiting the edits to translations.
  • Introducing edit filters for descriptions (e.g. URLs, usernames, emoji).
  • Turning off the feature until e.g. a clearer baseline has been established for how to write good descriptions.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Jc86035 Thank you, we much appreciated this constructive and thoughtful piece of feedback, and had a meeting to connect the things you bring up here with other feedback we've been getting. We're going to have a technical discussion around the feasibility of some of the suggested solutions, and other things that have come up in discussions around this feature – among other things, we're aware need a better metric to find bad edits, to not encourage people who don't get the system to edit and add to the burden of the Commons community.

(In the very short term, if you've got problems with specific users but don't want them blocked, reverting the edits instead of tweaking them will tell the system to lock those editors out of the suggested edits feature. But this is of course not how we plan to deal with this.)

Has there been any progress on this issue? Within my watchlist, the distribution of edits seems to remain about the same as it was three months ago:

  • A few users are labelling some of the images based on the wheelchair. (Those images form a minority of my uploads but the wheelchair still seems to be by far the most or only recognizable feature.)
  • A few users are copying or translating the generic English caption.
  • A few users appear to have a fairly poor command of English but are trying anyway, with mixed results (1, 2, 3, 4).
  • The rest – including some of the former group – appear to be disseminating their own personal information (primarily names and locations), which could be problematic from a privacy perspective if it's not considered self-promotion.

I haven't been following the development of Structured Data too closely; it's understandable if the team developing it has been focused on its other facets, since this only affects a small minority of files.

If this is still true:

Access to the Suggested Edits feature will now be paused for a user for one week the first time their edit quality drops below a certain threshold. They will be given information on how to improve their edits so that they can productively contribute once the feature is re-enabled. The Suggested Edits feature will be permanently disabled for a user in case of consistently low edit quality. The user's account is left unchanged - only the Suggested Edits feature becomes disabled. If the user's account is banned by an admin on either the Commons or Wikidata projects, the SE feature will also be disabled for the user.

Then there is some kind of bug as there is a rather continuous stream of reverted edits from user Cubaces4ever2 and I guess that system should have disabled the feature now.

Just wanted to provide an update that we conducted an in-depth audit of this and have some next steps to enhance the feature. We are currently awaiting guidance from another team before making updates. You can read more about the research conducted at