Page MenuHomePhabricator

Create a new API endpoint which returns Commons images in need of a caption or caption translation
Closed, ResolvedPublic

Description

Requirements:
Given a language, return an image in need of a caption in that language
Given source and destination languages, return an image with a caption in the source language that needs a caption in the destination language

Implementation details
Initially, utilize action=query&generator=random to seed a filter that looks for images matching the criteria requested. Continue to investigate other approaches with a higher guarantee of correctness if/when the random approach is no longer viable

Working deadline:
28 June 2019 for API production launch

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
LGoto changed the task status from Open to Stalled.Jan 9 2019, 4:53 PM
JoeWalsh renamed this task from Create a new API endpoint which returns Commons images in need of a caption translation to Create a new API endpoint which returns Commons images in need of a caption or caption translation.Jan 9 2019, 9:34 PM
Mholloway changed the task status from Stalled to Open.Apr 5 2019, 6:23 PM
Mholloway updated the task description. (Show Details)

it's not going to be possible in this case to run a series of MySQL queries to populate suggestions as we did for descriptions. Captions are stored along with other media info in JSON blobs in External Storage. From within MediaWiki, they are accessed from a revision record's mediainfo slot. Take, for example, the current latest revision (344850849) of File:Pluto-01 Stern 03 Pluto Color TXT.jpg:

mholloway-shell@mwmaint1002:~$ PHP=php7.2 mwscript shell.php --wiki=commonswiki
Psy Shell v0.9.9 (PHP 7.2.8-1+0~20180725124257.2+stretch~1.gbp571e56 — cli) by Justin Hileman
>>> sudo MediaWiki\MediaWikiServices::getInstance()->getRevisionStore()->getRevisionById(344850849)->getContent('mediainfo')->getMediaInfo()->getLabels()->toTextArray();
=> [
     "en" => "High-resolution MVIC image of Pluto in enhanced color to bring out differences in surface composition.",
     "de" => "Dies ist ein hochaufgelöstes Photo des Planeten Pluto, wobei die Farben verstärkt wurden, um Oberflächenstrukturen hervorzuheben.",
     "ru" => "MVIC-изображение Плутона с высоким разрешением в улучшенном цвете, чтобы выявить различия в составе поверхности.",
     "uk" => "MVIC-зображення Плутона з високою роздільною здатністю у покращеному кольорі, щоб виявити відмінності у складі поверхні.",
     "be" => "MVIC-выява Плутона з высокім разрозненнем у палепшаным колеры, каб выявіць адрозненні ў складзе паверхні.",
     "fi" => "Korkearesoluutioinen MVIC-kuva Plutosta, jonka korostetut värit tuovat esiin pinnanmuotojen erot.",
     "ja" => "冥王星の高解像度MVICイメージ。表面上の性質の違いを出すために、色をはっきりさせた。",
     "eu" => "Erresoluzio altuko Plutonen MVIC irudia azaleko ezaugarriak hobeto bereizteko koloreztatuta.",
   ]

This being the case, perhaps the most promising way for populating a suggestions table is to run a deferred update on page save (as we're currently doing for the counters) and insert/update suggestion data about the newly created revision.

See also T219502: wb_terms is deprecated for background on the current Wikibase storage scheme. wb_terms is not in use on commonswiki (it exists, but is empty).

It looks like we can piggyback on Extension:GlobalUsage to get the pages using a file per wiki. (I am curious as to how this flow will work in a suggestions scenario in the app, and will inquire with the team.)

It's also worth exploring whether/how we can leverage the WikibaseCirrusSearch extension here -- maybe we don't need to be doing so much work in large, custom MySQL tables.

To update where we're at: the new hascaption search keyword can be used like this to get images without captions in lang X, or with captions in lang X but not Y, along with their global wiki usage:

https://commons.wikimedia.org/w/api.php?action=query&format=json&formatversion=2&prop=globalusage&guprop=url|pageid|namespace&gulimit=500&gufilterlocal=&generator=search&gsrsearch=hascaption:en%20-hascaption:ru&gsrsort=just_match&gsrnamespace=6&gsrlimit=50

For association with Wikipedia articles, you could do something like continue the query until you find an image used in one or more articles on the wiki(s) of interest. (N.B. The indices that back this search are updated in real time as the wiki is edited, but the results are not random. Per T220282#5094412 random results are something we could explore, but it's not provided for at this point).

With that said, there are a couple of tricky practical issues I've noticed in the SDoC context:

(1) Unlike with Wikidata, where there is in principle a 1:1 relationship between a Wikidata item and an article on a given Wikipedia, the relationship between Commons files and articles on a given Wikipedia is one-to-many (or one-to-none).

(2) Because SDoC is so new, many images currently have only a single English structured caption, or no structured captions at all. A search for images that have a caption in English and don't have one in pretty much any other language, or simply that don't have a caption in ${non-English-lang}, will have tons of results, many of which may be pretty obscure and not used in any Wikipedia article. Conversely, a search for images with a caption in ${non-English-lang} and not English will likely turn up relatively few candidates.

Have you considered strategies for identifying candidates for caption addition/translation based on their usage in Wikipedia articles of interest, rather than attempting to identify random image candidates directly from Commons based on their captions and work back to Wikipedia articles? One idea that comes to mind is that we could populate caption addition/translation suggestions by querying caption info for articles the user visits as the user browses. Another idea is that we could do something like identify candidates used in (random members of) the top 100 or 1000 articles for a target wiki.

(2) Because SDoC is so new, many images currently have only a single English structured caption, or no structured captions at all. A search for images that have a caption in English and don't have one in pretty much any other language, or simply that don't have a caption in ${non-English-lang}, will have tons of results, many of which may be pretty obscure and not used in any Wikipedia article.
[...]
Have you considered strategies for identifying candidates for caption addition/translation based on their usage in Wikipedia articles of interest, rather than attempting to identify random image candidates directly from Commons based on their captions and work back to Wikipedia articles? One idea that comes to mind is that we could populate caption addition/translation suggestions by querying caption info for articles the user visits as the user browses. Another idea is that we could do something like identify candidates used in (random members of) the top 100 or 1000 articles for a target wiki.

This is a good point @Mholloway. I'm all for making sure that the translation work people do will be seen/used by others. I'm not sure how resource intensive it would be to get captions for articles from a user's history (perhaps @Dbrant can weigh in), but it seems feasible to query the top N articles in a wiki first for images lacking structured captions in that language. (We could increase the value of N presumably as more and more images got these types of captions.)

It makes total sense to start with identifying popular Wiki articles, and seeing if the images they contain have a structured caption. This will certainly be better than picking Commons images at random, most of which won't be linked to any Wikipedia article.

The only other thing to note is that the caption of the image (structured or not) is generally *not* what's used in the Wikipedia article! The actual captions used on Wikipedia are local to the specific article to provide the correct context for that usage. So, in order to see their contributions, the user would have to explicitly click on the image to go to our Gallery, where we can show the structured caption (presumably alongside the article-specific caption for the image).

Have you considered strategies for identifying candidates for caption addition/translation based on their usage in Wikipedia articles of interest, rather than attempting to identify random image candidates directly from Commons based on their captions and work back to Wikipedia articles? One idea that comes to mind is that we could populate caption addition/translation suggestions by querying caption info for articles the user visits as the user browses. Another idea is that we could do something like identify candidates used in (random members of) the top 100 or 1000 articles for a target wiki.

Also, in the long run, identifying images used at all outside of Commons compared to images not used anywhere would be a good guideline. Captioning that image of a small town which has articles in five languages makes is very relevant compared to the average image no one has ever used for anything.

I like the idea of using popular pages, but that's a case for which I think we're back to needing to store candidates in some kind of efficient queue (if only to ensure that we don't end up giving a number of editors the same suggestion at the same time). We don't have such a queue a the moment. I've attempted to reopen that conversation at T206504#5178006, but I don't think it's wise to block on the outcome.

@Dbrant, what do you think of the idea of a /page/media-like endpoint that simply reports the captions present for all of the images on a given page? It could be queried immediately upon the user visiting a page, and any suitable images could be immediately added to the suggestions list. Given how few structured captions currently exist, especially for languages other than English, I suspect that strategy would generate suggestions quite rapidly; and it could be prototyped for testing very quickly (as in an hour or two).

In preparation for tomorrow's check-in, here are some additional, high-level notes on where the description and image caption suggestion APIs are:

Captions

As noted above in T209997#5170452, the new haslabel/hascaption search keyword provides for fast CirrusSearch-powered querying of Commons files based on the presence or absence of structured captions per specified language. However, some issues remain:

(1) It's not currently possible to use file usage per wiki as a query filter. It is possible to obtain the global usage of a provided set of files (including by using haslabel/hascaption as a generator), but it may require continuing a query several times to find even one suitable candidate for a given language or language pair (especially given the current relative lack of structured captions for languages other than English).

(2) In the Commons context, finding random candidates with the required caption language characteristics is probably not sufficient on its own to generate suggestions that are of interest to the user.

(3) Generating suggestions from the set of images present on top-viewed pages is a promising strategy, but probably requires some viable intermediary queuing setup to avoid suggestions. That doesn't currently exist, though technical negotiations have been reopened (T206504#5178006).

Open questions:

  • Should we continue to push for an in-memory suggestions queue or once again attempt to work around its absence?
  • What does the minimum viable product look like here?

Descriptions

The CirrusSearch hasdescription keyword can be used to query Wikidata items based on the presence or absence of descriptions by language. CirrusSearch does not currently support searches by sitelink, but that is under discussion at T220282 as a new search feature. In the meantime, I would expect that post-hoc filtering by sitelink would be much more effective here than in the SDoC context given the mature Wikidata data set, and randomness could be "fudged" as needed with search offsets and/or post-hoc result sorting.

Given that the existing client-side search strategy seems viable for the time being, further work on this is on hold until the caption suggestions work is complete, unless the Android team requests otherwise.

Mholloway updated the task description. (Show Details)
Mholloway updated the task description. (Show Details)

@Charlotte @Mholloway

These are the criteria suggestions for unlocking the image caption tasks:

  • Add image captions:
    • 3 edited image captions in the app (non-reverted)
  • Translate image captions:
    • More than one language set in the app
    • 25 edited image captions (non-reverted) in the app or via “Suggested edits“.

Note: General image caption editing in the app is available to everyone and does not depend on previously unlocked “Suggested edit“ tasks.


@Charlotte: a conceptual thought – I’m getting more and more into the idea of include caption editing directly on article pages. Why? We’re already offering editing features right on the article page and if we’re serious about this editing captions in the app, we shouldn’t hide it in the gallery view. Here are some explorations that we could include:

Edit image captionTranslate image caption 
(if there’s none in the user’s second language)
edit-image-caption.png (2×1 px, 2 MB)
translate-image-caption.png (2×1 px, 2 MB)

There is also a way to make this a bit more subtle, e.g. by displaying just an edit pencil at the bottom right of the image. Let me know what you think @Charlotte.

@schoenbaechler - Can I see the subtle version? As it is now, having 2 edit pencils stacked one above the other is confusing to me, especially since we're not actually showing the image caption we're asking them to edit. My initial thought is that they're going to mistake the wikidata description for the image caption.

The caption translation CTA is less problematic I think. Harder to mistake that.

@schoenbaechler I like that idea a lot in principle; but it's important to bear in mind here that the image captions shown in-article are entirely local to the article itself and stored in the article wikitext, and have nothing to do with the Commons captions (structured or otherwise). It could be confusing if the workflow leads directly from the article to adding or translating a Commons caption, because the update wouldn't be reflected in the article itself.

If, on the other hand, the intent is to allow for editing the local, in-article image captions which are seen in the article itself, it will need some technical investigation from both RI and the Android devs into parsing these out from article wikitext or HTML and re-saving as updated. We haven't attempted that before, and it's probably more complicated than it seems.

Would this be enabled for only the lead image (as depicted in your mocks above) or for every image in an article?

In T209997#5207511, @schoenbaechler wrote:

@Charlotte @Mholloway

These are the criteria suggestions for unlocking the image caption tasks:

  • Add image captions:
    • 3 edited image captions in the app (non-reverted)
  • Translate image captions:
    • More than one language set in the app
    • 25 edited image captions (non-reverted) in the app or via “Suggested edits“.

Please let me know when these are final, and I'll update the counter config accordingly. Note that the caption edit counter is currently active and configured with a target count of 50.

I like that idea a lot in principle; but it's important to bear in mind here that the image captions shown in-article are entirely local to the article itself and stored in the article wikitext, and have nothing to do with the Commons captions (structured or otherwise).

And the local image captions are often contextualised or re-contextualised to relate to or explain something in the specific article, explaining how they illustrate the concept at hand in a way the more general Commons captions can't do.

Thanks for the feedback @Charlotte, @Mholloway and @Johan, all good points. There’s definitely potential in confusing our users when it’s designed poorly/not clear enough. The mocks above are pretty early stage. I’m also still exploring if this is worth pursuing, will post some updates tomorrow about it.

On another note – I just discussed it with fellow designers that have expertise in editing questions. Below their initial feedback:

@cmadeo:

  • Could push down the article title and description, which hasn’t been received well before
  • Caption is currently not displayed, which could lead to confusion what users are actually editing
  • Mentions that there are different type of captions/descriptions, e.g. Commons (structured, unstructured) and Wikipedia image captions
  • Another option: a call to action on the image itself

@Pginer-WMF:

  • Too many pencils (yes!)
  • There are different use cases for editing, e.g.
    1. I want to make a change (dedicated editor)
    2. Discovering about edits during reading
  • Intermediate solution can be considered, e.g. tapping the element (image), then reveal actions (based on Carolyn’s input above)

Agreed with @Mholloway that at least initially we should probably constrain this to the Commons captions for technical reasons - so not show the edit pencil anywhere that an in-page caption is used. Otherwise, parsing nightmare.

Change 512388 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/extensions/WikimediaEditorTasks@master] Add caption suggestion support to ApiWikimediaEditorTasksSuggestions

https://gerrit.wikimedia.org/r/512388

Change 512388 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEditorTasks@master] Add caption suggestion support to ApiWikimediaEditorTasksSuggestions

https://gerrit.wikimedia.org/r/512388

Ok, I further investigated on this. As @Johan mentioned:

(...) local image captions are often contextualised or re-contextualised to relate to or explain something in the specific article, explaining how they illustrate the concept at hand in a way the more general Commons captions can't do.

It’s important to keep it that way. There’s a general caption of an image (unstructured/structured from Commons) and there’s a caption specific to the article. We already let users edit the article specific captions.

Therefore, entering the gallery view to edit a caption on Commons makes a lot of sense. However, to promote editing in the app and increase the pool of structured captions, it could be interesting to introduce image caption editing, but only for title images in articles. To be more precise, I’m thinking about these two additions:

  • Display an “Edit caption“ CTA below the image, only when there is no structured caption available yet on Commons. Tapping the edit button would lead users to edit screen that displays the unstructured description.
  • Display “Add $Language caption“ below the article image, when there’s no structured image caption available in the user’s second language of the app. If there’s a structured caption available in the second language, the CTA will adapt ask for a caption to the third language (if set) and so forth. Tapping the button will lead users to the dedicated “Translate“ edit screen.
Edit captionAdd $language caption
edit-caption.png (2×1 px, 2 MB)
add-x-caption.png (2×1 px, 2 MB)

After adding a caption, users will get back to the article itself. A snackbar at the bottom informs users that the caption’s been added and offers the possibility to view it. Tapping view takes users to the gallery view with the updated, structured caption. To emphasize again, these CTA’s below images would only be displayed when there’s no structured image caption available on Commons.

How does that sound @Charlotte, @Mholloway, @Johan and @Dbrant?

Ok, I further investigated on the idea of including caption editing directly on article pages (T209997#5207511). As @Johan mentioned earlier:

(...) local image captions are often contextualised or re-contextualised to relate to or explain something in the specific article, explaining how they illustrate the concept at hand in a way the more general Commons captions can't do.

It’s important to keep it that way. There’s a general caption of an image (unstructured/structured from Commons) and there’s a caption specific to the article. We already offer the possibility to edit the article specific captions via the app (though not very convenient yet, but it’s there). I think tackling everything together right now is out of scope. Improving the general edit experience via app is its own EPIC.

However, to promote “Suggested edits“ in the app and increase the pool of structured captions on Commons, it could be very interesting to pursue the idea of image caption editing, but only for title images in articles. To be more precise, I’m thinking about these two additions:

  • Display an “Edit caption“ CTA below the image, only when there is no structured caption available yet on Commons. Tapping the edit button would lead users to edit screen that displays the unstructured description.
  • Display “Add $Language caption“ below the article image, when there’s no structured image caption available in the user’s second language of the app. If there’s a structured caption available in the second language, the CTA will adapt ask for a caption to the third language (if set) and so forth. Tapping the button will lead users to the dedicated “Translate“ edit screen.
Edit captionAdd $language caption
edit-caption.png (2×1 px, 2 MB)
add-x-caption.png (2×1 px, 2 MB)

After adding a caption, users will get back to the article itself. A snackbar at the bottom informs users that the caption’s been added and offers the possibility to view it. Tapping view takes users to the gallery view with the updated, structured caption.

To emphasize again, these CTA’s below images would only be displayed when there’s no structured image caption available on Commons.

How does that sound @Charlotte, @Mholloway, @Johan and @Dbrant?

I think the main problem here would be that you're adding a caption we're not showing in the article unless you tap the image, which could be potentially confusing, unless I'm misunderstanding something. On the other hand, there's a value in surfacing the editing capabilities in the app. I have no strong opinion either way.

(A related problem, I suppose, is that the app already strips the aforementioned local contextualisation from the title image. But this is probably a problem less often that it would be for any other image in the article.)

Thanks for your feedback @Johan.

I think the main problem here would be that you're adding a caption we're not showing in the article unless you tap the image, which could be potentially confusing, unless I'm misunderstanding something.

You’re understanding it correctly. To counteract these concerns, we’re guiding users with the established review process, which consists of:

  • Review screen after users have added a caption (View)
  • Snackbar after returning to the article that informs users that the caption’s been added. It also offers the possibility to view their contribution by tapping “View“. “View“ takes users to the gallery view where they see the updated caption.

Change 513340 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/services/recommendation-api@master] Suggestions endpoints for SDC image caption addition/translation

https://gerrit.wikimedia.org/r/513340

Mholloway raised the priority of this task from Medium to High.May 31 2019, 7:45 PM

Memorializing the outcomes of our sync earlier, we decided that:

  • Unstructured captions will be gathered and returned, but results will not be filtered for them;
  • Target wiki articles using the image, if any, are provided in a new 'globalusage' property, but again these are not used as a filter.

Of course this doesn't preclude the app from doing its own filtering or sorting on the results provided.


I also updated the target public deploy date to the Monday after next, since the SRE team is on offsite next week, and so we can't deploy.

Change 513340 merged by Mholloway:
[mediawiki/services/recommendation-api@master] Suggestions endpoints for SDC image caption addition/translation

https://gerrit.wikimedia.org/r/513340

Mentioned in SAL (#wikimedia-operations) [2019-06-20T17:22:48Z] <mholloway-shell@deploy1001> Started deploy [recommendation-api/deploy@7dc63ab]: Deploy Suggested Edits endpoints (T209997, T224233)

Mentioned in SAL (#wikimedia-operations) [2019-06-20T17:25:43Z] <mholloway-shell@deploy1001> Finished deploy [recommendation-api/deploy@7dc63ab]: Deploy Suggested Edits endpoints (T209997, T224233) (duration: 02m 55s)