Page MenuHomePhabricator

[SPIKE] Investigate inserting image on article infobox or lead image on iOS image recommendations
Closed, ResolvedPublic

Description

This task represents the work of investigating how we can verify if an image should be inserted in the article infobox or the image lead and how to insert it on the wikitext properly.

Event Timeline

Mazevedo renamed this task from [SPIKE] Investigate inserting imagem on article infobox or lead image on iOS image recommendations to [SPIKE] Investigate inserting image on article infobox or lead image on iOS image recommendations .Jan 29 2024, 3:55 PM

We synced with Android to learn more about their implementation. They're checking if the article already has an image, and if the article has an infobox (currently only in en-wiki, as we look for specific keywords on templates, and only on stable templates that have the images on a recognizable label).

Some notes:

  • On Android side, reused as much existing stuff. Reused native editing screen, plus workflow we have for inserting an image into native editing screen.
  • When user agrees to image, you can bounce straight to native editor, with prepopulated image at the right spot. Editor is technically there in the background for Android, user just doesn’t see it. Then show publish screen.
  • Fair/good approach: You get the wikitext, you get the name of the image. If all else fails, you can always insert it at the very top of the wikitext. Doesn’t screw anything up at the very top. If you’re doing this in your editing workflow, you’ll have a preview that will show you how it looks.
  • Pretty good: insert after any templates at the top of the article, right before actual text. Requires custom parsing logic, maybe regex, Android parses out templates {{ }}. When you know it ends, you ignore that template bit. Keep in mind nested templates, keep track of where you’re at. Keep parsing until you hit a character of text that is not part of a template.
  • Best: Inserting into infobox. More than 1 type of infobox, it’s not just one thing, there are thousands of them. Infoboxes have parameters, one parameter is called image, but not all the time, different types of infoboxes have different names for the parameter, and different names for the image caption field. Those names can also change by language. If that fails, we fall back to parsing for templates at the top.
  • It is achieved through a ombo of many things, regex, manual parsing.

Link to Android code: https://github.com/wikimedia/apps-android-wikipedia/blob/main/app/src/main/java/org/wikipedia/edit/insertmedia/InsertMediaViewModel.kt#L167

On magic words:

Tsevener raised the priority of this task from Low to Medium.Feb 12 2024, 5:10 PM

I'm not sure if this is covered by this spike, but I went ahead and looked into how to generate the image wikitext for insertion into the article wikitext:

To construct something like this from our flow:
[[Datei:Hauskatze_in_Abendsonne.jpg|mini|rechts|320x224px|alternativtext=User Alt Text|User Caption Text]]

We follow the format:
[[File:Name|Type|Location|Size|alt=Alt|Caption]]

(Taken from https://en.wikipedia.org/wiki/Wikipedia:Extended_image_syntax, and limited to just our wizard parameters)

Our image wizard already creates image wikitext like this, but only for English namespaces and template parameter names, regardless of which wiki we're editing. If we wanted to go the extra mile and translate the bold text (which matches Android):

File:

https://de.wikipedia.org/w/api.php?action=query&format=json&prop=&list=&meta=siteinfo&siprop=namespaces%7Cgeneral%7Cnamespacealiases&formatversion=2&origin=*

Call the above API, and get the name at query.namespaces["6"].name.

Type

https://de.wikipedia.org/w/api.php?action=query&meta=siteinfo&formatversion=2&siprop=general%7Cmagicwords

Call the above API, and use these values, depending on the user's choice:

thumbnail = query.magic_words.img_thumbnail.aliases.first
frame = query.magic_words.img_framed.aliases.first
frameless = query.magic_words.img_frameless.aliases.first
basic = Insert nothing (Note: Currently iOS inserts "basic" here, which I think is a bug)

Location

https://de.wikipedia.org/w/api.php?action=query&meta=siteinfo&formatversion=2&siprop=general%7Cmagicwords

Call the above API, and use these values, depending on the user's choice:

right = query.magic_words.img_right.aliases.first
left = query.magic_words.img_left.aliases.first
center = query.magic_words.img_center.aliases.center
(If user toggled wrap around image off) none = query.magic_words.img_none.aliases.center

Size

Looks like no magic words here. Just enter user's custom values with string concatenation "\(width)x\(height)px"

alt=Alt

https://de.wikipedia.org/w/api.php?action=query&meta=siteinfo&formatversion=2&siprop=general%7Cmagicwords

For alt template parameter, pull query.magic_words.img_alt.aliases.first. Note the value here looks like alternativtext=$1, so we'll need to replace the $1 with the user's alt text.

Caption

Nothing special here, just add the caption in the correct spot.

I will look at how to insert this wikitext into article text tomorrow.

For inserting image wikitext into an article, I created a prototype here in our new wikitext utils class, with some unit tests across a few languages to confirm it's working.

This prototype aims to accomplish the Pretty good bullet point above:

Pretty good: insert after any templates at the top of the article, right before actual text. Requires custom parsing logic, maybe regex, Android parses out templates {{ }}. When you know it ends, you ignore that template bit. Keep in mind nested templates, keep track of where you’re at. Keep parsing until you hit a character of text that is not part of a template.

This should match Android's non-EN insertion logic. (Though note I didn't port this directly from Android's code - rather wrote my own approach to it based on something similar I made for native editor).

One extra thing I did is also insert the image after any comments (<!-- && -->), because the very first article I tested with (Cat on EN) has a comment in-between templates which threw things off. Any remaining bits for this prototype would just be any cleanup we may want and further testing before it can go to PR.

For the Best approach above, to match Android's EN logic of inserting directly into the infobox, the lines of code to look at would be here:

https://github.com/wikimedia/apps-android-wikipedia/blob/main/app/src/main/java/org/wikipedia/edit/insertmedia/InsertMediaViewModel.kt#L191-L306

I did not get into it for the purposes of moving on from this spike. But we could potentially try to port that code if we want to aim for infobox insertion.