Identification of eligible image for Alt Text Flow C: Article Editor
Closed, ResolvedPublic
Actions

Assigned To

None

Authored By

	HNordeenWMF
	Jun 18 2024, 6:23 PM

Description

Background

For experiment group C (Article flow), we want to prompt users to add missing alt text after they have published an edit on any article containing an image in need of alt text.

In order to know when to show the prompt, we need to

Check whether there is an eligible image in the article. Surface the first eligible image in the article for the alt text prompt (What this task focuses on)
Check whether an edit was eligible (any edit made by a logged-in user on an article in the mainspace).
If there is an eligible image in that article, and the edit was eligible, show the Alt text prompt.

Sampling into control or group C should happen immediately after a user publishes an eligible edit from the main editor. If they are sorted into group C, they should see the alt text prompt.

Requirements

Location & wikitext of one eligible image in need of alt text is identified
Latency under 2 seconds for eligible image check and identification, to minimize the lag between when a user publishes the first edit, and when we show the prompt
Localized parameters should be supported
We should not suggest images to users for adding Alt text where any of the following are true
- The image is not linked to its common page, and has “|link=|alt=” or the local equivalent in magic words
- The image is below 100x100 pixels as defined in the wikitext, indicating it is an icon.
- The image images has any aria accessibility attributes such as ariahidden=true or role=presentation

Open question:

Should we exclude images in templates due to the specific formatting required? (infoboxes, galleries, timelines, or math formulas from our suggestions as they have specific formats)

References

Spike: T344378: Spike: How to obtain articles that have images with missing alt text answered the questions:

Question 1: If we have the wikitext of an article, how do we tell if it has images with missing alt text?
Question 2: How do we insert alt text into an existing File link?
Question 3: How do we get a list and/or queue of articles that have images with missing alt text?

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T357437 [Epic] Alt-Text Suggested Edit Experiment on iOS
		Resolved		None	T367908 Identification of eligible image for Alt Text Flow C: Article Editor

Event Timeline

HNordeenWMF created this task.Jun 18 2024, 6:23 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 18 2024, 6:24 PM

HNordeenWMF updated the task description. (Show Details)Jun 18 2024, 6:48 PM

HNordeenWMF renamed this task from Identification of eligible articles for Alt Text article flow C to Identification of eligible image for Alt Text Flow C: Article Editor.Jun 18 2024, 6:54 PM

HNordeenWMF updated the task description. (Show Details)

HNordeenWMF added a parent task: T357437: [Epic] Alt-Text Suggested Edit Experiment on iOS .

Taking this for feasibility spike. :)

HNordeenWMF edited projects, added Wikipedia-iOS-App-Backlog (iOS Release FY2023-24); removed Wikipedia-iOS-App-Backlog.Jun 25 2024, 5:43 PM

HNordeenWMF moved this task from Tasks from Product Backlog to Doing on the Wikipedia-iOS-App-Backlog (iOS Release FY2023-24) board.

(notes for alternate method using mostly client-side logic)

size has to be checked via lookup, this can be done easy after the listing though

parse for local links in original and post-edit versions
find any that looks like file links, with or without alt
- discard any already present
- consider empty equivalent to set (present)
compare against the previous set to look only for additions
if we see obvious aria fixes in wrapper, strip it
if we find new images with no alt text:
- check their sizes
- discard small ones

need to be able to fetch localization lists from mediawiki source

(namespaces, alt/link markers)

Work for Brooke (?):

helper script in js to fetch the keywords out of mediawiki
prototype logic in js to run some tests
port to swift to stick in the app

Open questions:

do we deal with _only new additions in the edit_ or _all alt-less template-less images in the text as of this revision_?
- the former is implementable by doing the extraction of links on both old & new and diffing them
do we need to exclude full image markup passed as wikitext into a template as a parameter?
- to implement: exclude all the {{....}} stuff
do we need to check very small icon images when sizes aren't passed in the params?

Notes:

magic words fetcher is already present! should be able to use that

@bvibber Notes on how to use the magic words stuff:

These methods give you some examples of how I'm using our MagicWordUtils struct. Note I also bundled the file namespace logic into this as well for simplicity at the time, so I use the term "MagicWordUtils" loosely here. Sorry if that adds confusion.
MagicWordUtils references these generated json files (DE example). Those json files can be updated by running a command line tool. This can be done by selecting the UpdateLanguages scheme in Xcode and running. Note it will likely change additional non-magicwords .json files - I recommend reverting those changes and only committing the changes to the magic words json files.

If you need to change the languages script and capture more magic words, you can do so by updating the command line utility script.

Let us know if you have any issues or questions!

Hi @HNordeenWMF! We had some questions about the task description:

Check whether an edit was eligible (any non-image-related edit made by a logged-in user on an article in the mainspace).

By this do you mean, an edit is potentially eligible if they were made using our article editor, as opposed to going through the image recommendations flow? Or is it that we need to inspect the article editor edit, and confirm they didn't make any changes related to images throughout the wikitext, and only then is it eligible?

Location & wikitext of one eligible image in need of alt text is identified

For determining an eligible image, are we expected to only consider new images added in that last article edit? Or do we consider any article image in the wikitext missing alt text, even if it wasn't touched in that edit?

The image is below 100x100 pixels, indicating it is an icon.
do we need to check very small icon images when sizes aren't passed in the params?

There may be image wikitext that does not specify pixels, we would need to do an additional fetch to commons to confirm its size. Just making sure you need us to do this additional check. I'm not sure how common this situation will be so sorry about the lack of guidance!

@bvibber Feel free to followup if I missed anything, thanks!

thanks @bvibber and @Tsevener for connecting on this, I understand that the current plan is to execute Flow C's check for the image without using the linter, doing it all client-side instead.

By this do you mean, an edit is potentially eligible if they were made using our article editor, as opposed to going through the image recommendations flow? Or is it that we need to inspect the article editor edit, and confirm they didn't make any changes related to images throughout the wikitext, and only then is it eligible?

The second - It should be an article editor edit on an article in the main namespace, and by a logged-in editor. The only edits we really want to avoid are edits where someone has just added Alt Text to an image in the article (a very small possibility I'm guessing). I thought to avoid this situation, it be easiest to exclude all edits to images, but what do you think?

For determining an eligible image, are we expected to only consider new images added in that last article edit? Or do we consider any article image in the wikitext missing alt text, even if it wasn't touched in that edit?

Any article image in the wikitext missing alt text, even if it wasn't touched in that edit. We actually expect this to be the most frequent situation: someone will have made a non-image related edit, and we suggest another way to improve that article that they demonstrated interest in.

There may be image wikitext that does not specify pixels, we would need to do an additional fetch to commons to confirm its size. Just making sure you need us to do this additional check. I'm not sure how common this situation will be so sorry about the lack of guidance!

This was an additional check suggested by Shay, that we saw in other alt text research research. It would help avoid suggesting images that are decorative icons are in need of alt text. If it's a very medium-to-large lift to do the extra check, we can only perform this check when the pixel sizes are included in the wikitext & forego the additional commons check for the experimental version. Acknowledging if this was ever built-out permanently we would add the extra check to Commons.

I thought to avoid this situation, it be easiest to exclude all edits to images, but what do you think?

@HNordeenWMF Unfortunately I think it will be difficult to determine that they are making an edit to image wikitext, and just as hard to know if they are adjusting alt text within an image wikitext. @bvibber Feel free to counteract if I'm being too pessimistic.

Since we think this is a small possibility, can we scrap this requirement? If they happen to fill in all the alt texts on the page, they should still not see the popup, because the followup logic to fetch any images without alt text should come up empty. If they fill in only one and others remain, then they will see the popup.

OK agreed, if the logic to fetch the image is happening after they submit their edit, it should not be an issue. I'll update the ticket to remove that requirement.

HNordeenWMF updated the task description. (Show Details)Jun 26 2024, 11:57 PM

HNordeenWMF updated the task description. (Show Details)

Seddon moved this task from iOS Release FY2023-24 to iOS Release FY2024-25 on the Wikipedia-iOS-App-Backlog board.Jul 16 2024, 1:22 PM

Seddon edited projects, added Wikipedia-iOS-App-Backlog (iOS Release FY2024-25); removed Wikipedia-iOS-App-Backlog (iOS Release FY2023-24).

Seddon moved this task from Tasks from Product Backlog to Doing on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.Jul 16 2024, 1:25 PM

HNordeenWMF mentioned this in T370305: [M] Trigger Alt-Text task from Edit (flow C).Jul 17 2024, 4:00 PM

HNordeenWMF triaged this task as Medium priority.Jul 23 2024, 10:07 PM

Provisional logic https://github.com/wikimedia/wikipedia-ios/pull/4903

Tsevener moved this task from Doing to Needs Code Review on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.Aug 6 2024, 5:10 PM

Should we exclude images in templates due to the specific formatting required? (infoboxes, galleries, timelines, or math formulas from our suggestions as they have specific formats)

@HNordeenWMF Just a heads up, I've been playing with this library and wanted to answer this question. I think as long as the format is (loosely) [[{FileNamespace}:{filename}|{additional parameters}]], it will be detected by this logic even if it is within a template. For example, the map picture of Carrollton, TX on German Wikipedia was detected, even though it's in an infobox:

Screenshot 2024-08-14 at 5.22.54 PM (214×316 px, 29 KB)

If you dig into the source it looks like this:

{{Infobox Ort in den Vereinigten Staaten
| Name = Carrollton
| Stadtspitzname = 
| Bundesstaat = Texas
| County = Dallas County
| County2 = Denton County
| County3 = Collin County
| Bild1 = Carrollton July 2019 11 (Carrollton Square gazebo).jpg
| Bildgröße1 = 
| Bildbeschreibung1 = Carrollton Square
| Siegel = 
| Flagge = Flag of Carrollton, Texas.svg
| Karte = [[Datei:Dallas County Texas Incorporated Areas Carrollton highighted.svg|250px]]
...

So that [[Datei:Dallas County Texas Incorporated Areas Carrollton highighted.svg|250px]] was detected, but I think Carrollton July 2019 11 (Carrollton Square gazebo).jpg will not be detected.

I think this will so far be fine, and our logic should be able to properly add alt text to something like [[Datei:Dallas County Texas Incorporated Areas Carrollton highighted.svg|250px]], even if it's in an infobox. Just note that detection may feel a little inconsistent within templates. We will detect them as best we can, but may miss some if they are lacking the surrounding brackets.

Note: I was playing with DE Wiki because they have more localized wikitext expectations. I realize DE Wiki will not actually participate in this experiment. :)

Tsevener moved this task from Needs Code Review to Waiting for Build on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.Aug 14 2024, 10:32 PM

Ok @Tsevener I think that's fine as long as we're only adding alt text to images within templates that are already formatted like a typical image found in an article.

Tsevener moved this task from Waiting for Build to Needs QA signoff on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.Aug 19 2024, 12:52 PM

Tsevener moved this task from Needs QA signoff to Waiting for Build on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.

Can be tested in TestFlight 7.5.8 (3895).

Looks good to me on 7.5.8 (3979)

ABorbaWMF moved this task from Needs QA signoff to Ready for PM Signoff on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.Aug 30 2024, 12:48 AM

HNordeenWMF moved this task from Ready for PM Signoff to Ready for Release on the Wikipedia-iOS-App-Backlog (iOS Release FY2024-25) board.Aug 30 2024, 4:41 PM

HNordeenWMF closed this task as Resolved.Sep 5 2024, 11:31 PM

	F57274458: Screenshot 2024-08-14 at 5.22.54 PM
	Aug 14 2024, 10:31 PM

	F55898791: Screenshot 2024-06-26 at 12.45.02 PM
	Jun 26 2024, 5:47 PM

Identification of eligible image for Alt Text Flow C: Article EditorClosed, ResolvedPublicActions