Page MenuHomePhabricator

Add a linter category for inline images with captions
Open, MediumPublic

Description

These aren't visible in the output, except as fallbacks to "alt". We should stop doing that and get content to move them to explicit |alt= options.

Longer discussion,


What to do for alt?

  • Core has a fallback to use the caption as the alt text when no alt is provided, and a further fallback to the filename if none of those are present
  • Parsoid, for its part, only adds an alt if it's explicitly requested with an |alt= option
  • We can create a linting category for inline images with a caption without an alt and then make the change in core to be more like Parsoid
  • Longer term, description stored with images seems like the better place for alts fallbacks but that has the editability/truth complications https://phabricator.wikimedia.org/T63566

Event Timeline

Change 745635 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Lint captions on inline media

https://gerrit.wikimedia.org/r/745635

Change 745964 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/Linter@master] Add linter category for \"inline-media-caption\"

https://gerrit.wikimedia.org/r/745964

Change 745635 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Lint captions on inline media without an explicit alt options

https://gerrit.wikimedia.org/r/745635

Change 745964 merged by jenkins-bot:

[mediawiki/extensions/Linter@master] Add linter category for \"inline-media-caption\"

https://gerrit.wikimedia.org/r/745964

A help page exists at https://www.mediawiki.org/wiki/Help:Lint_errors/inline-media-caption

But I suppose we need to communicate this more broadly, in TechNews and whatnot

Change 752716 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.15.0-a14

https://gerrit.wikimedia.org/r/752716

Change 752716 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.15.0-a14

https://gerrit.wikimedia.org/r/752716

This change appears to have been rolled out without notification to the community, and the help page is woefully inadequate. I request that it be reverted until there can be a discussion about why this change is needed, what priority it should have, and how instances of this condition (I hesitate to call it an error at this time) should be addressed. Certainly the advice to put the caption into alt= is not good advice in many cases (adding thumb or some other parameter that displays the caption is better in many cases).

On a more meta note, how does this sort of change become such a high priority that it gets implemented when bugs like T216003, which cause actual problems, have gone unaddressed for three years?

I agree with the preceding remarks by Jonesey95.

Yesterday the Linter was flagging pages but immediately removed them from the list after a zero edit.
Today the Linter is flagging pages like extrovert which don't have any inline images.

If this really is a high-priority issue, it deserves better preparation.

This morning I've found this new lint error category: https://it.wikivoyage.org/wiki/Speciale:LintErrors/inline-media-caption
but honestly I haven't understood the issue in most of the cases. If I show a File: without thumb/right/left the image, its link and its caption are rendered inside a SPAN so it's fine to show them inline.
But if I use thumb/right/left the same image is rendered inside a DIV so I can understand where the issue is and I know how to solve it.
Any comment?

The use of caption in this way is specifically documented at [[mw:Help:Images]]: "Caption text shows below the image in thumb and frame formats, or as tooltip text in any other format." Please revert this change pending discussion.

This error-detection code appears to be insufficiently thought out. This syntax should probably cause an error but does not:

https://en.wikipedia.org/w/index.php?title=User:Jonesey95/sandbox&oldid=1065637288

Also, this error typically indicates that alt text is missing, which is a low-priority error, not a high-priority error.

Can this change please be reverted while more thorough testing and discussion and documentation happens?

Sorry about the disruption. We expected only a small number of lints to be triggered, but given what we are seeing, we will revisit this lint and yes, probably revert it as well for now.

Thanks. I'll be happy to participate in testing and discussion to determine if and how this change should be rolled out in some way. It probably has some utility, but it is overkill in its current form.

We should stop doing that and get content to move them to explicit |alt= options.

Why? It seems to be a perfectly fine way to not duplicate the markup when wanting to write title="X" alt="X" in the HTML. I see that you are prepared to reconsider this already, and also want to say that I don’t see much reason for the switch.

As an aside, this linter category would have been a lot better and more useful if it was, actually, ‘no alternative text present on the image’ (without explicit |alt= set, of course). We as a movement are not doing enough to provide accessibility help to people, and LintErrors count of how bad the problem is would be a real good help to combat the prevalence of this. (I’d also say that it’s not a low-priority error, because typically it means that the file name is read out instead, which is usually really unhelpful, especially to non-English readers.)

Thanks for all the feedback! We'll discuss this in our team meeting next week and figure out next steps forward here. Till such time, please ignore this lint on wikis.

Change 754109 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/Linter@master] Disable \"inline-media-caption\" category

https://gerrit.wikimedia.org/r/754109

Change 754110 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Stop emitting \"inline-media-caption\" lints

https://gerrit.wikimedia.org/r/754110

Change 754110 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Stop emitting \"inline-media-caption\" lints

https://gerrit.wikimedia.org/r/754110

Change 754109 merged by jenkins-bot:

[mediawiki/extensions/Linter@master] Disable \"inline-media-caption\" category

https://gerrit.wikimedia.org/r/754109

Change 754558 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/extensions/Linter@master] Drop 'inline-media-caption' lint requests

https://gerrit.wikimedia.org/r/754558

I still see thousands lint errors on https://it.wikivoyage.org/wiki/Speciale:LintErrors/inline-media-caption; is it normal?

The fixes that revert this haven't yet rolled out into production. They will ride this week's train and these will then gradually clear out.

Change 754564 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.15.0-a15

https://gerrit.wikimedia.org/r/754564

Change 754144 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/extensions/Linter@wmf/1.38.0-wmf.17] Drop 'inline-media-caption' lint requests

https://gerrit.wikimedia.org/r/754144

Change 754145 had a related patch set uploaded (by Subramanya Sastry; author: Arlolra):

[mediawiki/extensions/Linter@wmf/1.38.0-wmf.17] Disable \"inline-media-caption\" category

https://gerrit.wikimedia.org/r/754145

Change 754558 merged by jenkins-bot:

[mediawiki/extensions/Linter@master] Drop 'inline-media-caption' lint requests

https://gerrit.wikimedia.org/r/754558

Change 754564 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.15.0-a15

https://gerrit.wikimedia.org/r/754564

Change 754144 merged by jenkins-bot:

[mediawiki/extensions/Linter@wmf/1.38.0-wmf.17] Drop 'inline-media-caption' lint requests

https://gerrit.wikimedia.org/r/754144

Mentioned in SAL (#wikimedia-operations) [2022-01-18T08:12:50Z] <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.17/extensions/Linter/includes/RecordLintJob.php: Backport: [[gerrit:754144|Drop 'inline-media-caption' lint requests (T297443 T299302)]] (duration: 00m 52s)

Change 754145 merged by jenkins-bot:

[mediawiki/extensions/Linter@wmf/1.38.0-wmf.17] Disable \"inline-media-caption\" category

https://gerrit.wikimedia.org/r/754145

Mentioned in SAL (#wikimedia-operations) [2022-01-18T09:31:16Z] <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.17/extensions/Linter/extension.json: Backport: [[gerrit:754145|Disable "inline-media-caption" category (T297443)]] (duration: 00m 51s)

I still see thousands lint errors on https://it.wikivoyage.org/wiki/Speciale:LintErrors/inline-media-caption; is it normal?

The fixes that revert this haven't yet rolled out into production. They will ride this week's train and these will then gradually clear out.

Thanks, now it has disappeared.

If you would need an example of reasonable use of ‘inline media captions’: https://ru.wikipedia.org/wiki/Шаблон:Злв-статус
It is an inline media file that has a caption that serves both as the accessible description (alt="…") and hoverable description (title="…") for users. While this can be hacked around in another way (say, wrapping it as a span) if it is decided that inline media captions are wrong for some reason, I personally struggle to see anything wrong with this markup. I am not sure why Parsoid decided 8 years ago that this is wrong, as it is certainly a normal expectation for an inline file.

Dutch Wiktionary uses inline media captions on tens of thousands of pages in a similar way as described by @stjn above. The captions were intentionally added to improve usability. Starting a process where the easy way to get rid of an error message is simply removing this information, seems counterproductive to me.

The reasons above, and more, are why this detection has been rolled back in the MediWiki code (and will be deployed soon). On the English Wikipedia, many millions of pages were affected. I do not think that there is a need to comment further on this phabricator ticket. If the developers decide that they think this situation is a problem, they will start a discussion and a link to that discussion will be posted here.

This error-detection code appears to be insufficiently thought out. This syntax should probably cause an error but does not:

https://en.wikipedia.org/w/index.php?title=User:Jonesey95/sandbox&oldid=1065637288

The example there has a |center option, which gets rendered by the legacy parser in a <div class="center">, making it not an inline image. Parsoid renders this as,

<figure class="mw-default-size mw-halign-center" typeof="mw:Image" id="mwAg"><a href="./File:Commons-logo.svg" class="mw-file-description" id="mwAw"><img resource="./File:Commons-logo.svg" src="//upload.wikimedia.org/wikipedia/en/thumb/4/4a/Commons-logo.svg/1024px-Commons-logo.svg.png" decoding="async" data-file-width="1024" data-file-height="1376" data-file-type="drawing" height="1376" width="1024" srcset="//upload.wikimedia.org/wikipedia/en/thumb/4/4a/Commons-logo.svg/1536px-Commons-logo.svg.png 1.5x, //upload.wikimedia.org/wikipedia/en/thumb/4/4a/Commons-logo.svg/2048px-Commons-logo.svg.png 2x" id="mwBA"/></a><figcaption id="mwBQ">Search Wikimedia Commons</figcaption></figure>

The linter wouldn't have flagged this case since there's a representation of the caption in the dom.

On a more meta note, how does this sort of change become such a high priority that it gets implemented when bugs like T216003, which cause actual problems, have gone unaddressed for three years?

Hopefully I've addressed everything in T216003 now.

I guess from my understanding (and I'm totally prepared to be wrong about this) is that the priority classes in the linter aren't so much about how egregious the errors the syntax are causing, although maybe that's one competing interest. The linter extension was introduce when we were doing the tidy migration with a goal of trying to evolve the language. That's why a lot of the higher priority categories are about HTML4 vs HTML5 semantics. We wanted to minimize the damage of switching from Tidy to Remex, an HTML5 parser. It's possible that work on some of the current categories can be deprioritized since Remex has been in production for quite some time and maybe the current rendering is acceptable?

In a similar vein, we're trying to change the structure around how media is rendered as step towards making Parsoid the default wikitext parser.

The use of caption in this way is specifically documented at [[mw:Help:Images]]: "Caption text shows below the image in thumb and frame formats, or as tooltip text in any other format." Please revert this change pending discussion.

The legacy parser says,

		# In the old days, [[Image:Foo|text...]] would set alt text.  Later it
		# came to also set the caption, ordinary text after the image -- which
		# makes no sense, because that just repeats the text multiple times in
		# screen readers.  It *also* came to set the title attribute.
		# Now that we have an alt attribute, we should not set the alt text to
		# equal the caption: that's worse than useless, it just repeats the
		# text.  This is the framed/thumbnail case.  If there's no caption, we
		# use the unnamed parameter for alt text as well, just for the time be-
		# ing, if the unnamed param is set and the alt param is not.
		# For the future, we need to figure out if we want to tweak this more,
		# e.g., introducing a title= parameter for the title; ignoring the un-
		# named parameter entirely for images without a caption; adding an ex-
		# plicit caption= parameter and preserving the old magic unnamed para-
		# meter for BC; ...
		if ( $imageIsFramed ) { # Framed image
			// @phan-suppress-next-line PhanImpossibleCondition
			if ( $caption === '' && !isset( $params['frame']['alt'] ) ) {
				# No caption or alt text, add the filename as the alt text so
				# that screen readers at least get some description of the image
				$params['frame']['alt'] = $link->getText();
			}
			# Do not set $params['frame']['title'] because tooltips don't make sense
			# for framed images
		} else { # Inline image
			// @phan-suppress-next-line PhanImpossibleCondition
			if ( !isset( $params['frame']['alt'] ) ) {
				# No alt text, use the "caption" for the alt text
				if ( $caption !== '' ) {
					$params['frame']['alt'] = $this->stripAltText( $caption, $holders );
				} else {
					# No caption, fall back to using the filename for the
					# alt text
					$params['frame']['alt'] = $link->getText();
				}
			}
			# Use the "caption" for the tooltip text
			$params['frame']['title'] = $this->stripAltText( $caption, $holders );
		}

https://github.com/wikimedia/mediawiki/blob/master/includes/parser/Parser.php#L5486-L5523

Change 791063 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Revert "Lint captions on inline media without an explicit alt options"

https://gerrit.wikimedia.org/r/791063

Change 791063 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Revert "Lint captions on inline media without an explicit alt options"

https://gerrit.wikimedia.org/r/791063

Change 791067 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/Linter@master] Revert "Add linter category for "inline-media-caption""

https://gerrit.wikimedia.org/r/791067

Change 791067 merged by jenkins-bot:

[mediawiki/extensions/Linter@master] Revert "Add linter category for "inline-media-caption""

https://gerrit.wikimedia.org/r/791067

Change 791474 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Set title for tooltips if caption isn't visible

https://gerrit.wikimedia.org/r/791474

Change 791475 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/core@master] [WIP] Tooltips

https://gerrit.wikimedia.org/r/791475

Change 791696 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/core@master] [WIP] Tooltips in galleries

https://gerrit.wikimedia.org/r/791696

Change 792236 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a8

https://gerrit.wikimedia.org/r/792236

Change 792236 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a8

https://gerrit.wikimedia.org/r/792236

Change 792286 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/ImageMap@master] Disable test temporarily

https://gerrit.wikimedia.org/r/792286

Change 792288 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/ImageMap@master] [WIP] Tooltips

https://gerrit.wikimedia.org/r/792288

Change 792286 merged by jenkins-bot:

[mediawiki/extensions/ImageMap@master] Disable test temporarily

https://gerrit.wikimedia.org/r/792286

Change 791475 merged by jenkins-bot:

[mediawiki/core@master] Clarify tooltips are set if captions aren't visible

https://gerrit.wikimedia.org/r/791475

Change 792288 merged by jenkins-bot:

[mediawiki/extensions/ImageMap@master] Place tooltips if caption isn't visible

https://gerrit.wikimedia.org/r/792288

Change 791696 merged by jenkins-bot:

[mediawiki/core@master] Set tooltips in galleries, despite caption being visible

https://gerrit.wikimedia.org/r/791696

Change 791474 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Set title for tooltips if caption isn't visible

https://gerrit.wikimedia.org/r/791474