Page MenuHomePhabricator

Make alt captions of an image found by web browser text search
Closed, InvalidPublicFeature

Description

Feature summary: Using 'find' in an article should get hits from text in image descriptions. That is, whatever text is copied when you copy a page and paste it into a word-processor should also be searchable directly on WP without requiring that extra step.

Use case(s): on the https://en.wikipedia.org/wiki/Kaktovik_numerals we have Unicode characters that don't yet have font support. (I've submitted a font here on Phabricator, but that hasn't gone through yet.) Thus graphics are needed for display. The Unicode characters are present, but only in the coding. E.g.

[[File:Kaktovik digit 3.svg | x32px | ๐‹ƒ]]

That displays the character ๐‹ƒ as an SVG file (since the browser isn't capable of displaying it directly), and when editing the article, 'find' is able to locate any instances of ๐‹ƒ, but when just reading the article, the browser 'find' function can't locate any instances of it.

This is an example. There are other cases where we use graphics for Unicode ranges that have poor font support.

Benefits: Currently, if you want to search where a character is, you need to copy the article (or a paragraph or table) and paste it somewhere else, or open the page for editing and search the coding. However, readers will expect to be able to search the article without any such workarounds. We're currently having a debate on the article talk page because some editors object to the 'find' function not working with these characters, and saying that graphics should not be used. However, if graphics are not used, those sections will be illegible. E.g. the Unicode character table will uselessly say that at code point 1D2C3 we have character 1D2C3, without showing which character 1D2C3 is.

Allowing the graphics to be searchable will mean that the behaviour of an article will be the same, whether the browser is able to handle the graphic display or whether we need a manual workaround to display some characters, as here.

Queston: it may be that only the title of an image should be searchable, and not the alt text. I'm not familiar enough with the difference to have an opinion on that. But when a graphic is used for a character for accessibility (say when reader font support is unlikely), and the Unicode point is also supplied so that the character can be copy-pasted, that character should show up in a search.

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptJan 27 2023, 6:49 PM
Aklapper changed the task status from Open to Stalled.Jan 29 2023, 10:48 PM

Hi @Kwamikagami, thanks for taking the time to report this!

the browser 'find' function can't locate any instances of it.

The markup [[File:Kaktovik digit 3.svg | x32px | ๐‹ƒ]] is explained on https://www.mediawiki.org/wiki/Help:Images : The last item is an image caption.
Image captions are displayed in a browser and thus searchable (in contrast to e.g. alternative text for an image).
Note that https://en.wikipedia.org/wiki/Kaktovik_numerals does not have that markup, so it's unclear to me what is expected in this ticket after which exact steps.

Unfortunately closing this Phabricator task as no further information has been provided.

@Kwamikagami: After you have provided the information asked for and if this still happens, please set the status of this task back to "Open" via the Add Action... โ†’ Change Status dropdown. Thanks!

@Aklapper: Sorry, pinging me here doesn't seem to work. I didn't get a notice that this had been answered.

You said the syntax [[File:Kaktovik digit 3.svg | x32px | ๐‹ƒ]] will make the text searchable, but hadn't been used. Actually, both points (appear to be) wrong: that syntax has been used ever since the characters were added to Unicode, but doesn't make the character searchable. (It does allow copy and paste.)

Just to be sure, I copied [[File:Kaktovik digit 3.svg | x32px | ๐‹ƒ]] over to WP-en and did a search for ๐‹ƒ. No hits. To be clear, what you describe is the behaviour I was hoping for.

Sorry, pinging me here doesn't seem to work. I didn't get a notice that this had been answered.

It works but depends on your settings: https://www.mediawiki.org/wiki/Phabricator/Help#Receiving_updates_and_notifications

and did a search for ๐‹ƒ. No hits.

Right, web browsers do not page-search within alt parameters of images, so it's up to the author/editor of the content to make this content displayed (or to convince web browser authors to add a feature to index alt parameters of images on a page and include them in a text search in the web browser).

Quoting https://www.mediawiki.org/wiki/Help:Images : "Caption text shows below the image in thumb and frame formats, or as tooltip text in any other format." - That means you would need to use [[File:Kaktovik digit 3.svg|thumb|x32px|๐‹ƒ]] or [[File:Kaktovik digit 3.svg|frame|x32px|๐‹ƒ]] instead.

Aklapper renamed this task from Allow images to be searchable to Make alt captions of an image found by web browser text search.Mar 11 2023, 10:29 PM

Okay, that's exactly what I'm getting at. These are inline images, so 'thumb' and 'frame' are not appropriate solutions. Rather, they are used for Unicode characters that don't yet have good font support, and so need to be augmented with images. But *replacing* them with images means they no longer show up in a search. Meanwhile, leaving them as searchable text means they are not legible to the reader. I'd like both: they should be both legible and searchable. If I understand you correctly, you're saying this isn't something that can be fixed at Phabricator, and we'd have to get Chrome, Firefox and the rest to implement this at their end?

I have supplied a font and hope it will soon be incorporated on Wikimedia. If/when that happens I'll remove all the in-text images from the Kaktovik articles and leave the characters as plain text. But it would still be nice to know how to accomplish this for other articles.

Yes, this applies to every and any website, so nothing to fix in MediaWiki hosting only a tiny number of websites...

Okay, thanks! We'll just put up with our text not being searchable, I guess.