Page MenuHomePhabricator

QINU appears instead of math in search results
Open, MediumPublic

Description

In the past, some articles had the string UNIQ QINU because of various edge cases of Parsoid and Content Translation. I checked whether any of that is left in the Hebrew Wikipedia by searching for "QINU". I couldn't find anything in wiki syntax or rendered text, which is great, but it does appear in search results instead of math formulas. Here's a screenshot:

The first result is the article "Function", which has a section with the following wikitext:

==קבוצת הפונקציות <math>Y^X</math>==

The wikitext makes sense and the rendering is correct, but in the search results I see this instead of anything that looks like "Y^X":

'"`UNIQ--postMath-00000072-QINU`"')

It's not the most important bug in Wikipedia, but it definitely shouldn't happen :)

This may be related to T138453 and T127738, but it's about searching and not rendering.

I'm not sure whether it's related to Math, MediaWiki-Parser, or Discovery-Search, so tagging all.

Thanks!

Event Timeline

Amire80 created this task.Jan 9 2020, 9:32 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 9 2020, 9:32 AM

Looking at the HTML output for one example page we have:

<span style="display:none" class="sortkey">Durener Straße&#160;040 '"`UNIQ--nowiki-00000009-QINU`"' </span>

I'm not sure how advanced the css selectors are for our html stripping utility, perhaps we could add a rule that strips html containing style="display:none" exactly. I read a little on the parent task and I'm not quite sure if this is an appropriate fix though, it seems from the parent task that there might be some other underlying reasons these pages have the marker that should be fixed?

TJones triaged this task as Low priority.Aug 27 2020, 8:40 PM

UNIQ/QINO tends to show up when there is either a broken or invalid template call on a page. It also sometimes shows up if an extension tag has a bug in its PHP code where it processing some of the wikitext only partially.

In this case, the bug seems to be in the page itself, and the corruption can be seen on the page itself as well, not just through search, so it seems fair to include. Text is sometimes intentionally hidden with CSS based on user interaction or other context. Hiding all "display:none" text would imho be a mistake.

There is supposed to be an image in row 35 generated through a template (Vorlage: is German for Template:), but it is broken.

Gehel raised the priority of this task from Low to Medium.Aug 28 2020, 12:27 PM
Gehel moved this task from elastic / cirrus to Bugs on the Discovery-Search board.