Page MenuHomePhabricator

Wikisource parse difference for <pages /> tag
Closed, InvalidPublic

Description

It looks like there's another issue with Wikisource HTML in Parsoid:

Wiki HTMLParsoid HTML
LinkLink
Image:
Wiki आज भी खरे हैं तालाब पाल के किनारे रखा इतिहास - विकिस्रोत.png (701×1 px, 124 KB)
Image:
Parsoid आज भी खरे हैं तालाब पाल के किनारे रखा इतिहास.png (59×784 px, 3 KB)

The error translates as "error: no such index", which is PRP's proofreadpage_nosuch_index message.

The source of the page is:

<pages
header=1
index="Aaj Bhi Khare Hain Talaab (Hindi).pdf"
from=8
to=12
prev="[[आज भी खरे हैं तालाब|मुखपृष्ठ]]"
next="[[आज भी खरे हैं तालाब/नींव से शिखर तक|नींव से शिखर तक]]"
/>

The index page definitely exists.

The PRP code that produces it is as follows, so perhaps this error is something to do with namespace IDs being handled differently in the two parsers?

		$indexTitle = Title::makeTitleSafe( $this->context->getIndexNamespaceId(), $index );
		if ( $indexTitle === null || !$indexTitle->exists() ) {
			return $this->formatError( 'proofreadpage_nosuch_index' );
		}

Event Timeline

I purged the page (?action=purge) and this time is rendered ok. Are there more cases than this one, that merits further investigation?

No, it looks like the others were also caching issues! Sorry for the noise; thanks for looking at this. :)

I am getting the same error that translates as "error: no such index".

What page? If the same page, try reloading? If a different page, try purging the cache via https://<wikisource>/wiki/<page-title>?action=purge and try the parsoid url again.

Great. Thanks for that information. But the end-user may not know how to purge cache. Most of the users won't even know what cache mean. :) Please make it a default parameter if possible.

I'm not quite sure what is happening here. It seems that the pages linked above haven't been edited for some months, and have previously been rendered correctly (via WS Export for T285590, although I'm not 100% sure of the timeline), so it seems odd that they would subsequently not be rendered correctly. The index page was edited more recently, so I guess that could be related.

Is ProofreadPage not doing something that would make things work more reliably?

The index page was edited more recently, so I guess that could be related.

The last two edits look like the page was moved away and then returned,
https://hi.wikisource.org/w/index.php?title=%E0%A4%B5%E0%A4%BF%E0%A4%B7%E0%A4%AF%E0%A4%B8%E0%A5%82%E0%A4%9A%E0%A5%80:Aaj_Bhi_Khare_Hain_Talaab_(Hindi).pdf&action=history

At first glance, that sounds similar to T217540. But the translation of the edit summary says, "without leaving redirects", so it could just be that the page was rerendered in the window where the index page had been moved.