Page MenuHomePhabricator

Subpages are not exported where AuxTOC template indexes chapters
Closed, ResolvedPublic

Description

Subpages are not exported in pdf or epub etc. where AuxTOC template indexes chapters of a transcluded text.

For example, exporting this book does not include its chapters.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Samwilson added a subscriber: ssastry.

This looks like an error with Parsoid, in that it's not returning the full HTML of the page.

Instead, there's a <span about="#mwt14" typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"AuxTOC","href":"./টেমপ্লেট:AuxTOC"}," … element that contains references to the ToC's contents (it looks like this is aimed at VE and requires secondary parsing or something, but we probably don't want to get into doing that here).

Pinging @ssastry.

The transclusion in question is:

{{AuxTOC|
{{c|{{xx-larger|'''সূচীপত্র'''}}}}
{{block center|width=300px}}
{{Table| title='''পরিচ্ছেদ'''| page= '''পাতা'''}}
{{Table| title=[[/১/]]| page=[[পাতা:অবরোধ বাসিনী.pdf/৪|১]]}}
...
...
{{Table| title=[[/৪৭/]]| page=[[পাতা:অবরোধ বাসিনী.pdf/২৮|২৫]]}}
}}

If I go to the API sandbox @ https://bn.wikisource.org/wiki/Special:ApiSandbox#action=expandtemplates&format=json&formatversion=2 and enter that wikitext and click submit, the output doesn't include the TOC. Since Parsoid effectively uses this API (internally via calls), Parsoid is emitting the empty table. So, something is broken in how Parsoid is using that or we are tripping over some bug ... this is just a quick report without any additional investigation. We'll take a look.

Thanks for looking at it.

The template expansion seems to be breaking on the use of <poem> in {{AuxTOC}}. Note sure why it's using <poem> (although I guess it also should break if it does).

Arlolra triaged this task as Medium priority.Apr 2 2021, 7:28 PM
Arlolra moved this task from Needs Triage to Missing Functionality on the Parsoid board.
Arlolra added a subscriber: Arlolra.

From the expansion of AuxTOC, the problem looks to be from here,

<poem>{{{text|{{{1|}}}}}}</poem>

That {{{1}}}. When parsing with the legacy parser, the poem tag would have access to the frame information for the template argument to substitute in. However, when doing preprocessing api calls, the extension tag has higher precedence and the content doesn't get rendered. So, Parsoid loses the contextual information about the frame and just renders the extension content without any template argument info.

I'm positive we have an open ticket about that but I can't find it right now. It's definitely a known deficiency.

We'll run into this whenever we have native extensions and I ran into this when I was doing a quick prototype of the <indicator> implementation in Parsoid and it looks like we need to link up call frames in the call stack appropriately in the ParsoidExtensionAPI code. Since I am going to be poking around in that area for <indicator>, I'll report back based on what I find there.

Tried {{#tag:poem}} ?

Yes, that is a workaround. The template argument will be expanded before being passed to the parser function, which will return an extension tag when preprocessing

Expansion,
https://github.com/wikimedia/mediawiki/blob/e614e8653eeb72095fdb5198a7217bb5bc3b0176/includes/parser/Parser.php#L3938-L3945

Then returning the reconstituted tag,
https://github.com/wikimedia/mediawiki/blob/e614e8653eeb72095fdb5198a7217bb5bc3b0176/includes/parser/Parser.php#L3985-L4005

I switched it to {{#tag:poem}} but it doesn't seem to have changed the Parsoid HTML output.

Are you sure? The ToC seems to be here https://bn.wikisource.org/api/rest_v1/page/html/%E0%A6%85%E0%A6%AC%E0%A6%B0%E0%A7%8B%E0%A6%A7_%E0%A6%AC%E0%A6%BE%E0%A6%B8%E0%A6%BF%E0%A6%A8%E0%A7%80#mwDA

There is a rendering difference because of T161278 but the HTML at least seems right.

Try adding to the head,

<link rel="stylesheet" href="/w/load.php?lang=bn&amp;modules=ext.gadget.Site%2CThreadedDiscussions&amp;only=styles&amp;skin=vector"/>

Oh you're right, it is updated now! I must have been looking at a cached copy (there was definitely no ToC). Oops.

So it looks like the immediate issue is solved (and the underlying issue is tracked in T107332).

@Bodhisattwa does the export look correct to you now?

@Samwilson the whole book is now getting exported for pdf, epub and mobi. Thanks a lot everyone. :-)

Samwilson claimed this task.

Great! Thanks for checking. Closing this.