Page MenuHomePhabricator

Autonumbering of section titles in TOC fails on printout
Closed, DeclinedPublic

Description

Turning on autonumbered section titles works in ordinary page view, but fails on printout.

Given I have turned on Appearance – "Advanced options" – "Auto-number headings" is checked
When I print out or make an PDF
Then I expect the section title and the TOC to be numbered

In the downloaded PDF (sidebar) there are no numbering at all, and in the print out there are only numbers for the sections and not in the TOC. This is on Firefox.

Event Timeline

jeblad renamed this task from Autonumbered section titles fails on printout to Autonumbering of section titles in TOC fails on printout.Jan 12 2019, 6:10 AM

Current print styles seem to intentionally hide the numbering in the TOC, regardless of the preference. Perhaps it should be shown if the preference is enabled.

I suspect the downloaded PDF is generated using Parsoid, which currently doesn't take your preferences into account at all. This also affects e.g. preferences about thumbnail size.

LGoto triaged this task as Lowest priority.Jul 15 2020, 3:55 PM
LGoto moved this task from Needs triage to Backlog on the Product-Infrastructure-Team-Backlog board.
LGoto added a project: Parsing-Team--ARCHIVED.

I suspect the downloaded PDF is generated using Parsoid

Can someone confirm if this is true? It was true in the OCG days, but don't know what Proton is backed by.

Proton essentially uses the output of /api/rest_v1/page/pdf/ which uses https://github.com/wikimedia/mediawiki-services-chromium-render under the hood.
It should be using an existing parsoid based API route, but it's been a while since I've looked at the code so I can't tell you right now the exact route used. @polishdeveloper do you remember?

The preference numberheadings is tied to the user so I imagine that's why this is not working.

Proton uses puppeteer node library to tell chrome (that is running in headless mode) to do things. What we currently do is that we tell chrome to open a Wikipedia page, and call window.print() and print to PDF. It behaves almost the same way as someone going to Wikipedia on their desktop and clicking "Print" and then pick "PDF" as output. It doesn't use Parsoid output (at least at the time of implementation) as there were differences in styles for mobile pages.

It uses the REST api only to verify that article exists (before it adds it to the queue).

This is how it creates an queue item:
https://github.com/wikimedia/mediawiki-services-chromium-render/blob/19cfc57b45fd1aebad27b17209421cd799dbbb82/routes/html2pdf-v1.js#L144

And this is how it builds the link (uri parameter in queue item) https://github.com/wikimedia/mediawiki-services-chromium-render/blob/a4e120f779dd9011b5669321dc61c18782c01c64/lib/api-util.js#L36
example: 'http://en.m.wikipedia.org/w/index.php?title=Book

Most probably the print styles are missing something (see https://github.com/wikimedia/Vector/blob/a4a3c17a99815048d077c544cd14be62cfeafb14/resources/skins.vector.styles/common/print.less).

Got it. One less thing for Sabbu to worry about then :) Thanks @polishdeveloper !

So I think what's being requested here is an additional option in the service itself to disable User::getDefaultOption( 'numberheadings' ), given the way Proton works, this would require various bits:

  1. Allow this to be enabled by query string e.g. http://en.m.wikipedia.org/w/index.php?title=Book&numberheadings=1
  2. Adding an option the service layer
  3. Wiring up to the Special:Proton page.

That does seem like a lot of work to me, and fragments the cache in multiple places. While this is just one option this could lead to many... so be cautious!

I know this might be potentially unpopular, but personally I would decline this - CTRL+P allows a mechanism to get a pdf that follows user preferences. Proton's audience is for browsers that cannot print at all. It's not in our interest for Proton to gain more complexity.

The PDF print service is not aware of any user/user preferences. It prints all articles as anon user. Therefore - when a logged in user asks to print page with some special option, then it's exactly as @Jdlrobson says:

  • we need to pass any special options to the print service (proton) (Currently the URL to generate PDF is https://en.wikipedia.org/api/rest_v1/page/pdf/Book). At this moment Proton handles only /page/pdf/:title/:format[letter|a4|legal]/:type[mobile|desktop] - for example /page/pdf/Book/legal/mobile to create a PDF of Book article in mobile view using legal paper size.
  • proton has to pass that option to puppeteer/chrome so it render the page correctly, that one is pretty straightforward as Queue item is an object, we can assign some extra keys
  • and then the the MediaWiki should allow to override options based on _GET params
LGoto closed this task as Declined.Oct 9 2020, 4:50 PM