Page MenuHomePhabricator

Expose the PDF rendering service via RESTBase
Closed, ResolvedPublic

Description

In order to be able to use the PDF rendering service from T143129, we need to create an endpoint, preferably /{domain}/v1/page/pdf/{title} that places a call to the service and returns the result to the caller (proxy-only mode for the time being).

Related Objects

StatusAssignedTask
ResolvedJhernandez
Resolved atgo
DeclinedNone
ResolvedNone
DeclinedNone
OpenJKatzWMF
StalledNone
ResolvedWMDE-Fisch
ResolvedAddshore
InvalidNone
InvalidNone
Resolved GWicke
Resolved Lea_WMDE
ResolvedAddshore
ResolvedAddshore
ResolvedTobi_WMDE_SW
ResolvedTobi_WMDE_SW
Resolvedgabriel-wmde
ResolvedAddshore
ResolvedTobi_WMDE_SW
ResolvedTobi_WMDE_SW
ResolvedTobi_WMDE_SW
DeclinedNone
ResolvedTobi_WMDE_SW
ResolvedAddshore
ResolvedAddshore
ResolvedAddshore
ResolvedAddshore
ResolvedAddshore
ResolvedPchelolo
ResolvedAddshore

Event Timeline

mobrovac created this task.Aug 16 2016, 5:47 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 16 2016, 5:47 PM
GWicke triaged this task as High priority.Oct 12 2016, 11:08 PM

Is there an ETA for this?

@Addshore, this depends on the electron service being deployed, which in turn depends on ops. We are shooting for all this to be resolved before November 30th.

@Addshore, this depends on the electron service being deployed, which in turn depends on ops. We are shooting for all this to be resolved before November 30th.

Great! :)

Joe added a subscriber: Joe.Nov 29 2016, 11:34 AM

Please note that exposing the service via restbase doesn't mean it's a good idea to call it via restbase from a MediaWiki extension; actually, there are very, very good reasons why that is a very bad idea in the general case.

The service is now exposed via restbase so that the public rest api can access this service, but I don't think we should access it via that from mediawiki.

Also, the restbase configuration at the moment does cache the content (a PDF) on varnish, and the text cluster, for 5 minutes.

That should be discussed as well with the appropriate people (Traffic )

Joe added a comment.EditedNov 29 2016, 12:24 PM

Please note that exposing the service via restbase doesn't mean it's a good idea to call it via restbase from a MediaWiki extension; actually, there are very, very good reasons why that is a very bad idea in the general case.
The service is now exposed via restbase so that the public rest api can access this service, but I don't think we should access it via that from mediawiki.

This is being discussed in T150185 and it is mostly ok, but it means that all requests to the PDF service will go through the publicly exposed restbase urls, thus via varnish, so the other point I made

Also, the restbase configuration at the moment does cache the content (a PDF) on varnish, and the text cluster, for 5 minutes.
That should be discussed as well with the appropriate people (Traffic )

is even more critical, as the expected via-varnish traffic is of course all the traffic generated from the service.

Joe added a comment.Nov 29 2016, 1:38 PM

For comparison, I just confirmed that when using OCG, MediaWiki issues Cache-control: no-cache; that's because OCG is caching content on disk.

@Joe: The traffic we are talking about here is very low. OCG currently sees about 2 req/s.

GWicke added a subscriber: BBlack.EditedNov 29 2016, 7:06 PM

The PR is now merged, and I also checked with @BBlack about object sizes & Varnish cache times. With expected volume & sizes (< 100mb) he does not see issues, but recommended to look into indicating the size (via content-length or some other header) if we end up serving PDFs of 100mb or larger. In that case, disabling caching for large responses would also be worth considering.

The electron render time limit is set to 60s in production. Based on experiments in labs, it is likely that this limits the returned PDF sizes to significantly less than 100mb in practice.

The REST API end point is tentatively scheduled for deployment tomorrow.

ema moved this task from Triage to Watching on the Traffic board.Dec 5 2016, 11:47 AM
TheDJ added a subscriber: TheDJ.Apr 28 2017, 10:24 AM

Can someone be so kind to document on mediawiki.org how to configure this ? Many people there are interested in running electron on their own, but It's totally confusing. People think they just have to install the ElectronPdfService extension, but are not realising they also need their own restbase and electron service, and there are no configuration steps specific to this service linked from the extension page.

I added some hints, and linked to the upstream service repository. Functionally, the electron render service is all that is needed to render arbitrary web pages to PDFs. The extension, RESTBase, and Varnish caching are all just nice-to-haves.