Page MenuHomePhabricator

[EPIC] Provide a way to download articles in PDF on the mobile website
Closed, ResolvedPublic

Assigned To
Authored By
ovasileva
Apr 20 2017, 5:46 PM
Referenced Files
F18855242: Screen Shot 2018-06-06 at 11.07.05 PM.png
Jun 6 2018, 10:09 PM
F9169720: download-floa.png
Aug 24 2017, 10:32 PM
F9169703: error-pdf.png
Aug 24 2017, 10:21 PM
F9169680: land.png
Aug 24 2017, 10:14 PM
F9169679: loading.png
Aug 24 2017, 10:14 PM
F9169678: modal.png
Aug 24 2017, 10:14 PM

Description

Description
Based on the New Readers team research on offline consumption, we have learned that users in our target countries are interested in saving wikipedia articles to pdf from their mobile devices [1].

Due to our optimisation to lazy load images, the default browser print mode is inferior in that it prints the page without images being visible.

User story
As a reader, I would like to read an article when I don't have an active internet connection

Proposed Solution
Provide an article action to download the article. This action will trigger downloading of an article in PDF format. We will use a backend service (electron) generate this PDF and trigger the download. Users will be able to read this pdf when there isn't an active internet connection.

Flow

download-floa.png (856×3 px, 130 KB)

Design

New article action on article action toolbar. the icon indicates "Download"

land.png (1×720 px, 275 KB)

Loading while electron prepares the PDF. the action will turn into the standard spinner

loading.png (1×720 px, 276 KB)

After electron has prepared the pdf, we open the modal window. we show details of the pdf and allow readers to download the PDF.

modal.png (1×720 px, 65 KB)

If the electron fails creating the pdf > show a toast message. don't show the modal window.

error-pdf.png (1×720 px, 271 KB)

Clickthrough prototype
https://wikimedia.invisionapp.com/share/CMD71Y5Z7#/250304641_land

Open question

  • can we show the filesize in kb/mb in the modal window above
  • do we want to show the modal window or trigger the PDF download directly
  • should we still restrict it to Android now that we are going to use backend service? (nope)

Functional Requirements

  • Should work on Android phones
  • PDF styles must be designed for reading on a mobile device
  • PDF's must contain all elements available in our current mobile print styles T154964: [EPIC] Improve mobile print styles
  • Must load all images in article
  • Users must be able to generate a PDF from the page for each article
  • PDF buttons will only be available for article namespace

[1] https://meta.wikimedia.org/wiki/File:Report_for_offline_concepts_in_India_2017.pdf

Related Objects

StatusSubtypeAssignedTask
Resolvedovasileva
OpenNone
Resolvedovasileva
ResolvedABorbaWMF
ResolvedABorbaWMF
DeclinedNone
Declinedovasileva
DeclinedNone
ResolvedABorbaWMF
ResolvedSpikephuedx
InvalidNone
ResolvedABorbaWMF
Resolvedphuedx
Resolved mobrovac
Resolved mobrovac
Resolvedakosiaris
Resolved mobrovac
Resolvedfgiunchedi
Resolvedpmiazga
Resolvedfaidon
Resolved mobrovac
Resolved mobrovac
Resolvedpmiazga
ResolvedJdrewniak
Resolved mobrovac
Resolvedphuedx
Resolvedpmiazga
Resolvedpmiazga
DeclinedNone
OpenNone
Resolvedovasileva
Resolved Tbayer
Resolved Tbayer
Resolvedovasileva
ResolvedCKoerner_WMF
Resolved Tbayer
Resolvedovasileva

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

can we generate it and show the size and then let the person download? that was meant to be the flow.

Right now clicking "download as pdf" kicks off an asynchronous process to generate the PDF. We can't predict a PDF will be generated so the alternative would be to provide some intermediate screen for however long (maybe up to a minute) while the pdf generation takes. When it's done we would have the size. What would the flow look like here?

You mean we would predict prior to the user selecting the button? Or once they select the button? If the latter, can't we just use the current workflow? (ignore me if we've updated this since this comment):

  1. user selects download
  2. we display spinner (render PDF in background)
  3. once rendering is done, we display size and download button

We skipped doing this on desktop, but it's a bit more important for mobile

@bmansurov, @Jdlrobson - we should keep an eye on this as we begin to consider headless Chrome. Namely:

  • can we still set this up with Electron (given we're planning on building and deploying early next quarter)
  • would that affect any of our plans on headless Chrome

would it be worth setting up a spike to answer the above?

Given how unstable Electron is I wouldn't recommend building anything on top of it. We're going to have to prioritize headless Chromium over this and use it instead of Electron.

Initial tasks:

  • set up task for download button plus instrumentation: number of downloads per day/per month in grafana T177215: Build download button for mobile PDF download
  • set up task for setting up error state plus instrumentation: number of errors per day/per month in grafana
  • spike for how we should store PDFs on the service in order to get the size of download:
    • is there storage space?
    • does the storage space have any limit?
    • provide the size for this storage space?
    • when can we throw a rendered PDF away - should we delete them at the end of the day? should we delete them when user closes overlay? what happens when user closes the overlay? what is the easiest thing we can do?

Note since we'd only generate PDFs upon request (not beforehand) there's likely to be delay between clicking the button and requesting the PDFs size and allowing the download. Clicking the button would always generate the PDF on the server regardless of whether the user follows through. How will that work in the workflow?

During grooming, @pmiazga reminded us that there was an issue with Varnish not playing nicely with PDFs as they flowed through it (see T175868#3611679 and rECOL024f98e3ac68: Remove Content-Length from PDF response).

There are also some missing tasks and open questions:

  • [Spike] Design the interface (HTTP) for the new rendering service (that's being built in T176627).
  • Who's this feature for? Should we restrict it to a specific set of browsers?

The latter will help us get an estimate for the kind of additional load that the rendering service will be under.

Feature is targeting readers who want to have access to articles while
offline. Here's info from the research reports that informed this feature:
https://meta.wikimedia.org/wiki/New_Readers/Offline#Concept_testing_for_mobile_web

There are also some missing tasks and open questions:

  • [Spike] Design the interface (HTTP) for the new rendering service (that's being built in T176627).
  • Who's this feature for? Should we restrict it to a specific set of browsers?

We were initially considering restricting to Android. @atgo - do you think this is still reasonable? In terms of restricting to particular browsers, I would advise against it - I think many of our target users are using both modern and older browsers.

In response to T177215#3656782 - based on @bmansurov's comment in T163472#3639709, I was assuming that we would be using headless chromium for the rendering rather than electron.

Definitely Android. I'm not sure why we would want to restrict though - is
there a reason we think iOS & other users wouldn't want this?

I will comment on the flows later but just for platform consideration, we shouldn't restrict anymore because of backend service. as of iOS 11, file management is easy on IOS so it makes sense to have it on iOS too

Definitely Android. I'm not sure why we would want to restrict though - is
there a reason we think iOS & other users wouldn't want this?

@atgo: The existing service that renders PDFs is struggling under the current load. The RW engineers are currently working on a possible replacement. The suggestion that we restrict the audience of this feature wasn't because people don't want it but because, right now, we can't support a large audience.

I will comment on the flows later but just for platform consideration, we shouldn't restrict anymore because of backend service. as of iOS 11, file management is easy on IOS so it makes sense to have it on iOS too

In principle, I agree. However, if the backend service continues to struggle under current load or the replacement service can't handle the load, then adding more load isn't viable.

Both: The RW engineers are aware of this work, the intended audience (everyone, at the moment), and will be able to give better-informed feedback once they've implemented, tested, deployed, and monitored the service in T176627: Trial replacing Electron with headless Chromium in the render service.

Due to our optimisation to lazy load images, the default browser print mode is inferior in that it prints the page without images being visible.

Medium has now worked out a solution for this and we should do it too:

Screen Shot 2018-06-06 at 11.07.05 PM.png (216×532 px, 43 KB)

Due to our optimisation to lazy load images, the default browser print mode is inferior in that it prints the page without images being visible.

Medium has now worked out a solution for this and we should do it too:

Screen Shot 2018-06-06 at 11.07.05 PM.png (216×532 px, 43 KB)

I think that looks good. @alexhollender - I think just a modal here would work. Any thoughts/concerns?

@Jdlrobson @ovasileva is there any way for a user to know when images have finished loading? It seems like a vague instruction. Also I was a bit confused regarding lazy load — I figured in order to get all the images to load the user would need to exit the print dialog, then scroll through the page (forcing the images to lazy load)?

@alexhollender most of the time closing and reprinting will resolve the problem but on slow connections the banner will remain. The feedback to he user that images have finished loading is the banner disappearing, so maybe our copytext could reflect this.

@Jdlrobson ok, so I wonder if the copy should be more like "there was an error with ..... please close this dialog and press the download button again"? If we mention waiting for images to finish downloading there's no real way for the user to know when enough time has passed, other than just trying again and seeing if the error is still there, so maybe we don't even mention images loading at all and just tell them to try again?

Jdlrobson renamed this task from [EPIC] Provide a way to download articles in PDF on the mobile website to Provide a way to download articles in PDF on the mobile website.Jun 19 2018, 9:35 PM
Jdlrobson renamed this task from Provide a way to download articles in PDF on the mobile website to [EPIC] Provide a way to download articles in PDF on the mobile website.
Jdlrobson changed the task status from Stalled to Open.
Jdlrobson edited projects, added Web-Team-Backlog (Design); removed Web-Team-Backlog.
Jdlrobson edited projects, added Web-Team-Backlog; removed Web-Team-Backlog (Design).

Let's move conversation over to T162414. Having the conversation on the epic is confusing me.

Can this be resolved @ovasileva - the title of this task is "Provide a way to download articles in PDF on the mobile website" and we have a way (sadly Chrome only but a way).

We have a task T201954 for rolling it out to other places - I don't think an epic is necessarily useful here unless we plan to work on this quarter.

ovasileva claimed this task.

Can this be resolved @ovasileva - the title of this task is "Provide a way to download articles in PDF on the mobile website" and we have a way (sadly Chrome only but a way).

We have a task T201954 for rolling it out to other places - I don't think an epic is necessarily useful here unless we plan to work on this quarter.

Sounds good. Resolving.