Page MenuHomePhabricator

[EPIC] Provide a way to download articles in PDF on the mobile website
Open, HighPublic

Description

Description
Based on the New Readers team research on offline consumption, we have learned that users in our target countries are interested in saving wikipedia articles to pdf from their mobile devices [1].

Due to our optimisation to lazy load images, the default browser print mode is inferior in that it prints the page without images being visible.

User story
As a reader, I would like to read an article when I don't have an active internet connection

Proposed Solution
Provide an article action to download the article. This action will trigger downloading of an article in PDF format. We will use a backend service (electron) generate this PDF and trigger the download. Users will be able to read this pdf when there isn't an active internet connection.

Flow

Design

New article action on article action toolbar. the icon indicates "Download"

Loading while electron prepares the PDF. the action will turn into the standard spinner

After electron has prepared the pdf, we open the modal window. we show details of the pdf and allow readers to download the PDF.

If the electron fails creating the pdf > show a toast message. don't show the modal window.

Clickthrough prototype
https://wikimedia.invisionapp.com/share/CMD71Y5Z7#/250304641_land

Open question

  • can we show the filesize in kb/mb in the modal window above
  • do we want to show the modal window or trigger the PDF download directly
  • should we still restrict it to Android now that we are going to use backend service? (nope)

Functional Requirements

  • Should work on Android phones
  • PDF styles must be designed for reading on a mobile device
  • PDF's must contain all elements available in our current mobile print styles T154964: [EPIC] Improve mobile print styles
  • Must load all images in article
  • Users must be able to generate a PDF from the page for each article
  • PDF buttons will only be available for article namespace

[1] https://meta.wikimedia.org/wiki/File:Report_for_offline_concepts_in_India_2017.pdf

Related Objects

StatusAssignedTask
StalledNone
OpenNone
ResolvedABorbaWMF
ResolvedABorbaWMF
DeclinedNone
Declinedovasileva
DeclinedNone
ResolvedABorbaWMF
Resolvedphuedx
InvalidNone
ResolvedABorbaWMF
Resolvedphuedx
Resolvedmobrovac
Resolvedmobrovac
Resolvedakosiaris
Resolvedmobrovac
Resolvedfgiunchedi
Resolvedpmiazga
Resolvedfaidon
Resolvedmobrovac
Resolvedmobrovac
Resolvedpmiazga
ResolvedJdrewniak
Resolvedmobrovac
Resolvedphuedx
Resolvedpmiazga
Resolvedpmiazga
Openovasileva
Openalexhollender
OpenNone
Resolved Tbayer
Resolved Tbayer
Resolvedovasileva
ResolvedCKoerner_WMF
Resolved Tbayer
Resolvedovasileva

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'd guess not since we are generating the pdf in the background and don't know it's final size until it's done.

can we generate it and show the size and then let the person download? that was meant to be the flow.

That's a design question right?

Yes, this is in design review with other designers right now. i will take a call by next week.

I don't really understand this question. What is the benefit of restricting to Android? Are we expecting bugs in other browsers? From a developer point of view limiting to Android is relatively straightforward if you want to, but I'd advise not doing that - as you may exclude from browsers where the feature is useful. Excluding it however will allow you to reduce the scope of potential bugs.

Earlier we were going to do this on client side. and client side flow for generating print>pdf on iOS is terrible. so we wanted to restrict it to Android. now with electron it doesn't make sense. so i guess answer to this open questions is "not restrict platforms"

can we generate it and show the size and then let the person download? that was meant to be the flow.

Right now clicking "download as pdf" kicks off an asynchronous process to generate the PDF. We can't predict a PDF will be generated so the alternative would be to provide some intermediate screen for however long (maybe up to a minute) while the pdf generation takes. When it's done we would have the size. What would the flow look like here?

ovasileva moved this task from Triage to Backlog on the Proton board.Sep 15 2017, 2:28 PM
ovasileva added a subscriber: bmansurov.EditedSep 27 2017, 12:23 PM

@bmansurov, @Jdlrobson - we should keep an eye on this as we begin to consider headless Chrome. Namely:

  • can we still set this up with Electron (given we're planning on building and deploying early next quarter)
  • would that affect any of our plans on headless Chrome

would it be worth setting up a spike to answer the above?

ovasileva updated the task description. (Show Details)Sep 27 2017, 12:25 PM
ovasileva updated the task description. (Show Details)
ovasileva updated the task description. (Show Details)Sep 27 2017, 12:27 PM

can we generate it and show the size and then let the person download? that was meant to be the flow.

Right now clicking "download as pdf" kicks off an asynchronous process to generate the PDF. We can't predict a PDF will be generated so the alternative would be to provide some intermediate screen for however long (maybe up to a minute) while the pdf generation takes. When it's done we would have the size. What would the flow look like here?

You mean we would predict prior to the user selecting the button? Or once they select the button? If the latter, can't we just use the current workflow? (ignore me if we've updated this since this comment):

  1. user selects download
  2. we display spinner (render PDF in background)
  3. once rendering is done, we display size and download button

We skipped doing this on desktop, but it's a bit more important for mobile

bmansurov added a comment.EditedSep 27 2017, 1:55 PM

@bmansurov, @Jdlrobson - we should keep an eye on this as we begin to consider headless Chrome. Namely:

  • can we still set this up with Electron (given we're planning on building and deploying early next quarter)
  • would that affect any of our plans on headless Chrome

would it be worth setting up a spike to answer the above?

Given how unstable Electron is I wouldn't recommend building anything on top of it. We're going to have to prioritize headless Chromium over this and use it instead of Electron.

ovasileva added a comment.EditedOct 2 2017, 10:14 AM

Initial tasks:

  • set up task for download button plus instrumentation: number of downloads per day/per month in grafana T177215: Build download button for mobile PDF download
  • set up task for setting up error state plus instrumentation: number of errors per day/per month in grafana
  • spike for how we should store PDFs on the service in order to get the size of download:
    • is there storage space?
    • does the storage space have any limit?
    • provide the size for this storage space?
    • when can we throw a rendered PDF away - should we delete them at the end of the day? should we delete them when user closes overlay? what happens when user closes the overlay? what is the easiest thing we can do?
Jdlrobson added a comment.EditedOct 3 2017, 4:45 PM

Note since we'd only generate PDFs upon request (not beforehand) there's likely to be delay between clicking the button and requesting the PDFs size and allowing the download. Clicking the button would always generate the PDF on the server regardless of whether the user follows through. How will that work in the workflow?

During grooming, @pmiazga reminded us that there was an issue with Varnish not playing nicely with PDFs as they flowed through it (see T175868#3611679 and rECOL024f98e3ac68: Remove Content-Length from PDF response).

phuedx added a comment.EditedOct 3 2017, 5:07 PM

There are also some missing tasks and open questions:

  • [Spike] Design the interface (HTTP) for the new rendering service (that's being built in T176627).
  • Who's this feature for? Should we restrict it to a specific set of browsers?

The latter will help us get an estimate for the kind of additional load that the rendering service will be under.

Feature is targeting readers who want to have access to articles while
offline. Here's info from the research reports that informed this feature:
https://meta.wikimedia.org/wiki/New_Readers/Offline#Concept_testing_for_mobile_web

There are also some missing tasks and open questions:

  • [Spike] Design the interface (HTTP) for the new rendering service (that's being built in T176627).
  • Who's this feature for? Should we restrict it to a specific set of browsers?

We were initially considering restricting to Android. @atgo - do you think this is still reasonable? In terms of restricting to particular browsers, I would advise against it - I think many of our target users are using both modern and older browsers.

In response to T177215#3656782 - based on @bmansurov's comment in T163472#3639709, I was assuming that we would be using headless chromium for the rendering rather than electron.

Definitely Android. I'm not sure why we would want to restrict though - is
there a reason we think iOS & other users wouldn't want this?

I will comment on the flows later but just for platform consideration, we shouldn't restrict anymore because of backend service. as of iOS 11, file management is easy on IOS so it makes sense to have it on iOS too

phuedx added a comment.Oct 5 2017, 1:26 PM

Definitely Android. I'm not sure why we would want to restrict though - is
there a reason we think iOS & other users wouldn't want this?

@atgo: The existing service that renders PDFs is struggling under the current load. The RW engineers are currently working on a possible replacement. The suggestion that we restrict the audience of this feature wasn't because people don't want it but because, right now, we can't support a large audience.

I will comment on the flows later but just for platform consideration, we shouldn't restrict anymore because of backend service. as of iOS 11, file management is easy on IOS so it makes sense to have it on iOS too

In principle, I agree. However, if the backend service continues to struggle under current load or the replacement service can't handle the load, then adding more load isn't viable.

Both: The RW engineers are aware of this work, the intended audience (everyone, at the moment), and will be able to give better-informed feedback once they've implemented, tested, deployed, and monitored the service in T176627: Trial replacing Electron with headless Chromium in the render service.

phuedx changed the task status from Open to Stalled.Mar 7 2018, 4:19 PM

Due to our optimisation to lazy load images, the default browser print mode is inferior in that it prints the page without images being visible.

Medium has now worked out a solution for this and we should do it too:

Due to our optimisation to lazy load images, the default browser print mode is inferior in that it prints the page without images being visible.

Medium has now worked out a solution for this and we should do it too:

I think that looks good. @alexhollender - I think just a modal here would work. Any thoughts/concerns?

@Jdlrobson @ovasileva is there any way for a user to know when images have finished loading? It seems like a vague instruction. Also I was a bit confused regarding lazy load — I figured in order to get all the images to load the user would need to exit the print dialog, then scroll through the page (forcing the images to lazy load)?

@alexhollender most of the time closing and reprinting will resolve the problem but on slow connections the banner will remain. The feedback to he user that images have finished loading is the banner disappearing, so maybe our copytext could reflect this.

@Jdlrobson ok, so I wonder if the copy should be more like "there was an error with ..... please close this dialog and press the download button again"? If we mention waiting for images to finish downloading there's no real way for the user to know when enough time has passed, other than just trying again and seeing if the error is still there, so maybe we don't even mention images loading at all and just tell them to try again?

Jdlrobson renamed this task from [EPIC] Provide a way to download articles in PDF on the mobile website to Provide a way to download articles in PDF on the mobile website.Jun 19 2018, 9:35 PM
Jdlrobson renamed this task from Provide a way to download articles in PDF on the mobile website to [EPIC] Provide a way to download articles in PDF on the mobile website.
Jdlrobson changed the task status from Stalled to Open.

Let's move conversation over to T162414. Having the conversation on the epic is confusing me.