Page MenuHomePhabricator

Storage for mobile-html endpoint
Closed, DuplicatePublic

Description

It seems like the mobile-html endpoint is coming soon, so we need to discuss the storage for it. Judging by the name of it, it will store almost full page content in HTML, which is pretty big. Currently, mobile-sections are the second largest table in Cassandra, so obviously we will not be able to just accommodate the whole thing with the current storage capacity we have, so we will need to be creative.

Some questions:

  1. What's the rollout plan?
  2. Do we even need storage? How quick would the proposed transforms be? What's the performance numbers on it?
  3. How much of the transforms are shared between the current mobile-section endpoints and the new mobile-html endpoint? Is it possible to generate one from another much quicker than from full Parsoid HTML? This one is of the most importance. Given that the clients don't upgrade right away, we're looking into possibly many months/years of supporting both simultaneously, and storing both mobile-sections and mobile-html is prohibitively expensive.

Event Timeline

Pchelolo created this task.
bearND renamed this task from Storage for content-html endpoint to Storage for mobile-html endpoint.Jul 26 2018, 3:11 PM
bearND updated the task description. (Show Details)

Renamed content-html -> mobile-html to reflect reality.

  1. Roll-out plan

I think storage needs would realistically start in September at the earliest. There is still a lot of work to be done on the Android side to adapt to this and test this out:

  • investigation to change the loading behavior in the Android app to use the new endpoint (direct HTML payload) instead of what is currently used (two step process of JSON payloads shoved over the JS bridge)
  • look into caching for offline concerns in the app. Now when saving a page for offline the app might want to also download a few other endpoints (/page/metadata, /page/references, /page/media) for the native side. We're also considering adding versioning to the CSS and JS endpoint URLs since those need to be stored in the app in a versioned fashion as well.

We might still adjust the mobile-html endpoint a bit during this phase as needed, according to feedback from app devs and designers. We'll also try to get the iOS app using the PCS endpoints.

  1. Do we even need storage?

I don't know yet.

  1. How much of the transforms are shared

Some of the transforms are shared, but a good portion of the transformations in the new mobile-html endpoint are new. They come from the wikimedia-page-library.
To me it would make more sense to try to generate mobile-sections from mobile-html. The problem right now is though, that it also applies the LazyLoadingTransform on the important images of a page. To realistically reuse mobile-html for mobile-sections we would have to stop doing that in PCS and push it back to the clients.

Given that the clients don't upgrade right away, we're looking into possibly many months/years of supporting both simultaneously, and storing both mobile-sections and mobile-html is prohibitively expensive.

I think we probably don't need to keep storage around for mobile-sections that long. At the end we might be able to also change the Android app remote config to instruct older versions of the app to fallback to action=mobileview. A few months of overlap would be nice if possible. If not we could probably switch-over to storing mobile-html when the Android app releases the new loading mechanism to the production app.

Good summary,

I think we probably don't need to keep storage around for mobile-sections that long. At the end we might be able to also change the Android app remote config to instruct older versions of the app to fallback to action=mobileview. A few months of overlap would be nice if possible. If not we could probably switch-over to storing mobile-html when the Android app releases the new loading mechanism to the production app.

That sounds like something to investigate as it could be very interesting.

The blocker is Android since it is the user of mobile-sections, iOS uses mobileview.

As such, Android will need to switch first, because if iOS does it then we will for bloat the storage as each platform will be using different REST services.

Moved it back to backlog as we're not doing anything with this right now, but having a discussion