Page MenuHomePhabricator

Pre-generate mobile app content end points
Closed, ResolvedPublic

Description

The Android team is currently rolling out RESTBase content loading to the beta channel. To accurately gauge performance, we should pre-generate mobile content accessed by the app, so that don't need to wait for relatively slow on-demand content massaging.

The main concerns with doing so are

a) load created on the content service, and
b) storage needs in RESTBase while the conversion to the multi-instance setup is not finished yet (T95253).

With the performance gains & reduced storage usage from the upgrade to Cassandra 2.1.12 (T120803) I'm quite confident that we can manage b). Based on past dump sizes, I would expect at most 150G of extra storage usage per node once storage is completely filled. In the worst case we could quickly delete all stored mobile apps content to free up space quickly.

For a), my information is that we currently have two instances of the content service deployed. This will likely be a bit tight to process about 100 updates per second, so we should carefully ramp it up to see how it does in practice. I would propose to start with pre-generating 10% of requests, and then gradually ramping it up.

@mobrovac, @Pchelolo, @Eevans, @fgiunchedi: Does this sound like a good plan to you?

Event Timeline

GWicke raised the priority of this task from to High.
GWicke updated the task description. (Show Details)

I'm not sure how you want to do pre-generate 10% of requests short of a time machine. By the time a page is requested those 10% should have been already pre-generated.

I would like to have some predictable portion of pages pre-generated. Ideally, the top 1000 pages of enwiki. Or maybe, if the former is too complicated, pre-generate everything of a small wiki. While this would not have the immediate positive impact we could at least use it for performance testing.

@bearND: Pre-generation is about processing an event stream of edited pages. Pre-generating 10% of pages means processing 10% of those edits right after they happen, and leaves 90% to be generated on client request.

Thanks for the explanation. That makes more sense. It was a bit confusing earlier since you said 10% of requests, which implies incoming requests.

We are currently waiting for restbase1004 to finish decommissioning in order to gauge the disk space used by this. If that shows enough spare space, we can start to ramp up pre-generation.

The dashboard at https://grafana.wikimedia.org/dashboard/db/mobileapps shows a median response time of 200ms from the content service. Some of this will be spent waiting for API requests, so it isn't completely impossible that we could sustain 100 req/s with just the two instances currently deployed.

PR #471 enables pre-generation. Scheduled for deploy today.

is 10% of requests still on the table? especially re: storage space needs

@fgiunchedi, we delayed pre-generation for quite a while now, and the next step is switching the stable app channel to this end point. This means that client requests would trigger generating those articles anyway at a fairly rapid pace. We might as well give users the improved performance we are aiming for.

Decommissioning nodes is not an option until more SSDs are installed, and in normal operation there is sufficient headroom to accommodate likely less than 100G of storage per node. Currently, all mobileapps content uses about 20G per node.

We have done a complete dump of en.wiki and pre-generation has been active for a week now. Calling it a win.

Here is a snapshot illustrating the effect of pre-generating mobile section content on lead section request latency, from the mobileapps API dashboard:

pasted_file (1×1 px, 279 KB)

To clarify, the load time of the mobile-sections-remaining route has remained constant because of the way RESTBase gets the content from the MCS: when a request for mobile-sections-lead for a title not present in storage comes, RB requests the whole document (i.e. it calls MCS' mobile-sections end point), and stores the two chunks separately. Since clients always request the lead section first, the subsequent request for the remainder of the document just retrieves the data from storage.