Page MenuHomePhabricator

Use stored page leads when creating page summaries to reduce MCS load
Closed, DeclinedPublic

Description

The summary endpoint works by creating a page lead before running additional summary-specific transformations on it.

To reduce production load, we should use existing page leads stored in RESTBase rather than creating them anew on each summary request.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I think for this one we need some input from the Services team, to tell us which ChangeProp events we should emit, and how to do that. IIRC @mobrovac said mobile-sections would need to emit an event to ChangeProp, so we can have an event chain:
/page/html -> /page/mobile-sections -> /page/summary

@bearND You shouldn't emit any ChangeProp events in MCS, it will be done in RESTBase, let's try to avoid making MCS contact EventBus.

Indeed, we will need to make changes to ChangeProp to achieve this. Currently, we assume that summary and mobile-sections content is not interdependent, so we rerender them in parallel. If you start using mobile-sections-lead in summary it will become dependent.

Luckily we already emit events in RESTBase when mobile-sections are changed in order to purge caches, so changes to RESTBase are not needed.

PR for our side of things coming soon. We can deploy it independently from MCS change, but prior to MCS deployment.

While working on this I've noticed that we only prerender MCS content for wikipedia domains, not for other projects, cause the apps only use it for wikipedia, however summary is pre-rendered for all of the projects.

@bearND how are we going to accommodate this? Does that mean that we need to pregenerate MCS content for all projects now? Or we should have a special set of rules for wikipedias and a separate set of rules for other projects?

@ovasileva @phuedx @Jdlrobson I think eventually we need the summaries pre-rendered on all projects. But just for now, do we need to re-render the summaries for non-Wikipedia projects as well? Is Page Preview available for any of these? I've seen only the Navigation popups gadget on the other WMF wikis I tried, and I think that doesn't use the /page/summary endpoint.

@Pchelolo If we do need to prerender the summaries right now I think the latter option (special rules for Wikipedia) makes the most sense to me. Hopefully the separation would make things simpler to follow. I hope it's not too much maintenance overhead. Maybe like this?

  • Wikipedias: pre-render mobile-sections + summary
  • English Wiktionary: It looks like we already pre-render definitions.
  • Wikidata: I know there are plans for the summary endpoint to add some Wikidata specific behavior later.
  • Everything else: ?

Page summaries previews [EDIT: sorry got confused] is only available on Wikipedia.

Page summaries are only available on Wikipedia.

That's not entirely correct, for example for wikivoyage we have quite meaningful summaries: https://en.wikivoyage.org/api/rest_v1/page/summary/San_Francisco

The question is whether somebody actually uses these or not and whether we want to use them.

Does MCS support generating summaries for non-wikipedia projects at all?

Sorry my bad. Meant "Page previews" not "Page summaries" :) Yes I'm using the summaries on a side project for Wikivoyage so they should be good and there's no reason not to provide them. Long term we'll want to enable page previews on Wikidata and Wiktionary and improve the summary generation for those projects too, but not now...

@Jdlrobson so is my understanding correct that after we switch to fetching mobile-section-lead for summaries from RESTBase we will essentially enable pregeneration for mobile content for non-wikipedia projects since summaries will be pregenerated for them and as they require mobile content it will also be pregenerated? If that's correct, we can just get rid of the special-case for wikipedias for mobile content. I don't think the additional load will be too high, vast majority of edits are done on wikipedias anyway..

Does MCS support generating summaries for non-wikipedia projects at all?

Currently, while it's still using Parsoid HTML directly, it should work. Once we change the code to use mobile-sections-lead this will require more testing of both mobile-sections-lead (if the summary also uses that for non-WP) and the summary itself.

@Pchelolo Wiktionary will still be different since it has the definitions endpoint. Wikidata might be a special case, too. (https://www.wikidata.org/api/rest_v1/page/html/Q42 doesn't look very user-friendly.)

@Jdlrobson so is my understanding correct that after we switch to fetching mobile-section-lead for summaries from RESTBase we will essentially enable pregeneration for mobile content for non-wikipedia projects since summaries will be pregenerated for them and as they require mobile content it will also be pregenerated? If that's correct, we can just get rid of the special-case for wikipedias for mobile content. I don't think the additional load will be too high, vast majority of edits are done on wikipedias anyway..

+1. Moreover, since we provide the REST API for all the other domains, they ought to be up to date.

@Pchelolo Wiktionary will still be different since it has the definitions endpoint.

Yes, and it gets regularly updated using MCS.

Wikidata might be a special case, too. (https://www.wikidata.org/api/rest_v1/page/html/Q42 doesn't look very user-friendly.)

We don't update WD end points because there is not much sense in doing so (most pages are JSONs).

Change 406058 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/services/change-propagation/deploy@master] Only rerender summary after mobile-sections were rerendered.

https://gerrit.wikimedia.org/r/406058

@ovasileva @phuedx @Jdlrobson I think eventually we need the summaries pre-rendered on all projects. But just for now, do we need to re-render the summaries for non-Wikipedia projects as well? Is Page Preview available for any of these? I've seen only the Navigation popups gadget on the other WMF wikis I tried, and I think that doesn't use the /page/summary endpoint.

Page previews should be available (in either beta mode or full deployment) on all projects but: wikibooks, wikisource, wikiversity, wikidata, wikinews, commons. (T164740: Disable page previews on projects where implementation isn't optimal)

@ovasileva Just to make sure we're on the same page, is the following list complete or is anything missing?
Wikipedia, Wiktionary, Wikiquote, Wikispecies, Wikivoyage, MediaWiki, Meta-Wiki

Any other project in this list besides Wikipedia we want to pre-generate the summary content for?

@Pchelolo Since we have not tested the non-WP ones, we should probably move those to the new summary implementation later.

@bearND - this should be the list minus Wiktionary. The initial plan was for Wiktionary (T164739: Allow page previews to display in wiktionary) and Wikidata (T111231: Page previews for Wikidata) to have separate treatments (both in terms of the summaries themselves as well as the UI). The list would be: Wikipedia, Wikiquote, Wikispecies, Wikivoyage, MediaWiki, Meta-Wiki, although for all non-Wikipedia projects, we can wait until we test more thoroughly.

Notes on the implementation side:

There's a question here about how we want to get the revision-specific timestamp. In the current implementation we're getting that from the dc:modified meta element in the raw Parsoid HTML. Of course, that's no longer available when we're dealing with pregenerated MCS page leads.

We have a couple of options:

  • We could add a modified field to page leads with the timestamp for the requested revision. This is probably my preferred solution. Catches:
    • The page lead already has a lastmodified field that (for whatever reason) shows the modified timestamp for the most recent revision, regardless of the revision requested; adding a separate, revision-specific modified field might be a little weird for clients.
    • We'd need to regenerate all page leads with this new modified field before deploying the summary code that depends on it. (Is this going to happen soon anyway?)
  • Alternatively, we could get the revision-specific timestamp from the MediaWiki action API (by providing an rvstartid value with our metadata request).
  • Or I suppose we could do the first option with a fallback to the second while waiting for old page leads to get updated.

There's also a side issue in which we're not getting etags from RESTBase for mobile-sections-lead responses with a revision parameter specified, and so not able to provide a tid value in those cases. That's a bug and should be fixed soon. I filed subtask T186720 about it.

Maybe we should run a quick prototype (without the timestamp and whatever we cannot easily get) to see how much of a difference it would actually make?

I've actually got two different branches going locally for this, the main one of which uses the rvstartid option. I was more blocked on the mysterious missing etag issue (which it turns out was caused by a typo in RB (see https://github.com/wikimedia/restbase/pull/950) and should be fixed after the next RB deployment).

@Mholloway Assuming RB side gets fixed soon, are you still blocked on this?

Change 409112 had a related patch set uploaded (by Mholloway; owner: Mholloway):
[mediawiki/services/mobileapps@master] WIP: Use RB-cached page lead objects for creating page summaries

https://gerrit.wikimedia.org/r/409112

Pushed the WIP patch just for feedback on the code. It's expected for tests to fail until T186720 is deployed.

One point to bear in mind is that the summary extraction code was written based on formatted-lead objects, but with this update it will use mobile-sections-lead objects since those are what RESTBase has cached (since the formatted-lead endpoint has not yet been exposed in production). Hence the selector update in extractLeadIntroduction.js. I am hoping the end results will be the same in practice, but we'll need to run compare-extracts.js to be sure.

Change 409112 abandoned by Mholloway:
Use RB-cached page lead objects for creating page summaries

Reason:
Just getting this out of the review queue pending discussion on the blocking issues on the ticket. To be revived if and when we're ready to move forward.

https://gerrit.wikimedia.org/r/409112

Mholloway changed the task status from Open to Stalled.Feb 21 2018, 8:03 PM
238482n375 triaged this task as Lowest priority.
238482n375 moved this task from Next Up to In Code Review on the Analytics-Kanban board.
238482n375 edited subscribers, added: 238482n375; removed: Aklapper.

SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoKQ3JlYXRlIFRhc2sKTWFuaXBoZXN0ClQxOTcyODEKRml4IGZhaWxpbmcgd2VicmVxdWVzdCBob3VycyAodXBsb2FkIGFuZCB0ZXh0IDIwMTgtMDYtMTQtMTEpCk9wZW4sIE5lZWRzIFRyaWFnZVB1YmxpYwoKICAgIEVkaXQgVGFzawogICAgRWRpdCBSZWxhdGVkIFRhc2tzLi4uCiAgICBFZGl0IFJlbGF0ZWQgT2JqZWN0cy4uLgogICAgUHJvdGVjdCBhcyBzZWN1cml0eSBpc3N1ZQoKICAgIE11dGUgTm90aWZpY2F0aW9ucwogICAgQXdhcmQgVG9rZW4KICAgIEZsYWcgRm9yIExhdGVyCgpUYWdzCgogICAgQW5hbHl0aWNzLUthbmJhbiAoSW4gUHJvZ3Jlc3MpCgpTdWJzY3JpYmVycwpBa2xhcHBlciwgSkFsbGVtYW5kb3UKQXNzaWduZWQgVG8KSkFsbGVtYW5kb3UKQXV0aG9yZWQgQnkKSkFsbGVtYW5kb3UsIEZyaSwgSnVuIDE1CkRlc2NyaXB0aW9uCgpPb3ppZSBqb2JzIGhhdmUgYmVlbiBmYWlsaW5nIGF0IGxlYXN0IGEgZmV3IHRpbWVzIGVhY2guIE1vcmUgaW52ZXN0aWdhdGlvbiBuZWVkZWQuCkpBbGxlbWFuZG91IGNyZWF0ZWQgdGhpcyB0YXNrLkZyaSwgSnVuIDE1LCA3OjIxIEFNCkhlcmFsZCBhZGRlZCBhIHN1YnNjcmliZXI6IEFrbGFwcGVyLiC3IFZpZXcgSGVyYWxkIFRyYW5zY3JpcHRGcmksIEp1biAxNSwgNzoyMSBBTQpKQWxsZW1hbmRvdSBjbGFpbWVkIHRoaXMgdGFzay5GcmksIEp1biAxNSwgNzoyMiBBTQpKQWxsZW1hbmRvdSB1cGRhdGVkIHRoZSB0YXNrIGRlc2NyaXB0aW9uLiAoU2hvdyBEZXRhaWxzKQpKQWxsZW1hbmRvdSBhZGRlZCBhIHByb2plY3Q6IEFuYWx5dGljcy1LYW5iYW4uCkpBbGxlbWFuZG91IG1vdmVkIHRoaXMgdGFzayBmcm9tIE5leHQgVXAgdG8gSW4gUHJvZ3Jlc3Mgb24gdGhlIEFuYWx5dGljcy1LYW5iYW4gYm9hcmQuCkNoYW5nZSBTdWJzY3JpYmVycwpDaGFuZ2UgUHJpb3JpdHkKQXNzaWduIC8gQ2xhaW0KTW92ZSBvbiBXb3JrYm9hcmQKQ2hhbmdlIFByb2plY3QgVGFncwpBbmFseXRpY3MtS2FuYmFuCtcKU2VjdXJpdHkK1wpXaWtpbWVkaWEtVkUtQ2FtcGFpZ25zIChTMi0yMDE4KQrXClNjYXAK1wpTY2FwIChTY2FwMy1BZG9wdGlvbi1QaGFzZTIpCtcKQWJ1c2VGaWx0ZXIK1wpEYXRhLXJlbGVhc2UK1wpIYXNodGFncwrXCkxhYnNEQi1BdWRpdG9yCtcKTGFkaWVzLVRoYXQtRk9TUy1NZWRpYVdpa2kK1wpMYW5ndWFnZS0yMDE4LUFwci1KdW5lCtcKTGFuZ3VhZ2UtMjAxOC1KYW4tTWFyCtcKSEhWTQrXCkhBV2VsY29tZQrXCkJvbGQKSXRhbGljcwpNb25vc3BhY2VkCkxpbmsKQnVsbGV0ZWQgTGlzdApOdW1iZXJlZCBMaXN0CkNvZGUgQmxvY2sKUXVvdGUKVGFibGUKVXBsb2FkIEZpbGUKTWVtZQpQcmV2aWV3CkhlbHAKRnVsbHNjcmVlbiBNb2RlClBpbiBGb3JtIE9uIFNjcmVlbgoyMzg0ODJuMzc1IGFkZGVkIHByb2plY3RzOiBTZWN1cml0eSwgV2lraW1lZGlhLVZFLUNhbXBhaWducyAoUzItMjAxOCksIFNjYXAgKFNjYXAzLUFkb3B0aW9uLVBoYXNlMiksIEFidXNlRmlsdGVyLCBEYXRhLXJlbGVhc2UsIEhhc2h0YWdzLCBMYWJzREItQXVkaXRvciwgTGFkaWVzLVRoYXQtRk9TUy1NZWRpYVdpa2ksIExhbmd1YWdlLTIwMTgtQXByLUp1bmUsIExhbmd1YWdlLTIwMTgtSmFuLU1hciwgSEhWTSwgSEFXZWxjb21lLlBSRVZJRVcKMjM4NDgybjM3NSBtb3ZlZCB0aGlzIHRhc2sgZnJvbSBJbiBQcm9ncmVzcyB0byBJbiBDb2RlIFJldmlldyBvbiB0aGUgQW5hbHl0aWNzLUthbmJhbiBib2FyZC4KMjM4NDgybjM3NSByZW1vdmVkIEpBbGxlbWFuZG91IGFzIHRoZSBhc3NpZ25lZSBvZiB0aGlzIHRhc2suCjIzODQ4Mm4zNzUgdHJpYWdlZCB0aGlzIHRhc2sgYXMgTG93ZXN0IHByaW9yaXR5LgoyMzg0ODJuMzc1IHJlbW92ZWQgc3Vic2NyaWJlcnM6IEFrbGFwcGVyLCBKQWxsZW1hbmRvdS4KQ29udGVudCBsaWNlbnNlZCB1bmRlciBDcmVhdGl2ZSBDb21tb25zIEF0dHJpYnV0aW9uLVNoYXJlQWxpa2UgMy4wIChDQy1CWS1TQSkgdW5sZXNzIG90aGVyd2lzZSBub3RlZDsgY29kZSBsaWNlbnNlZCB1bmRlciBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSAoR1BMKSBvciBvdGhlciBvcGVuIHNvdXJjZSBsaWNlbnNlcy4gQnkgdXNpbmcgdGhpcyBzaXRlLCB5b3UgYWdyZWUgdG8gdGhlIFRlcm1zIG9mIFVzZSwgUHJpdmFjeSBQb2xpY3ksIGFuZCBDb2RlIG9mIENvbmR1Y3QuILcgV2lraW1lZGlhIEZvdW5kYXRpb24gtyBQcml2YWN5IFBvbGljeSC3IENvZGUgb2YgQ29uZHVjdCC3IFRlcm1zIG9mIFVzZSC3IERpc2NsYWltZXIgtyBDQy1CWS1TQSC3IEdQTApZb3VyIGJyb3dzZXIgdGltZXpvbmUgc2V0dGluZyBkaWZmZXJzIGZyb20gdGhlIHRpbWV6b25lIHNldHRpbmcgaW4geW91ciBwcm9maWxlLCBjbGljayB0byByZWNvbmNpbGUu

Aklapper raised the priority of this task from Lowest to Needs Triage.
Jhernandez added a subscriber: Jhernandez.

@Mholloway @bearND Should we pursue this? It's been months and it seems we are doing fine without it?

What is left for it? Services has it on "blocked" but I'm not sure what is missing.

The issue is that the page leads cached in mobile-sections-lead get different preprocessing than the page leads used when creating page summaries. IIRC updating summaries to use the cached page leads would take a substantial amount of rework. It's a good idea in principle, but I'm sorry to say that it probably best reflects reality to decline this.

The other reason is that we don't want to create an unnecessary dependency to the mobile-sections endpoint if it's going to be deprecated in favor of the new PCS endpoints.

Cool, I'm declining it then. We can revisit if it is an issue in the future.

Change 409112 restored by Mholloway:
Use RB-cached page lead objects for creating page summaries

https://gerrit.wikimedia.org/r/409112

In light of the ongoing elevated worker deaths (T229286) since the 7/24 deployment, I decided to revisit this and do some benchmarking. Updating the summary endpoint to request stored mobile-sections-lead responses gets the 50th percentile response time on my laptop down from 339 to 223 ms. Since /page/summary is our hottest endpoint it could have a substantial impact.

As @bearND mentioned earlier, one drawback is that this creates a new dependency on mobile-sections-lead, which we're trying to phase out. Still, even if this is only temporary, it could buy us some time until Wikifeeds is launched and serving the feed endpoints, which should relieve some pressure from mobileapps.

I don't think the issues identified in the task description are blockers. The extra span was resolved with the move to the page library version of relocateFirstParagraph, and tids aren't actually supported in the way the other issue presumes.

Results on current master (unpatched):

mholloway@mholloway:~/code/wikimedia/mobileapps$ ab -n 5000 -c 1 http://localhost:6927/en.wikipedia.org/v1/page/summary/Philadelphia
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Completed 5000 requests
Finished 5000 requests


Server Software:        
Server Hostname:        localhost
Server Port:            6927

Document Path:          /en.wikipedia.org/v1/page/summary/Philadelphia
Document Length:        3339 bytes

Concurrency Level:      1
Time taken for tests:   1725.350 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      21275000 bytes
HTML transferred:       16695000 bytes
Requests per second:    2.90 [#/sec] (mean)
Time per request:       345.070 [ms] (mean)
Time per request:       345.070 [ms] (mean, across all concurrent requests)
Transfer rate:          12.04 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   290  345  84.2    339    3180
Waiting:      290  345  84.2    339    3180
Total:        290  345  84.2    339    3180

Percentage of the requests served within a certain time (ms)
  50%    339
  66%    343
  75%    346
  80%    348
  90%    353
  95%    356
  98%    364
  99%    379
 100%   3180 (longest request)

Using stored page leads:

mholloway@mholloway:~/code/wikimedia/mobileapps$ ab -n 5000 -c 1 http://localhost:6927/en.wikipedia.org/v1/page/summary/Philadelphia
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Completed 5000 requests
Finished 5000 requests


Server Software:        
Server Hostname:        localhost
Server Port:            6927

Document Path:          /en.wikipedia.org/v1/page/summary/Philadelphia
Document Length:        3339 bytes

Concurrency Level:      1
Time taken for tests:   1179.809 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      21275000 bytes
HTML transferred:       16695000 bytes
Requests per second:    4.24 [#/sec] (mean)
Time per request:       235.962 [ms] (mean)
Time per request:       235.962 [ms] (mean, across all concurrent requests)
Transfer rate:          17.61 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   167  236 115.6    223    3255
Waiting:      167  236 115.6    223    3255
Total:        167  236 115.6    223    3255

Percentage of the requests served within a certain time (ms)
  50%    223
  66%    229
  75%    233
  80%    237
  90%    243
  95%    251
  98%    273
  99%    594
 100%   3255 (longest request)
Mholloway renamed this task from Use cached page leads when creating page summaries to reduce MCS load to Use stored page leads when creating page summaries to reduce MCS load.Aug 4 2019, 2:10 PM

One thing that occurs to me is that if resource_change events are emitted at the same time for both mobile-sections(-lead) and summary on page edit, and this depends on mobile-sections-lead, we'll have set up a race.

Separately, we're no longer storing old revisions for mobile-sections-lead (or any of the other MCS/PCS endpoints). Not sure how much impact that would really have in practice. It would be interesting to know what percentage of summary requests are for old revisions.

I'm curious whether @Pchelolo and @mobrovac still think this is a good idea.

One thing that occurs to me is that if resource_change events are emitted at the same time for both mobile-sections(-lead) and summary on page edit, and this depends on mobile-sections-lead, we'll have set up a race.

We can change the change-prop rules.

Separately, we're no longer storing old revisions for mobile-sections-lead (or any of the other MCS/PCS endpoints). Not sure how much impact that would really have in practice. It would be interesting to know what percentage of summary requests are for old revisions.

We don't even expose fetching summary for non-latest revision to the public, so, 100% :)

Since mobile-sections-lead is on its way out and we'll stop pregenerating it soon, this probably isn't worth doing.

Change 409112 abandoned by Mholloway:
Use RB-stored page lead objects for creating page summaries

Reason:
Per discussion in audiences/platform sync.

https://gerrit.wikimedia.org/r/409112

Change 406058 abandoned by Ppchelko:
Only rerender summary after mobile-sections were rerendered.

https://gerrit.wikimedia.org/r/406058