Page MenuHomePhabricator

Disable Parsoid cache updates for wikis that are migrated to RESTBase
Closed, ResolvedPublic

Description

We currently process cache updates for both RESTBase and Parsoid, but this is causing higher load on the Parsoid cluster than necessary. While VisualEditor is already using RESTBase for all wikipedias, there are still some users that rely on cached content from the Parsoid v1 API:

  • OCG
  • Content Translation

Kiwix is also still using the Parsoid API, but is not particularly latency-sensitive.

Details

Related Gerrit Patches:
operations/puppet : productionRemove Parsoid job runners
operations/mediawiki-config : masterDisable Parsoid cache updates

Event Timeline

GWicke raised the priority of this task from to Medium.
GWicke updated the task description. (Show Details)
GWicke added projects: RESTBase, Parsoid.
GWicke added a subscriber: GWicke.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 19 2015, 10:49 PM
GWicke moved this task from Backlog to Ready / next on the RESTBase board.Mar 21 2015, 12:01 AM

I have still a few dumping processes running against the old API and in particular a WPEN dump I really don't want to have stopped in the middle. I don't know exactly what kind of impact this ticket might have for us, but if there is an important one, I would appreciate, if this can be postponed of around 10 days.

GWicke added a comment.EditedMar 21 2015, 7:31 PM

@Kelson, you might actually end up finishing your enwiki dump earlier if you switched to RB right now, even if you start that dump from scratch. Using a single instance of htmldumper with default settings (concurrency 50) the enwiki HTML should dump in well under a day, even with a remote connection.

In any case, the Parsoid V1 API will remain available for quite some time. We will however stop keeping its cache up to date soon, so dumps using it will become slower and will add more load to the Parsoid cluster as the percentage of cache misses will be higher.

I hear that mwoffliner is already converted to use RESTBase, so now it's only OCG and CX left to convert before we can stop updating the Parsoid caches.

Kelson added a comment.Apr 1 2015, 9:29 AM

Just a small reminder: rest.wikimedia.org is only available for wikipedia. We still rely pretty much on parsoid-lb.eqiad.wikimedia.org to create ZIM snapshots of all other WM projects.

Notice: I am aiming to switch off the old cache update jobs for wikipedias within the next two weeks.

Quick update: We are almost there. OCG is pending a deploy to the PHP extension, and cxserver might already be switched over.

Any agenda to migrate the other projects to rest.wikimedia.org?

cxserver migration is pending merging/deployment https://gerrit.wikimedia.org/r/#/c/207378/
But should happen with in next 2 weeks atmost.

@santhosh, thank you! That was merged now, so it seems that only OCG is left at this point.

@Kelson, the remaining public / non-special projects will be supported once https://gerrit.wikimedia.org/r/#/c/198433/ gets merged, which I hope will happen tomorrow.

All remaining public & active wikis are added in https://gerrit.wikimedia.org/r/#/c/217431/.

Change 217434 had a related patch set uploaded (by GWicke):
Disable Parsoid cache updates

https://gerrit.wikimedia.org/r/217434

Change 217434 merged by jenkins-bot:
Disable Parsoid cache updates

https://gerrit.wikimedia.org/r/217434

GWicke added a comment.EditedJun 11 2015, 4:29 PM

Now live:

Next steps:

Change 217550 had a related patch set uploaded (by GWicke):
Remove Parsoid job runners

https://gerrit.wikimedia.org/r/217550

Scheduled the removal of Parsoid job runners ( https://gerrit.wikimedia.org/r/#/c/217550/ ) with @Joe for Monday, 2015/06/15

Change 217550 merged by Ori.livneh:
Remove Parsoid job runners

https://gerrit.wikimedia.org/r/217550

GWicke closed this task as Resolved.Jun 16 2015, 9:38 PM
GWicke claimed this task.