Page MenuHomePhabricator

switchdc cache warmup should include URLs that warmup relevant Wikidata caches
Open, MediumPublic

Description

s8 was struggling today after the DC switchover, and it was suggested that we add some URLs to the cache warmup for Wikidata.

(log trimmed)

07:53:09 <_joe_> marostegui: do we have dbs suffering?
07:53:15 <@marostegui> yes, s8
07:54:08 <Amir1> (once this is all done): marostegui let's talk about if there is anything on wikibase side is needed
07:56:06 <jynus> is it Wikibase\Lib\Store\Sql\Terms\DatabaseTermInLangIdsResolver::selectTermsViaJoin ?
07:56:34 <Amir1> that's termstore replacement
07:56:42 <Amir1> I think because cache is cold
07:56:48 <Amir1> (memcached for term store)
07:56:52 <jynus> mine it is just a guess because it is the only query I am seeing
07:57:04 <jynus> on the heavy hit servers
07:57:34 <Amir1> maybe for the next dc switch we should "warm it up" beforehand
07:57:41 <_joe_> Amir1: yes definitely
07:57:50 <jynus> I see a few fetchterms too
07:57:51 <@legoktm> we can add some Wikidata URLs to the warmup
07:58:03 <@marostegui> legoktm: that'd be nice indeed
07:58:20 <Amir1> legoktm: it's a common misconception: most of reads on s8 are not coming from wikidata.org
07:58:30 <Amir1> this needs some parsing
07:59:17 <@legoktm> Amir1: "URLs that warmup the relevant Wikidata paths :)"
07:59:25 <Amir1> better :D

The URLs that are currently hit as part of the warmup process are located at https://gerrit.wikimedia.org/g/operations/puppet/+/545b517cea678d8fdeadeb6051e0f3757bd4ebff/modules/mediawiki/files/maintenance/mediawiki-cache-warmup/

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

https://grafana.wikimedia.org/d/000000548/wikibase-sql-term-storage?orgId=1&from=1631626432679&to=1631638695040

image.png (995×1 px, 146 KB)

It is a bit scary that cache hit reducing to 94% could bring down s8 but I guess there is no way around it.

Correct me if I am wrong, but this would also be solved by this cache using WANCache rather than just a BagOStuff?

Marostegui triaged this task as Medium priority.Wed, Sep 15, 1:51 PM
Marostegui moved this task from Triage to Blocked on the DBA board.

Just to mention that this is a nice-to-have indeed but not something that brought us down or anything close to that - it just required some load switching on the load balancer to help the latency to improve.
We are now in a much more better position that we were a few switchovers ago as we have added (in the last couple of years) additional capacity on our s8 (wikidata) section.

I don't know how much work this requires to make it happen but it is certainly something that would be a benefit especially as wikidata keeps growing and growing (yay!).

Correct me if I am wrong, but this would also be solved by this cache using WANCache rather than just a BagOStuff?

My understanding is that WANCache only sets values in the current DC, but invalidation is relayed to both DCs. Main stash would set the values across both DCs, see https://www.mediawiki.org/wiki/Manual:Object_cache#Main_stash

My understanding is that WANCache only sets values in the current DC, but invalidation is relayed to both DCs. Main stash would set the values across both DCs, see https://www.mediawiki.org/wiki/Manual:Object_cache#Main_stash

Right, and reading that we probably don't want to use Main stash for this.