Page MenuHomePhabricator

Make WDQS active / active
Closed, ResolvedPublic

Description

Traffic is ready for active / active applications, WDQS is ready to be active / active, we should do it.

Before sending traffic to codfw, we need to reimport all data as it seems that codfw is lagging behind. As there is no traffic yet, we can reimport all servers in parallel.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Gehel triaged this task as High priority.Apr 4 2017, 5:45 AM

Change 346543 had a related patch set uploaded (by Gehel):
[operations/puppet@production] wdqs: active/active public interface

https://gerrit.wikimedia.org/r/346543

Mentioned in SAL (#wikimedia-operations) [2017-04-06T08:37:23Z] <gehel> shutting down wdqs codfw for data reimport - T162111

Initial import is completed, wdqs-updater is restarted and is catching up on the differences since last export.

Change 346543 merged by Gehel:
[operations/puppet@production] wdqs: active/active public interface

https://gerrit.wikimedia.org/r/346543

Using the following curl to test, I don't see an entry in the nginx access log:

curl 'https://query.wikidata.org/bigdata/namespace/wdq/sparql?query=%23Streets%20without%20a%20city%0ASELECT%20%3Fstreet%20%3FstreetLabel%0AWHERE%0A%7B%0A%20%20%20%20%3Fstreet%20wdt%3AP31%2Fwdt%3AP279*%20wd%3AQ79007%20.%0A%20%20%20%20%3Fstreet%20wdt%3AP17%20wd%3AQ142%20.%0A%20%20%20%20MINUS%20%7B%20%3Fstreet%20wdt%3AP131%20%5B%5D%20%7D%20.%0A%09SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22fr%22%20%7D%0A%7D%0AORDER%20BY%20%3FstreetLabel' -H 'Accept: application/sparql-results+json' -H 'User-Agent: curl (testing/gehel)' --resolve query.wikidata.org:443:208.80.153.248
Gehel added a subscriber: BBlack.

I'm not sure the change is effective. While I do see a few requests (outside of pyball / icinga) in the nginx logs on the wdqs codwf servers, I don't see as many as I would expect. Also, grafana does not show any requests for the codfw service.

@BBlack any idea on how to check this further?

@Gehel: you can check x-served-by headers in the responses - half of those should have codfw hosts there now.

grafana dashboard was wrongly filtering on eqiad only (that's why I did not see any traffic there). More tests and checking x-cache and x-served-by headers show that indeed traffic is routed to codfw as well. All looks good!

Smalyshev claimed this task.

I think everything is fine, we can close this?