Maniphest T207676

Parallelize scap deployment of WDQS
Open, HighPublic5 Estimated Story Points
Actions

Assigned To

None

Authored By

	• Mathew.onipe
	Oct 22 2018, 5:45 PM

Description

We noticed we have been offshooting the deployment window of WDQS recently due to testing and also the non-parallel deployment of WDQS via scap. We don't want to restart more than 1 server at a time in a single cluster, to keep enough capacity to serve all the traffic. But we can restart servers from each cluster at the same time (public / internal & eqiad / codfw).

Related Objects

Mentioned In: T252124: Scap configuration for WDQS should get server groups from a known source or truth

Event Timeline

• Mathew.onipe created this task.Oct 22 2018, 5:45 PM

Restricted Application added a project: Wikidata. · View Herald TranscriptOct 22 2018, 5:45 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Smalyshev moved this task from Incoming to Operations/SRE on the Wikidata-Query-Service board.Oct 25 2018, 12:01 AM

I think we can parallelize, but we should do it in a smart way, so no more than one server in each cluster out of 3 is restared at the same time. But we can restart one in eqiad and one in codfw at the same time, same with internal and public. So we could do 4 servers at once instead of one.

Smalyshev triaged this task as Medium priority.Jan 3 2019, 1:23 AM

Smalyshev edited projects, added Discovery-Wikidata-Query-Service-Sprint; removed Discovery-Search (Current work).Jan 29 2019, 10:40 PM

• Mathew.onipe claimed this task.May 28 2019, 5:26 PM

Gehel removed a project: Discovery-Wikidata-Query-Service-Sprint.Jun 25 2019, 5:26 PM

Removing assignee @Mathew.onipe as the user does not seem to be active anymore.

Gehel renamed this task from Increase deployment window of wdqs or parallelize scap deployment to Parallelize scap deployment of WDQS.Aug 11 2020, 7:25 PM

Gehel updated the task description. (Show Details)

• Zbyszko mentioned this in T252124: Scap configuration for WDQS should get server groups from a known source or truth.Sep 28 2020, 8:55 AM

Gehel raised the priority of this task from Medium to High.Aug 26 2021, 1:14 PM

Gehel moved this task from Operations/SRE to Current work on the Wikidata-Query-Service board.Aug 26 2021, 1:24 PM

Gehel added a project: Discovery-Search (Current work).

This needs to be discussed with the rel-eng team before re-estimating and starting implementation.

Note that if adding support in Scap is too complex, it might make sense to implement deployment as cookbooks instead

I'll talk to rel-eng to see what scap changes are needed to parallelize between groups (wdqs eqiad public vs wdqs eqiad internal, etc)

There's a chance it might be worth it to rely on a cookbook to rolling restart. Basically we'd use scap to get the new code in place and a cookbook to do the actual rolling restarts to actually uptake the changes. But for now I'd assume we'll just be changing it in scap-land and not introducing a cookbook

MPhamWMF set the point value for this task to 5.Sep 13 2021, 3:29 PM

MPhamWMF moved this task from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.

Gehel removed a project: Discovery-Search (Current work).Feb 16 2022, 8:54 PM

Gehel moved this task from Current work to Operations/SRE on the Wikidata-Query-Service board.

Parallelize scap deployment of WDQSOpen, HighPublic5 Estimated Story PointsActions

Description

Related Objects

Event Timeline

Parallelize scap deployment of WDQS
Open, HighPublic5 Estimated Story Points
Actions