Page MenuHomePhabricator

Finalise changeprop migration to k8s
Closed, ResolvedPublic

Description

To complete the migration to k8s we need to work on enabling all of the changeprop tasks that we currently do in production to bring the two instance types into parity.

While these features are being rolled out, we should develop and document test data for these features.

Before we can roll tasks out, we should

  • implement feature toggles in the helm chart to enable easier rollout and reduce the need for continuous repackaging

Tasks that need to be enabled:

  • mobile-html_rerender_transclude
  • mobile-html_rerender
  • media-list_rerender_transcludes
  • media-list_rerender
  • mobile-sections_rerender_transcludes
  • mobile-sections_rerender
  • summary_definition_rerender
  • summary_definition_rerender_transcludes
  • metadata_purge
  • metadata_purge_transcludes
  • mw_purge
  • purge_varnish - k8s in prod now produces events to another topic. This currently relies on the scb changeprop instance but as far as kubernetes migration is concerned this is complete
  • null_edit
  • page_edit
  • revision_visibility_change
  • page_delete
  • page_restore
  • page_move
  • on_transclusion_update
  • page_create
  • on_backlinks_update
  • ores_cache
  • wikidata_description_on_edit
  • wikidata_description_on_undelete
  • on_wikidata_description_change
  • page_images_summary
  • page_images_mobile

We also need to

  • build monitoring dashboards for the new instances that complement or mimic the existing dashboards we have

Finally when ready we need to

  • decommission existing changeprop scb instances- blocked on revision of purge mechanism

Details

ProjectBranchLines +/-Subject
operations/puppetproduction+2 -12
operations/deployment-chartsmaster+271 -252
operations/deployment-chartsmaster+272 -310
operations/deployment-chartsmaster+3 -15
mediawiki/services/change-propagation/deploymaster+39 -39
operations/deployment-chartsmaster+243 -225
operations/deployment-chartsmaster+3 -3
operations/deployment-chartsmaster+6 -6
mediawiki/services/change-propagation/deploymaster+63 -64
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+30 -0
mediawiki/services/change-propagation/deploymaster+23 -34
operations/deployment-chartsmaster+20 -2
operations/deployment-chartsmaster+393 -498
operations/deployment-chartsmaster+12 -9
operations/deployment-chartsmaster+238 -214
operations/deployment-chartsmaster+242 -218
operations/deployment-chartsmaster+15 -0
mediawiki/services/change-propagation/deploymaster+52 -48
operations/deployment-chartsmaster+223 -204
mediawiki/services/change-propagation/deploymaster+3 -3
operations/deployment-chartsmaster+242 -203
operations/deployment-chartsmaster+4 -4
operations/deployment-chartsmaster+577 -532
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 585502 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: Bump log level

https://gerrit.wikimedia.org/r/585502

Change 585502 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: Bump log level

https://gerrit.wikimedia.org/r/585502

Change 586439 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/deployment-charts@master] ChangeProp: add more metrics and deploy the latest code

https://gerrit.wikimedia.org/r/586439

Change 586439 merged by jenkins-bot:
[operations/deployment-charts@master] ChangeProp: add more metrics and deploy the latest code

https://gerrit.wikimedia.org/r/586439

daniel triaged this task as Medium priority.Apr 7 2020, 12:52 PM

Change 587276 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[mediawiki/services/change-propagation/deploy@master] changeprop: Use correct tags for null_edits

https://gerrit.wikimedia.org/r/587276

Not sure where this will live long-term but I've hacked together a WIP tool for generating testing messages for changeprop kafka topics https://phabricator.wikimedia.org/P10934

Change 587276 merged by Ppchelko:
[mediawiki/services/change-propagation/deploy@master] changeprop: Use correct tags for null_edits

https://gerrit.wikimedia.org/r/587276

Change 587562 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: Correct puppetca path.

https://gerrit.wikimedia.org/r/587562

Change 587562 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: Correct puppetca path.

https://gerrit.wikimedia.org/r/587562

Change 588700 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/services/change-propagation/deploy@master] Remove rules moved into k8s installation

https://gerrit.wikimedia.org/r/588700

Change 588700 merged by Ppchelko:
[mediawiki/services/change-propagation/deploy@master] Remove rules moved into k8s installation

https://gerrit.wikimedia.org/r/588700

Mentioned in SAL (#wikimedia-operations) [2020-04-14T14:42:08Z] <ppchelko@deploy1001> Started deploy [changeprop/deploy@354ae2d]: Remove rules enabled in k8s T248677

Mentioned in SAL (#wikimedia-operations) [2020-04-14T14:44:06Z] <ppchelko@deploy1001> Finished deploy [changeprop/deploy@354ae2d]: Remove rules enabled in k8s T248677 (duration: 01m 58s)

Change 588721 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: increase replicas and resources assigned to changeprop

https://gerrit.wikimedia.org/r/588721

Change 588721 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: increase replicas and resources assigned to changeprop

https://gerrit.wikimedia.org/r/588721

Change 588749 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: make kafka SSL configurable

https://gerrit.wikimedia.org/r/588749

Change 588749 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: make kafka SSL configurable

https://gerrit.wikimedia.org/r/588749

Mentioned in SAL (#wikimedia-operations) [2020-04-14T17:12:14Z] <ppchelko@deploy1001> Started deploy [changeprop/deploy@354ae2d]: Remove rules enabled in k8s T248677 attempt 2

Mentioned in SAL (#wikimedia-operations) [2020-04-14T17:12:40Z] <ppchelko@deploy1001> Finished deploy [changeprop/deploy@354ae2d]: Remove rules enabled in k8s T248677 attempt 2 (duration: 00m 25s)

Change 589059 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: allow setting arbitrary keys for kafka options

https://gerrit.wikimedia.org/r/589059

Change 589079 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/deployment-charts@master] Update change-prop container version to v0.9.5

https://gerrit.wikimedia.org/r/589079

Change 589079 merged by jenkins-bot:
[operations/deployment-charts@master] Update change-prop container version to v0.9.5

https://gerrit.wikimedia.org/r/589079

Change 589324 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: reenable SSL for kafka

https://gerrit.wikimedia.org/r/589324

Change 589324 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: reenable SSL for kafka

https://gerrit.wikimedia.org/r/589324

Change 589059 abandoned by Hnowlan:
changeprop: allow setting arbitrary keys for kafka options

Reason:
Not needed, we have discovered the reason for this issue.

https://gerrit.wikimedia.org/r/589059

Change 592700 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: enable more rules, increase number of replicas

https://gerrit.wikimedia.org/r/592700

Change 592701 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[mediawiki/services/change-propagation/deploy@master] changeprop: toggle more rules off in prod

https://gerrit.wikimedia.org/r/592701

Change 592700 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: enable more rules, increase number of replicas

https://gerrit.wikimedia.org/r/592700

Change 592701 merged by Ppchelko:
[mediawiki/services/change-propagation/deploy@master] changeprop: toggle more rules off in prod

https://gerrit.wikimedia.org/r/592701

Mentioned in SAL (#wikimedia-operations) [2020-04-27T19:21:58Z] <ppchelko@deploy1001> Started deploy [changeprop/deploy@ecca66b]: Switch off rules moved to k8s T248677

Mentioned in SAL (#wikimedia-operations) [2020-04-27T19:23:20Z] <ppchelko@deploy1001> Finished deploy [changeprop/deploy@ecca66b]: Switch off rules moved to k8s T248677 (duration: 01m 22s)

Change 592918 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: enable more rules in kubernetes

https://gerrit.wikimedia.org/r/592918

Change 592930 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[mediawiki/services/change-propagation/deploy@master] Move more prod rules to kubernetes

https://gerrit.wikimedia.org/r/592930

Change 592918 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: enable more rules in kubernetes

https://gerrit.wikimedia.org/r/592918

Change 592979 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: Increase number of replicas

https://gerrit.wikimedia.org/r/592979

Change 592979 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: Increase number of replicas

https://gerrit.wikimedia.org/r/592979

Change 592930 merged by Ppchelko:
[mediawiki/services/change-propagation/deploy@master] Move more prod rules to kubernetes

https://gerrit.wikimedia.org/r/592930

Mentioned in SAL (#wikimedia-operations) [2020-04-28T15:27:06Z] <ppchelko@deploy1001> Started deploy [changeprop/deploy@2b87a75]: Switch off rules moved to k8s T248677

Mentioned in SAL (#wikimedia-operations) [2020-04-28T15:28:26Z] <ppchelko@deploy1001> Finished deploy [changeprop/deploy@2b87a75]: Switch off rules moved to k8s T248677 (duration: 01m 20s)

Change 592987 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: enable all rules in k8s

https://gerrit.wikimedia.org/r/592987

Change 592987 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: enable all rules in k8s

https://gerrit.wikimedia.org/r/592987

Change 593226 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: disable purge_varnish

https://gerrit.wikimedia.org/r/593226

Change 593229 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[mediawiki/services/change-propagation/deploy@master] Enable one more rule in Kubernetes

https://gerrit.wikimedia.org/r/593229

Change 593226 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: disable purge_varnish

https://gerrit.wikimedia.org/r/593226

Change 593262 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: Fix bucketing for servicerunner gc metrics

https://gerrit.wikimedia.org/r/593262

Change 593262 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: Fix bucketing for servicerunner gc metrics

https://gerrit.wikimedia.org/r/593262

Change 593229 merged by Hnowlan:
[mediawiki/services/change-propagation/deploy@master] Enable one more rule in Kubernetes

https://gerrit.wikimedia.org/r/593229

Change 594975 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: enable kafka replay of purge messages

https://gerrit.wikimedia.org/r/594975

Change 594975 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: enable kafka replay of purge messages

https://gerrit.wikimedia.org/r/594975

Change 596250 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: default all feature toggles to on

https://gerrit.wikimedia.org/r/596250

Change 596250 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: default all feature toggles to on

https://gerrit.wikimedia.org/r/596250

Change 596253 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] changeprop: fix syntax issue with dt updating.

https://gerrit.wikimedia.org/r/596253

Change 596253 merged by jenkins-bot:
[operations/deployment-charts@master] changeprop: fix syntax issue with dt updating.

https://gerrit.wikimedia.org/r/596253

Change 597258 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/puppet@production] changeprop: remove changeprop configuration from scb

https://gerrit.wikimedia.org/r/597258

Change 597258 merged by Hnowlan:
[operations/puppet@production] changeprop: remove changeprop configuration from scb

https://gerrit.wikimedia.org/r/597258

hnowlan updated the task description. (Show Details)