Values tracked during the process (please fill them when making progress)
- LEXEME_DUMP=https://dumps.wikimedia.your.org/other/wikibase/wikidatawiki/20210924/wikidata-20210924-lexemes-BETA.ttl.bz2
- ENTITY_DUMP=https://dumps.wikimedia.your.org/other/wikibase/wikidatawiki/20210927/wikidata-20210927-all-BETA.ttl.bz2
- DUMP_START_DATE=2021-09-24T23:00:01Z
- FLINK_EQIAD_JOB_START= 2021-10-01T10:22:25
- FLINK_CODFW_JOB_START= 2021-10-01T10:17:02
- BOOTSTRAP_STATE_EQIAD=swift://rdf-streaming-updater-eqiad.thanos-swift/wikidata/savepoints/bootstrap_20210927
- BOOTSTRAP_STATE_CODFW=swift://rdf-streaming-updater-codfw.thanos-swift/wikidata/savepoints/bootstrap_20210927
- Week of sept. 20:
- Send com about the rollout
- increase retention to 1month on codfw.rdf-streaming-updater.mutation (topic name is about to change) in kafka-main@codfw
- make sure retention is 1month on eqiad.rdf-streaming-updater.mutation (topic name is about to change) in kafka-main@eqiad
- Week of sept. 27:
- (friday oct. 1): bootstrap flink, note DUMP_START_DATE and FLINK_(EQIAD|CODFW)_JOB_START
- (friday oct. 1): pre-fetch the dumps to wdqs1009 and wdqs2008 and note LEXEME_DUMP and ENTITY_DUMP
- (friday oct. 1): start the data-reload cookbook with --reload-data wikidata --skolemize [TODO: new options to manage kafka offsets with FLINK_EQIAD_JOB_START] on wdqs1009
- (friday oct. 1): start the data-reload cookbook with --reload-data wikidata --skolemize [TODO: new options to manage kafka offsets with FLINK_CODFW_JOB_START] on wdqs2008
- (friday oct. 1): merge the activation of the streaming updater profile on wdqs2008 while the reload is happening there
- Week of oct. 4:
- Monitor that the reload is progressing properly
- Week of oct. 11:
- Send a quick reminder com to users
- Start data transfer (use the new option to activate kafka offsets propagation and always activate the streaming_updater profile via puppet on the target machine)
- (internal cluster)
- wdqs1009 -> wdqs1003
- wdqs2008 -> wdqs2005
- wdqs1003 -> wdqs1008
- wdqs2008 -> wdqs2006
- wdqs1003 -> wdqs1011
- (external cluster)
- wdqs1003 -> wdqs1004
- wdqs2008 -> wdqs2001
- wdqs1003 -> wdqs1005
- wdqs2008 -> wdqs2002
- wdqs1003 -> wdqs1006
- wdqs2008 -> wdqs2003
- wdqs1003 -> wdqs1007
- wdqs2008 -> wdqs2004
- wdqs1003 -> wdqs1012
- wdqs2008 -> wdqs2007
- wdqs1003 -> wdqs1013
Note: Migrations ended at 6am Oct 19.
Note: wdqs1010 is kept with the old updater and the old journal.