Page MenuHomePhabricator

Switch `tmpPropertyTermsMigrationStage` to MIGRATION_WRITE_NEW
Closed, ResolvedPublic

Description

IMPORTANT: this should be done only after T225052 is done. Rebuild script is expected to run and finish in ~30 minutes so it can be run right before deployment.

After this switch, property terms will be written to both old store (wb_terms table) and the new one (wbt_text, wbt_type, wbt_text_in_lang, wbt_term_in_lang and wbt_property_terms).

Order of SWAT patches:

  1. Switch property terms migration to WRITE_NEW on test wikidata (test on test wikidata, see next section) - https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/519211
  2. Switch property terms migration to WRITE_NEW on wikidata production (test on production wikidata, see next section) - https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/519212

How to test changes in SWAT

You need access to test and production (at least to run maintenance/sql.php to execute some sql)

Testing write logic (should be writing to both still)

  1. Add/Update a property labels/descriptions/aliases in different languages of a property (use sandbox property on production https://www.wikidata.org/wiki/Property:P2368)
  2. Check new terms store tables for the data. Here's a helper sql query to run to get all terms of that property from new store tables:
SELECT
	wbxl_language 		as term_language,
	wby_name 		as term_type,
	wbx_text 		as term_text
FROM wbt_property_terms
	INNER JOIN wbt_term_in_lang 		ON wbpt_term_in_lang_id = wbtl_id
	INNER JOIN wbt_type			ON wbtl_type_id = wby_id
	INNER JOIN wbt_text_in_lang 		ON wbtl_text_in_lang_id = wbxl_id
	INNER JOIN wbt_text 			ON wbxl_text_id = wbx_id
WHERE
	wbpt_property_id = 123 -- put here the numeric property id (without the P prefix)
;

Testing read logic (should be reading from new store first)

  1. Using sandbox property https://www.wikidata.org/wiki/Property:P2368
  2. Clear all cached terms of that property

TBD @Ladsgroup @Lucas_Werkmeister_WMDE @hoo is there a quick/easy way to do that?

  1. Hide all terms of that property from wb_terms table (you can use sql wikidatawiki or mwscript sql.php wikidatawiki to execute sql):
UPDATE wb_terms SET term_full_entity_id = "hidden_P2368" WHERE term_full_entity_id = "P2368";
  1. go over the use cases.. properties should render properly:
Use Cases to test
  • Rendering property labels in statements blocks (uses PropertyLabelResolver).
  • Rendering property labels & descriptions in search results when searching for a property when adding a statement (users PrefetchingTermLookup).
  • Rendering property labels on Special:AllPages?namespace=122 (or 120 depending on wiki configuration)
  • Rendering property labels on Special:RecentChanges
  • ...

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Can this wait for a couple of days until we make sure there's no bug with WRITE_BOTH mode, we just turned it on today.

effectively this is doing the same thing as WRITE_BOTH.. it will keep writing to both stores. so it wouldn't make a difference really.

Which of course makes it redundant.. it is here just a transient state to keep going in the same process of migration as usual

I'm confused now, Setting it to WRITE_NEW makes it to write to both? Can I see the codepath?

alaa_wmde updated the task description. (Show Details)Jun 26 2019, 2:22 PM
alaa_wmde added subscribers: Lucas_Werkmeister_WMDE, hoo.

no I actually let the confusing terms confuse me again. but yea WRITE_NEW should write to both by definition as far as I understand, shouldn't it?

alaa_wmde added a comment.EditedJun 26 2019, 2:31 PM

Reading the documentation of the constants in core Defines.php they don't match the previously explained process to me.

/**@{
 * Schema change migration flags.
 *
 * Used as values of a feature flag for an orderly transition from an old
 * schema to a new schema. The numeric values of these constants are compatible with the
 * SCHEMA_COMPAT_XXX bitfield semantics. High bits are used to ensure that the numeric
 * ordering follows the order in which the migration stages should be used.
 *
 * - MIGRATION_OLD: Only read and write the old schema. The new schema need not
 *   even exist. This is used from when the patch is merged until the schema
 *   change is actually applied to the database.
 * - MIGRATION_WRITE_BOTH: Write both the old and new schema. Read the new
 *   schema preferentially, falling back to the old. This is used while the
 *   change is being tested, allowing easy roll-back to the old schema.
 * - MIGRATION_WRITE_NEW: Write only the new schema. Read the new schema
 *   preferentially, falling back to the old. This is used while running the
 *   maintenance script to migrate existing entries in the old schema to the
 *   new schema.
 * - MIGRATION_NEW: Only read and write the new schema. The old schema (and the
 *   feature flag) may now be removed.
 */
define( 'MIGRATION_OLD', 0x00000000 | SCHEMA_COMPAT_OLD );
define( 'MIGRATION_WRITE_BOTH', 0x10000000 | SCHEMA_COMPAT_READ_BOTH | SCHEMA_COMPAT_WRITE_BOTH );
define( 'MIGRATION_WRITE_NEW', 0x20000000 | SCHEMA_COMPAT_READ_BOTH | SCHEMA_COMPAT_WRITE_NEW );
define( 'MIGRATION_NEW', 0x30000000 | SCHEMA_COMPAT_NEW );

As far as I have learned, we follow this process:
write/read old -> write both, read old -> write both, read new -> write/read new

So either we have different migraiton processes, and we aremisusing the constants of a different one, that is:
write/read old => write both, read old and fallback to new, write both, read new and fallback to old, write/read new.

Or I learned completely wrong.

So currently the implementation we have does the following:
on MIGRATION_OLD' => write/read old
on MIGRATION_WRITE_BOTH => write both, read old
on MIGRATION_WRITE_NEW' => write both, read new
on MIGRATION_NEW => write/read new

Yes, we’re interpreting the middle two stages a bit differently. We agreed in If5fb399c9d that this was acceptable.

Sure, so it means WRITE_NEW is actually write_both + read_new? if that's case and given that huge amount of reads on wb_terms table is actually for properties, we should still wait a little.

alaa_wmde updated the task description. (Show Details)Jun 27 2019, 7:56 AM

Sure, so it means WRITE_NEW is actually write_both + read_new? if that's case and given that huge amount of reads on wb_terms table is actually for properties, we should still wait a little.

Why should we wait on read? I'd prefer to see it failing, if it will, asap actually. Waiting will just delay delivery here, or?

Why should we wait on read? I'd prefer to see it failing, if it will, asap actually. Waiting will just delay delivery here, or?

We are changing query plans of around 10k queries per second, in case anything is wrong with the read, it explodes logstash, elastic, database and or Wikipedia and all of its projects. I don't say we should wait and look at walls, we should set the read true in test systems (first beta cluster, then test wikidata) and then stare at logs and then it's okay to move forward.

alaa_wmde added a comment.EditedJul 2 2019, 10:52 AM

We are changing query plans of around 10k queries per second, in case anything is wrong with the read, it explodes logstash, elastic, database and or Wikipedia and all of its projects. I don't say we should wait and look at walls, we should set the read true in test systems (first beta cluster, then test wikidata) and then stare at logs and then it's okay to move forward.

That was certainly not clear from "we should still wait a little" ;)

so let's schedule test wikidata https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/519211 then?

We are changing query plans of around 10k queries per second, in case anything is wrong with the read, it explodes logstash, elastic, database and or Wikipedia and all of its projects. I don't say we should wait and look at walls, we should set the read true in test systems (first beta cluster, then test wikidata) and then stare at logs and then it's okay to move forward.

That was certainly not clear from "we should still wait a little" ;)
so let's schedule test wikidata https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/519211 then?

Beta cluster first, I'm running the migration for beta cluster now.

Change 520220 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/mediawiki-config@master] labs: Set wmgWikibaseTmpPropertyTermsMigrationStage to MIGRATION_WRITE_NEW

https://gerrit.wikimedia.org/r/520220

Change 520220 merged by jenkins-bot:
[operations/mediawiki-config@master] labs: Set wmgWikibaseTmpPropertyTermsMigrationStage to MIGRATION_WRITE_NEW

https://gerrit.wikimedia.org/r/520220

  1. Clear all cached terms of that property

They have a pretty short TTL, I don't think that's needd

  1. Hide all terms of that property from wb_terms table (you can use sql wikidatawiki or mwscript sql.php wikidatawiki to execute sql):

You can't run queries like that, it would break replication and corrupts the data and practically brings the wiki and all of other wikis in the same shard (in case of testwikidatawiki, 900 production wikis) down because you're changing against replica and it seems there's no way to keep users from doing that (happened before). You need to determine --write flag to make sure you connect to master IIRC. Double check please.

hoo added a comment.Jul 2 2019, 6:11 PM

This broke beta commons: https://commons.wikimedia.beta.wmflabs.org/wiki/File:CLD_test.webm (Error: 1146 Table 'commonswiki.wbt_property_terms' doesn't exist (172.16.4.147:3306))

It should not query term store at all. It's a bug needs fixing, I can disable it in commons for now.

Change 520292 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/mediawiki-config@master] labs: Temporary disable read new for wikibase term store of commons

https://gerrit.wikimedia.org/r/520292

Change 520292 merged by jenkins-bot:
[operations/mediawiki-config@master] labs: Temporary disable read new for wikibase term store of commons

https://gerrit.wikimedia.org/r/520292

This broke beta commons: https://commons.wikimedia.beta.wmflabs.org/wiki/File:CLD_test.webm (Error: 1146 Table 'commonswiki.wbt_property_terms' doesn't exist (172.16.4.147:3306))

So this sounds like federation wasn't really taken care of properly in read logic .. looking into it under T226008: Lookup obtained via SingleEntitySourceServices::getPrefetchingTermLookup must obey EntitySource

It's not federation, it's also clients as well. For example try to invoke to get label of a property in English Wikipedia in beta cluster (Add {{#invoke:Wikidata test|hello}} in any page and try to save it). It fatals with this:

A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: SELECT  wbpt_property_id,wbpt_term_in_lang_id  FROM `wbt_property_terms`    WHERE wbpt_property_id = '694'  
Function: Wikibase\Lib\Store\Sql\Terms\PrefetchingPropertyTermLookup::prefetchTerms
Error: 1146 Table 'enwiki.wbt_property_terms' doesn't exist (172.16.4.147:3306)

#0 /srv/mediawiki/php-master/includes/libs/rdbms/database/Database.php(1534): Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer, string, string)
#1 /srv/mediawiki/php-master/includes/libs/rdbms/database/Database.php(1130): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /srv/mediawiki/php-master/includes/libs/rdbms/database/Database.php(1762): Wikimedia\Rdbms\Database->query(string, string)
#3 /srv/mediawiki/php-master/extensions/Wikibase/lib/includes/Store/Sql/Terms/PrefetchingPropertyTermLookup.php(79): Wikimedia\Rdbms\Database->select(string, array, array, string)
#4 /srv/mediawiki/php-master/extensions/Wikibase/data-access/src/ByTypeDispatchingPrefetchingTermLookup.php(53): Wikibase\Lib\Store\Sql\Terms\PrefetchingPropertyTermLookup->prefetchTerms(array, array, array)
#5 /srv/mediawiki/php-master/extensions/Wikibase/data-access/src/ByTypeDispatchingPrefetchingTermLookup.php(53): Wikibase\DataAccess\ByTypeDispatchingPrefetchingTermLookup->prefetchTerms(array, array, array)
#6 /srv/mediawiki/php-master/extensions/Wikibase/data-access/src/ByTypeDispatchingPrefetchingTermLookup.php(89): Wikibase\DataAccess\ByTypeDispatchingPrefetchingTermLookup->prefetchTerms(array, array, array)
#7 /srv/mediawiki/php-master/extensions/Wikibase/lib/includes/Store/EntityTermLookupBase.php(52): Wikibase\DataAccess\ByTypeDispatchingPrefetchingTermLookup->getTermsOfType(Wikibase\DataModel\Entity\PropertyId, string, array)
#8 /srv/mediawiki/php-master/extensions/Wikibase/lib/includes/Store/LanguageFallbackLabelDescriptionLookup.php(48): Wikibase\Lib\Store\EntityTermLookupBase->getLabels(Wikibase\DataModel\Entity\PropertyId, array)
#9 /srv/mediawiki/php-master/extensions/Wikibase/client/includes/Usage/UsageTrackingLanguageFallbackLabelDescriptionLookup.php(72): Wikibase\Lib\Store\LanguageFallbackLabelDescriptionLookup->getLabel(Wikibase\DataModel\Entity\PropertyId)
#10 /srv/mediawiki/php-master/extensions/Wikibase/client/includes/DataAccess/Scribunto/WikibaseLanguageDependentLuaBindings.php(60): Wikibase\Client\Usage\UsageTrackingLanguageFallbackLabelDescriptionLookup->getLabel(Wikibase\DataModel\Entity\PropertyId)
#11 /srv/mediawiki/php-master/extensions/Wikibase/client/includes/DataAccess/Scribunto/Scribunto_LuaWikibaseLibrary.php(586): Wikibase\Client\DataAccess\Scribunto\WikibaseLanguageDependentLuaBindings->getLabel(string)
#12 /srv/mediawiki/php-master/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php(391): Wikibase\Client\DataAccess\Scribunto\Scribunto_LuaWikibaseLibrary->getLabel(string)
#13 [internal function]: Scribunto_LuaSandboxCallback->__call(string, array)
#14 /srv/mediawiki/php-master/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php(314): LuaSandboxFunction->call(LuaSandboxFunction)
#15 /srv/mediawiki/php-master/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php(296): Scribunto_LuaSandboxInterpreter->callFunction(LuaSandboxFunction, LuaSandboxFunction)
#16 /srv/mediawiki/php-master/extensions/Scribunto/includes/engines/LuaCommon/LuaCommon.php(982): Scribunto_LuaEngine->executeFunctionChunk(LuaSandboxFunction, PPTemplateFrame_Hash)
#17 /srv/mediawiki/php-master/extensions/Scribunto/includes/common/Hooks.php(128): Scribunto_LuaModule->invoke(string, PPTemplateFrame_Hash)
#18 /srv/mediawiki/php-master/includes/parser/Parser.php(3592): ScribuntoHooks::invokeHook(Parser, PPFrame_Hash, array)
#19 /srv/mediawiki/php-master/includes/parser/Parser.php(3299): Parser->callParserFunction(PPFrame_Hash, string, array)
#20 /srv/mediawiki/php-master/includes/parser/PPFrame_Hash.php(254): Parser->braceSubstitution(array, PPFrame_Hash)
#21 /srv/mediawiki/php-master/includes/parser/Parser.php(3113): PPFrame_Hash->expand(PPNode_Hash_Tree, integer)
#22 /srv/mediawiki/php-master/includes/parser/Parser.php(1422): Parser->replaceVariables(string)
#23 /srv/mediawiki/php-master/includes/parser/Parser.php(553): Parser->internalParse(string)
#24 /srv/mediawiki/php-master/includes/content/WikitextContent.php(365): Parser->parse(string, Title, ParserOptions, boolean, boolean, NULL)
#25 /srv/mediawiki/php-master/includes/content/AbstractContent.php(555): WikitextContent->fillParserOutput(Title, NULL, ParserOptions, boolean, ParserOutput)
#26 /srv/mediawiki/php-master/includes/Revision/RenderedRevision.php(266): AbstractContent->getParserOutput(Title, NULL, ParserOptions, boolean)
#27 /srv/mediawiki/php-master/includes/Revision/RenderedRevision.php(234): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached(WikitextContent, boolean)
#28 /srv/mediawiki/php-master/includes/Revision/RevisionRenderer.php(199): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string)
#29 /srv/mediawiki/php-master/includes/Revision/RevisionRenderer.php(148): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, array)
#30 [internal function]: Closure$MediaWiki\Revision\RevisionRenderer::getRenderedRevision#2(MediaWiki\Revision\RenderedRevision, array)
#31 /srv/mediawiki/php-master/includes/Revision/RenderedRevision.php(197): call_user_func(Closure$MediaWiki\Revision\RevisionRenderer::getRenderedRevision#2;8004, MediaWiki\Revision\RenderedRevision, array)
#32 /srv/mediawiki/php-master/includes/Storage/DerivedPageDataUpdater.php(1290): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#33 [internal function]: MediaWiki\Storage\DerivedPageDataUpdater->getCanonicalParserOutput()
#34 /srv/mediawiki/php-master/includes/edit/PreparedEdit.php(104): call_user_func(array)
#35 /srv/mediawiki/php-master/includes/edit/PreparedEdit.php(119): MediaWiki\Edit\PreparedEdit->getOutput()
#36 /srv/mediawiki/php-master/includes/Storage/DerivedPageDataUpdater.php(1268): MediaWiki\Edit\PreparedEdit->__get(string)
#37 /srv/mediawiki/php-master/includes/page/WikiPage.php(2019): MediaWiki\Storage\DerivedPageDataUpdater->getPreparedEdit()
#38 /srv/mediawiki/php-master/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php(31): WikiPage->prepareContentForEdit(WikitextContent)
#39 /srv/mediawiki/php-master/includes/Hooks.php(174): SpamBlacklistHooks::filterMergedContent(DerivativeContext, WikitextContent, Status, string, User, boolean)
#40 /srv/mediawiki/php-master/includes/Hooks.php(202): Hooks::callHook(string, array, array, NULL)
#41 /srv/mediawiki/php-master/includes/EditPage.php(1766): Hooks::run(string, array)
#42 /srv/mediawiki/php-master/includes/EditPage.php(2221): EditPage->runPostMergeFilters(WikitextContent, Status, User)
#43 /srv/mediawiki/php-master/includes/EditPage.php(1596): EditPage->internalAttemptSave(NULL, boolean)
#44 /srv/mediawiki/php-master/includes/api/ApiEditPage.php(378): EditPage->attemptSave(NULL)
#45 /srv/mediawiki/php-master/includes/api/ApiMain.php(1583): ApiEditPage->execute()
#46 /srv/mediawiki/php-master/includes/api/ApiMain.php(500): ApiMain->executeAction()
#47 /srv/mediawiki/php-master/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(74): ApiMain->execute()
#48 /srv/mediawiki/php-master/extensions/VisualEditor/includes/ApiVisualEditorEdit.php(400): ApiVisualEditorEdit->saveWikitext(Title, string, array)
#49 /srv/mediawiki/php-master/includes/api/ApiMain.php(1583): ApiVisualEditorEdit->execute()
#50 /srv/mediawiki/php-master/includes/api/ApiMain.php(531): ApiMain->executeAction()
#51 /srv/mediawiki/php-master/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#52 /srv/mediawiki/php-master/api.php(87): ApiMain->execute()
#53 /srv/mediawiki/w/api.php(3): include(string)
#54 {main}

Are you connecting to the right database?

Are you connecting to the right database?

we are getting the db and passing it the same way it was done for old TermIndex implementation .. not sure where we are failing to achieve the same effect.

The only difference that I spotted from just reading the code is maybe we should do smth like:

$repoDbDomain = $dataAccessSettings->useEntitySourceBasedFederation() ? $entitySource->getDatabaseName() : false;

instead of:

$repoDbDomain = $this->entitySource->getDatabaseName();

but can't tell if this has anything to do with this error as I do not yet have a local setup to reproduce it.

After coordinating with our DBA, this will go live in Wednesday.

Change 526653 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/mediawiki-config@master] labs: Set tmpPropertyTermsMigrationStage to MIGRATION_WRITE_NEW in wikidata

https://gerrit.wikimedia.org/r/526653

Change 526653 merged by jenkins-bot:
[operations/mediawiki-config@master] labs: Set tmpPropertyTermsMigrationStage to MIGRATION_WRITE_NEW in wikidata

https://gerrit.wikimedia.org/r/526653

Change 519212 merged by jenkins-bot:
[operations/mediawiki-config@master] Switch property terms migration to WRITE_NEW on production wikidata

https://gerrit.wikimedia.org/r/519212

Mentioned in SAL (#wikimedia-operations) [2019-07-31T12:05:02Z] <ladsgroup@deploy1001> sync-file aborted: SWAT: [[gerrit:519212|Switch property terms migration to WRITE_NEW on production wikidata (T225053)]] (duration: 00m 03s)

Mentioned in SAL (#wikimedia-operations) [2019-07-31T12:06:11Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:519212|Switch property terms migration to WRITE_NEW on production wikidata (T225053)]] (duration: 00m 47s)

Mentioned in SAL (#wikimedia-operations) [2019-07-31T12:19:36Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:526657|Revert: Switch property terms migration to WRITE_NEW on production wikidata (T225053)]] (duration: 00m 47s)

This seems good to me:

wikiadmin@10.64.0.96(wikidatawiki)> EXPLAIN SELECT   wbxl_language     as term_language,   wby_name     as term_type,   wbx_text     as term_text FROM wbt_property_terms   INNER JOIN wbt_term_in_lang     ON wbpt_term_in_lang_id = wbtl_id   INNER JOIN wbt_type      ON wbtl_type_id = wby_id   INNER JOIN wbt_text_in_lang     ON wbtl_text_in_lang_id = wbxl_id   INNER JOIN wbt_text       ON wbxl_text_id = wbx_id WHERE   wbpt_property_id = 17\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wbt_property_terms
         type: ref
possible_keys: wbt_property_terms_term_in_lang_id_property_id,wbt_property_terms_property_id
          key: wbt_property_terms_property_id
      key_len: 4
          ref: const
         rows: 374
        Extra: 
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: wbt_term_in_lang
         type: eq_ref
possible_keys: PRIMARY,wbt_term_in_lang_text_in_lang_id_lang_id,wbt_term_in_lang_type_id_text_in
          key: PRIMARY
      key_len: 4
          ref: wikidatawiki.wbt_property_terms.wbpt_term_in_lang_id
         rows: 1
        Extra: 
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: wbt_text_in_lang
         type: eq_ref
possible_keys: PRIMARY,wbt_text_in_lang_text_id_text_id
          key: PRIMARY
      key_len: 4
          ref: wikidatawiki.wbt_term_in_lang.wbtl_text_in_lang_id
         rows: 1
        Extra: 
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: wbt_text
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: wikidatawiki.wbt_text_in_lang.wbxl_text_id
         rows: 1
        Extra: 
*************************** 5. row ***************************
           id: 1
  select_type: SIMPLE
        table: wbt_type
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: wikidatawiki.wbt_term_in_lang.wbtl_type_id
         rows: 1
        Extra: 
5 rows in set (0.00 sec)

Mentioned in SAL (#wikimedia-operations) [2019-08-01T11:23:49Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: c164132: Revert "Revert "Switch property terms migration to WRITE_NEW on production wikidata"" (T225053) (duration: 00m 55s)

Change 527087 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/mediawiki-config@master] Switch property terms migration to WRITE_NEW on client wikis

https://gerrit.wikimedia.org/r/527087

Restricted Application changed the subtype of this task from "Deadline" to "Task". · View Herald TranscriptAug 2 2019, 8:44 AM
Ladsgroup added a subscriber: Marostegui.EditedAug 2 2019, 9:11 AM

I reverted this due to request of @Marostegui given that rows read, DB traffic, and connection errors have spike every two hours and we don't want to leave it like that in the weekend:

Looking at wb_terms grafana dashboard, it seems that the spikes are external but we let it reach database in the new store:

While having the read_key handler being used isn't a bad thing, I think there is a pattern there that needs some investigation as the spikes are huge and very consistent - there's something underlying that we probably have to understand.

Mentioned in SAL (#wikimedia-operations) [2019-08-02T09:17:05Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:526657|Revert: Switch property terms migration to WRITE_NEW on production wikidata (T225053)]] (duration: 00m 48s)

Mentioned in SAL (#wikimedia-operations) [2019-08-05T12:13:24Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:526657|Switch property terms migration to WRITE_NEW on production wikidata (T225053)]] (duration: 00m 48s)

Change 527087 merged by jenkins-bot:
[operations/mediawiki-config@master] Switch property terms migration to WRITE_NEW on client wikis

https://gerrit.wikimedia.org/r/527087

Mentioned in SAL (#wikimedia-operations) [2019-08-07T11:33:40Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:527087|Switch property terms migration to WRITE_NEW on client wikis (T225053)]] (duration: 00m 56s)

Change 528909 had a related patch set uploaded (by Reedy; owner: Reedy):
[operations/mediawiki-config@master] Revert "Switch property terms migration to WRITE_NEW on client wikis"

https://gerrit.wikimedia.org/r/528909

Change 528909 merged by jenkins-bot:
[operations/mediawiki-config@master] Revert "Switch property terms migration to WRITE_NEW on client wikis"

https://gerrit.wikimedia.org/r/528909

Mentioned in SAL (#wikimedia-operations) [2019-08-07T19:16:03Z] <reedy@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Revert Switch property terms migration to WRITE_NEW on client wikis T225053 (duration: 00m 58s)

Reedy changed the task status from Open to Stalled.Aug 7 2019, 7:20 PM
Reedy added a subscriber: Reedy.

Marking as stalled because this cannot be put live again until some remediation is done, and an incident report written for issues caused on cawiki when pulling data from wikidata

Scheduled for Monday 19th of August. Fix T230119: New term store connects to the wrong host in clients moved to be tested on beta (to be deployed).

Mentioned in SAL (#wikimedia-operations) [2019-08-19T11:39:06Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: 483691c: Revert "Revert "Switch property terms migration to WRITE_NEW on client wikis"" (T225053) (duration: 00m 48s)

Mentioned in SAL (#wikimedia-operations) [2019-08-19T11:39:06Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: 483691c: Revert "Revert "Switch property terms migration to WRITE_NEW on client wikis"" (T225053) (duration: 00m 48s)

This has to be reverted asap. The fix(es) are not deployed yet

Mentioned in SAL (#wikimedia-operations) [2019-08-19T11:46:58Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Revert 483691c (T225053) (duration: 00m 48s)

Rescheduled for deployment 20th of August, with a priory backport
https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20190820T1100

alaa_wmde changed the task status from Stalled to Open.Aug 19 2019, 12:41 PM

Change 530845 had a related patch set uploaded (by Alaa Sarhan; owner: Alaa Sarhan):
[mediawiki/extensions/Wikibase@wmf/1.34.0-wmf.17] Initialize DatabaseTermIdsResolver and DatabaseTypeIdsStore with repo database name in client.

https://gerrit.wikimedia.org/r/530845

Change 530845 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.34.0-wmf.17] Initialize DatabaseTermIdsResolver and DatabaseTypeIdsStore with repo database name in client.

https://gerrit.wikimedia.org/r/530845

Mentioned in SAL (#wikimedia-operations) [2019-08-20T12:05:55Z] <awight@deploy1001> Synchronized php-1.34.0-wmf.17/extensions/Wikibase: SWAT: [[gerrit:530845|Initialize DatabaseTermIdsResolver and DatabaseTypeIdsStore with repo database name in client. (T230119, T225053)]] (duration: 00m 52s)

Mentioned in SAL (#wikimedia-operations) [2019-08-26T11:10:54Z] <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:527087|Switch property terms migration to WRITE_NEW on client wikis (T225053)]] (duration: 00m 46s)

We are now reading property terms from new store. We probably should see some drop in fetching terms from old store, since property terms are read way more often than item terms. Will be checking it in few days (https://grafana.wikimedia.org/d/000000548/wikibase-wb_terms 'select.TermSqlIndex_fetchTerms') and keeping this in monitoring for a little while.

alaa_wmde closed this task as Resolved.Aug 26 2019, 7:20 PM

We are now reading property terms from new store. We probably should see some drop in fetching terms from old store, since property terms are read way more often than item terms. Will be checking it in few days (https://grafana.wikimedia.org/d/000000548/wikibase-wb_terms 'select.TermSqlIndex_fetchTerms') and keeping this in monitoring for a little while.

Looking at that graph for last 7 days, today indeed have much smaller peak around the same parts of the day (around 18:00)
https://grafana.wikimedia.org/d/000000548/wikibase-wb_terms?refresh=30s&orgId=1&from=1566242273570&to=1566847073570

Also there no fatals coming from this. Resolving!

Change 519211 abandoned by Alaa Sarhan:
Switch Property Terms migration to WRITE_NEW on test wikidata

https://gerrit.wikimedia.org/r/519211