Page MenuHomePhabricator

Research items from Drift tracker report (December 2021) and create specific tasks to fix the issues
Closed, ResolvedPublic

Description

https://drift-tracker.toolforge.org/report/core/ shows some drift changes related to the timestamp clean up (T42626) on similiar hosts

A possible issue is that the database was cloned or created newly around the time of the schema changes, similiar to the report under T276292#7586195

bgwiki - s2

  • db1122
  • db1105:3312
  • db1129
  • db1146:3312
  • db1156
  • db1162
  • db1170:3312
  • db1182
  • Mismatching field nullability of recentchanges.rc_params
  • Mismatching field size of page_restrictions.pr_level
  • Mismatching field size of page_restrictions.pr_type
  • Mismatching field type of oldimage.oi_timestamp
  • Mismatching field type of page.page_touched
  • Mismatching field type of protected_titles.pt_timestamp
  • Mismatching field type of querycache_info.qci_timestamp
  • Mismatching field type of revision.rev_timestamp (not on db1156)
  • Mismatching field type of user.user_email_authenticated
  • Mismatching field type of user.user_email_token
  • Mismatching field type of user.user_email_token_expires
  • Mismatching field type of user.user_newpass_time
  • Mismatching field type of user.user_registration
  • Mismatching field type of user.user_token
  • Mismatching field type of user.user_touched

commonswiki - s4

  • db1138
  • db1121
  • db1141
  • db1142
  • db1143
  • db1144:3314
  • db1146:3314
  • db1147
  • db1148
  • db1149
  • db1160
  • Mismatching field size of page_restrictions.pr_level
  • Mismatching field size of page_restrictions.pr_type
  • Mismatching field type of oldimage.oi_timestamp
  • Mismatching field type of page.page_touched
  • Mismatching field type of protected_titles.pt_timestamp
  • Mismatching field type of querycache_info.qci_timestamp
  • Mismatching field type of revision.rev_timestamp
  • Mismatching field type of user.user_email_authenticated
  • Mismatching field type of user.user_email_token
  • Mismatching field type of user.user_email_token_expires
  • Mismatching field type of user.user_newpass_time
  • Mismatching field type of user.user_registration
  • Mismatching field type of user.user_token
  • Mismatching field type of user.user_touched

enwiki - s1

  • db1118
  • db1099:3311
  • db1105:3311
  • db1106
  • db1119
  • db1134
  • db1135
  • db1163
  • db1164
  • db1169
  • db1184
  • Mismatching field nullability of recentchanges.rc_params
  • Mismatching field size of page_restrictions.pr_level
  • Mismatching field size of page_restrictions.pr_type
  • Mismatching field type of oldimage.oi_timestamp
  • Mismatching field type of page.page_touched
  • Mismatching field type of protected_titles.pt_timestamp
  • Mismatching field type of querycache_info.qci_timestamp
  • Mismatching field type of revision.rev_timestamp
  • Mismatching field type of user.user_email_authenticated
  • Mismatching field type of user.user_email_token
  • Mismatching field type of user.user_email_token_expires
  • Mismatching field type of user.user_newpass_time
  • Mismatching field type of user.user_registration
  • Mismatching field type of user.user_token
  • Mismatching field type of user.user_touched

labswiki - s6

  • db1173
  • db1096:3316
  • db1098:3316
  • db1113:3316
  • db1131
  • db1165
  • db1168
  • db1180
  • Mismatching field type of user.user_email

thwikiquote - s3

  • db1157
  • db1112
  • db1123
  • db1166
  • db1175
  • db1179
  • Mismatching field nullability of user.user_email
  • Mismatching field nullability of user.user_newpassword
  • Mismatching field nullability of user.user_password
  • Mismatching field size of page_restrictions.pr_level
  • Mismatching field size of page_restrictions.pr_type
  • Mismatching field type of oldimage.oi_timestamp
  • Mismatching field type of page.page_restrictions
  • Mismatching field type of page.page_touched
  • Mismatching field type of protected_titles.pt_timestamp
  • Mismatching field type of querycache_info.qci_timestamp
  • Mismatching field type of revision.rev_timestamp
  • Mismatching field type of text.old_flags
  • Mismatching field type of user.user_email
  • Mismatching field type of user.user_email_authenticated
  • Mismatching field type of user.user_email_token
  • Mismatching field type of user.user_email_token_expires
  • Mismatching field type of user.user_newpass_time
  • Mismatching field type of user.user_newpassword
  • Mismatching field type of user.user_password
  • Mismatching field type of user.user_registration
  • Mismatching field type of user.user_token
  • Mismatching field type of user.user_touched

viwiki - s7

  • db1136
  • db1098:3317
  • db1101:3317
  • db1127
  • db1158
  • db1170:3317
  • db1174
  • db1181
  • Mismatching field nullability of user.user_email
  • Mismatching field nullability of user.user_newpassword
  • Mismatching field nullability of user.user_password
  • Mismatching field size of page_restrictions.pr_level (not on db1136)
  • Mismatching field size of page_restrictions.pr_type (not on db1136)
  • Mismatching field type of oldimage.oi_timestamp
  • Mismatching field type of page.page_restrictions (not on db1181)
  • Mismatching field type of page.page_touched (not on db1181)
  • Mismatching field type of protected_titles.pt_timestamp
  • Mismatching field type of querycache_info.qci_timestamp
  • Mismatching field type of revision.rev_timestamp
  • Mismatching field type of text.old_flags
  • Mismatching field type of user.user_email
  • Mismatching field type of user.user_email_authenticated
  • Mismatching field type of user.user_email_token
  • Mismatching field type of user.user_email_token_expires
  • Mismatching field type of user.user_newpass_time
  • Mismatching field type of user.user_newpassword
  • Mismatching field type of user.user_password
  • Mismatching field type of user.user_registration
  • Mismatching field type of user.user_token
  • Mismatching field type of user.user_touched

Related Objects

Event Timeline

All hosts are on eqiad. Did I missed a task where that is done after a switch over?

For example T282372 for page_touched is closed and the dbs are listed on the task

I run drift tracker per dc. I ran it several times on codfw (usually before switchovers) but it's a different issue altogether.
The thing is that this current list is too big to be tackled in one task (it used to be around 15 times bigger, I have been cleaning these up since 2017). Each set of drifts need a separate investigation. Sometimes, you need to change production, sometimes you need to change the code. Sometimes it's partial schema changes, sometimes people actually forgot to change production at all and new needs to be checked slowly to make sure nothing breaks. It's quite a mess. I suggest grouping those on drift type and create a ticket for each one to do the investigation. e.g. one for "Mismatching field size of page_restrictions.pr_type"

e.g. one for "Mismatching field size of page_restrictions.pr_type"

For that the last change to the fields is f38344aba87edbdb5dba23d6433bf3c7cbbb0b51, but I cannot see the old length.

Mismatching field nullability of recentchanges.rc_params

Possible 79eb0fe88d35aac3dce2b6fbefc652b123414066, but that's also removes DEFAULT, which is not reported here.

Mismatching field type of page.page_restrictions

Thats a tinyblob since begin 64ac6b8e775dd8582ea9cc5f24235488f076c5b3, not sure which types differ here

Most of the other mismatch are related to timestamps, which gets some fixed on the abstract schema changes.
It does not looks like current schema changes missing here. Only 5 wikis, not sure about the age of them.
I do not thing there are code changes needed here. If there are code changes it means all the other wikis needs updating. Only performance could be a reason for that.

It does not looks like current schema changes missing here. Only 5 wikis, not sure about the age of them.

The current drift tracker picks one wiki randomly from each section. That makes it less great but otherwise, any drift on s3 would overwhelm the report (and it would be really slow). I do run it on all of s3 from time to time separately: T260111: All sorts of random drifts in wikis in s3

I do not thing there are code changes needed here. If there are code changes it means all the other wikis needs updating. Only performance could be a reason for that.

It really depends on the case. We should do git log -S on each one, create a ticket separately, run a full check on the system for each drift. It's a long and tedious work but we are getting there. It is important to take our time because in 2017, an index named tmp1 was removed as a drift (and it was marked as unused index in database reports) caused a full outage of all wikis.

It really depends on the case. We should do git log -S on each one, create a ticket separately, run a full check on the system for each drift. It's a long and tedious work but we are getting there. It is important to take our time because in 2017, an index named tmp1 was removed as a drift (and it was marked as unused index in database reports) caused a full outage of all wikis.

There are no index reported this time, but I will try to find some reason for the field differences and create tasks for it. But some of them reading possible hard to do.

Umherirrender renamed this task from Drift tracker shows many mismatches for the same hosts/wikis to Research items from Drift tracker report (December 2021) and create specific tasks to fix the issues.Dec 25 2021, 7:02 PM
  • Need help for the drifts related to user table. It is possible to get the definition of a user table from s3 (thwikiquote)? The reported columns are all binary since the begin (only char() binary rewrite to binary() in f38344aba87edbdb5dba23d6433bf3c7cbbb0b51 is shown in the history). Not sure about the current type of the fields and what the difference is.
  • Should "Mismatching field type of page.page_restrictions" fixed or waiting for drop? T218446 (Just to keep my result from research: The type was not changed - added in 64ac6b8e775dd8582ea9cc5f24235488f076c5b3, not sure why it is a varbinary(255) on some wikis and tinyblob on others)
  • While looking through old commits it seems necessary that the drift tracker also looks for DEFAULTs, because some columns gets defaults removed and not sure if that all was done in production as well. Sounds that necessary?
  • Need help for the drifts related to user table. It is possible to get the definition of a user table from s3 (thwikiquote)? The reported columns are all binary since the begin (only char() binary rewrite to binary() in f38344aba87edbdb5dba23d6433bf3c7cbbb0b51 is shown in the history). Not sure about the current type of the fields and what the difference is.
root@db1112.eqiad.wmnet[thwikiquote]> show create table user\G
*************************** 1. row ***************************
       Table: user
Create Table: CREATE TABLE `user` (
  `user_id` int(5) unsigned NOT NULL AUTO_INCREMENT,
  `user_name` varbinary(255) NOT NULL DEFAULT '',
  `user_real_name` varbinary(255) NOT NULL DEFAULT '',
  `user_password` varbinary(255) DEFAULT NULL,
  `user_newpassword` varbinary(255) DEFAULT NULL,
  `user_email` varbinary(255) DEFAULT NULL,
  `user_touched` varbinary(14) NOT NULL,
  `user_token` varbinary(32) NOT NULL DEFAULT '',
  `user_email_authenticated` varbinary(14) DEFAULT NULL,
  `user_email_token` varbinary(32) DEFAULT NULL,
  `user_email_token_expires` varbinary(14) DEFAULT NULL,
  `user_registration` varbinary(14) DEFAULT NULL,
  `user_newpass_time` varbinary(14) DEFAULT NULL,
  `user_editcount` int(11) DEFAULT NULL,
  `user_password_expires` varbinary(14) DEFAULT NULL,
  PRIMARY KEY (`user_id`),
  UNIQUE KEY `user_name` (`user_name`),
  KEY `user_email_token` (`user_email_token`),
  KEY `user_email` (`user_email`(50))
) ENGINE=InnoDB AUTO_INCREMENT=5135 DEFAULT CHARSET=binary
1 row in set (0.001 sec)
root@db1112.eqiad.wmnet[thwikiquote]> show create table text\G
*************************** 1. row ***************************
       Table: text
Create Table: CREATE TABLE `text` (
  `old_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
  `old_text` mediumblob NOT NULL,
  `old_flags` varbinary(255) NOT NULL,
  PRIMARY KEY (`old_id`)
) ENGINE=InnoDB AUTO_INCREMENT=42791 DEFAULT CHARSET=binary PACK_KEYS=1
1 row in set (0.001 sec)
  • Should "Mismatching field type of page.page_restrictions" fixed or waiting for drop? T218446 (Just to keep my result from research: The type was not changed - added in 64ac6b8e775dd8582ea9cc5f24235488f076c5b3, not sure why it is a varbinary(255) on some wikis and tinyblob on others)

If it is going to be eventually dropped, I would rather ignore this. Doing a schema change that changes datatype is costly and requires master switchovers.

Sure!

  • While looking through old commits it seems necessary that the drift tracker also looks for DEFAULTs, because some columns gets defaults removed and not sure if that all was done in production as well. Sounds that necessary?

@Ladsgroup ^ thoughts?

  • While looking through old commits it seems necessary that the drift tracker also looks for DEFAULTs, because some columns gets defaults removed and not sure if that all was done in production as well. Sounds that necessary?

@Ladsgroup ^ thoughts?

I think fixing defaults and unifying them is important but not in any way high priority. It should be tackled eventually but that can be during the switchover time.

  • Should "Mismatching field type of page.page_restrictions" fixed or waiting for drop? T218446 (Just to keep my result from research: The type was not changed - added in 64ac6b8e775dd8582ea9cc5f24235488f076c5b3, not sure why it is a varbinary(255) on some wikis and tinyblob on others)

If it is going to be eventually dropped, I would rather ignore this. Doing a schema change that changes datatype is costly and requires master switchovers.

If T298557: Fix mismatching field type of page.page_touched on wmf wikis needs a master switchover maybe make it as well, at least some of the section of that task also affected with that schema drift. But I am linking the schema drift to the drop task instead and treat it as declined from this task view.

Sure!

Created (linked as subtasks here)

  • While looking through old commits it seems necessary that the drift tracker also looks for DEFAULTs, because some columns gets defaults removed and not sure if that all was done in production as well. Sounds that necessary?

@Ladsgroup ^ thoughts?

T298562: Change drift tracker to report DEFAULT differences as well

Umherirrender claimed this task.

I have picked all issues into separated tasks for better processing on wmf side. it also makes discussion about the tables easier.
I have also linked the tasks from the tool and this task is no longer linked there.

Thanks for fixing the drifts afterwards. All all lowest prio and it would be nice if there are fixed somedays.

Thank you for creating all those tasks!