Page MenuHomePhabricator

Missing updater for rc_ip field: RecentChange::save: 1406 Data too long for column 'rc_ip'
Closed, ResolvedPublic

Description

Actually, this issue seems much wider when looking at this more closely. In 2007 Tim changed many fields without providing updaters:

commit f38344aba87edbdb5dba23d6433bf3c7cbbb0b51
Author: Tim Starling <tstarling@users.mediawiki.org>
Date:   Fri Jun 22 18:31:24 2007 +0000

    * Fixed complete breakage for MySQL instances with a maximum key length of 1000 bytes. Tested with MyISAM 5 with UTF-8 default character set.
    * Replaced char/varchar with binary/varbinary in cases where the data is pure ASCII. This is more efficient for various reasons in MySQL 4.1+, and equivalent to char() binary in 4.0. Sometimes varbinary is needed where char was used before, to avoid zero-padding on the right. As a side-effect, this means the MySQL 5 binary schema is no longer broken.
    * Removed width specifiers from integer columns. These do virtually nothing, the MySQL command line client ignores them and they have no impact on storage. The only thing they do is cause arguments between developers over whether the most appropriate width of a 32-bit integer is 8, 9 or 10 columns.
    * No updater provided for the moment, but everyone should migrate to the MySQL 5 binary schema eventually since it should now work correctly and efficiently.

On of these changes is changing rc_ip from 15 to 40:

-  rc_ip char(15) NOT NULL default '',
+  rc_ip varbinary(40) NOT NULL default '',

The missing updaters can be confirmed at translatewiki.net:

MariaDB [translatewiki_net]> show create table bw_recentchanges\G
*************************** 1. row ***************************
       Table: bw_recentchanges
Create Table: CREATE TABLE `bw_recentchanges` (
  `rc_id` int(8) NOT NULL AUTO_INCREMENT,
  `rc_timestamp` varchar(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',
  `rc_user` int(10) unsigned NOT NULL DEFAULT '0',
  `rc_user_text` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',
  `rc_namespace` int(11) NOT NULL DEFAULT '0',
  `rc_title` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',
  `rc_comment` varbinary(767) NOT NULL DEFAULT '',
  `rc_minor` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `rc_bot` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `rc_new` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `rc_cur_id` int(10) unsigned NOT NULL DEFAULT '0',
  `rc_this_oldid` int(10) unsigned NOT NULL DEFAULT '0',
  `rc_last_oldid` int(10) unsigned NOT NULL DEFAULT '0',
  `rc_type` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `rc_patrolled` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `rc_ip` varchar(15) NOT NULL DEFAULT '',
  `rc_old_len` int(10) DEFAULT NULL,
  `rc_new_len` int(10) DEFAULT NULL,
  `rc_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rc_logid` int(10) unsigned NOT NULL DEFAULT '0',
  `rc_log_type` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `rc_log_action` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `rc_params` blob NOT NULL,
  `rc_source` varbinary(16) NOT NULL DEFAULT '',
  PRIMARY KEY (`rc_id`),
  KEY `rc_timestamp` (`rc_timestamp`),
  KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
  KEY `rc_cur_id` (`rc_cur_id`),
  KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
  KEY `rc_ip` (`rc_ip`),
  KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
  KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`),
  KEY `rc_patrolling` (`rc_this_oldid`,`rc_last_oldid`,`rc_patrolled`),
  KEY `rc_ns_bot_ts` (`rc_namespace`,`rc_bot`,`rc_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=10228341 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

And the data loss this causes can be seen in translatewiki.net production:

MariaDB [translatewiki_net]> select rc_id, rc_ip from bw_recentchanges order by rc_id desc limit 1\G
*************************** 1. row ***************************
rc_id: 10228340
rc_ip: 2001:14ba:3fb:b
1 row in set (0.01 sec)

I only noticed this issue because I have $wgSQLMode = 'STRICT_TRANS_TABLES'; set on the development server and because I only recently got IPv6.

I find it unlikely that the migration to binary schema was ever announced or explained how to do it. I have been pretty actively following MediaWiki developments, so I think it is likely that many old wikis are having the same issue, but gone unnoticed.

Event Timeline

Nikerabbit raised the priority of this task from to Needs Triage.
Nikerabbit updated the task description. (Show Details)
Nikerabbit added a project: Wikimedia-Rdbms.
Nikerabbit added a subscriber: Nikerabbit.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 5 2015, 9:57 AM
Nemo_bis triaged this task as Normal priority.Dec 5 2015, 11:19 AM
Nemo_bis set Security to None.
Nemo_bis added a subscriber: Nemo_bis.

Change 315515 had a related patch set uploaded (by Paladox):
Correctly migrate rc_ip field to varbinary(40) on wiki's that used an old mw version

https://gerrit.wikimedia.org/r/315515

bd808 removed a subscriber: bd808.Oct 12 2016, 5:16 PM

Change 315515 merged by jenkins-bot:
Add patch to migrate rc_ip field to varbinary(40)

https://gerrit.wikimedia.org/r/315515

@Nikerabbit hi, does the above patch fix your problem?

Is this task for the other fields too or just this specific one?

@Nikerabbit hi, does the above patch fix your problem?

I don't know. I could not keep production site knowingly broken for almost a year, so I migrated rc_ip and some other fields manually long ago.

Feel free to create new tasks for other fields and close this.