Page MenuHomePhabricator

Checkuser does not truncate data (cuc_comment, cuc_agent, cuc_actiontext) and that causes exceptions with strict mode
Closed, ResolvedPublic

Description

cuc_comment is 255 characters. rc_comment is 767

One doesn't fit into the other

Function: PopulateCheckUserTable::doDBUpdates
Error: 1406 Data too long for column 'cuc_comment' at row 36 (127.0.0.1:3306)
Function: CheckUserHooks::updateCheckUserData
Error: 1406 Data too long for column 'cuc_agent' at row 1 (127.0.0.1:3306)

Event Timeline

rc_comment is also in the process of being replaced by rc_comment_id referencing the comment table.

The 767-byte length on rc_comment was a first step towards an earlier attempt to allow longer comments, particularly for non-ASCII languages. That first step was never deployed to WMF sites and subsequent steps were never taken.

Is it worth just putting a trim to 255 characters in CheckUser in the meantime? And then revert it when CheckUser is updated to work with longer comments?

If you want a quick fix instead of going ahead and updating CheckUser to use the comment table, sure.

This happens for cuc_agent too. Someone should look into which all fields need truncating.

Nikerabbit renamed this task from cuc_comment is incompatible with Comments to Checkuser does not truncate data (cuc_comment, cuc_agent) and that causes exceptions.Aug 1 2018, 3:06 PM
Nikerabbit renamed this task from Checkuser does not truncate data (cuc_comment, cuc_agent) and that causes exceptions to Checkuser does not truncate data (cuc_comment, cuc_agent) and that causes exceptions with strict mode.
Nikerabbit triaged this task as High priority.
Nikerabbit updated the task description. (Show Details)

@Nikerabbit how did you find that out? Did you try to migrate something and got an error? Or did you look at WMF server logs and noticed a warning about a UA longer than 255 chars? Just trying to understand the issue better.

I installed CheckUser at translatewiki.net, which runs MariaDB in strict mode ($wgSQLMode = 'STRICT_TRANS_TABLES';) unlike WMF production. I found the cuc_agent issue in our logs.

@Nikerabbit how did you find that out? Did you try to migrate something and got an error? Or did you look at WMF server logs and noticed a warning about a UA longer than 255 chars? Just trying to understand the issue better.

The extension needs migrating to use CommentStore

Thank you both.

I think a temporary fix of applying a trim() would be reasonable. Long term, we might want to migrate to CommentStore and only store the IDs instead of the cu_agent. But that would obviously require us to make sure those rows in the comments table don't get replicated into Cloud-Services servers, DB dumps, etc. It won't be as easy as "do not replicate any of the CU tables" anymore.

I assume you mean mb_strcut or Language::truncateForDb rather than trim. Also cuc_agent can't be put into CommentStore as it is not a comment.

Now I am also seeing:

Function: CheckUserHooks::updateCheckUserData
Error: 1406 Data too long for column 'cuc_actiontext' at row 1
Nikerabbit renamed this task from Checkuser does not truncate data (cuc_comment, cuc_agent) and that causes exceptions with strict mode to Checkuser does not truncate data (cuc_comment, cuc_agent, cuc_actiontext) and that causes exceptions with strict mode.Sep 14 2019, 1:03 PM

Change 560444 had a related patch set uploaded (by Paladox; owner: Paladox):
[mediawiki/extensions/CheckUser@master] Convert cuc_actiontext to 'text'

https://gerrit.wikimedia.org/r/560444

Now I am also seeing:

Function: CheckUserHooks::updateCheckUserData
Error: 1406 Data too long for column 'cuc_actiontext' at row 1

This one is curious

	"checkuser-autocreate-action": "was automatically created",
	"checkuser-create-action": "was created",
	"checkuser-email-action": "sent an email to user \"$1\"",
	"checkuser-reset-action": "reset password for user \"$1\"",

Are any of the translations really over 255?

Actually, nevermind

		if ( isset( $attribs['rc_log_type'] ) && $attribs['rc_type'] == RC_LOG ) {
			$target = Title::makeTitle( $attribs['rc_namespace'], $attribs['rc_title'] );
			$context = RequestContext::newExtraneousContext( $target );

			$formatter = LogFormatter::newFromRow( $rc->getAttributes() );
			$formatter->setContext( $context );
			$actionText = $formatter->getPlainActionText();
		} else {
			$actionText = '';
		}

Change 598196 had a related patch set uploaded (by TK-999; owner: TK-999):
[mediawiki/extensions/CheckUser@master] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/598196

Change 560444 abandoned by Paladox:
Convert cuc_actiontext and cuc_agent to 'text'

Reason:
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/CheckUser/ /598196/

https://gerrit.wikimedia.org/r/560444

Change 598196 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@master] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/598196

Change 602878 had a related patch set uploaded (by Paladox; owner: TK-999):
[mediawiki/extensions/CheckUser@REL1_34] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/602878

Change 602878 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@REL1_34] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/602878

@Paladox I noticed that truncation logic was removed from the above patch (https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/+/598196/6..7) for several code paths, such as account creation logging. Can you clarify why this was done?

Change 605242 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/CheckUser@master] Revert "Trim comment field content prior to insertion"

https://gerrit.wikimedia.org/r/605242

Change 605243 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/CheckUser@REL1_34] Revert "Trim comment field content prior to insertion"

https://gerrit.wikimedia.org/r/605243

Change 605243 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@REL1_34] Revert "Trim comment field content prior to insertion"

https://gerrit.wikimedia.org/r/605243

Change 605242 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@master] Revert "Trim comment field content prior to insertion"

https://gerrit.wikimedia.org/r/605242

Change 605266 had a related patch set uploaded (by Paladox; owner: TK-999):
[mediawiki/extensions/CheckUser@master] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/605266

@TK-999 Ah, sorry. I re did the file (since there was merge conflicts). But seems i removed bits. That was not my intention.

@Paladox No worries. I will create a fresh patchset on top of master with all relevant changes restored.

Change 605266 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@master] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/605266

Change 607798 had a related patch set uploaded (by Paladox; owner: TK-999):
[mediawiki/extensions/CheckUser@REL1_34] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/607798

Change 607798 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@REL1_34] Trim comment field content prior to insertion

https://gerrit.wikimedia.org/r/607798

Reedy assigned this task to TK-999.

I think this is all done now, right?