Page MenuHomePhabricator

Client recentchanges entries sometimes don't have their wb_changes.change_id reference set
Open, LowPublic

Description

Apparently some (but not all) recentchanges rows which we inject into wikibase clients don't have their "id" (referring to wb_changes.change_id on the repo) set:

wikiadmin@10.64.32.115(enwiki)> SELECT rc_params FROM recentchanges WHERE rc_id = 1245837094 LIMIT 1\G
*************************** 1. row ***************************
rc_params: a:1:{s:20:"wikibase-repo-change";a:14:{s:2:"id";N;s:9:"object_id";s:8:"Q5538841";s:4:"type";s:20:"wikibase-item~update";s:11:"revision_id";s:10:"1147155130";s:7:"user_id";s:5:"95338";s:4:"time";s:14:"20200331142533";s:11:"entity_type";s:4:"item";s:7:"page_id";i:5299020;s:9:"parent_id";i:1034014476;s:7:"comment";s:120:"/* wbsetreference-add:2| */ [[Property:P7859]]: lccn-n87895055, [[:toollabs:quickstatements/#/batch/25615|batch #25615]]";s:6:"rev_id";i:1147155130;s:9:"user_text";s:16:"Vladimir Alexiev";s:15:"central_user_id";i:15868969;s:3:"bot";i:0;}}
1 row in set (0.00 sec)
php > print_r(unserialize('a:1:{s:20:"wikibase-repo-change";a:14:{s:2:"id";N;s:9:"object_id";s:8:"Q5538841";s:4:"type";s:20:"wikibase-item~update";s:11:"revision_id";s:10:"1147155130";s:7:"user_id";s:5:"95338";s:4:"time";s:14:"20200331142533";s:11:"entity_type";s:4:"item";s:7:"page_id";i:5299020;s:9:"parent_id";i:1034014476;s:7:"comment";s:120:"/* wbsetreference-add:2| */ [[Property:P7859]]: lccn-n87895055, [[:toollabs:quickstatements/#/batch/25615|batch #25615]]";s:6:"rev_id";i:1147155130;s:9:"user_text";s:16:"Vladimir Alexiev";s:15:"central_user_id";i:15868969;s:3:"bot";i:0;}}'));
Array
(
    [wikibase-repo-change] => Array
        (
            [id] => 
            [object_id] => Q5538841
            [type] => wikibase-item~update
            [revision_id] => 1147155130
            [user_id] => 95338
            [time] => 20200331142533
            [entity_type] => item
            [page_id] => 5299020
            [parent_id] => 1034014476
            [comment] => /* wbsetreference-add:2| */ [[Property:P7859]]: lccn-n87895055, [[:toollabs:quickstatements/#/batch/25615|batch #25615]]
            [rev_id] => 1147155130
            [user_text] => Vladimir Alexiev
            [central_user_id] => 15868969
            [bot] => 0
        )

)

AFAICT we don't use this information anywhere, so one course of action could be to simply remove the reference.

Event Timeline

This could happen when an insert to wb_changes fails?
Not sure how much this is worth digging into or even trying to remove?
How hard do you think it would be to remove? I guess this would only be removal of the id from the data stored in the recentchanges row?

This could happen when an insert to wb_changes fails?

No idea, but that might be the cause. No idea if this is even happening still.

Not sure how much this is worth digging into or even trying to remove?

Not much IMO… but then again, having this around in a broken state could cause trouble later on (eg. if something starts using that field).

How hard do you think it would be to remove? I guess this would only be removal of the id from the data stored in the recentchanges row?

Trivial, but we need to make sure that nothing reads from this again (the ticket is ~1y old by now).

Quickly looking through https://codesearch.wmcloud.org/search/?q=change_id&i=nope&files=&excludeFiles=&repos= I don't see any usages of this from the recent changes side of things.

And seemingly this does still happen quite a bit

image.png (1×835 px, 164 KB)

mysql:research@dbstore1003.eqiad.wmnet [enwiki]> select (rc_params LIKE "%s:2:\"id\";N%") as nullId, COUNT(*) from recentchanges where rc_source = "wb" GROUP BY nullId limit 1
0;
+--------+----------+
| nullId | COUNT(*) |
+--------+----------+
|      0 |  1080833 |
|      1 |   177750 |
+--------+----------+
2 rows in set (6.029 sec)

So 14% of recent changes on enwiki don't have this field.

@hoo could you give the codesearch a once over to see if you see any usages?
If not I'd say lets remove it to avoid any accidently usages etc and trim the data a bit?