Page MenuHomePhabricator

[C-DIS][SW] Client recentchanges entries sometimes don't have their wb_changes.change_id reference set
Open, LowPublic

Description

Apparently some (but not all) recentchanges rows which we inject into wikibase clients don't have their "id" (referring to wb_changes.change_id on the repo) set:

wikiadmin@10.64.32.115(enwiki)> SELECT rc_params FROM recentchanges WHERE rc_id = 1245837094 LIMIT 1\G
*************************** 1. row ***************************
rc_params: a:1:{s:20:"wikibase-repo-change";a:14:{s:2:"id";N;s:9:"object_id";s:8:"Q5538841";s:4:"type";s:20:"wikibase-item~update";s:11:"revision_id";s:10:"1147155130";s:7:"user_id";s:5:"95338";s:4:"time";s:14:"20200331142533";s:11:"entity_type";s:4:"item";s:7:"page_id";i:5299020;s:9:"parent_id";i:1034014476;s:7:"comment";s:120:"/* wbsetreference-add:2| */ [[Property:P7859]]: lccn-n87895055, [[:toollabs:quickstatements/#/batch/25615|batch #25615]]";s:6:"rev_id";i:1147155130;s:9:"user_text";s:16:"Vladimir Alexiev";s:15:"central_user_id";i:15868969;s:3:"bot";i:0;}}
1 row in set (0.00 sec)
php > print_r(unserialize('a:1:{s:20:"wikibase-repo-change";a:14:{s:2:"id";N;s:9:"object_id";s:8:"Q5538841";s:4:"type";s:20:"wikibase-item~update";s:11:"revision_id";s:10:"1147155130";s:7:"user_id";s:5:"95338";s:4:"time";s:14:"20200331142533";s:11:"entity_type";s:4:"item";s:7:"page_id";i:5299020;s:9:"parent_id";i:1034014476;s:7:"comment";s:120:"/* wbsetreference-add:2| */ [[Property:P7859]]: lccn-n87895055, [[:toollabs:quickstatements/#/batch/25615|batch #25615]]";s:6:"rev_id";i:1147155130;s:9:"user_text";s:16:"Vladimir Alexiev";s:15:"central_user_id";i:15868969;s:3:"bot";i:0;}}'));
Array
(
    [wikibase-repo-change] => Array
        (
            [id] => 
            [object_id] => Q5538841
            [type] => wikibase-item~update
            [revision_id] => 1147155130
            [user_id] => 95338
            [time] => 20200331142533
            [entity_type] => item
            [page_id] => 5299020
            [parent_id] => 1034014476
            [comment] => /* wbsetreference-add:2| */ [[Property:P7859]]: lccn-n87895055, [[:toollabs:quickstatements/#/batch/25615|batch #25615]]
            [rev_id] => 1147155130
            [user_text] => Vladimir Alexiev
            [central_user_id] => 15868969
            [bot] => 0
        )

)

AFAICT we don't use this information anywhere, so one course of action could be to simply remove the reference.

Event Timeline

This could happen when an insert to wb_changes fails?
Not sure how much this is worth digging into or even trying to remove?
How hard do you think it would be to remove? I guess this would only be removal of the id from the data stored in the recentchanges row?

This could happen when an insert to wb_changes fails?

No idea, but that might be the cause. No idea if this is even happening still.

Not sure how much this is worth digging into or even trying to remove?

Not much IMO… but then again, having this around in a broken state could cause trouble later on (eg. if something starts using that field).

How hard do you think it would be to remove? I guess this would only be removal of the id from the data stored in the recentchanges row?

Trivial, but we need to make sure that nothing reads from this again (the ticket is ~1y old by now).

Quickly looking through https://codesearch.wmcloud.org/search/?q=change_id&i=nope&files=&excludeFiles=&repos= I don't see any usages of this from the recent changes side of things.

And seemingly this does still happen quite a bit

image.png (1×835 px, 164 KB)

mysql:research@dbstore1003.eqiad.wmnet [enwiki]> select (rc_params LIKE "%s:2:\"id\";N%") as nullId, COUNT(*) from recentchanges where rc_source = "wb" GROUP BY nullId limit 1
0;
+--------+----------+
| nullId | COUNT(*) |
+--------+----------+
|      0 |  1080833 |
|      1 |   177750 |
+--------+----------+
2 rows in set (6.029 sec)

So 14% of recent changes on enwiki don't have this field.

@hoo could you give the codesearch a once over to see if you see any usages?
If not I'd say lets remove it to avoid any accidently usages etc and trim the data a bit?

Michael edited projects, added TestMe, wmde-wikidata-tech; removed [DEPRECATED] wdwb-tech.
Michael subscribed.

Check if this still happens after good parts of the system have been rewritten in Q3 of 2021.

ArthurTaylor renamed this task from Client recentchanges entries sometimes don't have their wb_changes.change_id reference set to [C-DIS][SW] Client recentchanges entries sometimes don't have their wb_changes.change_id reference set.Apr 2 2024, 2:56 PM
ArthurTaylor moved this task from Incoming to [DOT] By Project on the wmde-wikidata-tech board.
Lydia_Pintscher subscribed.

Removing it from WD dev team board as this will need to be handled by WIT.

Still happening, though not quite as frequently as the 14% observed in T248984#6999200 (closer to 1% now, at least on euwiki and enwiki).

mysql:research@dbstore1007.eqiad.wmnet [euwiki]> SELECT rc_params LIKE '%:2:"id";N;%' AS nullid, COUNT(*) FROM recentchanges WHERE rc_source = "wb" GROUP BY nullid;
+--------+----------+
| nullid | COUNT(*) |
+--------+----------+
|      0 |   348514 |
|      1 |     3927 |
+--------+----------+
2 rows in set (0.565 sec)

mysql:research@dbstore1008.eqiad.wmnet [enwiki]> SELECT rc_params LIKE '%:2:"id";N;%' AS nullid, COUNT(*) FROM recentchanges WHERE rc_source = "wb" GROUP BY nullid;
+--------+----------+
| nullid | COUNT(*) |
+--------+----------+
|      0 |  1798027 |
|      1 |    22299 |
+--------+----------+
2 rows in set (33.250 sec)

If nobody else has noticed or complained about this, I’m inclined to agree that we could just drop this field.