Page MenuHomePhabricator

flaggedrevs_promote has entries for non existent users
Closed, InvalidPublic

Description

For enwikibooks, there are 62 rows in the flaggedrevs_promote table with frp_user_id not corresponding to a user_id

MariaDB [enwikibooks_p]> SELECT frp_user_id FROM flaggedrevs_promote WHERE NOT EXISTS (SELECT 1 FROM user WHERE user_id = frp_user_id);
+-------------+
| frp_user_id |
+-------------+
|           0 |
|       88200 |
|       96685 |
|      102733 |
|      104520 |
|      107208 |
|      107242 |
|      107472 |
|      108150 |
|      108962 |
|      109258 |
|      109849 |
|      110692 |
|      110748 |
|      111301 |
|      111316 |
|      111795 |
|      113482 |
|      113483 |
|      118718 |
|      120131 |
|      120476 |
|      121384 |
|      122175 |
|      126728 |
|      126838 |
|      127694 |
|      128142 |
|      129596 |
|      131394 |
|      131644 |
|      131646 |
|      133780 |
|      136941 |
|      138262 |
|      138355 |
|      140028 |
|      144573 |
|      144905 |
|      147418 |
|      148579 |
|      149211 |
|      149611 |
|      149638 |
|      150395 |
|      150433 |
|      157351 |
|      158209 |
|      158213 |
|      158946 |
|      159503 |
|      375939 |
|     3031471 |
|     3032881 |
|     3038270 |
|     3135571 |
|     3158043 |
|     3186866 |
|     3201800 |
|     3216990 |
|     3253559 |
|     3306387 |
+-------------+
62 rows in set (8.91 sec)

Iterating over all wikis with flagged revs shows this problem exists on other wikis as well

dannys712@tools-sgebastion-07:~$ while read wiki; do echo "SELECT COUNT(frp_user_id) AS 'count' FROM flaggedrevs_promote WHERE NOT EXISTS (SELECT 1 FROM user WHERE user_id = frp_user_id);" | sql $wiki | sed "s/^/$wiki\t/g"; done < <(curl https://noc.wikimedia.org/conf/dblists/flaggedrevs.dblist)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   595  100   595    0     0   2012      0 --:--:-- --:--:-- --:--:--  2500
Could not find requested database
Make sure to ask for a db in format of <wiki>_p
alswiki count
alswiki 23
arwiki  count
arwiki  263
bewiki  count
bewiki  10
bnwiki  count
bnwiki  8
bswiki  count
bswiki  2
cawikinews      count
cawikinews      0
cewiki  count
cewiki  0
ckbwiki count
ckbwiki 0
dewiki  count
dewiki  783
dewikiquote     count
dewikiquote     1
dewiktionary    count
dewiktionary    11
elwikinews      count
elwikinews      0
enwiki  count
enwiki  516
enwikibooks     count
enwikibooks     62
enwikinews      count
enwikinews      61
eowiki  count
eowiki  20
eswikinews      count
eswikinews      17
fawiki  count
fawiki  5
fawikinews      count
fawikinews      0
fiwiki  count
fiwiki  2
frwikinews      count
frwikinews      7
hewikisource    count
hewikisource    4
hiwiki  count
hiwiki  1
huwiki  count
huwiki  34
iawiki  count
iawiki  4
idwiki  count
idwiki  1
iswiktionary    count
iswiktionary    4
kawiki  count
kawiki  12
lawikisource    count
lawikisource    0
mkwiki  count
mkwiki  0
plwiki  count
plwiki  63
plwikisource    count
plwikisource    0
plwiktionary    count
plwiktionary    1
ptwikibooks     count
ptwikibooks     0
ptwikinews      count
ptwikinews      6
ptwikisource    count
ptwikisource    7
ruwiki  count
ruwiki  114
ruwikinews      count
ruwikinews      0
ruwikiquote     count
ruwikiquote     4
ruwikisource    count
ruwikisource    2
ruwiktionary    count
ruwiktionary    0
sqwiki  count
sqwiki  0
tawikinews      count
tawikinews      0
test2wiki       count
test2wiki       6
trwiki  count
trwiki  95
trwikiquote     count
trwikiquote     0
ukwiki  count
ukwiki  0
ukwiktionary    count
ukwiktionary    3
vecwiki count
vecwiki 0
zh_classicalwiki        count
zh_classicalwiki        13

Standing out: 783 rows on dewiki, 516 on enwiki, 263 on arwiki, 114 on ruwiki.

Should these rows be deleted? How did they come to exist in the first place (I guess 0 comes from an anonymous users, but the others?) and how can it be prevented?

Event Timeline

Restricted Application added subscribers: Huji, Base, Aklapper. · View Herald Transcript
Marostegui subscribed.

Removing the  DBA project for now, as there is nothing for us here so far. We don't delete things directly from the DB, we prefer those to be done via a script from MW maintenance servers.
I am staying subscribed to the task in case I can be of any help

Investigated little bit. I suspect that there has not been any edits for those users, but i cant confirm this edits could be deleted with oversight too. However, user_id 310454 was referenced from abuser_filter_log table and second user_id 318872 from logging table.

MariaDB [fiwiki_p]> SELECT frp_user_id FROM flaggedrevs_promote WHERE NOT EXISTS (SELECT 1 FROM user WHERE user_id = frp_user_id);
+-------------+
| frp_user_id |
+-------------+
|      310454 |
|      318872 |
+-------------+

MariaDB [fiwiki_p]>  SELECT actor_id, actor_name, actor_user, user_id, user_name  FROM actor LEFT JOIN user ON user_id=actor_user  WHERE actor_user IN (310454, 318872) ;
+----------+---------------------------+------------+---------+-----------+
| actor_id | actor_name                | actor_user | user_id | user_name |
+----------+---------------------------+------------+---------+-----------+
|   334736 | Britta.Parbel-Antisemitin |     310454 |    NULL | NULL      |
|   343104 | Shoot all Muslims         |     318872 |    NULL | NULL      |
+----------+---------------------------+------------+---------+-----------+

And log entries for those users

https://fi.wikipedia.org/wiki/Toiminnot:Poistetut_muokkaukset/Britta.Parbel-Antisemitin?uselang=en

This user is currently blocked. The latest block log entry is provided below for reference:
* (change visibility) 12:58, 3 March 2016 Henswick talk contribs block (log details removed) (Sopimaton käyttäjätunnus: + levittää eri wikeihin kunnianloukkauksia eräästä de-wikin käyttäjästä)

https://fi.wikipedia.org/wiki/Toiminnot:Loki/Shoot_all_Muslims?uselang=en

20:54, 12 June 2016 (username removed) (log details removed)

@Marostegui can you check if problem exists in the production db too or just that we cant see those users in labs database because of sanitization?

I looked at this while trying to explain autopromotion bugs reported in T237191.

@Marostegui can you check if problem exists in the production db too or just that we cant see those users in labs database because of sanitization?

I tried running the queries on analytics replicas, for enwikibooks and fiwiki:

  • on enwikibooks it only returns the row with frp_user_id=0
  • on fiwiki it returns nothing

So I think your suspicion about oversighted data is correct. There are no entries for non existent users (except 0), those users are not visible on labs because of sanitization.

It is a bug, but it doesn't seem worth the effort to look into it. Most likely it was a bug and it has been fixed years ago (hence why the row only appears on some wikis).

For example, commit rEFLR6742fd341e30: Added 'totalCheckedEdits' param (from 2009) added a check for anon users in a function called autoPromoteUser(). I don't know if that was actually the fix, it's just what I was able to find quickly with some blaming.