Page MenuHomePhabricator

log_user_text is missing for some logs in fawiki
Open, Needs TriagePublic

Description

Please look at https://quarry.wmflabs.org/query/20617 for some examples of logs where log_user is there but log_user_text is null (in this case, it should have been 'ماني').

I am not sure if this is caused by the Labs views, or if the actual data in the DB is also having log_user_text as NULL for these rows. Some (but not all) of these are logs that are later revdel'ed so that may have to do with this. For now, I am tagging it both for Clouds Services and for WMF Site Requests, but it should be narrowed down to only one of those two by someone who has access to the DB tables.

Event Timeline

Huji created this task.Mar 10 2019, 2:57 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 10 2019, 2:57 AM

That same query in production returns empty for log_user_text:

root@db1094.eqiad.wmnet[fawiki]> select log_user_text from logging where log_type = 'move' and log_user = 173 and log_user_text <> 'ماني' order by log_id desc limit 100\G
*************************** 1. row ***************************
log_user_text:
*************************** 2. row ***************************
log_user_text:

<snip>

And this is the view for the table:

MariaDB [fawiki_p]> show create table logging\G
*************************** 1. row ***************************
                View: logging
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`viewmaster`@`%` SQL SECURITY DEFINER VIEW `logging` AS select `fawiki`.`logging`.`log_id` AS `log_id`,`fawiki`.`logging`.`log_type` AS `log_type`,if((`fawiki`.`logging`.`log_deleted` & 1),NULL,`fawiki`.`logging`.`log_action`) AS `log_action`,`fawiki`.`logging`.`log_timestamp` AS `log_timestamp`,if((`fawiki`.`logging`.`log_deleted` & 4),NULL,`fawiki`.`logging`.`log_user`) AS `log_user`,if((`fawiki`.`logging`.`log_deleted` & 4),0,`fawiki`.`logging`.`log_actor`) AS `log_actor`,if((`fawiki`.`logging`.`log_deleted` & 1),NULL,`fawiki`.`logging`.`log_namespace`) AS `log_namespace`,if((`fawiki`.`logging`.`log_deleted` & 1),NULL,`fawiki`.`logging`.`log_title`) AS `log_title`,if((`fawiki`.`logging`.`log_deleted` & 2),0,`fawiki`.`logging`.`log_comment_id`) AS `log_comment_id`,if((`fawiki`.`logging`.`log_deleted` & 2),NULL,`fawiki`.`logging`.`log_comment`) AS `log_comment`,if(`fawiki`.`logging`.`log_deleted`,NULL,`fawiki`.`logging`.`log_params`) AS `log_params`,`fawiki`.`logging`.`log_deleted` AS `log_deleted`,if((`fawiki`.`logging`.`log_deleted` & 4),NULL,`fawiki`.`logging`.`log_user_text`) AS `log_user_text`,if((`fawiki`.`logging`.`log_deleted` & 1),NULL,`fawiki`.`logging`.`log_page`) AS `log_page` from `fawiki`.`logging` where (`fawiki`.`logging`.`log_type` in ('abusefilter','articlefeedbackv5','block','campus','close','contentmodel','course','create','delete','eparticle','gather','gblblock','gblrename','gblrights','globalauth','gwtoolset','import','institution','instructor','interwiki','liquidthreads','lock','managetags','massmessage','merge','moodbar','move','mwoauthconsumer','newsletter','newusers','notifytranslators','online','pagelang','pagetranslation','pagetriage-curation','pagetriage-deletion','patrol','protect','renameuser','review','rights','spamblacklist','stable','student','tag','thanks','timedmediahandler','translationreview','upload','usermerge'))
character_set_client: utf8
collation_connection: utf8_general_ci

Great. That means the view is not to blame.

So what do we do now? Should I create a maintenance script to update the logging table and fill in the blanks? Should we investigate if other wikis are affected too?

Marostegui added a subscriber: Umherirrender.

I don't know what is the logic behind that field and how it works so I cannot help further

On revision deletion that field and the user id is set the null in the views for toollabs

The actor is filled, maybe the row was written while the migration was in process

T167246

The actor is filled, maybe the row was written while the migration was in process

Not likely. All the rows in question are timestamped 2009-09-15 or earlier.

MariaDB [fawiki_p]> select max(log_timestamp) from logging where log_type = 'move' and log_user = 173 and log_user_text <> 'ماني'\G
*************************** 1. row ***************************
max(log_timestamp): 20090915081709

MariaDB [fawiki_p]> select max(log_timestamp) from logging where log_user_text = ''\G
*************************** 1. row ***************************
max(log_timestamp): 20090916231541

MariaDB [fawiki_p]> select min(log_timestamp) from logging where log_user_text != ''\G
*************************** 1. row ***************************
min(log_timestamp): 20090917000112

The log_actor was filled in based on log_user.

The log_user_text column was added well after the rest of the logging table, and seems to never have been backfilled. Since it's going away in the near future in favor of log_actor, there's probably little point in it being backfilled now.

Then the maintenance script populateLogUsertext.php maybe missed some rows - T29711 - T189043

Why do you think that? Both of the tasks you linked are closed as "Declined" and there's no indication the maintenance script was ever run.

You are right, so it could be by design that these fields are not filled (because the maintenance script was not running and mediawiki itself is not using the column and not missing data)

Urbanecm added a subscriber: Urbanecm.

I don't see the relevance to site requests.