Page MenuHomePhabricator

afl_log_id is never written to
Closed, ResolvedPublic

Description

Investigation of T204126 reveals that nothing has ever been inserted into afl_log_id field.

rEABFd19ced4cef8d: Filter AbuseLog by the "impact" of the change didn't assume it was unused, so the filter is quite useless.

Details

Related Gerrit Patches:
mediawiki/extensions/AbuseFilter : masterReally drop afl_log_id from update.php
mediawiki/extensions/AbuseFilter : masterDrop afl_log_id

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 24 2019, 3:47 PM
Daimona added a subscriber: Daimona.

Cool! Introduced with rEABF5e4289ce4e240f5a91e7c6fd68bab32b8fe7074c, I see, but never used. Its counterpart, afl_rev_id, is used to show the diff link to successfully saved revisions. I guess afl_log_id was meant to be a foreign key to logging.log_id, but I cannot get why. Was it meant to show "a diff to a log entry"? Lol. Some copypasta from core tables which hold bot *_rev_id and *_log_id?

Daimona moved this task from Backlog to Future on the User-Daimona board.Jan 24 2019, 3:54 PM

Was it meant to show "a diff to a log entry"?

This is how I imagined it. But this remark in the linked commit is interesting:

Store the revision ID associated with a log entry if the action is successful

This is how I imagined it. But this remark in the linked commit is interesting:

Store the revision ID associated with a log entry if the action is successful

True. That comment leads in fact to think that it was meant to show a sort of "log diff". Probably that was the idea, then Andrew forgot to use the column, or changed his mind but forgot to remove it. Another possibility is that the column is there to make the table more similar to other core tables. See for instance rEABF1ee749a966714a1c82d45520e7a025c847a3ed86: afl_patrolled_by is unused as well (although in this case we're sure that it was meant to be unused).

Huji added a subscriber: Huji.Jan 24 2019, 7:48 PM

Entering this discussion abruptly, so I'm not sure how useful my comment is, but: please note that unlike edits (which are all in one place), logs are stored in different places, and it is wrong to assume that this field is meant to point to a logging.log_id as, to the extent I understand it, logs that are in other tables can also trigger filters. At least in theory. Unless I am totally misunderstanding the code.

I'm unsure whether logs from other tables could be related to AbuseLog entries, but I highly doubt that we could do something useful with values from this column.

Alright, so I think this afl_log_id field should be removed altogether. Not only because we don't use it, but it makes queries slower. For instance, I see the following logged as slow on tendril for commonswiki:

SELECT * FROM `abuse_filter_log` LEFT JOIN `abuse_filter` ON ((af_id=afl_filter)) WHERE (NOT ( afl_rev_id IS NULL AND afl_log_id IS NULL )) AND afl_deleted = '0' ORDER BY afl_timestamp DESC LIMIT 51

Running it with an explain on quarry says

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	abuse_filter_log	index_merge	afl_rev_id,afl_log_id	afl_rev_id,afl_log_id	5,5		363569	Using sort_union(afl_rev_id,afl_log_id); Using where; Using filesort
1	SIMPLE	abuse_filter	eq_ref	PRIMARY	PRIMARY	8	commonswiki.abuse_filter_log.afl_filter	1	Using where

So it's scanning 363569 rows + a filesort. Removing the afl_log_id part, which is unused anyway, is much faster. I don't have an explain from quarry because the query is too fast, but this one is from my local wiki (where the query above scans 3055 rows):

1	SIMPLE	abuse_filter_log	index	afl_rev_id	afl_timestamp	14	NULL	91	Using where	
1	SIMPLE	abuse_filter	eq_ref	PRIMARY	PRIMARY	8	my_wiki.abuse_filter_log.afl_filter	1	Using where

And this is instantaneous.

The reason which makes it happen is also ovious: afl_log_id is not indexed. So, given that it's been unused for years, we can safely go ahead and free some DB space. We can always re-add the field in the future if we find the need for it, and possibly with the right indexes.

Change 499770 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Drop afl_log_id

https://gerrit.wikimedia.org/r/499770

Daimona claimed this task.Mar 28 2019, 1:56 PM
Daimona moved this task from Future to Under review on the User-Daimona board.

Change 499770 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Drop afl_log_id

https://gerrit.wikimedia.org/r/499770

Change 521474 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Really drop afl_log_id from update.php

https://gerrit.wikimedia.org/r/521474

Change 521474 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Really drop afl_log_id from update.php

https://gerrit.wikimedia.org/r/521474

Daimona moved this task from Under review to Done on the User-Daimona board.Aug 10 2019, 2:36 PM