Page MenuHomePhabricator

Linter seems to be not cleaning up after page deletion
Closed, ResolvedPublic

Description

I wanted to clean up lint errors a little and downloaded the dump of the table and wrote a script to find most common templates being used. The second and third most common templates are Template:Db-afc and Template:Db-g13 that are used for speedy deletion (which doesn't make sense, we don't have 60k pages waiting for deletion).
Here is a list:

79871849	55000918	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79871869	44893415	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79871921	55003369	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79871927	55007572	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79871935	55010772	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79871952	55002791	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79908005	55010089	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79908269	55002334	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79908306	54360240	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79931595	33613825	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79943777	52887941	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79984485	53484509	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}
79984489	55012278	4	0	43	{"inTable":true,"name":"p","templateInfo":{"name":"Template:Db-afc"}}

Then tried to check the page id to find the page to fix them, none. nada.

I did a left-join query: select * from linter left join page on linter_page = page_id where page_namespace is null; and it seems out 17M rows in enwiki 300K belong to pages that don't exist anymore. These need to be cleaned up to at least reduce the load on the database. And also it's better to find the problem and make sure if there is a hook that removes the lint errors after deletion.

Event Timeline

There is code in the Linter extension to handle page deletions and clearly there is bug there. I did a quick skim of the code just now and nothing obvious popped out. So, will need investigation.

Oh and this explain the difference between count and the actual list. e.g. Currently there is no error in self-closed-tag (id=6) but the special page says "Self-closed tags (72 errors)". When you query it with pages that don't exist "select * from linter left join page on linter_page = page_id where linter_cat =6 and page_namespace is null;" exactly 72 results show up :)

I will look into this as I already have attempted to improve the error count results and this continues that effort in:
https://phabricator.wikimedia.org/T194872

This should probably be merged to the task identified by Arlolra.

Change 752743 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/Linter@master] Fix lint error updating

https://gerrit.wikimedia.org/r/752743

Change 752743 merged by jenkins-bot:

[mediawiki/extensions/Linter@master] Fix lint error updating

https://gerrit.wikimedia.org/r/752743

The patch should fix the issue going forward but, after it's deployed, we need to clean up the db.

We won't be introducing new cases once this is merged but there are a lot left that are still there, how do you want to clean those up? Is there anything do you want me to do?

Can you run a script to purge all the stale entries from the dbs?

Can you run a script to purge all the stale entries from the dbs?

But, of course, only after Arlo's linter patch rides this week's train.

Did it reach wmf.17? if not, I suggest backporting it.

It did. I merged it early enough and I see this "commit fc8c39baa50b664a9ed23f8826e17a1a62bb8414 (origin/wmf/1.38.0-wmf.17)" in the repo.

I wrote a script to do the job and I'm running it so we have a bit of cleanness now and will re-run it once wmf.17 is stable.

Should we revert https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Linter/+/677673 ?

Mentioned in SAL (#wikimedia-operations) [2022-01-12T12:48:30Z] <Amir1> removing orphan lint error reports in all wikis (T298782)

Maybe I'm missing something obvious but from my queries, it looks like all of the discrepancies between the count and the actual report were due to this bug. e.g. T298782#7605462. Usually the diff between the count and the report is just pages that don't exist (as far as I can see).

I see. It certainly sounds like this was the right fix and that was never needed. But how accurate is estimateRowCount? The difference between 0 and MAX_ACCURATE_COUNT doesn't seem so great, it's probably fine to just leave it for the time being.

Another script to run in T298343#7618503. We need to clear old lints from pages with contentmodels that will never be run again.

I see. It certainly sounds like this was the right fix and that was never needed. But how accurate is estimateRowCount? The difference between 0 and MAX_ACCURATE_COUNT doesn't seem so great, it's probably fine to just leave it for the time being.

Sure. SGTM.

Another script to run in T298343#7618503. We need to clear old lints from pages with contentmodels that will never be run again.

I wrote the query to run, the problem is that the current script is sorta stuck on commons, it has been working on it for a full day now and it's still 2M more rows needs to be deleted. I feel in commons the issue was bigger (around 10% of pages). Once that's over, I'll run it.

The cleaner in commons couldn't move forward because it got super slow after roll out of inline media and adding 20M more rows there. I skip commons for now and will find a different way to clean that up.

In case it helps, once this week's train rolls out, you should be able to DELETE * FROM linter WHERE linter_cat=19 on the commons db to get rid of all the inline media lints.

In case it helps, once this week's train rolls out, you should be able to DELETE * FROM linter WHERE linter_cat=19 on the commons db to get rid of all the inline media lints.

DELETE * FROM linter WHERE linter_cat=19 would make commons go read-only for half an hour. I need to split it to 10k rows and sleep between each batch, so I suggest backporting this patch if it's okay so it won't add more and more rows to the table.

Mentioned in SAL (#wikimedia-operations) [2022-01-18T08:20:35Z] <Amir1> cleaning up commons linter errors T298782

Mentioned in SAL (#wikimedia-operations) [2022-01-18T08:20:35Z] <Amir1> cleaning up commons linter errors T298782

This just got finished and removed around 30M rows.

Would it be worth to optimize the table? I haven't checked its current size.

Would it be worth to optimize the table? I haven't checked its current size.

Nah, it's not worth the work

root@db1143:/srv/sqldata/commonswiki# ls -Ssh | grep -i lint
 4.1G linter.ibd
 4.0K linter.frm

It has been cleaned up everywhere now. I confirm that there is no new lint error showed up in some wikis I spot checked.