Page MenuHomePhabricator

Remove invalid emails from the database
Closed, ResolvedPublic

Description

There are a bunch of emails in the user table (user_email) that are totally invalid. For example, two users on aawiki have a user_email of "qwerty". These should just be deleted since they would never pass validation today.

Related Objects

Event Timeline

Legoktm created this task.Dec 2 2014, 8:06 PM
Legoktm raised the priority of this task from to Needs Triage.
Legoktm updated the task description. (Show Details)
Legoktm changed Security from none to None.
Legoktm added a subscriber: Legoktm.
Nemo_bis added a subscriber: Nemo_bis.EditedDec 2 2014, 8:36 PM

Can this be done for all wikis (e.g. in update.php), or only on Wikimedia wikis? Are those users also marked emailconfirmed?

Change 177021 had a related patch set uploaded (by Legoktm):
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177021

Patch-For-Review

Can this be done for all wikis (e.g. in update.php), or only on Wikimedia wikis? Are those users also marked emailconfirmed?

I've written the maint. script as a non-Wikimedia script, so any wiki should be able to run it. I don't think adding it to update.php is a good idea since it has to scan the entire user table, which can be slow.

The users were not confirmed, but my script that is getting the email addresses which I was watching is only looking at non-confirmed addresses.

I also found valid-looking emails with leading whitespace. I'm not sure whether it's worth cleaning that up...

aaron reassigned this task from Kunalgrover05 to Legoktm.

Change 177021 merged by jenkins-bot:
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177021

Change 177386 had a related patch set uploaded (by Legoktm):
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177386

Patch-For-Review

Change 177387 had a related patch set uploaded (by Legoktm):
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177387

Patch-For-Review

Change 177387 merged by jenkins-bot:
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177387

Change 177386 merged by jenkins-bot:
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177386

Legoktm closed this task as Resolved.Dec 4 2014, 9:20 PM

Ran, removed a total of 218,598 emails across all wikis.

bd808 moved this task from Done to Archive on the MediaWiki-Core-Team board.Dec 8 2014, 10:54 PM