Page MenuHomePhabricator

Remove invalid emails from the database
Closed, ResolvedPublic

Description

There are a bunch of emails in the user table (user_email) that are totally invalid. For example, two users on aawiki have a user_email of "qwerty". These should just be deleted since they would never pass validation today.

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

Legoktm raised the priority of this task from to Needs Triage.
Legoktm updated the task description. (Show Details)
Legoktm changed Security from none to None.
Legoktm subscribed.

Can this be done for all wikis (e.g. in update.php), or only on Wikimedia wikis? Are those users also marked emailconfirmed?

Change 177021 had a related patch set uploaded (by Legoktm):
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177021

Patch-For-Review

Can this be done for all wikis (e.g. in update.php), or only on Wikimedia wikis? Are those users also marked emailconfirmed?

I've written the maint. script as a non-Wikimedia script, so any wiki should be able to run it. I don't think adding it to update.php is a good idea since it has to scan the entire user table, which can be slow.

The users were not confirmed, but my script that is getting the email addresses which I was watching is only looking at non-confirmed addresses.

I also found valid-looking emails with leading whitespace. I'm not sure whether it's worth cleaning that up...

Change 177021 merged by jenkins-bot:
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177021

Change 177386 had a related patch set uploaded (by Legoktm):
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177386

Patch-For-Review

Change 177387 had a related patch set uploaded (by Legoktm):
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177387

Patch-For-Review

Change 177387 merged by jenkins-bot:
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177387

Change 177386 merged by jenkins-bot:
Add removeInvalidEmails.php maintenance script

https://gerrit.wikimedia.org/r/177386

Ran, removed a total of 218,598 emails across all wikis.