Page MenuHomePhabricator

Purge rights or update contact info for Cloud VPS and Toolforge members with invalid email addresses
Open, LowPublic

Description

The Toolforge trusty deprecation emails and a recent round of puppet failure emails that went out to a large percentage of developer accounts has heightened awareness of a long standing issue: we do not have a standard procedure or mechanism to purge rights from accounts held by people we can no longer contact via email. There are at least two different classes of accounts affected by this problem:

  • Former Foundation and affiliate staff who used their work email as the contact address for their developer account
  • Community members who have for whatever reason closed or abandoned the email account that they used as the contact address for their developer account

The idea of email address re-validation has been brought up before (T148792). I am not trying to fork or reopen that discussion. I am however interested in at least a one time cleanup and ideally leaving behind some processes and possibly scripts to assist in repeating the cleanup in the future. This could become a periodic cleanup task for the cloud-services-team similar to their current process of annual orphan Cloud VPS project cleanup.

Event Timeline

Some former WMF staff with outdated Developer account contact information seen in recent bounce messages:

Long list of LDAP accounts with *@wikimedia.org email addresses that failed validation by the MX server. Generated via script also included below.

check_wmf_email.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# Copyright (c) 2019 Wikimedia Foundation and contributors
#
# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation, either version 3 of the License, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
# more details.
#
# You should have received a copy of the GNU General Public License along
# with this program.  If not, see <https://www.gnu.org/licenses/>.
import operator
import sys
import time

import ldap3
import validate_email


def ldap_conn():
    """Get an ldap connection

    Return value can be used as a context manager
    """
    servers = ldap3.ServerPool([
        ldap3.Server('ldap-labs.eqiad.wikimedia.org'),
        ldap3.Server('ldap-labs.codfw.wikimedia.org'),
    ], ldap3.ROUND_ROBIN, active=True, exhaust=True)
    return ldap3.Connection(
        servers, read_only=True, auto_bind=True)


def get_wmf_users():
    data = []
    with ldap_conn() as conn:
        results = conn.extend.standard.paged_search(
            'ou=people,dc=wikimedia,dc=org',
            '(&(objectClass=posixAccount)(mail=*@wikimedia.org))',
            ldap3.SUBTREE,
            attributes=['uid', 'cn', 'mail'],
            paged_size=100,
            time_limit=5,
            generator=True,
        )
        for resp in results:
            attribs = resp.get('attributes')
            # LDAP attributes come back as a dict of lists. We know that
            # there is only one value for each list, so unwrap it
            data.append({
                'uid': attribs['uid'][0],
                'cn': attribs['cn'][0],
                'mail': attribs['mail'][0],
            })
    return data


def is_valid(email):
    if '+' in email:
        # Strip out sorting tags, these don't seem to validate well
        local, host = email.split('@')
        user, _ = local.split('+', 1)
        email = "{}@{}".format(user, host)
    return validate_email.validate_email(email, verify=True)


def main():
    for r in sorted(get_wmf_users(), key=operator.itemgetter('uid')):
        if not is_valid(r['mail']):
            print(
                "* [[{osb}/{uid}|{uid}]] ([[{wt}:{cn}|{cn}]])".format(
                    osb="https://tools.wmflabs.org/openstack-browser/user",
                    wt="https://wikitech.wikimedia.org/wiki/User",
                    **r))
        else:
            print("OK: {}".format(r['mail']), file=sys.stderr)
        time.sleep(0.5)


if __name__ == "__main__":
    main()
requirements.txt
ldap3
py3dns
validate_email
aborrero subscribed.

WMCS meeting discussion: users can re-request access if they want, this can be undone, so let's do it!

Mentioned in SAL (#wikimedia-cloud) [2024-01-10T17:43:52Z] <bd808> Blocking Developer accounts connected to invalid/legacy wikimedia.org email addresses (T218239)

Mentioned in SAL (#wikimedia-cloud) [2024-01-10T17:43:52Z] <bd808> Blocking Developer accounts connected to invalid/legacy wikimedia.org email addresses (T218239)

The block list used here is now documented in P54718