Page MenuHomePhabricator

Send "are you there?" email to Toolforge members every 3 months to revalidate email address
Open, LowPublic

Description

While sending invitations to the 2016 Tool Labs survey @bd808 captured quite a few bounces for non-deliverable emails. It is important that Toolforge administrators be able to contact Toolforge maintainers so having bad email addresses on file is not great.

We could pretty easily setup a process that would send an email to each Toolforge maintainer periodically (every 3 months seems reasonable) and then flag or even disable their accounts if the ping was not responded to in a reasonable amount of time (3 weeks?). Bonus points for also leaving a talk page message at the same time in case the user is watching that but has lost access to their registered email address. Once we have more users using Striker we could potentially also leave talk page messages on their SUL home wiki.

Is this the worst/most corporate idea ever or something that would be ok if only somebody found time to do it?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Looks like yet another annoying thing to me…

Looks like yet another annoying thing to me…

Do you have other suggestions for keeping contact information current?

I would rather not just let things go until there is some event like a downed service that leads to trying to contact the maintainers to find out that we can't. I understand that many/most/all tool maintainers are busy people, but a couple of mouse clicks every 3 months doesn't seem like an outrageous burden. I think that toolserver actually made you ssh in once a month to keep your account active didn't it?

I'm quite sure it didn't, that it was more like once a year (and yet, every time some people failed).

It was every six months and happened via a login/script going to 4 times a year would be kind of obnoxious for most. I have things configured so it is mostly hands off for a reason.

I (and probably others) would probably create an email filter to flag it all as spam... This might impact the normal notification emails, as email servers may auto-flag-spam for them as well, due to their similar origins as "are you there?" email.

The approach used by npmjs package email-existence looks interesting, although I have no idea how well it works.

I (and probably others) would probably create an email filter to flag it all as spam... This might impact the normal notification emails, as email servers may auto-flag-spam for them as well, due to their similar origins as "are you there?" email.

The mere thought of getting 4 emails per year asking to click a link is worthy of plans to setup special messages filters and forecast that so many people would do so that the function of emails sent from the wikis generally would be degraded. The latter suggestion sounds like a FUD tactic to me.

The approach used by npmjs package email-existence looks interesting, although I have no idea how well it works.

It has been several years since I researched the topic of email address validity probing in depth. The last time I looked deeply at it I found that it was not a universally effective tactic. Unfortunately that research was done as work for hire for a private company so I don't have a link to point to giving a detailed analysis of the challenges. Fundamentally however I found it to be highly dependent on the particular mail service being checked. I just did a quick check and found that gmail does seem to give a negative acknowledgment while yahoo does not.

# gmail
250 2.1.0 OK t187si3813593wmb.66 - gsmtp
rcpt to: <pretty.sure.invalid.bd808@gmail.com>
550-5.1.1 The email account that you tried to reach does not exist. Please try
550-5.1.1 double-checking the recipient's email address for typos or
550-5.1.1 unnecessary spaces. Learn more at
550 5.1.1  https://support.google.com/mail/?p=NoSuchUser t187si3813593wmb.66 - gsmtp

# yahoo
rcpt to: <invalidbd808@yahoo.com>
250 recipient <invalidbd808@yahoo.com> ok

-1. I think it would be an unnecessary burden for tool maintainers while gaining no real value. Just because someone clicked on a link in a robot mail two months ago does not mean they will happily reply to or act upon a specific request by administrators. Especially with the abandoned tool policy and the right to fork policy in place I don't see why membership in Toolforge should require regular "attendance".

I wouldn't mind if there was a trimonthly newsletter by mail that provided useful information ("you can now do X!", "did you know that you can do Y?", "here is an example of a beautifully engineered tool") for users not subscribed to labs-announce/labs-l and kept score of bounces, but what would a "registered bounce" achieve? If subsequently there was need for (important) individual communication, would the administrators not even try again?

scfc triaged this task as Low priority.Feb 16 2017, 10:17 PM
scfc moved this task from Backlog to Ready to be worked on on the Toolforge board.

If I had to frame this in a vague and philosophically-looking statement I would say the benefits to the commons comes before individual convenience. In other words, clicking an email every 3 months gives us, the platform maintainers, useful information to try and keep cruft out of a shared platform with limited resources. It's not responsible to leave unused allocated resources (both computational and human) in a public shared platform when others could be benefiting from it. Each tool adds a small overhead to maintenance and decreasing that overhead allows us to do more.

I support this proposal and suggest that it be adopted on a per tool/project basis instead of just validating users individually. If someone confirms their tool/project is active, their account is automatically validated for the period.

Yes, there are dozens of different ways to collect signals indicating a tool is active or not. However, back to the maintainers of a public platform, development time is not free and sending an email every X months is way less costly than developing a semi-AI platform for such purposes.

It is not uncommon for community members to disappear for say, a few months. Say one tool has only one or two maintainer listed (btw, let's face it: even for tools with lots of maintainers listed usually only a few knows all its internals), and you receive no response after 3 months, what do you do? Stop all the tool jobs to free resources? What if the tool is one quite dependent by the community like flickr2commons or gpsexif? (come on, even these two are are single-maintainer-ed.) And then nobody can invoke abandoned tool policy until the tool is down for 2 weeks, so you get at least 2 weeks of downtime, assuming there is someone to is willing to take over the maintenance, for some arbitrary maintainer-activity criteria we imposed.,

Honestly, I don't see any significant cost in keeping a well-behaved tool alive. If we actually have some backwards-incompatible changes (eg. a while back in the db replicas rebuild the replicas no longer accept user databases and joins between user databases and replicas broke), then indeed we can sacrifice those tools that are unmaintained, because we have to. If some tool is abusing resources then we sacrifice it because we have to. If the tool goes down as a result then yes that is exactly what abandoned tools policy is designed to take case of, by providing the community a means to take over and do the appropriate changes so the community can rely on the tool again. But asking community members to click a link in a spammail for some arbitrary frequency we impose, to keep their tool alive is ridiculous.

I would like to second what @scfc said:

-1. I think it would be an unnecessary burden for tool maintainers while gaining no real value. Just because someone clicked on a link in a robot mail two months ago does not mean they will happily reply to or act upon a specific request by administrators. [...]
I don't see why membership in Toolforge should require regular "attendance". [...]
what would a "registered bounce" achieve? If subsequently there was need for (important) individual communication, would the administrators not even try again?

Spam is unsolicited email from a business entity which you have no relationship with. If you are maintaining a tool, you have a relationship and I wouldn't consider it spam.

Say, someone doesn't reply to 2 requests in a row (so 6 months), add this project to a queue and try to engage with them in a different way.

My point is, we can't have the platform accumulating cruft endlessly.

If the tool goes down as a result then yes that is exactly what abandoned tools policy is designed to take case of.

Except it doesn't. Someone has to be interested in maintaining it. If nobody is interested, it stays there broken taken up management resources.

We're talking about rational management of resources in a public platform. We're not asking for anything but a simple "hey, this project is alive".

I don't think any of what is discussing here is ridiculous. That is dismissing other people's opinions and unnecessarily escalating the situation.

I just want to clarify that I understand the need to keep good/working tools up & running and that the community may depend heavily on them. I don't want to take tools down needlessly.

The situation I'm focusing on are tools that simply crash, don't recover and nobody cares. Or tools that become broken and can't be unbroken only with operational actions (maybe the ecosystem evolved and someone needs to fix the code). Additionally, there are security concerns with tools that go unmaintained because it's rare that a piece of code is "finished" and doesn't need to be touched ever again (more like it's a garden in constant of periodic attention).

So I understand the points being made and would like to reiterate I would be very surprised if we took an automatic/robotic approach of taking tools down without proper warning and sufficient attempts to engage with maintainers. What's more likely is that this policy would catch lots of broken/unused tools and fewer (but not zero) tools that are widely used but lack maintainers.

If the tool is broken (no jobs, or all jobs are in crash loop, infinite sleep, infinite loop, or otherwise provides no service) for a while (say, a year), nobody is willing to take over the maintenance, and nobody from the communities is in any way still using the tool, then yes it's cruft. As you said, they are "tools that simply crash, don't recover and nobody cares". It has nothing to do with whether maintainer is reachable via email.

If there are security issues with tools, we go with T182341 route: contact the maintainers. If they cannot be reached in any way, we either fix it ourselves or kill it.

I understand that there are tools that are pretty-much-useless and should be deleted; but the single "epic" that would have removed much of these cruft is, in my opinion, to have a self-service tool deletion. Most tool authors aren't bad-behaved and would very much like to clean up after themselves, but having to create a task for it, especially that such task almost never gets resolved, is a significant discouragement that would drive many away.

If, in the worst case in my own opinion, that emailing tool maintainers must be done, can I ask that any user that did any of the following during the source of the period (3 months?) to not be emailed:

  • talked on IRC in a public wikimedia channel
  • edited on any public wikimedia project
  • sent any messages to any public wikimedia mailing list
  • made a publicly-visible comment / task on phabricator

If not, can I ask that the period in between consecutive mail pings to multiply if a user responds?
What I am saying is that, constantly pinging users every 3 months is what @Platonides said, //yet another annoying thing//.

Reading the task description again, yeah this is the most corporate idea ever :)

can I ask that any user that did any of the following during the source of the period (3 months?) to not be emailed:

  • talked on IRC in a public wikimedia channel

no, not all channels are centrally logged and even if they were how do you properly map back to the relevant LDAP account?

  • edited on any public wikimedia project

do we have a good LDAP <-> SUL mapping that covers everyone now?
pretty sure we don't have a mapping against the fishbowls.

  • sent any messages to any public wikimedia mailing list

programmatically digging through mailman archives might be doable.

  • made a publicly-visible comment / task on phabricator

that's easy if they link their phab account to LDAP, but some people log in with SUL instead and so the SUL question above applies

I think we should keep a log of people SSHing into the tools boxes (I assume we already do this to some extent) and use that. We could probably have the user's email address shown as a login banner somehow.

no, not all channels are centrally logged and even if they were how do you properly map back to the relevant LDAP account?

IRC Cloak is usually good enough. Most active IRC community members are informed of https://meta.wikimedia.org/wiki/IRC/Cloaks. If not, well, on-wiki edits and phab actions should get most of the rest covered :)

do we have a good LDAP <-> SUL mapping that covers everyone now?

Phab + striker does the majority. New registrations on striker requires OAuth attachment AFAICT.

pretty sure we don't have a mapping against the fishbowls.

By public I exclude fishbowls.

The point of my 'exclusion list' is to exclude those that are obviously still-in-the-community. The rest those status cannot be 'determined' are therefore asked 'are you there'.

This list is primarily community-focused. For WMF-ers checking the email address might be enough already.

5+ years later we continue to have the same problem of unreachable tool maintainers from the Toolforge admin side. The process of contacting folks for https://wikitech.wikimedia.org/wiki/News/Toolforge_Grid_Engine_deprecation has reminded us that there really are quite a few of these maintainers to whom emails always bounce or no response is received.

I don't think the problem has become particularly worse, but it also has not become in any way better. We do today have nicer mechanisms for both reversibly blocking a Developer account and for archiving an abandoned tool when one is identified.

bd808 renamed this task from Send "are you there?" email to tool labs members every 3 months to revalidate email address to Send "are you there?" email to Toolforge members every 3 months to revalidate email address.Jan 10 2024, 5:54 PM
bd808 updated the task description. (Show Details)
bd808 removed a subscriber: MZMcBride.