Page MenuHomePhabricator

Ease the bottleneck in wiki account creation at in-person events
Open, MediumPublic

Description

The Problem(s)

Newbies often need to register for a wiki account on the day of editathons and other similar events. (Often such users just show up, without registering for the event in advance. Also, most organizers are hesitant to simply require wiki account creation as part of advance registration, fearing it will deter possible participants.) Meanwhile, for security reasons, the wikis allow only 6 accounts to be created from one IP during a given timeframe. This creates a number of problems:

  • Account creation wastes time during the event: Organizers can apply for a the Event Organizer right, which lets them create more than the allowed number of accounts from one IP. But the person with the Event Organizer right has to personally register the participants one by one, which can cause significant delays at the start of events.
  • Restricted choice of event leaders: The Event Organizer right can’t be delegated, so the person with that authority needs to be on hand at events personally to register users.
  • There is a process for requesting an exemption to the cieling, but: it is a) not well known, b) relies on volunteers to make the patch correctly and in time, c) requires that the organizer know at the time of the request the IP address of the room where the event will be (which can be hard to get).
  • Participants still get blocked at events—even when organizers play by all the rules. This is a huge disruption for people who've booked a room, advertised on social media and are often partnering with prestigious GLAMS or other institutions..

These problems are described on the project page in Step 2, Step 3, and Step 4.

What the users say

You'll find very instructive talk page discussions on:

Your mission

A variety of ideas have been proposed to help organizers out; see below. However each seems to involve challenges or unknowns, and no consensus has emerged through our casual discussions on the best approach. Your goal is to investigate these problems and the solutions discussed, bring back what you find and, we hope, make some recommendations—especially as to feasibility and level of effort.

The discussions below are a recap of an email chain among CommTech team members. In addition, I've added a few solutions suggested by community members on the project talk page.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@Samwilson wrote:

On 17/08/18 01:22, Ryan Kaldari wrote:

Besides, if I were an event participant I would not trust some random Tool Forge tool to handle my account creation. I would expect to do it on Wikipedia itself. Doing it elsewhere would be worrisome (to me at least). Like "Who else is going to have access to my login credentials"? "Is it secure?" "Is it being stored by the tool?" Even if we do it 100% correctly (from a security POV) the user doesn't know that and has no reason to inherently trust our tool.

I think how it would work is that they'd only have to put in their email address and desired username, and then they'd receive an email with a temp password. So it could possibly be made to feel okay from a security point of view.

@Mooeypoo wrote:
We have this at our disposal at the moment:

  • Ops has a way to pre-whitelist locations for events so that they don't run into the limit
  • But the process to do this involves creating a ticket that isn't clear
  • And ops require a long time in advance (because they don't get to it?)

We are part of the Foundation, so we have access to Ops. We might be able to talk to them and see if we can figure out an alternative method of doing this.

Ops has a way to do it? I doubt most of ops knows about it, this has traditionally been the realm of deployers and in recent years Wikimedia-Site-requests volunteers.
What does "We are part of the Foundation, so we have access to Ops" mean?

@kaldari wrote:

Building off an idea proposed by Kerry on the talk page (see below): What if we gave Event Coordinators (who have already been vetted by the community) the ability to exempt an IP address from account creation throttling for a limited amount of time (6 hours?) on demand via an on-wiki form or API? (And the exemption just happened automatically, not via Ops.) I have no idea if the community would accept the security risk, but basically any solution to make account creation easier is going to involve some degree of security risk.

...But, why can't I pre-register the event and be given a password and then, on the day within the time frames pre-registered for the event, I invoke something (lets call it GameOn), supply the eventname and password and the IP address I am using at that point in time becomes free of the limit for the duration of the pre-registered event. Why can't that work? The only difference to the existing mechanism is that the IP address is only known at event time not in advance. @Kerry_Raymond

@MusikAnimal wrote:
I think we use Sam's suggestion of only asking for a username and email, then submitting it in a queue for the organizer to mass-create the accounts (similar to what Alex was saying), it will make things a lot easier than they are now. Niharika was also exploring the prospect of putting our app on production (subdomain of wikimedia.org). That should further show the site can be trusted, and probably attract more users given it looks so official.

Some participants will still occasionally get blocked on-wiki. This is a social problem and I don't think there's much we can do outside advertising that the user is editing as part of an event on the user/user talk page, and provide a link to contact the organizer. We discussed this some and I do think it can be semi-automated under the organizers account, which will be better than Community Tech bot (can't trust that bot, oh no! At least not when I'm writing it). Overall I think these ideas are doable and would be really fun to build.

Ryan is also right that Grant Metrics has no user-facing interface at the moment. I envision the sign up form being an attractive outlying page from the rest of the application. Organizers have a permalink to it that they will share in their emails to everyone, on social media, etc. The user can't navigate to anywhere else, just submit details and get a "thank you!" notice.

jmatazzoni added a subscriber: Bluerasberry.

On the project talk page, @Bluerasberry wrote:

......The takeaway is that instead of granting advanced userrights for a trusted user to be "event coordinator", then instead there could be a bot on an event page which creates accounts for new users. The current process is that new users at events get their wiki accounts from someone with a userright. The new proposed process is that no human gets an advanced userright, but instead the new users who register on an event page get publicly logged as part of a cohort and are easy to observe if only the event page publicly reports who they are. Also, their user account registration should log them as being part of a program or event, and the program itself should have a record of the organizer in charge who can take blame.

There is a process for requesting an exemption to the limit, but it [...] relies on volunteers to make the patch correctly and in time—which apparently doesn't always happen.

In my experience the main problem with this process is that event organisers neglect to give enough information (e.g., IP address) before the next eligible deployment window.

Yeah, we might want to look into making the submission of this information a lot better, and help organizers figure out the IP address and all necessary information to get the request done... that's one option we can look into.

jmatazzoni renamed this task from Investigate how to ease the bottleneck for wiki account creation at in-person events to Investigate ideas for easing the bottleneck in wiki account creation at in-person events.Aug 24 2018, 7:15 PM

A variety of ideas have been proposed to help organizers out (see below).

See T27000: Deploy ThrottleOverride extension to Wikimedia wikis

The ideas discussed above are for getting around the IP limit at the actual event. Some other ideas for impacting the issue in different ways have also been suggested. See below. I wonder if these should be part of this investigation or part of a separate ticket?

Idea for reducing the # of people who need accounts

  • Promte and facilitate wiki registration during event signup: We may be able to integrate wiki registration directly into event signup page, creating a seamless process and ensuring that fewer attendees show up on the day of without usernames. If we can’t embed registration, we can create a signup flow that makes registration as attractive as possible.

Idea for making the registration process go faster

  • Bulk account creation tool: No matter how attractive we make advanced registration, there will always be drop-ins attendees on the day of. To help Event Creators move more quickly, we could make a tool that would let them register multiple participants at one time.

Idea for protecting newbies from being blocked

  • Post messages to newbies’ user pages: To shield newly registered participants from patrollers, we could post notices upon signup to their user pages announcing that they are part of an event and directing questions to the the event organizer. The Dashboard does something like this, though the process it uses (bot?) requires community approval, so it isn’t implemented in many places. We’d want to avoid that.

Ideas for spreading info about current solutions more widely

  • Encourage organizers to apply for the Coordinator right: In Event Setup Phase, provide information about the Event Coordinator right (if available for that wiki) and a link to how to apply.
  • Assist with Phabricator request for easing of registration limit: During event setup, provide link to a Phabricator form where organizers can request easing of the limit during the event.
jmatazzoni renamed this task from Investigate ideas for easing the bottleneck in wiki account creation at in-person events to Investigate ways to ease the bottleneck in wiki account creation at in-person events.Aug 25 2018, 12:40 AM
jmatazzoni updated the task description. (Show Details)

Thanks for putting the ideas together in this phabricator ticket. I would like to highlight 2 points that i can see having a significant impact in the community's events organizing work:

  1. I am happy to see thought is being given to the IP limit, within the framework of event organizing.
  2. I am also happy to see that the idea of writing to the user's page an acknowledgement/statement that she/he is participating in an event will help populating those pages, and can be used by that user as a way to track their own participation at community events (they can always delete/move those contents as they see fit) and facilitate patrollers' work. I would suggest using "badges" as a metaphor for this process: my guess is that most new users would want to show off those badges. (and at a later time turn into an "open badges" initiative within the movement).

Bulk account creation tool: No matter how attractive we make advanced registration, there will always be drop-ins attendees on the day of. To help Event Creators move more quickly, we could make a tool that would let them register multiple participants at one time.

@jmatazzoni - Looks like this might already exist: https://accounts.wmflabs.org/acc.php

Bulk account creation tool: No matter how attractive we make advanced registration, there will always be drop-ins attendees on the day of. To help Event Creators move more quickly, we could make a tool that would let them register multiple participants at one time.

@jmatazzoni - Looks like this might already exist: https://accounts.wmflabs.org/acc.php

Hmmm. Are you positive that is a bulk creator? Isn't this the Account Creation interface?

Screen Shot 2018-08-27 at 5.45.23 PM.png (609×926 px, 94 KB)

Bulk account creation tool: No matter how attractive we make advanced registration, there will always be drop-ins attendees on the day of. To help Event Creators move more quickly, we could make a tool that would let them register multiple participants at one time.

@jmatazzoni - Looks like this might already exist: https://accounts.wmflabs.org/acc.php

Hmmm. Are you positive that is a bulk creator? Isn't this the Account Creation interface?

F25446197

It is not a bulk creator. This is the issue tracking the work to handle events. Currently it is only for WP:ACC.

Bulk account creation tool: No matter how attractive we make advanced registration, there will always be drop-ins attendees on the day of. To help Event Creators move more quickly, we could make a tool that would let them register multiple participants at one time.

@jmatazzoni - Looks like this might already exist: https://accounts.wmflabs.org/acc.php

I don't think we want to send organizers to yet another tool. Also, they need to be granted access to that tool by ACC admins. We can use the MediaWiki API to create accounts, so as long as the organizer has "account creator" or "event coordinator" privileges, we can allow them to mass-create accounts from within Grant Metrics.

Niharika renamed this task from Investigate ways to ease the bottleneck in wiki account creation at in-person events to [4 hours] Investigate ways to ease the bottleneck in wiki account creation at in-person events.Aug 28 2018, 10:02 PM
Niharika added a project: Spike.
Niharika moved this task from Needs Discussion to Up Next (June 3-21) on the Community-Tech board.

In my experience the main problem with this process is that event organisers neglect to give enough information (e.g., IP address) before the next eligible deployment window.

First problem is that people need to be aware of the throttle and how to request the exemption. Then, they themselves need to know the IP address (the venue is often provided by a third party).

(...) asks the Event Coordinator to come input their credentials so that the form can be submitted as the EC and therefore bypass the IP restrictions. Sort of like a student doing something and having a teacher sign off on it.

No way. Having a user with higher permissions input his own credentials on an untrusted machine?? (not to mention that the EC would still a bottleneck here)

If you were to do something similar, I would make the application produce a token that can be given to the student to signup. If they can be generated beforehand (even if they are not valid until the date of the event), they can be printed and distributed on people arrival. Plus, the tool could use that to automatically mark the user as present in the session.

I think the proper way is to have the user sign in/up during the event signup. It will need the account that will be participating in the event, so integrating it seems the logical choice, and that also avoids problems like people providing non-existent usernames or inadvertently enrolling someone else (there is the still the problem of too many people signing up there, though).

+1 for the ideas of how to mark that the user is part of an event. Also, it should be relatively straightforward for a patroller to find out not only that user:Foo is in an event but also who is coordinating the event and warn them so that he can scold that user in person. We may also want to flag those temporary exemptions on IP blocklist page.

I think a token per student would be almost as cumbersome as account creators manually creating every account, it just wouldn't scale well. The most scalable way would be for event organizers to create a single token used in registrations. The token would have time limits set and would be used is short URLs. The organizers would therefore just publish the URL, e.g. "To create an account, go to enwp.org/register/1234abcd".

@MaxSem Might it also work if we gave the organizer that token and they could print it out at the event or write it on a whiteboard? Then, we put a form field on account creation where the attendee inserts this token. That would then allow that sign up to bypass some of the restrictions.

That might work better than in a URL because it would lessen the likelihood of bad guys using the URL to generate a bunch of nonsense or problematic accounts. It would force the signups to be during the event as long as an organizer didn't broadcast the token through some broad media.

MaxSem moved this task from Ready to Needs Review/Feedback on the Community-Tech-Sprint board.

So, I propose this workflow:

A user with appropriate permissions creates a an event. The event has properties of name, start time, duration and max number of accounts. Also, a secret code is generated, to be used for registrations. To simplify registration instructions, instead of "go to Special:Foo, enter this code", create a short URL with this code. The URL can be published IRL at the event venue. Visiting it during the event time would allow users to have a different creation throttle value. We might even log in account creation log that registration was made at an event (can this be considered private information, though?)

I like this idea. It combines what you and I were discussing above.

I don't see the information about how the account was created as private but I'm not a good person to know the right answer.

If you're indicating the exact event I would expect it to be private because people would know the geographical location.

@Krenair I hadn't considered that. If participants' usernames are posted on-wiki for the event, is this log different than that?

In T202759#4572870, @MaxSem wrote:

So, I propose this workflow:

A user with appropriate permissions creates a an event. The event has properties of name, start time, duration and max number of accounts. Also, a secret code is generated, to be used for registrations. To simplify registration instructions, instead of "go to Special:Foo, enter this code", create a short URL with this code. The URL can be published IRL at the event venue. Visiting it during the event time would allow users to have a different creation throttle value. We might even log in account creation log that registration was made at an event (can this be considered private information, though?)

@MaxSem's idea sounds workable and effective to me. I'm going to play devil's advocate on the security side for the moment, to test it. The danger of this idea, someone might charge, is that if an event creator is careless with the secret URL, then a secret URL could get into circulation, allowing a bad actor to make multiple bogus accounts during the event timeframe. If that is a valid concern at all, what are the safeguards against this possibility?

  • During event creation in Grant Metrics, we can presumably check the event creator's credentials in order to ensure that he or she has Event Coordinator or Account Creator rights, correct?
  • Given that, might it be possible going forward from the event to always know that a given user account was created under the authority of a particular Event Coordinator or Account Creator? I ask because such traceability of account lineage would be a pretty effective safeguard. I.e., if numerous bogus accounts were making trouble, and if admins could trace them to a particular Event Coordinator, then they could revoke that person's credentials or warn him or whatever. If the bad accounts were traceable to a particular event, it would further ability to pinpoint what had gone wrong.
  • If that permanent traceability is not inherent in the plan, would it be hard to add it as a feature?
  • Aside from such traceability of the account creator, what other safeguards are there that would allay fears about this plan? Or is this fear not valid to begin with?
  • (One small thing we could do, it strikes me, would be to ask if an event is an in-person or online event and then generate a secret code only for in-person events. This on the theory that only in-person events need the code and that they typically take place over a shorter duration than online events, so have a shorter window of vulnerability.)

Another problem some people might find with this system is that in a large organization, one person with the Event Coordinator right might set up all the events and never go to any of them himself.

  • That's precisely what enables this to scale. But will that be a problem for the people who created the Event Coordinator right? Was the intention that the Coordinator should be directly involved in the event?
  • If that was the intention, is there a way to mitigate? E.g., we could wait to email out the secret code until just before the event. Any other ideas?

@jmatazzoni The security question one is a valid one by my thinking. The traceability you describe is a nice way to mitigate after the fact.

In most situations like this, a second verification step would be enough to limit the exposure. I'm not sure if it makes sense in the Wiki context because of privacy and such though. That is, if a user was created via this specific URL, we could notify (on-wiki or via email) the Event Coordinator and force them to click a link to verify the user should be created.

I imagine that is a significant chunk of work but I think it does allay many of the concerns you raise.

As for the second question, we could make the secret URL only available inside the Event Tool and only visible to the Event Coordinator during the specified start and finish times of the event. Would that mitigate some of this issue?

So, I propose this workflow:

A user with appropriate permissions creates a an event. The event has properties of name, start time, duration and max number of accounts. Also, a secret code is generated, to be used for registrations. To simplify registration instructions, instead of "go to Special:Foo, enter this code", create a short URL with this code. The URL can be published IRL at the event venue. Visiting it during the event time would allow users to have a different creation throttle value. We might even log in account creation log that registration was made at an event (can this be considered private information, though?)

Are you proposing modifying MW core to support this workflow? That's not clear from the discussion, AFAICT.

That is, if a user was created via this specific URL, we could notify (on-wiki or via email) the Event Coordinator and force them to click a link to verify the user should be created.

That would still make them a bottleneck, though it will probably result in a smaller overhead per user. I don't think that leaking a time and user count constrained URL can result in significant disruption.

Are you proposing modifying MW core to support this workflow? That's not clear from the discussion, AFAICT.

That is an implementation detail, though obviously this code shouldn't be in the core.

@jmatazzoni The security question one is a valid one by my thinking. The traceability you describe is a nice way to mitigate after the fact.

As for the second question, we could make the secret URL only available inside the Event Tool and only visible to the Event Coordinator during the specified start and finish times of the event. Would that mitigate some of this issue?

@Niharika, given that the "organizer" can add various other organizers to the Program, would we be able to make sure that the secret URL appeared only on the page of the "prime" organizer?

If we could do that, yes, to my mind it would limit the ability of people to just mass produce events from one organizer's user right.

@jmatazzoni The security question one is a valid one by my thinking. The traceability you describe is a nice way to mitigate after the fact.

As for the second question, we could make the secret URL only available inside the Event Tool and only visible to the Event Coordinator during the specified start and finish times of the event. Would that mitigate some of this issue?

@Niharika, given that the "organizer" can add various other organizers to the Program, would we be able to make sure that the secret URL appeared only on the page of the "prime" organizer?

If we could do that, yes, to my mind it would limit the ability of people to just mass produce events from one organizer's user right.

@jmatazzoni There is no logic to identify a "prime" organizer in the app yet. To do this we would first have to build a way to designate primary organizers for events/programs.

You could default that the first organizer is the "primary" one but that would be rife with false positives.

I think we have to assume some level of trust amongst the group of organizers. So, maybe we consider the organizer to be a role instead of an individual. Thus, we could describe this as being shown only to users with the organizer role for that event. I'm not suggesting the code need know about anything like an organizer role but in practice, that's how this would appear.

@Mooeypoo is concerned that we need to hedge this around with safeguards pretty securely in order to get approval. She suggests something along the following lines: What if the organizer was instructed to wait until she was on-site on the day of to generate the code. When activated, the system would determine the IP address of the event location, and the code would work only from that IP address (and, as as above, only for a limited period of time). This would answer the fear that someone could get the code and publish it.

There is no need, of course, for the secret code for online events. It's whole purpose is for in-person. Would that be workable? Would it work?

I'm personally opposed to turning this feature into Ft. Knox: this ticket is about making account creation easier, lots of safeguards being discussed here go against this. Even basic time constraints would prevent most attempts to abuse this, making it everything-proof will make it too inconvenient to use.

the system would determine the IP address of the event location, and the code would work only from that IP address

Toolforge and VPS both scrub the client's IP. I'm assuming we'd be able to do it in a production environment.

Very related aspect (not sure if in scope here): Even if you allow folks (on the same IP) to create many accounts folks will still face the autoconfirmed rate limiting.
See T204583 about this (coming from T203909#4588691).

jmatazzoni updated the task description. (Show Details)
jmatazzoni updated the task description. (Show Details)

I tend to agree with @MaxSem on this one. I don't think we need to care about IP address. If we only show the code to the organizer and only on the date that the event occurs (or starts), then it seems fine to me. We can take that proposal to whatever security we need. If they push back, then we can change.

I don't want to overengineer something out of fear of what some group might do.

Thanks to @Aklapper for pointing out this other task which does feel related. Maybe we could build some consensus on this more broadly across the community?

An alternative solution: https://www.mediawiki.org/wiki/Extension:ThrottleOverride - I was aware about it when making my proposal and decided against it because I wanted to go away from deeply technical details, however with T204583 I'm tempted to say maybe relaxing restrictions by IP is not such a bad idea.

I'm personally opposed to turning this feature into Ft. Knox: this ticket is about making account creation easier, lots of safeguards being discussed here go against this. Even basic time constraints would prevent most attempts to abuse this, making it everything-proof will make it too inconvenient to use.

I don't think we should create it into Ft. Nox, but I also think it's reasonable for us to try and make sure we put some sensible limits on this, especially since the point is that this will be used by users who are not technically savvy, and that it will extend the definition of the event creator role that the community entrusted a single person with. We are taking that, and making it into an automated system, we should make sure we are putting in sensible protections around it.

I do agree we don't have to go all-out, but I don't think that it's that bad to get the location-specific registration. Otherwise, we might get issues with a person unrelated to the event registering lots of spam accounts, which will not only go against this entire purpose of the ticket, but will probably result in the person whose role we used having to lose that role/permission.

@MaxSem That ThrottleOverride extension looks like it would solve the technical problems. It seems like there would be some process issues that still remain with making sure the extension is installed on the relevant wikis for the event, getting the wiki admin to setup the override for the event, etc.

Or, were you thinking that the Event Tool could hook into that extension directly and set the override itself?

The extension could provide an API indeed.

@jmatazzoni My understanding is that you are not planning to include this in the Event Tool work in the near term. If so, can we close this ticket?
There's a lot of good ideas in here that we should remember to refer back to when if and when we decide to take on this work.

@Niharika I think we should close this and get it off the board.

jmatazzoni renamed this task from [4 hours] Investigate ways to ease the bottleneck in wiki account creation at in-person events to Ease the bottleneck in wiki account creation at in-person events.Jan 2 2019, 6:40 PM
jmatazzoni reopened this task as Stalled.

I'm changing the status of this task to Stalled, because while we investigated it, we did not solve it. Also, we did a lot of good work compiling and discussing the issues, and I'd like that information to remain accessible.

For what it's worth, we have recently put in an process for account creation via outreachdashboard.wmflabs.org , the throttle is bypassed if the creation is made via an event on the dashboard. This certainly doesn't solve the core complaint, but may help with anyone running an event.

Unless an exemption has already been introduced, this helps only English Wikipedia-based events, since IIRC this is the only one wiki the Dashboard can create account on. Because of that, even I have noratelimit in cswiki, I can't use the Dashboard process. If I'm wrong, feel free to correct me!

@Urbanecm the outreach dashboard creates all accounts via enwiki (which then due to SUL get autocreated anywhere when logged in) - and this exemption is allowed via a service account between the dashbaord and enwiki (https://en.wikipedia.org/wiki/User:OutreachDashboardBot).

@Urbanecm - also it only is used when the only reason for account creation fail is rate limit (see examples here: https://en.wikipedia.org/wiki/Special:Log/OutreachDashboardBot)

Aklapper changed the task status from Stalled to Open.May 20 2020, 8:39 PM

The previous comments don't explain what/who exactly this task is stalled on ("If a report is waiting for further input (e.g. from its reporter or a third party) and can currently not be acted on"). T202759#4849931 seems to be about now-removed team tags only.
Hence reopening.
This task could also be resolved again as per T202759#4652501 or declined in favor of T27000, but this task is not "stalled" per definition.