Page MenuHomePhabricator

Tools web interface for tool authors (Brainstorming ticket)
Closed, ResolvedPublic

Description

So right now, to get started on tool labs, you need to go through the following:

  1. Sign up on wikitech
  2. Be confused about Labs vs Tool Labs and figure out which documentation you need to follow
  3. Request access (via semantic forms)
  4. Be granted access
  5. Figure out SSH setup (God forbid you're on Windows, or not particularly familiar with the commandline)
  6. Read lots of documentation

Managing tools (Adding more people, adding a description, etc) is also scattered. You manage tools via confusingly named 'service groups' on wikitech, but description and stuff requires you use the commandline.

This was probably ok back in the day when shared hosting meant Dreamhost, but IMO not anymore. This ticket is for thinking about and possibly spawning other tickets about making the UX for tool authors better.

I propose that we have a web interface for people to sign up for tool labs, create/manage tools, and possibly over the long term even more (webterminal / upload stuff, PAWS integration, etc). We can also figure out some way to bridge the gap between wikitech accounts (LDAP) and MW SUL accounts.

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
Resolvedbd808
Resolvedbd808
Resolvedmmodell
Resolvedmmodell
Resolvedbd808
Resolved dpatrick
Resolvedbd808
Resolvedmmodell
Resolvedjcrespo
Resolvedbd808
Resolvedbd808
Resolvedbd808
Resolvedbd808
OpenNone
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Resolvedtaavi
ResolvedJhancock.wm
ResolvedAndrew
ResolvedAndrew
Resolvedaborrero
Resolvedaborrero
Resolvedaborrero
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedBstorm
OpenAndrew
DuplicateNone
Resolvedbd808
Resolvedbd808
Resolvedbd808
Resolvedbd808
DuplicateNone
Resolvedbd808

Event Timeline

bd808 triaged this task as Medium priority.Feb 26 2016, 2:05 AM

I'm kind of thinking this interface should start at the landing page of tools.wmflabs.org. Here are some crazy brainstorming ideas:

  • OAuth integration with SUL wikis for authn
  • LDAP integration for authz
  • Create/manage LDAP accounts tied to a SUL account
  • Unified account creation form that gets you a SUL account if you don't have one, an LDAP account, and membership in the Tool Labs project.
    • Done as a nice wizard that skips steps that are already done and provides real information about what's needed at each step and why.
  • Once all of that works, switch wikitech from LDAP auth to SUL (Hey! A new SUL migration!)
  • Nice landing page with teaser content for various tool labs audiences:
    • people looking to join as a tool labs user
    • people looking for tools to use
    • tool maintainers
    • tool labs admins
  • Manage Tools (let's never mention "service groups" in the UI mkay?)
    • Create new Tool with a nice interface that actually collects some information that can be used to fill in a toolinfo.json for the tool
    • Update toolinfo.json data
    • Click a button to create associated [[:wikitech:Tool:*]] page
    • Add/remove maintainers
    • View running jobs
    • View logs for jobs
    • Manage crontab
    • Manage webserver
    • Manage k8s containers
  • Search for Tools
    • like Hay's directory but with a few more options (license, source published, wikis that it works with, "rating", ...)
  • Send feedback to a Tool maintainer (posts message to Tool's flow page?)
  • Rate a tool? ("★★★½ Helped me find categories that were missing parents")

Open questions:

  • Should this be a single monolith app or a family of tools that handle specific parts and share a SSO system? I can think of pros/cons for both designs.
  • The PHP vs Python implementation language question. Again pros/cons for both.

I guess another question I'm opening up here is whether or not this should be limited to Tool Labs or more broadly useful for all Labs projects. My "glorious vision" of killing of LDAP for wikitech and making it a normal SUL wiki as well as the Horizon project might make a more general purpose solution attractive. That is also scope creep however. It could be done in stages certainly.

<3

I personally think this should be limited to Tool Labs - Service Accounts were a mistake because they were overgeneralized despite being used only on tools (and toolsbeta)

My $ 0.02:

  1. Tool authors sign up for Toolforge once, and they create one tool every x, with the median user maintaining y tools in total. x and y could be calculated from actual data, but I'm betting they are very low. So this feels like a case of YAGNI to me.
  2. The Labs account system relies on user accounts existing in the LDAP backend, so I have no idea how wikitech could be SUL. Would there be a separate account system? Would there be a PAM plug-in that authenticates a Linux user login against a MediaWiki user database?
  3. I am fine with tool vs. service group either way, but what I would like to avoid is a situation where in the tools.wmflabs.org UI they are called "tools", in the wikitech UI "service groups", in the LDAP records this, in Puppet that. If a user can live happily without ever encountering the term "service group", then using it in the UI is okay; if he will stumble upon it in the first week elsewhere without any explanation in the UI shown to him, it makes the learning experience a bit unpleasant.

My $ 0.02:

  1. Tool authors sign up for Toolforge once, and they create one tool every x, with the median user maintaining y tools in total. x and y could be calculated from actual data, but I'm betting they are very low. So this feels like a case of YAGNI to me.

I agree that creating a user account to access Labs and Tools is not done often by a single human. To me this is actually a strong reason to make the process smoother. People get used to clumsy processes that they have to repeat. The first time however its all shock and awe. Using Labs and Tool Labs should not require hazing the n00bs to test if they are worthy.

The vision I subscribe to is that contributing to MediaWiki and other Wikimedia related software projects should be as easy and common as editing Wikipedia is. We should not be putting up artificial walls of cumbersome process and confusing procedure unnecessarily. The low numbers you assume may even be kept artificially low due by cumbersome process and bad information flow about the benefits of isolating Tools to allow for ease of maintenance and long term continuity. Large suites of tools that are largely unrelated except by original developer like https://tools.wmflabs.org/hay/ are an anti-pattern in the modern (or at least my imagined) Tools system.

  1. The Labs account system relies on user accounts existing in the LDAP backend, so I have no idea how wikitech could be SUL. Would there be a separate account system? Would there be a PAM plug-in that authenticates a Linux user login against a MediaWiki user database?

The need for LDAP account data on Wikitech is entirely a construct of OpenStackManager usage. As long as OSM is the system that is used to manage OpenStack in Labs we will need to keep a strong association between the on-wiki account and the LDAP account. Once OpenStack management transitions to Horizon however, there is no compelling need for directly using LDAP auth on Wikitech. LDAP accounts could easily be managed completely separately with associations between SUL accounts and LDAP accounts being stored in a separate database or LDAP itself.

  1. I am fine with tool vs. service group either way, but what I would like to avoid is a situation where in the tools.wmflabs.org UI they are called "tools", in the wikitech UI "service groups", in the LDAP records this, in Puppet that. If a user can live happily without ever encountering the term "service group", then using it in the UI is okay; if he will stumble upon it in the first week elsewhere without any explanation in the UI shown to him, it makes the learning experience a bit unpleasant.

Reasonable. If nothing else we can at least provide nicer explanation that "service group" == a unix user and group that is created for the purpose of owning files for a Tool and providing shared access to a restricted set of additional unix users. I've heard enough hatred for service groups from @yuvipanda and others to think their days are numbered, but they certainly aren't gone yet.

[…]

  1. Tool authors sign up for Toolforge once, and they create one tool every x, with the median user maintaining y tools in total. x and y could be calculated from actual data, but I'm betting they are very low. So this feels like a case of YAGNI to me.

I agree that creating a user account to access Labs and Tools is not done often by a single human. To me this is actually a strong reason to make the process smoother. People get used to clumsy processes that they have to repeat. The first time however its all shock and awe. Using Labs and Tool Labs should not require hazing the n00bs to test if they are worthy.

I don't think "hazing" describes the process accurately. You create a wiki account like someone interested in Wikipedia & Co. will have done already elsewhere, you upload the ssh key and fill out one form (actually, one field), and a few hours later you can log in. You click on one link, type in the tool's name, and immediately the tool exists.

Which one of those processes feels clumsy is hugely a matter of your own experience. I do remember @yuvipanda modelling service.manifest after Heroku's concepts. I don't know those, I think many tool developers do not either, but probably some do. What I do know, though, is that I never encountered any environment where I wanted to do something non-trivial and could do so without reading some documentation (or going through some trial-and-error cycles), because there is no universal environment (and I don't say that just as a Fedora user :-)).

The vision I subscribe to is that contributing to MediaWiki and other Wikimedia related software projects should be as easy and common as editing Wikipedia is. We should not be putting up artificial walls of cumbersome process and confusing procedure unnecessarily. The low numbers you assume may even be kept artificially low due by cumbersome process and bad information flow about the benefits of isolating Tools to allow for ease of maintenance and long term continuity. Large suites of tools that are largely unrelated except by original developer like https://tools.wmflabs.org/hay/ are an anti-pattern in the modern (or at least my imagined) Tools system.

I very much concur with the vision, but I think that this inseparably entails the effects you describe. Writing software that shares libraries between several deployed tools, etc. to avoid those suites requires skills and energy, and if someone is thwarted by the existing processes, IMHO they are unlikely to all of the sudden "properly" design a system of modules and microservices. In the end, that is the whole point of Wikimedia Labs: Your contributions don't have to pass the thresholds for MediaWiki & Co.

  1. The Labs account system relies on user accounts existing in the LDAP backend, so I have no idea how wikitech could be SUL. Would there be a separate account system? Would there be a PAM plug-in that authenticates a Linux user login against a MediaWiki user database?

The need for LDAP account data on Wikitech is entirely a construct of OpenStackManager usage. As long as OSM is the system that is used to manage OpenStack in Labs we will need to keep a strong association between the on-wiki account and the LDAP account. Once OpenStack management transitions to Horizon however, there is no compelling need for directly using LDAP auth on Wikitech. LDAP accounts could easily be managed completely separately with associations between SUL accounts and LDAP accounts being stored in a separate database or LDAP itself.

I meant the LDAP accounts as the backend for Linux authentication. So you mean wikitech would be a SUL wiki, and (for example) in the preferences there would be a button "Create Labs account" that asks for your ssh key and then creates a/the corresponding LDAP account? That'd make sense.

  1. I am fine with tool vs. service group either way, but what I would like to avoid is a situation where in the tools.wmflabs.org UI they are called "tools", in the wikitech UI "service groups", in the LDAP records this, in Puppet that. If a user can live happily without ever encountering the term "service group", then using it in the UI is okay; if he will stumble upon it in the first week elsewhere without any explanation in the UI shown to him, it makes the learning experience a bit unpleasant.

Reasonable. If nothing else we can at least provide nicer explanation that "service group" == a unix user and group that is created for the purpose of owning files for a Tool and providing shared access to a restricted set of additional unix users. I've heard enough hatred for service groups from @yuvipanda and others to think their days are numbered, but they certainly aren't gone yet.

I usually try to ignore only the hatred, but I don't remember any concept that does not rely on some notion of "tool account". Do you have a pointer?

Thanks for your replies @scfc. I think we are closer to agreement than either of us probably thought initially.

I don't think "hazing" describes the process accurately. You create a wiki account like someone interested in Wikipedia & Co. will have done already elsewhere, you upload the ssh key and fill out one form (actually, one field), and a few hours later you can log in. You click on one link, type in the tool's name, and immediately the tool exists.

Here are a some of the things I find confusing or cumbersome with the initial process:

  • You need to pick two different usernames at account creation time (wikitech and LDAP). The relationship between these names and when each one will be shown in interfaces, etc is not well described.
  • The fact that the wikitech account name is also the display name in gerrit and is commonly a full name including whitespace is not mentioned anywhere.
  • The shell account name selection mentions svn accounts still which is a relic from a time now long gone.
  • The shell account name is not checked in real time for conflicts.
  • The error message returned when a shell name conflict exists is inscrutable.
  • An email address is required, but that is not mentioned in the instructions.
  • Once your wiki and LDAP accounts are created there is no instruction given on how to continue to upload your ssh key or to request access to Tool Labs.
  • When you do find the PreferencesOpenStackAdd public SSH key area, you end up on a page that again has no indication of what, why and how.
  • The waiting period for manual approval of joining the Tool Labs project is highly artificial. There is no good technical or procedural reason I can see for it to not simply be a matter of clicking a button or checkbox.

Which one of those processes feels clumsy is hugely a matter of your own exprience. I do remember @yuvipanda modelling service.manifest after Heroku's concepts. I don't know those, I think many tool developers do not either, but probably some do. What I do know, though, is that I never encountered any environment where I wanted to do something non-trivial and could do so without reading some documentation (or going through some trial-and-error cycles), because there is no universal environment (and I don't say that just as a Fedora user :-)).

I agree that only interface changes will not get rid of all problems nor alleviate the need for documentation and user effort. Many of the things I mentioned above could be made a bit better in the existing workflow but creating local message overrides (or adjusting existing ones) on wikitech. This ticket however is about brainstorming on what might be better, not apologizing for or rationalizing the existing workflow.

For ease of introduction, Heroku actually tries pretty hard. Granted they are leading you down a funnel that for them hopefully ends in getting your credit card number, but such is life. The steps to getting up and running on Heroku are:

  1. Visit heroku.com
  2. Click the prominent "sign up for free" button
  3. Fill in First name, Last name, email and click "create free account"
  4. Click link in confirmation email
  5. Choose a password
  6. Click one of 8 language specific (ruby, PHP, node, python, ...) getting started links
  7. Follow a step by step tutorial that ends up with you running a live webservice with version control managed source.

The tutorial does involve several more steps, but you are guided through each one with simple clear instructions.

I very much concur with the vision, but I think that this inseparably entails the effects you describe. Writing software that shares libraries between several deployed tools, etc. to avoid those suites requires skills and energy, and if someone is thwarted by the existing processes, IMHO they are unlikely to all of the sudden "properly" design a system of modules and microservices. In the end, that is the whole point of Wikimedia Labs: Your contributions don't have to pass the thresholds for MediaWiki & Co.

Again I will agree that a bit of tooling and a nicer workflow won't magically change things. I think that it is a step in a larger process however. Guided instruction can be pretty powerful for knowledge retention.

I usually try to ignore only the hatred, but I don't remember any concept that does not rely on some notion of "tool account". Do you have a pointer?

I think that groups for Tools are fundamentally sound and necessary for collaborative management. I think that T125002#1971966 is positing however that the general concept of "service groups" should be removed from Labs generally and only used in Tool Labs. That would largely remove the problem of the naming of the feature not matching what it is used to represent in Tool Labs.

I just signed up for Heroku, but haven't tested it any further. Now I find the idea that you must put your application in an SCM and automate the deployment process simply brilliant (and I suggested something to that effect with the Kubernetes setup – if you start a whole new paradigm, nobody expects backwards compatibility, so away with it). But real existing users (including me and the tools I maintain that don't warrant complexer setups) want to transfer files to a "web directory" and maybe log in and shuffle them around. I remember the outcry in the MediaWiki community when the SCM switched from Subversion – at that time already a fossil – to Git, and tools are often written by people who are less savvy with those things than somebody who mastered the MediaWiki code.

Regarding the tutorial: The first step is to install the "Heroku Toolbelt", and for my Fedora system it does offer no option. I would be hesitant to install random software from websites anyway, and unpackaged software hurts very badly, so I looked at the Fedora repository and found a reviewed and maintained package that provides the heroku CLI application. Now if I install that package with the description "Client library and command-line tool to deploy and manage apps on Heroku" does that provide the "Heroku Toolbelt" or just the CLI tool? Is there a difference? (I reported this, so maybe someone'll update that page.)

Which brings me to the next point: Heroku (and Google and OpenShift and …) makes money with paying customers. So they can afford not to be frugal about giving some CPU & Co. to bad people because it's probably less than 1 ppm of their available resources. Wikimedia Labs is brought down to a standstill regularly by one (!) user reading or writing large files or daring to run some database queries. So I see a huge DOS potential if somebody does not even have to face a single human being before being handed the key to the city, but can just click a button to join the Toolforge project.

Which brings me to the next point: Heroku (and Google and OpenShift and …) makes money with paying customers. So they can afford not to be frugal about giving some CPU & Co. to bad people because it's probably less than 1 ppm of their available resources. Wikimedia Labs is brought down to a standstill regularly by one (!) user reading or writing large files or daring to run some database queries. So I see a huge DOS potential if somebody does not even have to face a single human being before being handed the key to the city, but can just click a button to join the Toolforge project.

I know that you currently handle most of the Tool Labs membership requests, so maybe you have screening criteria that I don't know of. Can you tell me what percentage of requests you deny and/or what criteria you use to do that?

<tangent>I think most of our DOS problem comes from people who we would have a hard time just kicking out. See issues like people who run a very much needed bot who are also developing a new bot under the same tool and don't bother to check that it is spewing gigs of error log data bringing down the NFS server.</tangent>

But again I think that the point of this ticket is discussion of how things could be better, not apologizing for or rationalizing the existing workflow. If we make it much easier and faster to create and use a new account we should also make sure that our tools and policies for stopping people who are actively causing harm (either from malice or more likely negligence) are improved as well. It would be nice to know that there is a big red stop button we can hit for a tool or user who is causing problems.

I don't have data for "denied" requests (that could be estimated by all requests in https://wikitech.wikimedia.org/wiki/Category:Tools_Access_Requests that have been edited by me where the corresponding user is not in the Toolforge project), but I roughly decide based on:

  1. If the user name "looks" like a bot or someone else who could not consent to the Labs rules → "deny" (4.).
  2. If the stated purpose is "tangible" ("I want to move my bot x to Labs", "I want to build a web app that does y", etc.) → approve. (If I know that someone else has been working on the same problem, I usually add a paragraph to the welcome message who the user should contact or where he might find more information.)
  3. If the stated purpose is "abstract" ("research", "experimentation", etc.) and there is a hackathon ongoing or planned, the user has a non-throw-away mail address, the user has created a user page with coherent information about himself or linked his main wiki page of good standing, etc. → approve.
  4. Otherwise I leave a message on the user's talk page asking for clarification and link to that section from the access request (so the request is not really "denied", but more (indefinitely) "delayed").

During the hackathon I helped two people get new tools started from scratch (new account, ssh key, create service group, upload source code, run job/web server).

One thing that was reinforced for me by helping get both those tools up and running is that our Tool Labs onboarding process is complicated and difficult. A few task and language oriented tutorials on what do do once you have managed to get to the point where you can ssh into login.tools.wmflabs.org would go a long way towards making things nicer I think.

Renaming a few things in the navigation for Tool Labs help might be useful as well. One person I worked with knew that they wanted to "run a job", but did not connect that with the label "Grid" in the navigation. This is an example of how the documentation is organized for use by people with existing knowledge of the services offered by Tool Labs and the local terminology rather than around tasks an goals which are more likely to be useful to new wikitech/labs/tools users.

@bd808 what would you think about using https://phabricator.wikimedia.org/ponder/ for some of this as it's more stackoverflow question and answer oriented?

It's always the tradeoff for docs between maximizing usefulness past the initial charge and from day 1. usually there is alternate documentation. i.e. long form suited for those who speak the language or are trying and 'quickstart' which is roughly task / objective oriented with more handholding. Honestly not sure what makes sense here as much as I am thinking outloud.

quickstart?
question / answer oriented? "How can I [start a job|monitor a job|look for x|get y installed]?

@bd808 what would you think about using https://phabricator.wikimedia.org/ponder/ for some of this as it's more stackoverflow question and answer oriented?

It's always the tradeoff for docs between maximizing usefulness past the initial charge and from day 1. usually there is alternate documentation. i.e. long form suited for those who speak the language or are trying and 'quickstart' which is roughly task / objective oriented with more handholding. Honestly not sure what makes sense here as much as I am thinking outloud.

quickstart?
question / answer oriented? "How can I [start a job|monitor a job|look for x|get y installed]?

It's at least a reasonable question to ask. I haven't used ponder, but it looks to be a basic Q & A / FAQ tool. There are obviously pros and cons to using purpose built tools for storing such information. Pros would be that we don't need to invent a taxonomy for storing, displaying and searching small facts as we would need to do on wiki. A couple of cons from me would be fragmentation of content (some things answered in phab, some on wiki, inevitable duplication and drift) and the horrible usability of search in our phabricator instance.

Looking at the bigger picture, wikitech in general and Tool Labs content on wikitech specifically suffers from a lack of community. There are documentation pages which have been made primarily by subject matter experts on various topics. These pages tend to have good information but poor organization and discoverability. If you think in the same keywords as the original author you will probably find and be able to use the docs. If however you are not familiar with the domain then you will probably struggle to find the content you need. Correcting this requires curation by a group of authors with varied backgrounds and skill sets or paid technical writers who can put themselves in the shoes of the various target users. As I have written elsewhere, I don't believe that paid editing can scale for us. The Wikimedia Foundation will never be able to afford the army of tech writers it would take to organize and maintain documentation in a readable and discoverable form. What we should do instead is invite and encourage our existing users to collaborate on improving and maintaining the documentation as a part of a community.

Here are a some of the things I find confusing or cumbersome with the initial process:

  • You need to pick two different usernames at account creation time (wikitech and LDAP). The relationship between these names and when each one will be shown in interfaces, etc is not well described.
  • The fact that the wikitech account name is also the display name in gerrit and is commonly a full name including whitespace is not mentioned anywhere.
  • The shell account name selection mentions svn accounts still which is a relic from a time now long gone.
  • The shell account name is not checked in real time for conflicts.
  • The error message returned when a shell name conflict exists is inscrutable.
  • An email address is required, but that is not mentioned in the instructions.
  • Once your wiki and LDAP accounts are created there is no instruction given on how to continue to upload your ssh key or to request access to Tool Labs.
  • When you do find the PreferencesOpenStackAdd public SSH key area, you end up on a page that again has no indication of what, why and how.
  • The waiting period for manual approval of joining the Tool Labs project is highly artificial. There is no good technical or procedural reason I can see for it to not simply be a matter of clicking a button or checkbox.

I think you should file tickets in this Phabricator Maniphest installation about these issues. I agree that the getting started process can be made easier.

Here are a some of the things I find confusing or cumbersome with the initial process:

  • You need to pick two different usernames at account creation time (wikitech and LDAP). The relationship between these names and when each one will be shown in interfaces, etc is not well described.
  • The fact that the wikitech account name is also the display name in gerrit and is commonly a full name including whitespace is not mentioned anywhere.
  • The shell account name selection mentions svn accounts still which is a relic from a time now long gone.
  • The shell account name is not checked in real time for conflicts.
  • The error message returned when a shell name conflict exists is inscrutable.
  • An email address is required, but that is not mentioned in the instructions.
  • Once your wiki and LDAP accounts are created there is no instruction given on how to continue to upload your ssh key or to request access to Tool Labs.
  • When you do find the PreferencesOpenStackAdd public SSH key area, you end up on a page that again has no indication of what, why and how.
  • The waiting period for manual approval of joining the Tool Labs project is highly artificial. There is no good technical or procedural reason I can see for it to not simply be a matter of clicking a button or checkbox.

I think you should file tickets in this Phabricator Maniphest installation about these issues. I agree that the getting started process can be made easier.

T144710: Create Wikitech/LDAP accounts via a new user friendly guided workflow

Does Striker adequately satisfy the requirements of this task?

Does Striker adequately satisfy the requirements of this task?

Are we done yet? No. Are we in a hopefully better place than we were in 2016? I think so. There seem to be only 2 direct subtasks still open and both of them are a bit off track from the original discussion here, so we could probably close this task. There are still a lot of things touched on here that are not done (and won't be for a while). In some ways this ticket is like T2001: [DO NOT USE] Documentation is out of date, incomplete (tracking) [superseded by #Documentation] and will never really be finished, but eventually will be closed because someone doesn't like seeing it open anymore. :)

taavi added a subscriber: taavi.

Boldly this task without any meaningful activity since 2018 and with most subtasks resolved.