Site/Location: eqiad or codfw ? both
Number of systems: 1 should be more than enough
Service: Wikimedia Stewards automatizations that are ineligible under Wikimedia Cloud VPS's terms of use.
Networking Requirements: internal IP
Processor Requirements: 1 or 2 should be sufficient.
Memory: 2 GB
Disks: 20 GB
Other Requirements: Selected Wikimedia Stewards (and/or steward-approved non-stewards) to have root-level access (new access group needed, for now, probably me / @Urbanecm).
Detailed reasoning
Wikimedia Stewards have several workflows that have the following characteristics (below is an example workflow described):
- are easy to automate,
- are used frequently enough for automatization to have visible impact,
- allow direct access to (or operate with) Nonpublic personal information and Personal information as defined by relevant WMF policies (Privacy policy or Confidentiality agreement) without explicit consent of the user(s) the data is about, and as such, are bound by the restrictions set by the Privacy policy and/or ANPDP.
Because the third point, it is currently impossible to experiment with possible automatization within WMF premises, as there is no suitable production machine and as far as I know, processing Privacy policy-protected data is prohibited in Wikimedia Cloud by their ToU (in particular, the ToU make it explicit that there are no guarantees in terms of WMCS security, which seems to be incompatible with the Privacy policy-set expectations).
This can be illustrated with automating (on/off)boarding for community functionaries (described in more detail below), which is the first project I'd like to use the machine for. For a system to be able to automatically provision required accesses for functionaries, the system necessarily needs to have credentials allowing to grant/revoke said accesses. This also means that any such system would have direct access to virtually all private data that the WMF exposes to trusted functionaries, begining with user IP data and ending with security reports. Restricting such access would be impractical or impossible, because the system's purpose is to perform the permission adjustments and needs to have the rights to do so.
Production VM seems to be a reasonable place for such on/offboarding scripts to live. I'm opening this request to start an initial conversation with SRE and stewards, about whether having a production machine would even be an option, or whether there are other solutions that are more suitable for the problem I'm proposing to solve here.
Please let me know if there is a better place to run a discussion like this. I'm also happy to discuss the needs we (Stewards) have synchronously, if that would be benefitial.
Example Steward Workflow
The most important workflow that could be automated without signficiant effort (assuming environment where private data can be accessed safely) is (on/off)boarding community functionaries. Community functionaries tend to have access to several resources that need to be enabled/disabled individually, in addition to the on-wiki permission group. Many of those resources include access to privileged data, which cannot be (as explained above) maintained from Cloud. Examples include:
- Private wikis, such as checkuser.wikimedia.org (contains user IP data), steward.wikimedia.org (contains miscellaneous WMF confidential data) or vrt-wiki.wikimedia.org (contains excerpts from VRTS and other WMF confidential data)
- Private Mailman lists (such as stewards-l, checkuser-l, global-sysops, global-renamers, ...); some of them are frequently used for deliberations involving WMF confidential data
- Private IRC channels (#wikimedia-checkuser, #wikimedia-privacy, ...); some of them are frequently used for deliberations involving WMF confidential data
- Phabricator ACLs, such as acl*security_steward or acl*stewards, which provide access to sensitive Phabricator tasks.
- Secondary on-wiki user groups (for example, steward permission compose of the steward Meta-Wiki group and of the steward global CentralAuth-provided group; both need to be granted to make an user an actual steward).