Page MenuHomePhabricator

Function infrastructure for Cloud/Toolforge ("Wikimedia Cloud Functions")
Open, MediumPublic

Description

Background: T379526#10315154, but I have much more usecases for it.
Here I propose Wikimedia Cloud can host a infrastructure like AWS Lambda, where user can write and deploy own functions, including:

  • "Readonly" function - those does not edit Wikimedia wikis, but will have access to (i.e. read and write, if the owner of function has permission) Cloud infrastructure like NFS, ToolsDB and database replicas.
  • "Read/Write" function - such function will require (end) users to login via OAuth to use, and the function can do action (such as edit) on the user's behalf. So user can build maintainance tools using such functions.

Cloud functions will have public APIs (SameSite=None must be set if they do editing), so can be called outside of cloud (with proper CORS settings customizable).

How is this different from webservices running in Kubernetes:

Kubernetes web services is not serverless or stateless, and it may suffer from issues such as OOM. There are many large tools such as Mix'n'Match, which are frequently broken. Usually they will be fixed automatically once webservice is restarted, but this can not be done by ordinary users.

Also, a Toolforge web service must comprise both frontend and backend part. With backend logic moved to another component, user will be able to create frontend-only tools which may be hosted without a per-tool Kubernetes container.

It will also shorten development cycle. e.g. User can write a function to download a webpage from website A and create a new Wikidata item based on its content. In the current Toolforge infrastructure it must be deployed as a webservice, and be redeployed if we need to add support for another website (B). In the proposed infrastructure, this can be acomplished with a new function, with no effect on existing one.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
fnegri triaged this task as Medium priority.Nov 13 2024, 10:34 AM
fnegri added a project: Epic.
fnegri subscribed.

Thanks for this proposal! This is something that have been in the back of my mind since I joined the foundation in 2022. I would love to see this happen.

There are many ways this could be implemented, and I think the first step should be some user research on what are the user needs for such a service.

The only technical note I have at the moment is that this could (possibly) be implemented on top of the current Toolforge Kubernetes, using something like https://knative.dev

How is this different from webservices running in Kubernetes:

Kubernetes web services is not serverless or stateless, and it may suffer from issues such as OOM. There are many large tools such as Mix'n'Match, which are frequently broken. Usually they will be fixed automatically once webservice is restarted, but this can not be done by ordinary users.

Serverless does not eliminate capacity planning as far as I understand the concept. As @fnegri has mentioned with https://knative.dev, tooling for the server side of this space is generally a layer of things on top of a Kubernetes cluster that take care of autoscaling and often provide simple abstractions for responding to incoming events.

I do think there are some neat possibilities for adding low code/no code solutions to the WMCS product suite that let folks focus on unique business logic and reduce boilerplate. I just don't see how that would magically make the implementation of Mix'n'Match more stable implicitly.

Somewhat relevant to this discussion, I just found this commented document explaining how and why Amazon released AWS Lambda 10 years ago: https://www.allthingsdistributed.com/2024/11/aws-lambda-turns-10-a-rare-look-at-the-doc-that-started-it.html