This document is an attempt to formalize the output of the "Architecting Core: Standalone Services" session from the 2018 Wikimedia Technical Conference (see https://www.mediawiki.org/wiki/Wikimedia_Technical_Conference/2018/Session_notes/Architecting_Core:_stand-alone_services and T206082).
The session was designed to work on a series of questions:rfc is [[ https://www.mediawiki.org/wiki/Requests_for_comment/ExternalServicesStandards | on wiki ]] and discussion about it, at least while it's not perfectioned, should move to the [[ https://www.mediawiki.org/wiki/Talk:Requests_for_comment/ExternalServicesStandards| talk page ]]
- What characteristics of a given feature make it a candidate to be provided by an external service, as opposed to being integrated into MediaWiki directly or provided by an extension?
- What characteristics of a given feature disqualify it from being provided by an external service, and instead require that it be integrated in to MediaWiki?
- What technical requirements apply to standalone services generally; to standalone services that will be distributed to third parties; and to standalone services that are intended for operation in the Wikimedia production environment?
I am proposing that TechCom adopt the definition, selection criteria, and architecture guidelines below as requirements that will guide the decision to extend MediaWiki via an external service, and the architecture by which that extension is accomplished.
//Standalone services// (or //external services//, the two terms are used interchangeably here are not intended to be distinguished) are applications that extended MediaWiki in some way, but that are operationally distinct in some meaningful way. The mechanism of extension is not specified, beyond saying that it's necessarily interprocess as opposed to intraprocess; queues, API calls and other types of RPC mechanisms, XHR from a client, or shell interaction are all valid examples. It is very likely that one or more MediaWiki extensions will be involved in integration; the important point is that business logic is implemented outside of the MediaWiki extension.
External services that integrate with MediaWiki in a number of different ways are already present within the Wikimedia environment. These include:
- Parsoid (provides parsing for the Visual Editor extension), which is called directly by clients and then consumes the MediaWiki Action API for its data;
- ORES (provides revision scoring for some wikis), a WMF-created service that is called on the server side via an HTTP API, and consumes from MediaWiki via HTTP;
- CirrusSearch (provides search functionality, and is built on Elasticsearch), which consumes changes from a queue, and is called by MediaWiki using an HTTP API.
These services represent three different ways of integrating an external service in to MediaWiki, but all fall in to the definition in use here.
**Selection Criteria: Deciding whether an external service is appropriate**
The properties listed here are intended to be a guide as to whether a given feature can be provided externally to MediaWiki or not. They are not intended to be exhaustive, and they are provided as guidance rather than requirements. If a proposed feature has one or more properties that appear in the left column, but no properties that appear in the right column, then that feature could be implemented as a standalone service. If the proposed feature has one or more properties that appear in the right column, then significant care should be taken before implementing as a standalone service.
| **Properties that make a feature suitable for implementation as a standalone service**|**Properties that make a feature //unsuitable// for implementation as a standalone service**|
|Feature is either async to servicing requests, or if sync, provides optional features.|Feature is synchronous to the request, and the request cannot complete until the feature is successfully provided.|
|State is independent - the functionality provided does not require view of MediaWiki state that is guaranteed to be current.|Feature only works correctly with a consistent and current view of MediaWiki state.|
|3rd party library or service exists that can provide the needed functionality with minimal integration|Feature requires direct access to the MediaWiki database, and cannot use an API to retrieve or update data.|
|A non-PHP language or framework exists that significantly simplifies implementation|Requires features/functionality provided by other MediaWiki extensions that are implemented internally|
|The feature is independently useful, and is likely to have non-MediaWiki use cases|
|A desire for the feature to be owned and maintained autonomously, pace of development is unlikely to correspond to the pace of MediaWiki development, or other organizational factors|
If any of the following are true, then the feature **absolutely should be implemented as an external service**, with appropriate architectural changes made elsewhere to eliminate disqualifying properties.
|**Properties that //require// that a service is provided externally to MediaWiki**|
|Elevated security need: Due to data isolation or other operational requirements, a given feature cannot be provided in the same operational environment as MediaWiki itself.|
|Excessive or potentially unbounded resource needs: Image thumbnailing, video transcoding, and machine translation are all examples of features where unpredictable properties such as request rates and input size have a significant impact on the resources required, and based on factors that the operator can't control may result in resource contention and denial of service.|
|Long-running processes involved|
|The feature in question is used to triage or fix MediaWiki in the case of failures|
|The application is going to be run in a separate environment from MediaWiki itself|
**Architectural and Implementation Guidelines for external services**
All features implemented as standalone services must have the following properties:
- Actually do something
- Minimize data collection
- Provide for compliance with GDPR or other data privacy/data ownership frameworks
- Implement privacy controls that are //at least// equivalent to those of any calling service. For example, if the privacy controls of the calling service specify that IP addresses will not be stored for more than 90 days, the external service may not store IP addresses for longer than that time.
- Be licensed under an OSI-approved license
- Avoid needlessly duplicating features or functionality provided in other services
If the standalone service is intended to be distributed for general use, it must have the properties above, and in addition must
- Have a documented installation process
- Have a documented uninstallation process
- Have a documented upgrade process
- Be versioned using semver
- Indicate versions of MediaWiki with which it's compatible
- Provide a configuration mechanism that does not involve changing the distributed code
- Provide a mechanism by which support (community or otherwise) can be requested
- Provide a mechanism by which patches can be proposed
- Provide a mechanism by which public security advisories are issued
If the standalone service is intended to be used in the Wikimedia production environment, it should comply with the guidelines above, and in addition must
- Have SLIs and SLOs
- Have WMF-compatible monitoring
- Use Wikimedia deployment tooling
- Have passed review by WMF Security
- Use a language and toolset that have been approved by TechCom
- Have an owner, and a plan for ongoing maintenance
- Have a runbook
- Have WMF-compatible structured logging
- Be architected such that it can run in multiple data centers simultaneously
- Be tolerant to faults, including network partitions or other datacenter level issues
- Have pinned/pinnable dependencies
- Have trusted upstream asset chains
- Have backups if the service stores any data
- Perform sufficiently for very high request or data volumes
- Have users, or a plan to acquire users
 An example of a service that has properties that make it unsuitable, but which was none the less implemented as a standalone service, is CirrusSearch. Though search functionality is synchronous and cannot succeed until the external service returns, operational demands are enough to suggest an external service.