Once [[ https://phabricator.wikimedia.org/T206010 | questions about the external interface have been answered ]], we'll need to plan the remainder of the implementation, including the use of technologies, and operational semantics.
## Some random notes
### Integrating w/ Mediawiki
We should be able to make use of [[ https://doc.wikimedia.org/mediawiki-core/master/php/classRESTBagOStuff.html | RESTBagOStuff ]] on the MediaWiki side. It uses PHP's [[ https://secure.php.net/manual/en/function.serialize.php | serialize() ]] and [[ https://secure.php.net/manual/en/function.unserialize.php | unserialize() ]] for the body of `PUT` and the response of `GET` respectively, which isn't ideal (JSON would be better), but it seems that it was [[ https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/293554 | created for this purpose ]], and so perhaps we can make a case for changing it.# Mediawiki Integration
If it is not possible to make backward-incompatible changes to RESTBagOStuffMediawiki supports plugging of session persistence with a `BagOStuff` implementation, and there already exists a [[ https://doc.wikimedia.org/mediawiki-core/master/php/classRESTBagOStuff.html | `RESTBagOStuff` ]], we could always provide compatibility using `(Content-Type|Accept): text/plain`.(apparently [[ https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/293554 | created for this purpose ]]). The only issue with [[ https://doc.wikimedia.org/mediawiki-core/master/php/classRESTBagOStuff.html | `RESTBagOStuff` ]] is that it uses PHP's [[ https://secure.php.net/manual/en/function.serialize.php | serialize() ]] and [[ https://secure.php.net/manual/en/function.unserialize.php | unserialize() ]] for the body of `PUT` and the response of `GET` respectively, and we have specified JSON.
### Replication semantics
Based on the requirements for session storage, we should be able to assume in all cases that //get// and //set// use `ConsistencyLevel.LOCAL_QUORUM` and //delete// uses `ConsistencyLevel.EACH_QUORUM`. However, the service itself will be a very straightforward key-value storage implementation, likely applicable for many future use-cases, not all of which will necessarily be satisfied by these semantics. In the interest of finding balance between future-proofing and YAGNI, we could make the consistency level configurable on a per method basis. For example, something like:
If this failed to provide enough flexibility for some future use-case, it would be straightforward to extend the API, in a backward compatible way, to accept a per-operation consistency parameter.
With NodeJS, the only practical source of dependencies is http://npmjs.org. Dependencies, both those explicitly declared, as well as those that are transitive, are fetched whenever `npm install` is invoked, and there is no chain of trust. These dependencies -- the entire contents of `node_modules/` -- are as much a part of our production applications as any that we write, yet despite the time, care, and effort we put into reviewing even the smallest of changes to our code bases, the contents of `node_modules/` remain opaque to us.
A saner approach for something so security critical, would be to prioritize a manageable list of dependencies that can be sourced entirely from within our current version of Debian.
During a meeting on session storage, the following criteria were suggested as a way of evaluating technologies:
- availability of debian-sourced dependencies
- ease of coding
- complexity of front-end set-up
- compatibility with cassandra and stability of driver
| | Debian-sourced dependencies | Tooling | Ease of coding | Operational complexity | Cassandra driver |
| ---- | ---- | ---- | ---- | ---- | ---- |
| PHP | Possibly, sort of (if package in Sid is fixed and uploaded to backports) | ??? | Very | FastCGI integrated w/ Nginx/Apache(?) | Good; //!!Only present in unstable ([[ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=865271 | and it is broken ]])!!// |
| Python | Yes | Django, Falcon, Flask, etc | Very | | Good; v3.7.1 in Stretch (v3.14.0 in Sid) |