https://www.mediawiki.org/wiki/MediaWiki_Developer_Summit_2015#Schedule
Tuesday 27, 9:45am
Etherpad: http://etherpad.wikimedia.org/p/MDS_2015_SOA_plenary
Agenda
- Introduction: ~15 minutes
- Discussion on open questions: ~30 minutes
Why SOA?
- well-defined interfaces / APIs:
- org / team scaling: don't need to understand every detail to get started
- parallelize development behind stable interfaces
- testing: clear & narrow interface to test against, can mock dependencies
- horizontal layers / interfaces can reduce vertical silo tendency & help to identify common concerns / patterns
- performance: parallelism using distribution
- security & robustness: least privilege, fault isolation & monitoring
What we have done so far
- Parsoid
- feature services: mathoid, citoid, hieroglyphs?
- PHP API improvements
- HHVM perf -> graph
- features for mobile
What we learned
- Stable APIs are great:
- PHP API powers apps, Parsoid, OCG, ..
- Parsoid API powers VE, Flow, Kiwix, content translation, ..
- Isolation is great
- example: had issue in OCG; service isolation meant that potential damage was limited
- Got to use third-party code, new contributors: MathJax, Zotero
- Both use client-side code on server
- No standard / convenient solutions for
- monitoring
- caching
- some security aspects
- Infrastructure & deployment
- shortage of manpower in ops for puppet, reviews
- mostly using trebuchet, which has improved but still has plenty of issues (mostly in salt land)
- can we find something simpler / more reliable & better integrated with config management?
- Unclear responsibilities: what happens when stuff breaks?
What we are currently working on
- Parsoid: Perfecting rendering, media support in preparation for Parsoid-powered views
- Wikidata Query Service: see later session
- API improvements
- lots of work on PHP API
- Performance: HHVM, ongoing per-request overhead optimizations
- Features in support of mobile
- RESTBase close to first deploy
- light-weight & high performance REST API driven by Swagger specs
- optimized for storage / caching backed by internal services like PHP API, Parsoid, Mathoid etc
- aims to provide standard monitoring, security headers, CSRF validation, sanitization and authorization facilities; avoid duplication of effort in each service.
- initial focus on HTML content and revision metadata, including services like revscore
- looking into HTML section editing and -retrieval
- lots of work on PHP API
- Coming up:
- Auth service (Wikia: Helios)
- Image scaling (Wikia: Vignette)
- Small feature services like svg to png renderer or hieroglyphs
Possible discussion topics
- scaling down for testing and third-party use (covered in https://phabricator.wikimedia.org/T86559?)
- should we start targeting cheap VMs with packages, images, vagrant, puppet, docker files or [insert here]?
- Keeping complexity in check
- granularity
- communication patterns
- stateless services
- limited number of platforms: PHP, Node, some Java & Python
- Can we move faster without breaking?
- test coverage
- ability to restart & roll back: stateless vs. stateful
- security: XSS, CSRF
- responsibility: who is getting pages?
- isolation, CI & deployment: next session
- Front-ends as API consumers
- Should we gradually rework the desktop and mobile skins as API consumers?
- Can we achieve fully cached page views for logged-in readers?
- ESI vs. client-side
- Using Parsoid HTML for views
- Parsoid eventually becoming the default parser?
This session should ideally be followed by T86138 and T86372.