MediaWiki and the ecosystem of services that support it in Wikimedia's production environment emit lots of performance timing data, such as the time it took to process some request or the speed of a network link. Much of this data is aggregated in two log aggregation systems with graphing capabilities. But the data is not well curated, mixing important metrics with unimportant ones. The data needs a curator! We have [http://performance.wikimedia.org/](http://performance.wikimedia.org/) provisioned, and we'd like that space to feature some key performance metrics about the Wikimedia cluster, perhaps accompanied by some glosses that help readers interpret the data. (See [gdash.wikimedia.org](http://gdash.wikimedia.org/) for an approximate system.) He will be happy to provide an overview of the data that is available, the means of accessing it, and the tooling available for plotting it. This task is suitable for anyone with interest in data analysis and performance analysis. Some facility with a language with good data analysis libraries like Python or R is desirable but not required.
- Possible mentor: Ori Livneh