[MediaWikiAnalysis](https://github.com/MetricsGrimoire/MediaWikiAnalysis) is a tool to collect statistics from MediaWiki sites, via the MediaWiki API. It is a part of the [MetricsGrimoire toolset](https://github.com/MetricsGrimoire), and it is currently [used for getting information from the MediaWiki.org site](http://korma.wmflabs.org/browser/mediawiki.html), among others.
The stats currently collected by MediaWiki are only a part of what is feasible to collect, and the tool itself could be improved. Some possible directions:
# Explore in detail the MediaWiki API and extract as much information from it as possible.
# Improve efficiency and incremental retrieval of data
# Propose (and if possible, implement) changes to the MediaWiki API if needed, to support advanced collection of data.
# Use SQLAlchemy instead of MySQLdb for managing the MediaWikiAnalysis database.
Optionally, candidates can as well develop a library, using Python/Pandas, for analyzing the resulting database, computing the most relevant metrics. The current GrimoireLib can be an inspiration for this line of development.
When preparing their proposals, candidates are urged to analyze the problems that may arise while developing the proposed lines, and specify how they are going to deal with them, and in general the approach to be followed to improve efficiency, incremental collection, and to find out which modifications to the API would be convenient.
- Mentors: Alvaro del Castillo, Daniel Izquierdo, Jesus M. Gonzalez-Barahona
* Primary mentor: @jgbarah
* Co-mentor: @dicortazar
* Other mentors: //(optional, Phabricator username)//
* Skills: Python, SQL, PHP (only in case of proposing changes to MediaWiki API)
* Estimated project time for a senior contributor: 2-3 weeks
* Microtasks: T114437 T114439 T114440