In the last years I've seen many bot requests where a relative simple find-and-replace type of edit had to be applied to large number of wikipages. Some common examples:
- wikilink changes when redirects are not an option
- template changes where existing transclusions have to be corrected
- url fixes when a URL scheme for well-used website changes (prime example for substitution regex search and replace)
- spelling fixes (highly contextual, has to be semi-automatic) [not really in the focus of the proposal, there are better ways]
As a Google Summer of Code project, I would like to build a modern, simple to use, webapp which allows users to "find and replace" over a large number of wikipages in a semi-automatic fashion. It would use websockets (sockjs) for asynchronous communication (search results and edit requests) and OAuth for authentication.
I've made a prototype (nothing more than the form currently), which probably explains the idea the best: tools.wmflabs.org/find-and-replace.
I'll add a more detailed proposal later (e.g. how local projects should be able to configure find-and-replace, as users for AWB have to be whitelisted on enwiki, but not on other wikis).
Existing alternatives
Existing tools which provide such functionality are Extension:Replace Text and the AutoWikiBrowser. The former is unlikely to be actived on WMF projects due to performance constraints, the latter has a bigger scope and requires users to install software.
A webapp version of AWB has long been wished, the goal of this proposal is not to replace AWB, but rather to go a first step and provide a simple to use find and replace tool for users who shy away from installing such tools like AWB.
Implementation
I've spend some time researching suitable libraries, one main objective is that it fits well into the tools environment and community. I'm now at a point where I would say the question which libraries to choose is mostly solved, the major ones include:
- angularjs for the frontend/client-side rendering
- angular-translate for i18n support (should integrate well with Translatewiki)
- python's Tornado web server with sockjs support, flask-mwoauth for Oauth negotiation (test setup for this combination works on tools)
- celery as task manager with dedicated worker processes, Redis as message broker
- pywikibot for the MediaWiki API (currently no OAuth support, see T74065: Pywikibot: Implement support for OAuth. Probably best to hack support in using mwoauth and then replace it when OAuth is officially supported by pywikibot)
About me
I'm a physics student at the University of Göttingen and mostly active in the german wikipedia since 2010. I run a pywikibot based bot there named AsuraBot. While I've programmed in various languages, writing a webapp is something new for me.
I'm looking for possible mentors, if you're interested please ping me.