Detailed plan agreed with mentors: T93498 (will be revised and moved to T90238)
[x] Phabricator project created: #Mediawiki-extensions-SmiteSpam
[x] Meetings with mentors started: Generally communicate over email. For quick queries and progress checks, Google Hangouts will be used.
[x] Bonding period report published (see below)
= Bonding Period Report =
- **Work done**:
- Development environment was already set up
- Gerrit repository created at [[https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/SmiteSpam | mediawiki/extensions/SmiteSpam ]]
- Extension's skeleton merged as [[ https://gerrit.wikimedia.org/r/#/c/211364/ | first commit ]]
- Spam honeypot wiki set up on labs instance at http://honeypot-wiki-alpha.wmflabs.org (still has to start collecting spam)
- Was pointed to http://markmail.org/search/?q=spam%20list%3Aorg.wikimedia.lists.mediawiki-l to make it easier to study experiences of third party wikis which have been a victim of spammers
- **Minimum Viable Product**
- Same as mentioned in T93498: A crude prototype of the extension which can perform the basic function of searching for pages based on a single rule and then list out the matching pages.
- **Changes from original plan**
- Spam **pages** are a much bigger problem than spammy **edits**. Therefore, the extension will only focus on identifying spam pages. Identifying spam edits may be added in if time permits.
- **Communication plan**
- Quick queries and progress checks will happen on Google Hangouts. Questions with larger scope will be asked by email.
- **Issues identified/lessons learnt**
- The number of pages that need to be processed could be very large. This means that while the current plan is to just load them all in PHP memory and process them one by one, a more efficient procedure in terms of memory (less efficient in terms of speed) will probably need to be implemented instead. This will be done a little after the MVP.