The old bot (EranBot) that powered CopyPatrol was written in Python 2 which is now EOL. Our gracious rockstar of a volunteer @JJMC89 has rewritten the bot from scratch in Python 3 (T293688) and it is now ready to be deployed.
Repository for the bot: https://github.com/JJMC89/copypatrol-backend
Staging checklist
- First deploy the bot to staging to ensure everything works smoothly
- For now, use the test db s52615__copypatrol_migrate_test_02_p
- Rework CopyPatrol to interface with the new bot (T340600)
- Deploy the new CopyPatrol code to a VPS test instance
- Seek approval from Turnitin. We're currently only using the sandbox version of TCA (Turnitin Core API). This conversation with them could also negotiate a long-term supply of credits (T305318)
Production checklist
- Create our new production database with the new schema
- Stop all writes to the toolsdb database (frontend and backend) [downtime begins]
- Run the migration script on copypatrol-migrate-01 to backfill historical data
- Deploy the bot on copypatrol-backend-prod-01 [backend downtime ends]
- Deploy frontend to https://copypatrol.wmcloud.org and ask people to start using it [frontend downtime ends]
- Redirect https://copypatrol.toolforge.org to https://copypatrol.wmcloud.org
- Update PageTriage with the new URL T362124
- Have the new bot replace EranBot as enwiki's copyvio tagging bot (discussed at T334265) – see BRFA