The Support and Safety team requests some improvements to some self-written tools to help them complete important tasks as part of their workflows. This ticket will be an investigation into the DMCA Takedown Form, while T159467 covers the Child Protection Takedown Form.
The tools are currently found at http://lcatools.corp.wikimedia.org — @Jalexander will be able to provide login credentials to pertinent developers on request for investigation and/or development. Existing code is here: https://github.com/jamesryanalexander/lca-tools
What the tool does:
This tools is found either from the main page or left rail of the Trust and Safety tools wiki as "DMCA Takedown Form"
This is a form to fill out information about a DMCA takedown (both data about the takedown itself, as well as the file or page that is being taken down), and attach a file with additional information as needed. The form sends data (and the file) to Lumen Database (formerly Chilling Effects) via their API, and receives (for log and sugar case) the Lumen ID/URL.
The tool formats posts for:
- WMFwiki -- provides link to post on WMFWiki and allows copy and paste of the main post
- the User Talk page of the uploader -- also allows you to make the edit directly via MW OAuth in addition to copy/paste
- When the takedown is on Commons, it formats posts for Commons Village pump and the Commons DMCA noticeboard. (It also allows one click posting of those via OAuth in addition to copy/paste).
- The tool also creates a SugarCRM case with basic info about the takedown.
There is a separate sub-tool that only reports to Lumen without formatting the posts/creating sugar case. This is mostly used for when the API was down or issues happened.
Current screenshots
Requested changes:
- Right now, the tool only tracks the files that are taken down. We also want to track the requests that don't lead to a takedown, so there should be a full log. There should likely be an option added for request granted (image removed) or denied (image kept.)
- The 'Project' dropdown is currently hardcoded, can it be expanded or have an 'Other' that allows for a textbox? (with validation?)
- Some of the fields should not be mandatory.
- Anything required by the Lumen API should be required
- If the request was not complied with, then nothing should be required
- This should write to SalesForce instead of Sugar. The exact same data should be stored.
- Potentially support multi-file support.
Open questions:
- Will these requested changes require a complete rewrite, or small-scale fixes?
- Where should these tools live — on the existing private server, or ToolLabs?
- Should we merge the sub-tool with the main tool, or keep them separate?
Deliverables:
- Written answers to open questions in this ticket
- Written proposal for how to implement requested changes to the two tools
- Documented knowledge learned to help with further development of this project
- Any needed additional cards created
In an CommTech estimation meeting on March 7, 2017 this card was sized as a '5'.
Investigation results:
The tool currently, is a mixture of a bunch of different tools, which don't necessarily have a lot in common.
- DMCA takedown tool - A form for submitting information about the requested DMCA takedown.
- Child protection takedown tool - A form for submitting information about the requested child-protection takedown.
- Global search, Global link search, Global text search - Standalone tools to search across all of WMF wikis.
- Strategy tools - Also standalone tools for strategic decision making. Very infrequently used.
The first two tools are basically just a form, with some mandatory and some non-mandatory fields. There is an option for fetching data from CentralAuth for the file author(s) as well as the person filling up the form, to auto-populate some of the fields. It runs a bunch of checks/tests after the form is submitted and then it's sent to the database after some verifications to make sure the data won't be rejected at the database end. There is a smaller tool to retrieve a particular entry but from my understanding, it's also infrequently used.
The global search tools let you search for a string/link across all wikis deployed on the Wikimedia cluster. They should be made public for anybody to use. James mentioned that there was no particular reason that they were a part of the same tool besides lack of developer resources at the time of writing them.
The strategy tools contain private data and hence they shouldn't be made public. James expressed interest in keeping them clubbed with the DMCA and CP takedown tools for now.
To answer the open questions...
Will these requested changes require a complete rewrite, or small-scale fixes?
From my understanding, the changes requested are not small-scale from looking at the codebase. The code is complex, more than it needs to be. It could use a fair bit os structure and classes etc. There's a lot of code duplication in places and also the way it handles data (it gathers data from form, then sends it to the database, then retrieves it back and then uses it to structure wikitext for posting on Village pumps etc.) is also unnecessarily complicated. It doesn't use composer or any sort of external packages as far as I can see. I feel like if we attempt to patch up the current code, it'll only lead to more complexities and more time spent trying to understand the current code. I propose doing a rewrite but borrowing code from the current codebase as and when needed.
Where should these tools live — on the existing private server, or ToolLabs?
Due to the sensitive nature of some of the data being handled, we will probably need to leave it on the existing private server.
Should we merge the sub-tool with the main tool, or keep them separate?
I don't see a reason why they are two separate tools for posting on wiki/not posting. I would suggest we make it a checkbox at the end and merge the two tools into one.
Next steps:
Make a new tool for the DMCA and CP takedown tools
- The forms should replicate the functionality of the existing tool.
- Merge the two tools (one which posts on wiki and one which doesn't) into one (additional form checkbox).
- Pull in data from wikis wherever possible (like information about a commons file/uploader etc.)
- Validate data wherever possible
- Do away with local accounts and use OAuth (make sure to only allow access to specific staff accounts etc.)
- Log actions by users
- Retrieval function for given case ID (From the database)
- Copy over functionality for strategy tools (If needed?)
- Host on private server
And also, everything in Requested changes in the task description:
- Right now, the tool only tracks the files that are taken down. We also want to track the requests that don't lead to a takedown, so there should be a full log. There should likely be an option added for request granted (image removed) or denied (image kept.)
- The 'Project' dropdown is currently hardcoded, can it be expanded or have an 'Other' that allows for a textbox? (with validation?)
- Some of the fields should not be mandatory.
- Anything required by the Lumen API should be required
- If the request was not complied with, then nothing should be required
- This should write to SalesForce instead of Sugar. The exact same data should be stored.
- Potentially support multi-file support. (CP takedown already has this, DMCA very rarely needs it)