Plagiabot provides an API for testing articles with the Turnitin engine (see https://en.wikipedia.org/wiki/Wikipedia:Turnitin). Here is an example of the API output for a specific article: http://tools.wmflabs.org/eranbot/plagiabot/api.py?action=suspected_diffs&page_title=Rajesh_Khanna&report=1. (It returns an array of 1 or more potential violations.)
It would be great if the Copyvio Detector tool (https://tools.wmflabs.org/copyvios/) had the option of using Turnitin as well as Yahoo BOSS for detecting possible copyright violations.
Acceptance criteria:
* In the "Copyvio search" options, add a new option for "Use Turnitin" (off by default for now)
* If "Use Turnitin" is checked, include source URLs from the Plagiabot API output in the "Checked Sources" list (and have them actually be checked by Copyvio Detector)add an extra box to the output (between the generation-time div and the cv-result div) that shows the results from the Plagiabot query.
* Source URLs that are detected by both Turnitin and Yahoo should only be shown in the list onceIf there are no matches from the Plagiabot query, the div should use class=green-box and say something like "Turnitin found no matching sources."
* If the source URL isn't publicly accessiblere are matches from the Plagiabot query, show an errorthe div should use class=red-box and link to the say something like "Turnitin report
For now,n found sources that may have been plagiarized. we should only use the Plagiabot API as a source for more source URLs to compare using the Copyvio Detector engine. For the sake of consistent output (that isn't mixing apples and oranges), we should not use Plagiabot's score or reports (at least for now), but should use the score and diffs generated by the Copyvio Detector enginePlease review them." It should then include output similar to the Source column at https://en.wikipedia.org/wiki/User:EranBot/Copyright/2#Added (but in a nicer format).
* Do not feed the results from Plagiabot into the Copyvio Detector's list of sources to check.