Create an output API for Earwig's Copyvio Detector Tool
Closed, ResolvedPublic8 Estimated Story Points
Actions

Assigned To

Authored By

	kaldari
	Apr 18 2016, 6:18 PM

Description

We want to be able to dynamically retrieve the comparison from Earwig's Copyvio Detector Tool from the new Tool Labs interface for EranBot, but this requires an output API on Earwig's side. The API should accept a page title and a single URL and return all the info needed to construct a comparison showing the matching text between the WP article and the possible plagiarism source.

For now, the API should probably just return HTML for the diff, rather than trying to come up with some sort of JSON abstraction.

Related Objects
Search...

Status	Assigned	Task
Resolved	None	T116957 Plagiarism detection tools for text (tracking)
Resolved	• TBolliger	T120435 Improve the plagiarism detection bot
Resolved	• TBolliger	T131583 Epic: Make a tool labs interface for Plagiabot aka Eranbot
Resolved	Niharika	T132832 Show the comparison from Earwig's detector on the CopyPatrol interface
Resolved	Earwig	T132949 Create an output API for Earwig's Copyvio Detector Tool

Event Timeline

kaldari created this task.Apr 18 2016, 6:18 PM

• DannyH updated the task description. (Show Details)Apr 18 2016, 6:21 PM

• DannyH moved this task from New & TBD Tickets to Up Next (June 3-21) on the Community-Tech board.

• DannyH set the point value for this task to 8.

So http://tools.wmflabs.org/copyvios/api, but a solution for caveat #1?

Caveat #1: "There is currently no way to get the contents of the article or suspected source, nor can you get the data behind the visual comparison available from the main tool. This may be changed in a future version if there is sufficient demand for it."

Niharika edited projects, added Community-Tech-Sprint; removed Community-Tech.Apr 29 2016, 10:53 AM

@Earwig, do you have any thoughts on how to go about implementing this?

Probably not too crazy, but it depends on the way you want the results presented.

Either way, it's kind of hard to think about this until T125459 is dealt with...

In our sprint meeting today, Niharika asked if we should try using a third party library for generating the comparisons, instead of building an API on Earwig's.

She's investigating:
https://packagist.org/packages/adaptive/php-text-difference

• DannyH added a project: CopyPatrol.May 18 2016, 8:54 PM

Restricted Application added a subscriber: JEumerus. · View Herald TranscriptMay 18 2016, 8:54 PM

• DannyH edited projects, added Community-Tech; removed Community-Tech-Sprint.May 24 2016, 5:33 PM

Niharika mentioned this in T136259: Onboard new hire for Community Tech.Jun 1 2016, 3:50 AM

• DannyH edited projects, added Community-Tech-Sprint; removed Community-Tech.Jun 7 2016, 5:36 PM

@Earwig: Now that the API stuff is resolved, any more thoughts on this? Is this something that you might be interested in working on or would it be better for us to work on it (with your input)?

kaldari updated the task description. (Show Details)Jun 7 2016, 5:42 PM

I can do the implementation, but it would be helpful to get some suggestions for the output format.

Basically we want the API to return all the HTML that is currently in the 2 cv-chain-detail divs. One should be marked as the article (in the API data scheme) and the other should be marked as the source. We can then reproduce the CSS on our end to style the HTML. You don't need to worry about abstracting the output content (other than splitting it into article and source). Let's keep it simple.

This should work now. Simply pass detail=true when using action=compare.

Example:

https://tools.wmflabs.org/copyvios/api.json?version=1&action=compare&detail=true&project=wikipedia&lang=en&title=User:EarwigBot/Copyvios/Tests/2&url=https://www.whitehouse.gov/administration/president-obama&format=jsonfm

kaldari moved this task from Ready to Q1 2018-19 on the Community-Tech-Sprint board.Jun 9 2016, 10:09 PM

In T132949#2369968, @Earwig wrote:

This should work now. Simply pass detail=true when using action=compare.

Example:

https://tools.wmflabs.org/copyvios/api.json?version=1&action=compare&detail=true&project=wikipedia&lang=en&title=User:EarwigBot/Copyvios/Tests/2&url=https://www.whitehouse.gov/administration/president-obama&format=jsonfm

Wow. Thank you @Earwig! That was swift.

@Earwig do you think it's worthwhile to add CORS support? We can get by using a different browser than our normal one and disabling web security, but obviously not ideal :)

• DannyH edited projects, added Community-Tech; removed Community-Tech-Sprint.Jun 20 2016, 8:14 PM

• DannyH moved this task from Up Next (June 3-21) to Archive on the Community-Tech board.Jun 20 2016, 8:19 PM

MusikAnimal moved this task from Backlog to Done on the CopyPatrol board.Dec 6 2016, 5:25 AM

Create an output API for Earwig's Copyvio Detector ToolClosed, ResolvedPublic8 Estimated Story PointsActions

Description

Related ObjectsSearch...

Event Timeline

Create an output API for Earwig's Copyvio Detector Tool
Closed, ResolvedPublic8 Estimated Story Points
Actions

Related Objects
Search...