Page MenuHomePhabricator

CopyPatrol not showing the highlighted text
Closed, InvalidPublic

Description

@Niharika or @eranroz -- I'm not sure which of you this would be for. I saw the following issue on a CopyPatrol report, but I don't see it consistently. It might be something to look into.

At this link, I open up a page that is flagged as a violation, but none of the text in the right-hand box is highlighted as matching the left-hand box. None of the text is highlighted in either box.

When you look at the whole page, though, violating text is definitely present. See the Earwig report, for instance.

Event Timeline

Restricted Application added a project: Community-Tech. · View Herald TranscriptOct 17 2018, 10:31 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
MMiller_WMF updated the task description. (Show Details)Oct 17 2018, 10:32 PM

@MMiller_WMF The Earwig report you linked to uses Google to search for violation. However doing the same search with Turnitin doesn't find anything.

What I think happened with CopyPatrol there is that Turnitin told us there's a copyright violation with http://www.balseracommunications.com/david-duckenfield.php but Turnitin has an old version of that page in its database and the content has changed in the meantime. Thus there's no matching text to highlight.

I'm actually not quite sure what's going on with Turnitin there because if you go to the iThenticate report from the CopyPatrol interface, it does give you some more links which contain that text.

I'm actually not quite sure what's going on with Turnitin there because if you go to the iThenticate report from the CopyPatrol interface, it does give you some more links which contain that text.

This is quite common and probably due to our side proccessing:

@eranroz Got it. Do you know why Earwig isn't able to find anything using Turnitin though?

Niharika closed this task as Invalid.Oct 23 2018, 9:37 PM

This is a Turnitin/Earwig problem per my first comment.

Niharika moved this task from Backlog to Done on the CopyPatrol board.Nov 21 2018, 11:41 PM