Page MenuHomePhabricator
Feed Advanced Search

Jan 21 2016

Earwig added a comment to T124225: PageImages should never return non-free images.

If a file has been uploaded to Commons it's free (otherwise it would have been deleted already), if a file has been uploaded to a Wikipedia it's non-free (otherwise it would have been moved to Commons already).

Jan 21 2016, 6:46 AM · MW-1.27-release (WMF-deploy-2016-03-08_(1.27.0-wmf.16)), Reading-Web-Sprint-68-"Java and JavaScript are basically the same", Reading-Admin, Readers-Community-Engagement, Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, Patch-For-Review, WMF-Legal, PageImages

Jan 20 2016

Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

This is... done, I think. I want to hack on the visual output further, but it works.

Jan 20 2016, 8:57 AM · Community-Tech, CopyPatrol

Dec 19 2015

Earwig updated the title for P2442 {{cite pmid/*}} from untitled to {{cite pmid/*}}.
Dec 19 2015, 7:32 AM

Nov 11 2015

Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

Okay, so Coren's been the point of contact in the past between me and the WMF with regards to managing the Yahoo! BOSS API keys that are necessary to use that service. As far as I know, he still has that role. I was suggesting that he could create a new key for Fhocutt for developing/testing this new feature (since sharing of keys doesn't sound like a good idea, although we could do that too, I guess).

Nov 11 2015, 12:30 AM · Community-Tech, CopyPatrol

Nov 5 2015

Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

@kaldari Still useful to test how the results look when combined with the regular BOSS hits, I guess?

Nov 5 2015, 1:40 AM · Community-Tech, CopyPatrol

Nov 4 2015

Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

Sorry Coren, I didn't really mean to add you as a subscriber...!

Nov 4 2015, 6:43 AM · Community-Tech, CopyPatrol
Earwig updated subscribers of T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

Hmm... I guess you can ask @coren for a BOSS key for testing? Alternatively, disable part of EarwigBot: in earwigbot/wiki/copyvios/__init__.py, comment out line 116 and change 133 to if True:. That should make it just report "no match" for everything. I might add a more graceful fallback in the future.

Nov 4 2015, 6:42 AM · Community-Tech, CopyPatrol

Nov 3 2015

Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

You probably didn't put it in the "wiki" section.

Nov 3 2015, 6:52 AM · Community-Tech, CopyPatrol
Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

To be honest, I'm struggling with free time right now. Not sure the best way for you to approach this.

Nov 3 2015, 3:49 AM · Community-Tech, CopyPatrol

Sep 27 2015

Earwig closed T110778: [AOI] Create a test suite for Copyvio Detector as Resolved.
Sep 27 2015, 9:02 AM · Community-Tech
Earwig added a comment to T110778: [AOI] Create a test suite for Copyvio Detector.

All done now.

Sep 27 2015, 9:02 AM · Community-Tech
Earwig added a comment to T110144: Integrate Turnitin (as used in Plagiabot) into Copyvio Detector tool [AOI].

Sounds fine. I'm not sure about putting the Turnitin results above the main result summary, but that's a nitpick.

Sep 27 2015, 5:22 AM · Community-Tech, CopyPatrol

Sep 26 2015

Earwig added a comment to T110778: [AOI] Create a test suite for Copyvio Detector.

Oh, good point on that last one. I can definitely use posts from my own blog. Will try that.

Sep 26 2015, 6:59 PM · Community-Tech
Earwig added a comment to T110778: [AOI] Create a test suite for Copyvio Detector.

Now at https://en.wikipedia.org/wiki/User:EarwigBot/Copyvios/Tests. Did some cleanup and added a few new tests.

Sep 26 2015, 6:56 AM · Community-Tech

Sep 22 2015

Earwig added a comment to T110778: [AOI] Create a test suite for Copyvio Detector.

This is very useful. Thanks!

Sep 22 2015, 6:39 AM · Community-Tech

Sep 21 2015

Earwig added a comment to T110778: [AOI] Create a test suite for Copyvio Detector.

I will likely work on this on my own over the next couple of weeks. It'll be useful for other improvements that I plan to make to the comparison engine.

Sep 21 2015, 5:20 PM · Community-Tech

Sep 17 2015

Earwig added a comment to T112881: Create Cyberbot Project on Labs.

I don't understand. What kind of work are you doing that requires so much memory?

Sep 17 2015, 11:20 PM · The-Wikipedia-Library, Cloud-Services, VPS-Projects

Sep 11 2015

Earwig added a comment to T112227: Incorrect {{DEFAULTSORT}} additions.

For https://en.wikipedia.org/w/index.php?title=Clinoch_of_Alt_Clut&diff=prev&oldid=680314774, I believe the correct parsing is "Clinoch of Alt Clut" rather than "Clut, Clinoch of Alt", per WP:PEER. This is strange to me because WP:AWB/GF indicates it should be doing this already. For the second page, I think "Byzantine Master of the Crucifix of Pisa" without any modification is correct.

Sep 11 2015, 6:44 AM · AutoWikiBrowser, Essential-Work

Aug 25 2015

Earwig added a comment to T108422: [AOI] Investigation: Can we improve Copyvio Detector?.

Yes, this is a good idea. I already use https://en.wikipedia.org/wiki/User:The_Earwig/Sandbox/CopyvioExample and https://en.wikipedia.org/wiki/User:The_Earwig/Sandbox/CopyvioPDFExample as basic sanity checks, but a more comprehensive suite would be much better.

Aug 25 2015, 3:49 AM · Community-Tech

Aug 22 2015

Earwig added a comment to T108422: [AOI] Investigation: Can we improve Copyvio Detector?.

It is custom-written. You are right that the particular result there is poor; my first thought is to work on the confidence algorithm a bit to value large contiguous blocks more than lots of disjoint trigrams. For quotes, I'm not so sure; if that issue was fixed I think it might not be so important. I can look into that.

Aug 22 2015, 12:29 AM · Community-Tech

Aug 19 2015

Earwig added a comment to T108422: [AOI] Investigation: Can we improve Copyvio Detector?.

Regarding l10n, the tool works fine for non-English content from a technical perspective (logs show many successful requests involving Korean etc wikis; people have added German and Russian mirrors...).

Aug 19 2015, 12:33 PM · Community-Tech

Aug 15 2015

Earwig added a comment to T108422: [AOI] Investigation: Can we improve Copyvio Detector?.

There is only one outstanding bug with the tool that comes to mind. I have a memory leak that I've been unable to get to the bottom of for about a year now. It happens so slowly and unpredictably that progress on it is difficult, especially given the lack of urgency and questions about why Python's internal memory management isn't working. I could probably fix it if I devoted enough time to extra debugging.

Aug 15 2015, 2:54 AM · Community-Tech

Jul 30 2015

Earwig added a comment to T106763: Mandatory dependency on mwparserfromhell.

Okay! I released mwparserfromhell 0.4.1 (and 0.4.2, because I made a mistake...) just an hour ago, which fixes the Python 3.5 issue. I also have Windows binary releases working properly thanks to Appveyor.

Jul 30 2015, 7:27 AM · Pywikibot-textlib.py, Pywikibot

Jul 24 2015

Earwig added a comment to T106763: Mandatory dependency on mwparserfromhell.

Sorry, I forgot I had an unreleased fix for the Python 3.5 issue that you've been waiting on. I'm back to working on the parser after a little break so it should come soon.

Jul 24 2015, 2:38 PM · Pywikibot-textlib.py, Pywikibot

Jun 4 2015

Earwig created T101437: Raise memory limit for copyvios web tool.
Jun 4 2015, 8:30 PM · Toolforge

May 24 2015

Earwig added a comment to T68010: Windows users get error: Unable to find vcvarsall.bat when using setup.py due to mwparserfromhell.

mwparserfromhell 0.4.1 should come soon (next few days, just need to work out the windows binary situation) - I'll add 3.5 support along with it, so we shouldn't have any problems with this.

May 24 2015, 8:41 AM · Pywikibot

Mar 30 2015

Earwig merged task T94501: bigbrother doesn't stop into T94500: bigbrother doesn't stop.
Mar 30 2015, 10:58 PM · Toolforge
Earwig merged T94501: bigbrother doesn't stop into T94500: bigbrother doesn't stop.
Mar 30 2015, 10:58 PM · Patch-For-Review, Cloud-Services, Toolforge
Earwig created T94501: bigbrother doesn't stop.
Mar 30 2015, 10:56 PM · Toolforge
Earwig created T94500: bigbrother doesn't stop.
Mar 30 2015, 10:56 PM · Patch-For-Review, Cloud-Services, Toolforge
Earwig triaged T94496: bigbrother doesn't know how to manage uwsgi-python webservers and other new webservice2 functionality as Medium priority.
Mar 30 2015, 10:55 PM · Cloud-Services, Patch-For-Review, Toolforge
Earwig created T94496: bigbrother doesn't know how to manage uwsgi-python webservers and other new webservice2 functionality.
Mar 30 2015, 10:28 PM · Cloud-Services, Patch-For-Review, Toolforge

Mar 9 2015

Earwig closed T74226: Missing page revisions on enwiki, a subtask of T50930: Database replication problems - production and labs (tracking), as Resolved.
Mar 9 2015, 11:50 PM · SRE, DBA, Cloud-Services, Tracking-Neverending
Earwig closed T74226: Missing page revisions on enwiki as Resolved.

Interesting. It seems like the problem has been resolved, although 16 pages still result from the above query. However, that looks like accurate replication of a corrupted database rather than the other way around, so I'm deferring to T92046.

Mar 9 2015, 11:50 PM · Cloud-Services, Toolforge
Earwig added a comment to T92046: English Wikipedia user talk page shown as blue link but displays: revision #0 does not exist.

There appear to be sixteen affected pages in total, given by the following query. Note that NS 3 is User_talk and NS 4 is Wikipedia.

Mar 9 2015, 11:37 PM · WMF-General-or-Unknown

Jan 9 2015

Earwig added a comment to T78378: Depend on mwparserfromhell consistently across operating systems.

I thought that issue was caused by an incorrectly set-up Windows build environment, which @valhallasw and I resolved by distributing binaries with releases? I'm not clear on this since I haven't looked in a while, but it seems the underlying problem is https://github.com/earwig/mwparserfromhell/issues/78 blocking py3.3 and py3.4 builds. Would fixing that and releasing v0.4 along with the necessary binaries be sufficient to fix this?

Jan 9 2015, 9:10 AM · Pywikibot