Page MenuHomePhabricator

Compare HTTP responses for NO RESPONSE FROM SERVER with external API
Closed, ResolvedPublic

Description

Some websites block Labs servers. Use an external API to verify "NO RESPONSE FROM SERVER" is truly a "Response timed out" and not "Request blocked".
Concept:
"Suppose isup.me reports as up, and IABot reports as down (server blockage) you see the responses don't match, so you don't do anything on this page"
In this concept, isup.me can replaced by another external (not Labs server) API.

Event Timeline

This would require additional servers to implement, and must be outside of the Labs' infrastructure.

The advantages of this would however be instant automatic detection of false positives, and the bot's false positive rate as a result would drop to virtually 0.

Cyberpower678 triaged this task as Medium priority.Aug 8 2017, 3:35 PM
jeblad added a subscriber: jeblad.Aug 10 2017, 2:36 PM

The service IsUp can't be used for national services that blocks foreign requests.

Also, some national services will only give metadata about an object for foreign requests.
Se for example http://nb.no

Cyberpower678 added a comment.EditedAug 10 2017, 2:38 PM

The service IsUp can't be used for national services that blocks foreign requests.
Also, some national services will only give metadata about an object for foreign requests.
Se for example http://nb.no

I'm not planning on using isUP. I'm planning on using external servers, with IABot's checking code, located in different locations. If possible, I will try to make use of a VPN in the process.

BTW, your URL loads just fine in America.

jeblad added a comment.EditedAug 10 2017, 7:28 PM

It is a site much like Google Books covering published Norwegian books for the last four centuries. This one should be a link to a book where you only get metadata; Ibsen. If you do get the scanned book, then something is broken…

This link is the permanent link. During initial lookup the domain "nb.no" is used, and due to sloppy editing it has made its way into Wikipedia.

It is a site much like Google Books covering published Norwegian books for the last four centuries. This one should be a link to a book where you only get metadata; Ibsen. If you do get the scanned book, then something is broken…
This link is the permanent link. During initial lookup the domain "nb.no" is used, and due to sloppy editing it has made its way into Wikipedia.

Well it returned a 200 OK response with my algorithm.

This implementation is ready to be tested in v1.5beta2. If successfully false positives to drop off significantly.

I cannot wait to test this.

I am shutting the bot down. A maintenance script must be run on the existing DB values for this to work. It will also provide for the opportunity to test the new implementation.

Bot is shut down on all wikis.

Pinging @Green_Cardamom for this information.

Good timing as I just completed IMP yesterday (verified 4 million URLs in 2 months) and now back to running WaybackMedic on enwiki for the moment.

Cyberpower678 closed this task as Resolved.Aug 18 2017, 9:44 AM

Implemented in v1.5beta2