Support anonymous FTP access in CheckIfDead class (Cyberbot)
Closed, ResolvedPublic

Description

Per T136728, sometimes FTP sites require you to log in as a guest in order to access the content (See RFC 1635: How to Use Anonymous FTP). Currently, CheckIfDead::checkDeadlinks() reports these pages as dead even though they aren't. For example, ftp://ftp.aip.org/epaps/phys_rev_lett/E-PRLTAO-98-047705/.

In order to access these pages, you have to append something like "--user anonymous:anonymous@domain.com" to the curl request, where the user is "anonymous" and the password is any email address.

Example:

$ curl ftp://ftp.aip.org/epaps/phys_rev_lett/E-PRLTAO-98-047705/
curl: (67) Access denied: 530
$ curl ftp://ftp.aip.org/epaps/phys_rev_lett/E-PRLTAO-98-047705/ --user anonymous:anonymous@domain.com
total 20760
-rw-r--r-- 1 20    1065 Jan 29  2007 README.TXT
-rw-r--r-- 1 20 8361184 Jan  9  2007 Supplementary_Video_1.avi
-rw-r--r-- 1 20 1819942 Jan  9  2007 Supplementary_Video_2.avi
-rw-r--r-- 1 20 3436606 Jan  9  2007 Supplementary_Video_3.avi
-rw-r--r-- 1 20 2239436 Jan  9  2007 Supplementary_Video_4_.avi
-rw-r--r-- 1 20 2037788 Jan  9  2007 Supplementary_Video_5.avi
-rw-r--r-- 1 20 3312198 Jan  9  2007 Supplementary_Video_6.avi

In order to properly support these FTP sites, when CheckIfDead::checkDeadlinks() identifies a URL as an FTP site (which it already does), and the request fails, it should try again with "--user anonymous:anonymous@domain.com" as part of the curl request. Note that some FTP sites will return an error if you attempt to log in anonymously, so this should only be tried if the first attempt fails.

Please test with the following URLs to make sure they all return false ("alive"):

  • ftp://ftp.aip.org/epaps/phys_rev_lett/E-PRLTAO-98-047705/ (requires anonymous log in)
  • ftp://ftp.rsa.com/pub/pkcs/ascii/layman.asc
  • ftp://ftp.funet.fi/pub/standards/RFC/rfc959.txt

And add the first URL above to checkIfDeadTest.php.

kaldari created this task.Jul 6 2016, 1:31 AM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 6 2016, 1:31 AM
kaldari triaged this task as High priority.
kaldari updated the task description. (Show Details)Jul 6 2016, 1:38 AM
Cyberpower678 closed this task as Resolved.Jul 7 2016, 9:08 PM

Committed my update, and tested the given URLs. They return false now.