Page MenuHomePhabricator

Whitelist search.ebscohost.com
Closed, ResolvedPublic

Description

Similar to T164553, EBSCOhost is a major, paywalled database. Its archived pages always link to a login screen. Some of the links (those with db=a9h, for instance, which were 15% of the results) do redirect in-browser to an abstract of the cited article, but their archived links do not.

Example:
http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=5331234&site=ehost-live will resolve but
https://web.archive.org/web/20180429204700/http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=5331234&site=ehost-live won't

So the search.ebscohost.com domain should be whitelisted ("treats the URL as alive and locks it in that state so it cannot be modified by the bot")

Happy to assist with stuff like this, especially as the whole ebscohost.com domain needs some configuration, but I think whitelisting/blacklisting full domains requires a user right, if it didn't before

Event Timeline

czar created this task.Apr 29 2018, 8:52 PM
Restricted Application added a project: Internet-Archive. · View Herald TranscriptApr 29 2018, 8:52 PM
czar updated the task description. (Show Details)Apr 29 2018, 9:05 PM
Vvjjkkii renamed this task from Whitelist search.ebscohost.com to b0daaaaaaa.Jul 1 2018, 1:13 AM
Vvjjkkii removed Cyberpower678 as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii added a subscriber: Cyberpower678.
Green_Cardamom renamed this task from b0daaaaaaa to Whitelist search.ebscohost.com.Jul 1 2018, 4:57 AM
Green_Cardamom assigned this task to Cyberpower678.
Green_Cardamom raised the priority of this task from High to Needs Triage.
Green_Cardamom updated the task description. (Show Details)
Green_Cardamom removed a subscriber: Cyberpower678.
Cyberpower678 closed this task as Resolved.Jul 18 2018, 2:57 AM

It already appears to be locked behind the paywall state. Nothing to do here.