Page MenuHomePhabricator

Foreign Affairs PDF downloads aren't proxied correctly
Closed, ResolvedPublic

Description

When on a Foreign Affairs magazine page (https://www-foreignaffairs-com.wikipedialibrary.idm.oclc.org/issues/2020/99/4), the PDF download link isn't proxied correctly. Clicking it takes the user to https://www.foreignaffairs.com/system/files/pdf/2020/99400_1.pdf, which presents an Access Denied screen.

The URL can be crafted manually (https://www-foreignaffairs-com.wikipedialibrary.idm.oclc.org/system/files/pdf/2020/99400_1.pdf) and functions as expected.

Event Timeline

Samwalton9-WMF created this task.

Some of the page is generated via Javascript, and I'm seeing some elements being blocked in the Chrome Inspect view. Looks like we might just have something additional we need to let through the proxy configuration.

This one took a while to track down. These links are assembled in a pretty unusual way.
Some javascript makes an xhr request and gets back a thing that isn't quite a url

{
  "audio": null,
  "pdf": {
    "download_allowed": true,
    "download_url": "https:\/\/www.foreignaffairs.com\/system\/files\/pdf\/2020\/99400_1.pdf"
  }
}

which it then parses into a url that gets injected into the page.
I'll have to write a find/replace pattern into the db stanza, which will slow the page loads.

Since this should be an issue for any Foreign Affairs users would it be worth me emailing them to see if there's another solution, or potentially a fix they can implement?

It looks like they are using drupal with a bunch of modules to implement their platform. My guess is that changing their data format would be problematic. Let me try to hack together a working config and we'll see if it performs acceptably.

I just imagine other folks must be using ezproxy or similar and thus getting the same issue?

I've usually found in situations like this that it's most helpful to just share the working config once you have it.
It can also be helpful to poke the vendor to fill out the ezproxy vendor form so the config can get published to ezproxy's support site
https://help.oclc.org/Library_Management/EZproxy/EZproxy_for_content_providers

I have this almost working by the way. Currently getting proxied PDFs with a cert error. Have the fix wending it's way through the oclc deploly pipeline.

okay, flagged for production pick up. I'm calling it a day, but this should be working 30 minutes from now. I'll check in tomorrow.