Page MenuHomePhabricator

Facilitate the manual check of proposed PDF/HTML
Closed, ResolvedPublic

Description

Because we care about users checking the links as thoroughly as possible, we need to provide them with the best tools to do so.

After T166592: Simplify access to Sherpa/Romeo information (e.g. via Dissemin), the next step is to help the check of the proposed PDF/HTML:

  • increase the chances that the URL preview is above the fold, rather than hidden (e.g. by floating the title and search bar to the right of the logo and the "copyright guidance" section to the right of the citation);
  • change the iframe to increase chances that it will actually load the PDF in the most common browser configurations.

The current layout is clearly suboptimal:

2018-04-21_OAbot_screen.png (900×1 px, 105 KB)

Event Timeline

I think this is our highest priority currently, based on demand.

The issues with the preview I've identified so far are:

  1. links over HTTP, which will not be loaded when you're using oabot over HTTPS: no real solution, the workaround is to use oabot over HTTP (I thought a proxy existed at some point to serve such resources, but I can't find anything now);
  2. URLs which return X-Frame-Options: SAMEORIGIN like http://www.dtic.mil/get-tr-doc/pdf?AD=ADA065558 , again little can be done other than downloading the file and serving it locally.

I've added a route to stream the URLs which are not HTTPS HTML: https://github.com/dissemin/oabot/commit/11ae96694548e728362d85c787546cdb2c9d5b44

The local streaming works pretty well. I've tested many domains yesterday and there are only a couple where it just fails for whatever reason.