Page MenuHomePhabricator

Providing both HTTPS as protocol and 443 as port makes citoid fail
Closed, ResolvedPublic0 Estimated Story Points

Description

Given that I had opened VisualEditor
And started "Cite"
When I filled in a link to an external site (book at Oria.no)
Then I expect a properly formatted reference

Instead I got a failure

Screendum citoid 2016-04-24.png (279×436 px, 25 KB)

If I remove the port number the reference is properly formatted

Screendum citoid 2016-04-24-correct.png (242×439 px, 17 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

There is a note at w:no:Wikipedia:Torget#HTTPS_og_port_443 about this bug as the site Oria.no is the major provider of bibliographic information in Norway.

Mvolz subscribed.

I believe that ports are disallowed as a security feature to prevent citoid from being used for port scanning but I could be wrong; @dpatrick is this true? Is there a way to allow these kinds of links?

I have notified Bibsys about the problem with the explicit port address, hopefully they can fix this.

@jeblad, just to clarify, the original citation URL which caused the error is https://bibsys-almaprimo.hosted.exlibrisgroup.com:443/BIBSYS:default_scope:BIBSYS_ILS71491487680002201, correct?

@Mvolz, can you tell me the version of node.js and the url module that are currently deployed? My hypothesis is that the url.parse is incorrectly parsing the URL.

Oh, and to answer your question, @Mvolz, explicit port specification is allowed. It is access to localhost and private IP address ranges that is disallowed.

Yes, I got the correct page.
Attached the source of the page.

There are two problems here for the URL https://bibsys-almaprimo.hosted.exlibrisgroup.com:443/BIBSYS:default_scope:BIBSYS_ILS71491487680002201:

  1. Citoid seems to disregard the protocol in the URL and puts http instead, which results in it querying http://bibsys-almaprimo.hosted.exlibrisgroup.com:443/BIBSYS:default_scope:BIBSYS_ILS71491487680002201
  2. The underlying library Citoid is using for issuing external requests does not strip the port off of the hostname, which results in it setting the header host: 'bibsys-almaprimo.hosted.exlibrisgroup.com:443. As such, it is rejected by the server (likely because of the SSL certificate name mismatch).

As you've found yourself, @jeblad, the workaround is to strip it manually from the URL being fed to Citoid, but the problem surely needs a proper fix.

It can be reproduced with curl as well:

curl -I -L https://bibsys-almaprimo.hosted.exlibrisgroup.com:443/BIBSYS:default_scope:BIBSYS_ILS71521654120002201 \
   -H 'Host: bibsys-almaprimo.hosted.exlibrisgroup.com:443'

It's very odd indeed to have a https url redirect to http and then back to https. I will also complain to Bibsys about this.

The example https://bibsys-almaprimo.hosted.exlibrisgroup.com:443/BIBSYS:default_scope:BIBSYS_ILS71491487680002201 seems to work now? Can anyone else confirm that this is in fact solved?

Seems like this works now?

First reference on the newly created wikipage is a random example from Oria.no while the second reference is the initially posted failing reference.

As the failing reference used as an example in the initial post now works I will close this task.