Page MenuHomePhabricator

Decide on HTTP vs HTTPS for concept URIs on Commons
Open, HighPublic

Description

As a Commons user, I want a decision to be made on using HTTP or HTTPS for concept URIs to have a consistent usage across SPARQL, dumps and Concept URIs links and maintain conformance with accepted standards.

Since Commons is starting from scratch with Structured Data, we have the opportunity to make the right decision between HTTP or HTTPS for concept URIs.

For context: https://www.w3.org/DesignIssues/Security-NotTheS.html

AC:

  • Decision on HTTP vs HTTPS for concept URIs (Update: Decision is to use HTTPS)
  • Dump, SPARQL and ConceptURI links on wiki should be coherent with this decision

Event Timeline

Gehel created this task.Jul 22 2020, 1:54 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 22 2020, 1:54 PM
Gehel triaged this task as High priority.Jul 22 2020, 1:54 PM
Gehel moved this task from All WDQS-related tasks to RDF Model on the Wikidata-Query-Service board.

Wait, what, didn't we have this discussion quite some time ago for Commons and decided it would be https from the start? How did the http slip in again? If I look at a not so random item https://commons.wikimedia.org/wiki/Special:EntityData/M90544172.rdf it says :
<schema:about rdf:resource="https://commons.wikimedia.org/entity/M90544172"/>

That's also the url you get on the left as the concept URI. The introduction of http is a mistake and a bug.

CBogen added a subscriber: CBogen.Jul 23 2020, 8:47 PM

@Multichill it's very possible this discussion was had in the past but that those of us working on it now weren't around at that point - do you happen to have a link to the discussion that we can review? If the decision was already made, then going forward with the change will be a more straightforward next step to take :)

@Multichill it's very possible this discussion was had in the past but that those of us working on it now weren't around at that point - do you happen to have a link to the discussion that we can review? If the decision was already made, then going forward with the change will be a more straightforward next step to take :)

Been looking around and can't find it. I recall a discussion about Wikidata not using https, but that for Commons because the it's all new, we might as well use https from the start. That's what got implemented in the rdf and the concept URI's. So it's a bit confusing to see http popping up again now.

I guess that Wikidata concept URIs are using http:// because it is what is usually done by RDF datasets (DBpedia...), mostly for backward compatibility reasons.
I would be slightly in favor of using http:// URI for Commons entities in order to have all Wikibase entities and relations using http:// instead of having some with http:// and some with https;//.

I have not found the original tickets about Wikidata concept URIs.
I believe that the original Wikibase mapping has been designed by @DVrandecic.

@DVrandecic sorry for notifying you, do you have any opinion about it?

Removed reference to T226453 . This is about Concept URI's on Commons (httpS) and not about concept URI's on Wikidata (http).

Akuckartz added a subscriber: Akuckartz.

Use only https please.

Thanks @Tpt for pulling me in.

I checked in my own PhD thesis, as I had a short section on that, and ten years ago, when I wrote this, basically everything on the Linked Open Data Web was http: http://simia.net/download/ontology_evaluation.pdf , p.67

Fortunately, that has changed.

More background can be found here: https://wiki.dnb.de/display/DINIAGKIM/HTTP+vs.+HTTPS+in+resource+identification from the SWiB18 discussion on the topic.

In my opinion, I would suggest to use the chance and since we are building a new resource here, to use https from the start.

It was a mistake that we didn't do that for Wikidata, which was due to the fact that Wikipedia only started offering https in 2011, and switched to it only in 2015. We are not in that situation today, and can make the right decision.

I would love for Wikidata to switch, but that's a different task.

So my recommendation is - which is not informed at all by possible technical constraints - to use https wherever possible, and in particular for a any new namespaces such as introduced for WCQS and the Commons data dumps to use https from the start, even though this means mixing some http and https identifiers.

I would not care about consistency with the Wikidata namespaces. As said, i consider that, retroactively, a mistake, and would love to change it.

mxn added a subscriber: mxn.Jul 24 2020, 8:24 PM
CBogen added a comment.Aug 3 2020, 5:17 PM

When the dump was reloaded last week, WCQS was changed to HTTPS.

The Commons URIs and the dumps are still HTTP so this work is on the Structured Data team.

When the dump was reloaded last week, WCQS was changed to HTTPS.

The Commons URIs and the dumps are still HTTP so this work is on the Structured Data team.

The Concept URI link on commons is still HTTP, (the RDF dumps are using https for commons related URIs)

CBogen updated the task description. (Show Details)
CBogen added a comment.Aug 5 2020, 4:21 PM

Note that the SD team work to change the Concept URIs in Commons is estimated to be a small.