Page MenuHomePhabricator

Making WDQS onscreen output more Commons friendly
Closed, ResolvedPublic

Description

I recently created a Commons template, "Category contains", to store a SPARQL-based specification for the content of the category, and present a WDQS query to see what is on Wikidata that appears to match. (See eg example use on a category; and post to Commons Village Pump to introduce and motivate it).

I hope the template gets taken up widely, because I think it could be very useful for the Structured Data for Commons project to be able to analyse at scale the kind of combinations that typically-used categories represent; and to present people with an idea of what (Wikidata + Structured Topic Data) might currently be able to achieve; how close (or not) it might be able to get to the contents of present-day categories; and what sort of properties currently need better data coverage on Wikidata if WD+SD is to get nearer to what categories can currently show for a particular combination and refinement of topics.

But it made me realise that there are a couple of tweaks to the WDQS output that could really make a big difference in the usability of the onscreen output for this kind of purpose, ie people comparing available content on Commons with the output of queries:

  • It would be good if the values of property P:373 ("Commons category") could be presented as links rather than plain text. (The same also goes for the values of external identifiers). Yes, it's easy enough to CONCAT the domain and cast the string to a URL, but it makes the output look a lot messier and harder to read. It would be nice if internally the column of values were being retrieved as (URL + text)-valued objects, and could then be presented as such.
  • It's an issue that the image thumbnails link directly to the full size images (often huge). I accept this may be exactly what one wants for batch queries, but usually for interactive work one doesn't want that huge image. Often what one wants to get to, especially for investigatory work on the image, or to see how it compares with other possible images that could have been chosen to be the P18 value instead, would be the Commons file page, which one can't get back to from the full big image. A really good compromise perhaps might be if it were possible for the thumbnail to link to the MediaViewer for the image. A user could then either click onwards to the full-size image, or instead follow the link from MV to the Commons file page instead. And a real bonus would be if the links for the rest of the image results from the query could be sent as the rest of the "carousel" of images that MV displays, so that one could click through an MV slideshow of them if one wanted.

I hope this doesn't sound as if I am carping, because I am totally blown away by what you guys have achieved with WDQS and its amazing range of outputs in the last few months. But this would be the last 2% that would just be the icing on the cake.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

It would be good if the values of property P:373 ("Commons category") could be presented as links rather than plain text

The problem is, when displaying query results, the GUI code has no idea a specific value came from P373. You can use projection like uri(concat('https://commons.wikimedia.org/wiki/Category:', ?commonscat)) but otherwise I don't see a way to know.

(The same also goes for the values of external identifiers)

External IDs are trickier, because the information which is needed is not present readily. See T121274 for more work on this.

It would be nice if internally the column of values were being retrieved as (URL + text)-valued objects

Not sure what you mean here. Can you give an example? SPARQL doesn't really allow any objects... Maybe I am missing something here.

A really good compromise perhaps might be if it were possible for the thumbnail to link to the MediaViewer for the image

This may be possible to do in GUI, so which URL should it be using?

Finally, it would be good if there could be a way to switch between different views when a query has been run in "kiosk" mode

I think it may be possible to add some mode switch menu to embed page. @Jonas?

The standard URLs for the MediaViewer appear to be the wiki page it was launched from, then #/media/ and then the name of the file to display, so

with the other files to include in the slideshow (if the user wants to click through to them) being deduced from the base URL. The base URL also tells MediaViewer the page to return to if the user closes the viewer.

One could therefore link to MediaViewer like this:

i.e. https://commons.wikimedia.org/wiki/ + Filename + #/media/ + Filename

I don't know if there are any other ways to link & invoke MediaViewer (but there might be), or to pass it the names of the other images to include in the slideshow.

The other (simplest) approach would just be to link to the Commons file page,

but personally I do quite like the MediaViewer (though some people don't)

@Jheald right now, if you click on the "picture" icon before the picture name, you already get a gallery view. Is that one not good? As for file link, I don't remember why we chose Special:FilePath, @Jonas may know.

The gallery view is nice!! I don't know how I missed it - I thought I had tried it, and it just went to the same place as the file link.

As for the file link, presumably it was chosen because something like <http://commons.wikimedia.org/wiki/Special:FilePath/Universe%20Photo.svg> is the IRI that used to represent the image in RDF exports, according to the information on the Wikibase RDF Dump Format page; and so presumably also the IRI that one should get from the direct-access and download versions of the query, to provide the full image as linked data.

On the other hand ,for interactive use, it's often very helpful to be able to see what's on the File page at Commons; but there's no way to click back from the full-resolution image to the Commons page, without manually re-writing the URL every time. Also, as mentioned before, the full-resolution image may often be a lot too much -- often very big, so very heavy to download.

(It may even indeed be a legal requirement to be able to easily get to the licensing and attribution image on the file page, under the terms of the various CC licenses that demand that images are only made available with attribution. So strictly speaking at the very least the linked data service should probably be making available some attribution URL + some license URL if it is providing the bare image URL. But that may be something that will have to wait for the full roll-out of Structured Data on Commons).

I don't know if there are any other ways to link & invoke MediaViewer (but there might be), or to pass it the names of the other images to include in the slideshow.

No.

With regard to external identifiers and to the P:373 Commons category values, it strikes me that what we really need in the triplestore is a special datatype -- in a similar way to the way we have a special datatype for Commons media.

It would seem that what we need is a "link" datatype, that can carry both a URL and a string, and can therefore be recognised as such and formatted appropriately by the GUI. Presumably routines could be provided to easily cast it to string or to IRI; and/or one could just have two RDF connector properties to connect it to its string and to its IRI respectively.

It would also be useful for the GUI to show Wikipedia sitelinks as links rather than full URLs.

The shorter form is especially valuable when one is trying to include a lot of columns in a table report, without losing the grid structure (which happens if there are too many columns with unbreakably long text).

Of course one can more or less extract the page name with something like
BIND(REPLACE(REPLACE(str(?article), 'https://en.wikipedia.org/wiki/', ''),'%20',' ') AS ?en_wiki)
but this very basic workaround still leaves eg brackets and international characters in percent encoding; and of course it loses the clickability of the link.

It would be nice to be able to have a wiki article name by default show up in the GUI simply as linked text.

Finally, it would be good if there could be a way to switch between different views when a query has been run in "kiosk" mode

Please see T151057: [Story] Toolbars for result views

@Jheald right now, if you click on the "picture" icon before the picture name, you already get a gallery view. Is that one not good? As for file link, I don't remember why we chose Special:FilePath, @Jonas may know.

AFAIK Special:FilePath is from the RDF export. Maybe the reason is that it is a direct link to the actual image resource.

It would be good if the values of property P:373 ("Commons category") could be presented as links rather than plain text. (The same also goes for the values of external identifiers). Yes, it's easy enough to CONCAT the domain and cast the string to a URL, but it makes the output look a lot messier and harder to read. It would be nice if internally the column of values were being retrieved as (URL + text)-valued objects, and could then be presented as such.

@Smalyshev we may want to change this in the RDF export so it is an URI instead of text.

Change 340503 had a related patch set uploaded (by Jonas Kress (WMDE)):
[wikidata/query/gui] Change image link media viewer

https://gerrit.wikimedia.org/r/340503

Thanks for this Jonas, but given there is already the nice viewer on the page (which I didn't know about when I wrote the bug, because until then I'd only right-clicked on it), perhaps on consideration a better link is just the simple Commons file page, ie 'https://commons.wikimedia.org/wiki/File:{FILENAME}'

Change 340503 merged by Jonas Kress (WMDE):
[wikidata/query/gui] Change image link media viewer

https://gerrit.wikimedia.org/r/340503

I lost overview a bit here. What still needs to be done?

Jonas claimed this task.

I am closing this even if I think proper linking to commons category is missing.
If the demand is still there please create a new ticket, thanks!