Page MenuHomePhabricator

Provide short URL to file description page in imageinfo API
Closed, ResolvedPublic

Description

Normal file page links can be very long (up to ~300 characters) so there is need for a shorter URL to be exposed via the imageinfo API (see T119686 for a use case). One possibility would be to expose the page ID, and then an URL could be constructed as scriptUrl + '?curid=' + pageId (script URL is already exposed by the filerepoinfo API).

This will be a bit hacky since shared-DB file repositories are not technically required to have file pages, but we already make the same assumptions with the description page URL.

Event Timeline

Tgr raised the priority of this task from to Medium.
Tgr updated the task description. (Show Details)

This is really a core mediawiki bug, right? The change needs to be made to core API files.

Anomie claimed this task.
Anomie added a subscriber: Anomie.

The page_id is already available from the API. Construct the URL client-side as proposed in the task description.

Page ID is not available for images from foreign repositories (when queried by the imageinfo API). For ForeignAPIRepo files one could fetch it with an extra API request, although it's a but cumbersome. For ForeignDBRepo, there is no way to get the page ID short of trying to guess what the remote API endpoint is.

Example beta enwiki query which contains all three repo types: http://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&prop=imageinfo&titles=File:Screen_Shot_2015-10-07_at_1.45.55_PM.png|File:Multipage_tif_example.tif|File:Example.png (filerepoinfo query here). The core API can't really be expected to know the page id for the nonlocal files (the associated pages don't exist locally, or might even exist but have a completely different id), but there could be an imageinfo subproperty that shows the pageid at the wiki of origin. (Or the short URL, which would default to a curid link but could be overridden by a hook; that would be handy for wikis which have an url shortener extension installed.)

This is really a core mediawiki bug, right? The change needs to be made to core API files.

Some of it, but the imageinfo API mostly just fetches data from File objects, so the substantial part of the change would have to happen there.

This is really a core mediawiki bug, right? The change needs to be made to core API files.

Some of it, but the imageinfo API mostly just fetches data from File objects, so the substantial part of the change would have to happen there.

Okay cool, I plan to claim this on GCI after I finish one or two others (assuming someone doesn't claim it before I get to it).

I wonder how generally useful curid-based short URLs would really be, considering that such a short URL pointing to an enwiki image will break completely opaquely if the image is moved to Commons (the normal URL will generally still work, but the curid-based URL will just give a page saying "The requested page title is empty or contains only the name of a namespace"). Or if the image page is deleted and then undeleted.

I wonder how generally useful curid-based short URLs would really be, considering that such a short URL pointing to an enwiki image will break completely opaquely if the image is moved to Commons (the normal URL will generally still work, but the curid-based URL will just give a page saying "The requested page title is empty or contains only the name of a namespace"). Or if the image page is deleted and then undeleted.

Yes, that sucks (then again a stable but long URL that people are not actually using due to its size is not that useful either). In the long term, on Wikimedia sites, we should rely on T108557: Review and deploy UrlShortener extension to Wikimedia wikis for short URLs, but this sounds too broken even for a temporary solution.

The related bugs are T28123: Special:Undelete doesn't use ar_page_id (seems like an easy fix) and T73578: Deletion log excerpt (mw-warning-with-logexcerpt) not shown when only curid given and page has been deleted (seems also easy-ish to fix by making MediaWiki::parseTitle() fall back to a delete log record lookup for the page title, but the logging table is not indexed quite right for that and is huge so maybe that wouldn't perform well?). I wonder if those could be spun into further GCI tasks :)

Alternatively, we could use revision ID instead of page ID for the poor man's short URL. That survives delete/restore; still behaves poorly while in a deleted state, but making MediaWiki::parseTitle() check for deleted revisions seems straightforward.

Adding curid key to imageinfo should do it, right? Or should the url key be modified?

Adding curid key to imageinfo should do it, right?

No, since people want it to work with shared image repos too.

Or should the url key be modified?

No, that key gives a URL to the image itself rather than its description page.

@Anomie, comments starting with no are, most often not helpful. This doesn't help me understand how you would like it to be.

No, since people want it to work with shared image repos too.

Why would adding curid to imageinfo prevent this from working with shared repos?

Could you please tell me what you would like this to be and, if possible, suggest me an approach? Thank you 😊

Adding curid could work (although preferably under a name like pageid), with additional information from the filerepoinfo API an URL could be pieced together. (Of course for a remote file the remote page ID would have to be involved.) But it's not great - curid is not a great way of providing stable URLs, and it would be nice to support arbitrary URLs (such as those generated by UrlShortener). So I would go for an URL (call it descriptionShortUrl or something) which defaults to a curid URL and later a hook that can be used to replace it.

@Tgr, here is the model I am trying to implement:

  • for local files, get the pageid and then add a key descriptionShortUrl (as you proposed) like https://localwiki.com/w/?curid=<some-number>
  • for foreign API files, get the URL of the remote Wiki and the pageid and create a URL, in the same key descriptionShortUrl, like https://wikipedia.org/w/?curid=<some-number>

I am not really sure what to implement for foreign DB files, I am not sure how this actually works.

For foreign API files, just grab descriptionShortUrl through the API.

Foreign DB files are not too different from local files, you just need to use a different DB. (They inherit from LocalFile so you might not need modify ForeignDBFile at all.)

Change 262415 had a related patch set uploaded (by Victorbarbu):
[WIP]Provide short URL to file description page in imageinfo API

https://gerrit.wikimedia.org/r/262415

Change 262415 merged by jenkins-bot:
Provide short URL to file description page in imageinfo API

https://gerrit.wikimedia.org/r/262415

Krinkle added a subscriber: Krinkle.

Change 262415 merged by jenkins-bot:
Provide short URL to file description page in imageinfo API

https://gerrit.wikimedia.org/r/262415

Marking as resolved. Also T119686 was fixed as well.