Page MenuHomePhabricator

ApiQueryImageInfo is crufty, needs rewrite
Open, MediumPublic

Description

The code is a mess, the limit semantics make no sense, and we have several other options that don't really fit non-images.

The best thing to do here is probably to just write a prop=fileinfo module from scratch so we don't have to worry about backwards compatibility, and then deprecate prop=imageinfo.

Current plans:

  • Right now, iilimit specifies the max number of revisions to return per file, which is inconsistent with the rest of the API and isn't particularly sane. For fileinfo, filimits will limit the number of file-info-objects returned per result, and a separate "fioldversions" property (default 0, values integers or 'all') will specify the max number of revisions to be returned per file.
  • fistart/fiend may result in the info for the current revision not being returned.
  • iiprops has three different metadata properties. There really should be only one, and if possible it should be key-value pairs rather than a list of objects with key and value properties.
    • Metadata needs to be separately continuable, see T86611.
  • Figure out something sane to replace iiurlwidth/iiurlheight/iiurlparam. Maybe multi-valued fiparams?
  • prop=stashimageinfo is very odd, it's a prop module but doesn't use any titles. It would make sense to me for prop=fileinfo to have a fifilekeys parameter instead of having a whole separate module for this.
  • prop=videoinfo really isn't needed either. Instead we should make it possible for extensions to add additional info to the fileinfo response.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
ResolvedJdlrobson
OpenFeatureNone
OpenFeatureNone
OpenNone
Resolvedcscott
Duplicatecscott
DeclinedFeatureNone
OpenFeatureNone
OpenNone
OpenNone
OpenNone
ResolvedUmherirrender
DuplicateNone
OpenNone
DeclinedNone
OpenBUG REPORTNone
OpenNone
OpenFeatureNone
OpenFeatureNone
OpenNone
OpenNone
DeclinedNone
OpenNone
OpenNone
OpenNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

iiprops has three different metadata properties. There really should be only one, and if possible it should be key-value pairs rather than a list of objects with key and value properties.

metadata and commonsmetadata are based on file contents and thus a good fit for the imageinfo API. They are not very different and should be merged (maybe with a separate array similar to the one shown for title normalizations to tell how they relate to the original metadata keys?).

extemtadata is based on the file page (well it's based on a hook and the only actual implementation of that hook is based on the file page) and I'm not sure it makes sense to put it into imageinfo which is about file revisions. It is also more widely applicable than the other two (metadata contains relatively obscure things like camera type or exposure time which are only relevant when displaying detailed information about the image; CommonsMetadata contains stuff like author and description which is needed by most interfaces which show or list images) and much smaller so it definitely should not be merged.

Note that metadata can be multivalued, and extmetadata should be multivalued in non-trivial ways (groups of keys should be able to have multiple values), although that's not currently implemented.

prop=stashimageinfo is very odd, it's a prop module but doesn't use any titles. It would make sense to me for prop=fileinfo to have a fifilekeys parameter instead of having a whole separate module for this.

Seems a bit hairy given that lots of information that exists for all other files might be missing for stash images (the title, for one). Also fileinfo is public while stashimageinfo is private and should return different results per user.

In T89971#1055618, @Tgr wrote:

extemtadata is based on the file page (well it's based on a hook and the only actual implementation of that hook is based on the file page) and I'm not sure it makes sense to put it into imageinfo which is about file revisions.

I could certainly see moving it out of imageinfo/fileinfo entirely.

Note that metadata can be multivalued,

Multi-valued metadata seems easy enough:

{
    "key": "value",
    "key2": [ "value1", "value2", "value3" ],
}

prop=stashimageinfo is very odd, it's a prop module but doesn't use any titles. It would make sense to me for prop=fileinfo to have a fifilekeys parameter instead of having a whole separate module for this.

Seems a bit hairy given that lots of information that exists for all other files might be missing for stash images (the title, for one). Also fileinfo is public while stashimageinfo is private and should return different results per user.

A list=stashfileinfo would make more sense than the current prop=stashimageinfo that isn't really a prop module, if we go that route. That may, in fact, be a better idea. But we will need to figure a sane way to structure the code.

Parsoid really needs an abstracted, structured-data version of Linker::makeImageLink() rather than a view of File properties or image table rows. It needs all the information that goes into an image link in one API request:

  • If the width and height are not user-specified, the parameters to File::transform() depend on the source dimensions. If only the height is user-specified, the width parameter to File::transform() depends on File::isVectorized(). Neither of these things is supported by the existing imageinfo interface.
  • The "bug 23834" suppression of client-side upscaling is undesirable. Parsoid needs to do client-side scaling as specified by the MediaHandler and doesn't care what the size of the generated output file is.
  • $wgResponsiveImages should be supported, which requires multiple transform operations per image.
  • Media types are extensible and have extensible transform parameters, via MediaHandler::getParamMap(). So the list of transform parameters (width etc.) cannot be a fixed set. We have iiurlparam but it would be better if it were structured rather than bundled into a handler-specific string intended for naming files.
  • MediaTransformOutput is currently imagined as an HTML fragment generator, but Parsoid really needs its whole contents to be serialized. In ParsoidBatchAPI I proposed MediaTransformOutput::getAPIData() which would return structured data about the transform output. Media types would opt in to this interface since the call would be wrapped in is_callable().
  • Batch requests with title lists are obviously useless unless the transform parameters can be specified per-file.

I imagine a solution for this will be prototyped in ParsoidBatchAPI and then some variation of it will be considered for inclusion in core.

I was wondering if a cacheable imageinfo API would be feasible and make sense. A lot of these images and their metadata are rather static, so we might be able to achieve a decent speed-up if we found a way to leave more of the scaling calculations to clients.

Do you see any obvious flaws in this idea?

You are thinking of caching source size, metadata etc. without scaling parameters? Or would the URL include the thumbnail width and height?

An alternative approach more geared towards caching and client-side URL construction is being discussed in T66214.

An alternative approach more geared towards caching and client-side URL construction is being discussed in T66214.

A RESTy method of fetching thumbnails is good, but doesn't have anything to do with the action API's imageinfo beyond that we might be able to punt on at least some aspects of thumbnails.

@Anomie: I agree that much would stay the same, but we might want to revisit bits directly relevant to the construction of thumb urls to take advantage of a more predictable URL structure and scaling behavior. We actually started with the intention of making imageinfo as-is cacheable in T116840, but then realized that it's still relatively hard for clients to construct thumb urls without passing desired dimensions to the API, which of course clashes with caching.

T66214 doesn't really block this, but if the sort of thing being discussed there happens then one of the more nebulous bits of this (fixing iiurlwidth/iiurlheight/iiurlparam into something multi-valued but still sane) would just go away since we could just point to the "/api/rest_v1/media/img/{original-sha1}" bit and let the client fill in the width/height/page/seek-time/language/etc. when it talks to that endpoint.

But that, of course, depends on (1) someone actually working on T66214 any time soon and (2) the end result actually existing for all MediaWiki installations and not just as an optional bespoke service like too many things seem to be planned as these days.

MER-C subscribed.

Proposing for 2017 Developer Wishlist.

This proposal is selected for the Developer-Wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) of the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.