The [image Info](https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bimageinfo) end point is one of the most popular end points in the PHP API, with around 1100 requests per second. While the information returned by it is fairly static and cacheable, there is currently no support for caching. With a cacheable imageinfo end point we should be able to significantly lower response latencies for imageinfo requests, while also lowering the load on the PHP API cluster.
The basic requirements for a cacheable end point are:
- deterministic URL to enable active (Varnish) purging
- less or no parametrization about the information contained in the response, to reduce cache fragmentation
There might be others, and we are soliciting input from the main consumers of the imageinfo end point to discover these.
One major consumer is #parsoid. As Parsoid has fast local-DC connectivity, minor bandwidth savings are probably not critical. As a result, returning more information than strictly required might be okay.
### Possible schema
A [request for all properties](https://en.wikipedia.org/w/api.php?action=query&titles=File:Albert%20Einstein%20Head.jpg&prop=imageinfo&iiprop=timestamp|user|userid|comment|parsedcomment|canonicaltitle|url|size|dimensions|sha1|mime|thumbmime|mediatype|metadata|commonmetadata|extmetadata|archivename|bitdepth|uploadwarning) results in about 13k of uncompressed JSON. The main bulky properties are:
- `extmetadata`, especially the `Credit` and `Permission` sub-fields. This metadata is HTML formatted, which greatly increases its size.
- `html`, the HTML returned by the `uploadwarning` property. This is documented as internal-only in any case, so should probably be excluded in any case.
With these two properties removed, [the response shrinks to a very reasonable size](https://en.wikipedia.org/w/api.php?action=query&titles=File:Albert%20Einstein%20Head.jpg&prop=imageinfo&iiprop=timestamp|user|userid|comment|parsedcomment|url|size|dimensions|sha1|mime|thumbmime|mediatype|metadata|commonmetadata|archivename|bitdepth|canonicaltitle). To make a decision on `extmetadata`, we should look into how this is currently used by consumers. Depending on the actual use cases it might be desirable to either expose this HTML-formatted metadata in a separate API end point, or include it as plain data.
### Cache invalidation
The caches need to be invalidated on
- file upload / deletion / rename
- any changes to the file description page
We cover much of this in the RESTBase extension, but going forward we should make sure that the events defined in T116247 will support this use case really well.