Page MenuHomePhabricator

[RFC] Treat "virtual" MediaInfo entities as existing pages (or not)
Closed, InvalidPublic

Description

We currently display a virtual ("dummy") MediaInfo entity when trying to access a non-existingMediaInfo page associated with an existing File page (see T137534).

There are several open questions about how far this "virtual" entities should go. Should we consider such entities to exist (so it's just the wiki page that doesn't exist)? In particular:

  • should we send status 404 when displaying a virtual entity?
  • should EntityStore resolve valid IDs to virtual entities, even if there is no corresponding wiki page to load data from?
  • should virtual entities be supported by Special:EntityData?
  • should virtual entities be included in dumps?

Note that "virtual" MediaInfo entities are not necessarily empty. At least their RFC mapping (and probably also their JSON representation) would at least contain the URI of the corresponding media file (and perhaps URLs of the file page, the file itself, a thumbnail, etc). Additional information could be optionally included: the media file's mime type, size, resolution, duration, and other meta-data.

Also keep in mind that in the final product, the MediaInfo content would not live on a separate page, but in a "slot" of the file page itself, see T107595.

Event Timeline

Repeating what I wrote in https://gerrit.wikimedia.org/r/295213:

  • I think sending a "200 OK" status code is wrong. These pages should be indexed by search engines when they exist and contain something. With a 200 the number of pages indexed on Commons will double, but half of these pages will not contain anything useful. Let's please stay with the 404 or use a more suitable status code.
  • I think Special:EntityData should return an empty JSON blob, as if an empty entity exists. Main reason: It must be possible to edit such a non-existing page. We agreed that the UI should allow editing non-existing entities. External tools and bots should behave identical, which means all our getter APIs as well as EntityData should return something you can edit.
  • But not in dumps, for, well, I think obvious reasons. The stuff in dumps is usually not used for editing, but purely for consumption. Having an empty JSON blob in there is not very different from omitting it.

@thiemowmde The JSON would not be empty. It would contain basic information about the file (at least the URI). The association between MediaInfo-Id and File-page name is not available elsewhere. I think that mapping needs to be accessible via SPARQL.

For the question "should virtual entities be supported by Special:EntityData?" @Lydia_Pintscher said YES during our sprint planning meeting today. :)