Page MenuHomePhabricator

(ContentHandler) use symbolic names for content models and formats in the database
Closed, ResolvedPublic

Description

instead of numeric ids, symbolic names should be used to represent content model and format in the database and throughout the code.

for performance reasons, the respective database fields should be set to null in case the default model resp. format is used.


Version: master
Severity: normal

Details

Reference
bz37746

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 12:24 AM
bzimport set Reference to bz37746.

Is that a WikidataRepo or a Core bug?

Perhaps enum type could be used as a compromise? I know it's not popular, but this seems as the place for it.

(In reply to comment #1)

Is that a WikidataRepo or a Core bug?

it's a Wikidata branch bug. I didn't want to file it as a a core bug, and for practical purposes, it only affects WikibaseRepo at the moment, so I put it here.

(In reply to comment #2)

Perhaps enum type could be used as a compromise? I know it's not popular, but
this seems as the place for it.

since extensions can introduce their own content models and formats, this would become very tricky. also, afaik enum types are not supported by all databases and would have to be emulated. painful.

ENUM also would need to be changed every time something is added.

Few choices to be made:

  1. How large is the space (10ths, 100ths, 1000ths?) - I don't know.
  1. Using some existing scheme like MIME? OIDs? What are the options here?
  1. Is there any kind of hierarchy there? What are the relationships between types?
  1. What registration model should be?

For example, registering OIDs can be free (under 1.3.6.1.4.1) and is hierarchical. IANA is registering simple integer values for some standards. Using UUIDs gives opportunity for anybody to generate a value themselves and still there is a pretty good chance it will be unique (I just generated 470b7557-bb8a-11e1-be32-001b77bca544 for fun).

MIME is used by email.
OIDs are used by protocols like LDAP, SNMP and everything using ASN.1 (like many security protocols)
UUIDs are used to specify interfaces in some protocols

(In reply to comment #5)

  1. Using some existing scheme like MIME? OIDs? What are the options here?

I definitely want something human readable. For the format, I'll go for MIME types, because they can also be used in HTTP responses, etc.

For the content model, i'm not sure. Maybe I'll just use the class name that handles that kind of data.

committed to Wikidata branch as 7673587974bb4ef2354443e0de410a28192c0e05, push pending.

Verified in Wikidata demo time for sprint 7