Page MenuHomePhabricator

[Story] Create LilyPond datatype
Closed, ResolvedPublic13 Estimated Story Points

Description

As an editor interested in music scores I want to enter them in Wikidata using the LilyPond notation in order to share them and display them nicely in Items.

Problem:
Right now we do not have a datatype to store music scores. Editors are storing it as a property with datatype string: P5482. We should have a new datatype that renders the value as if it were enclosed within the wikitext <score></score>

Example:
list of Items that currently use the existing property

Acceptance criteria:

  • I can make statements with the new datatype. The input is LilyPond notation and the values are rendered as sheet music.
  • Input for the datatype is just a regular old string textbox (nothing fancy)
  • The values in diffs are just show as the raw text string (same as what currently happens)
  • The new datatype shows up on https://www.wikidata.org/wiki/Special:ListDatatypes
  • Invalid input can be saved but when rendered should show a short version of the full error output, such as "Unable to compile LilyPond input file"
  • Length of the datatype should be configurable via a config setting

Open questions:

  • Can we migrate the existing property to the new datatype?
    • Probably, should be separate ticket?
  • What should the datatype be called?
    • Musical notation
  • Should multi-line values be allowed?
    • Probably not initially
  • Is LilyPond a good choice for this datatype, given the existence of other music notation formats?
    • Yes because it is supported in Wikimedia already
  • Would it be beneficial to have more than one music notation datatype
    • Possibly but the benefit does not seem large enough at this point to invest the effort

Useful information:

Event Timeline

Ideas regarding the open questions in the task description:

  • Yes. As a matter of fact, since the datatype opens up the possibility for more specific properties to be defined, values of this property could be migrated to those specific properties and the original property could be deleted afterwards.
  • We could call it 'LilyPond musical notation', to make it clear that it's not Abc or MusicXML or something else to someone who might assume so for just 'musical notation'.
  • If the 400-character limit for strings is still in place for this property, then I think multi-line values may not be necessary. (My initial thoughts on the datatype have this limit in mind.) Otherwise, if the limit can be increased to something larger, then it may be useful to have multi-line values.
Addshore set the point value for this task to 13.
Addshore updated the task description. (Show Details)

Considering the Commons RfC which occurred after this task was opened (see T208494: Allow music scores to be uploaded to Wikimedia Commons):

  • Is there a good reason to favour LilyPond over other music notation formats? LilyPond is presently the only notation software supported by MediaWiki, but it may be worth considering other formats. The last few comments in T208494 might be helpful. (For what it's worth, almost all of the LilyPond notation statements in Wikidata were added by me through my semi-automated creation of items for musical chords.)
  • Would it be useful to support more than one musical notation format in Wikibase/Wikidata?
  • If/when it becomes possible to store musical notation files on Commons, would the property still be preferable to linking to Commons files?
  • In my use of the property (e.g. Q58233594), I've used a lot of formatting, in part to match existing Commons files (e.g. File:Chord D♭7.svg). Would it be desirable to encourage this? Would the property need coding conventions?
Addshore renamed this task from Create LilyPond datatype to [Story] Create LilyPond datatype.Feb 5 2019, 4:11 PM

Considering the Commons RfC which occurred after this task was opened (see T208494: Allow music scores to be uploaded to Wikimedia Commons):

  • Is there a good reason to favour LilyPond over other music notation formats? LilyPond is presently the only notation software supported by MediaWiki, but it may be worth considering other formats. The last few comments in T208494 might be helpful. (For what it's worth, almost all of the LilyPond notation statements in Wikidata were added by me through my semi-automated creation of items for musical chords.)

This is the one that can be reused inside the Wikimedia projects and that I have seen outside it as well. Any new one would need a lot of additional development time to support it that I'm currently not really convinced is worth it.

  • Would it be useful to support more than one musical notation format in Wikibase/Wikidata?

Same as above. Supporting anything in addition to what MediaWiki already supports costs a lot more resources than I think is currently warranted. It'd also be a significant effort to support more than one format in the same data type and make it harder for re-users of the data because they'd need to be able to deal with different notations.

  • If/when it becomes possible to store musical notation files on Commons, would the property still be preferable to linking to Commons files?

There are different considerations to take into account then where to put what. These include licensing, length, ease of access at least. So I think it depends on which is better for a particular case then.

  • In my use of the property (e.g. Q58233594), I've used a lot of formatting, in part to match existing Commons files (e.g. File:Chord D♭7.svg). Would it be desirable to encourage this? Would the property need coding conventions?

I'll have a look. Thanks for the link.

Ideas regarding the open questions in the task description:

  • Yes. As a matter of fact, since the datatype opens up the possibility for more specific properties to be defined, values of this property could be migrated to those specific properties and the original property could be deleted afterwards.

I was thinking if we can/want to do an automatic conversion in the database like we have converted some string properties to external identifiers in the past. Apparently we can do that conversion.

  • We could call it 'LilyPond musical notation', to make it clear that it's not Abc or MusicXML or something else to someone who might assume so for just 'musical notation'.

Hmmm for mathematical formulas we also have a specific notation but call it mathematical formula. https://www.wikidata.org/wiki/Special:ListDatatypes

  • If the 400-character limit for strings is still in place for this property, then I think multi-line values may not be necessary. (My initial thoughts on the datatype have this limit in mind.) Otherwise, if the limit can be increased to something larger, then it may be useful to have multi-line values.

Ok. We will look at muultiline support later then. So far we don't have any datatypes that take multiline values so that might be more difficult.

  • Is there a good reason to favour LilyPond over other music notation formats? LilyPond is presently the only notation software supported by MediaWiki, but it may be worth considering other formats. The last few comments in T208494 might be helpful. (For what it's worth, almost all of the LilyPond notation statements in Wikidata were added by me through my semi-automated creation of items for musical chords.)

This is the one that can be reused inside the Wikimedia projects and that I have seen outside it as well. Any new one would need a lot of additional development time to support it that I'm currently not really convinced is worth it.

Most of the (few) concerns raised about using LilyPond are largely related to the file format specification being based on the code, rather than the code being based on a specification. I think this wouldn't be a practical concern for Wikidata, since breaking changes don't occur very often (the last stable version of LilyPond was released almost five years ago); and given the length limits, LilyPond, Braille and ABC are the only formats which could realistically be used in statements anyway (MEI, MuseScore and MusicXML are all XML-based). Concerns were also raised about the lack of an exporter from LilyPond to MusicXML.

On the other hand, if it's exceedingly unlikely that anyone will put in the work to create new extensions, then the scope of T208494 should probably be considerably reduced to only allowing Commons uploads of the formats, since thumbnail generation would effectively require four new extensions (as well as work to enable thumbnails for LilyPond files). If that occurs, it would probably affect this task, since it would become possible to upload music notation files to Commons much sooner. It would be possible to instead indefinitely postpone enabling all formats other than LilyPond on Commons, but this would go against the consensus of the Commons RfC.

Ideas regarding the open questions in the task description:

  • Yes. As a matter of fact, since the datatype opens up the possibility for more specific properties to be defined, values of this property could be migrated to those specific properties and the original property could be deleted afterwards.

I was thinking if we can/want to do an automatic conversion in the database like we have converted some string properties to external identifiers in the past. Apparently we can do that conversion.

On the other hand, if we’re already thinking about migrating to several more specific properties (as opposed to just one, a drop-in replacement), then I’d be tempted to keep it simple and not do the in-place conversion (though I agree that, technically, it’s likely possible).

All current sub-tasks of this story are in peer-review now. Not sure if we will keep the story for later sub-tasks regarding the remained questions on this. Each one of them sound like a separate story to me.

We can't test this yet because it is not enabled on beta yet. Will close it once we can test it there.