Page MenuHomePhabricator

PDF export should render music scores natively, i.e. not as images
Open, LowPublic

Description

Currently, PDF exports of pages using <score>s ( for example < https://de.wikipedia.org/w/index.php?title=Benutzer:Reiner_Stoppok/Tanzlied_aus_Poniky&oldid=165757690 >from another task) embed them as PNGs.

This makes no sense. If one were to transcribe a PDF sheet music file (e.g. File:Mozart. Zerlina`s aria from Don Giovanni in Ukrainian.pdf ) the result would be worse than the original. The lyrics and other text would not be copyable or searchable, the well known shortcomings of printing raster graphics and so on. Transcribing good quality scans of well preserved sources is not very productive end user wise too (a raster image vs a raster image), even though there are benefits for advanced users.

Basically this makes pages with scores we have badly reusable for readers of whom we cannot demand to be able to extract the Lilypond (or ABC) code and generate PDF externally.

Furthermore generating PDFs is what Lilypond as software does out of box as basically its main functionality. (I do not know about ABC)

I don't know how difficult is it to combine Lilypond's rendering with our PDF export (be it OCG or Electron) but I think it should be doable. At least for the scores used not inline. Small examples of a couple of measures or less which might be used in sentences inline are a smaller concern.

Perhaps one way to improve the situation is related to T49578 but basically it is a separate issue as it is better to use a PDF specific approach.

Event Timeline

Base created this task.Nov 25 2017, 8:21 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 25 2017, 8:21 AM
Base renamed this task from PDF export should not render music scores natively, i.e. not as images to PDF export should render music scores natively, i.e. not as images.Nov 25 2017, 8:22 AM
Ebe123 triaged this task as Low priority.Nov 28 2017, 11:33 PM
Ebe123 added a subscriber: Ebe123.

There seems to be a few points in this task:

  1. Higher quality prints
  2. Embed the partitions in exported PDFs
  3. PDF export of sheet music
  4. Text search support

For the first point, T49578: Score should output SVG is the solution. There is no native format for sheet music; rather, the format should be appropriate for the medium. Here, it's the web, where SVG is more suitable than PDF. This will be resolved soon enough.

When embedding the sheet music in PDF, the filetype where it was taken from should not have an effect, i.e. if text was selectable in an SVG beforehand, that it would be that way in PDFs. Is there knowledge of the contrary?

For PDF rendering to work, the code has to be re-typeset using a different backend, effectively doubling the render. If we were to, how the file is accessed would have some discussion, but with T114757: Remove "midi" option could be included in an hypothetical tooltip along with MIDI.

The ability to select text in SVGs does not work on any page, irrespective of whether it is music. This is due to its embedding with the <img tag. Even if selectable, the text / objects selected would be a mess (just look at your PDF as an example).

Base added a comment.Nov 30 2017, 11:14 AM

First of all I want to make sure we are on the same page. This task is about the "Download as PDF" sidebar option: currently, Special:ElectronPdf; before a recent update it was a different special page. Is this what you are about as well? It feels like you are talking about a different thing which makes it difficult to address your points.

I will try a little bit though.

For the first point, T49578: Score should output SVG is the solution. There is no native format for sheet music; rather, the format should be appropriate for the medium. Here, it's the web, where SVG is more suitable than PDF. This will be resolved soon enough.

When talking about the online web version it is of course true, as well as when using the print view of the same version.

When embedding the sheet music in PDF, the filetype where it was taken from should not have an effect, i.e. if text was selectable in an SVG beforehand, that it would be that way in PDFs. Is there knowledge of the contrary?

I cannot say I agree with this. PDF is another medium. As per your first point we should use appropriate for the medium formats. In PDF we lose some features, for example there is no way to follow links to Commons to get the attribution (at least it is not to be taken for granted that there is a way), so it must be provided in other form. Not to say that Interractive Graphs and Maps cannot work there. I do not see why on the other hand we cannot provide some benefits of PDF medium which are unaccessible in medium browser because of the latter's technical limitations.

For PDF rendering to work, the code has to be re-typeset using a different backend, effectively doubling the render.

Basically yes to the first part, as far as I understand it there should be two backends in play, the normal PDF renderer for the most of the content, but for the score content native Lilypond' renderer (and ABC' for ABC code) should take care of generating its PDF using the source code rather than the PNG as the input. Then both must be blended together. I do not know much about PDF, this sounds like the most tricky part, thus I am proposing that this perhaps is only done to where <score> is taking its own paragraph as it sounds easier to just insert that bulk generated page piece in between of some 1D vertical boundaries of main renderer generated file, rather than to try to fit a Lilypond/ABC generated block in 2D space. It is not doubling the renderer though, as each renderer will be taking care of different content type, just as we have (or can have) different renderers for TIFF and SVG thumnailing for normal web view each taking care of its assigned content. Doubling would be in place if both renderers were to render the same things. It feels that the main PDF renderer uses rendered HTML for its input, that is fine, but it should recognise which parts of it are, in this case, of score type and send it to be proceeded by the respective renderer as I mentioned. Well, more or less, I am not a software architect.

If we were to, how the file is accessed would have some discussion, but with T114757: Remove "midi" option could be included in an hypothetical tooltip along with MIDI.

This sounds like something providing room for another feature like requesting a PDF render of concrete piece of music, this I believe would be good to have too but it is basically a separate task.

Even if selectable, the text / objects selected would be a mess (just look at your PDF as an example).

Where is it a mess there? Selecting objects, like the notes themselves surely is a mess there, they use some private Unicode block characters which of course make little sense. All the textual content there though like title, lyrics, dynamics is perfectly copyable for me, including the text in Ukrainian.

Ebe123 added a comment.Dec 1 2017, 7:44 PM

Yes, we are talking about the "Export to PDF" (Electron) feature found on pages.

I cannot say I agree with this. PDF is another medium. As per your first point we should use appropriate for the medium formats.

Export to PDF takes HTML content and re-produces it (not quite well) into another format. As the SVG standard is integrated into HTML, it would also be able to be converted with the same (bad) accuracy. Embedding another PDF would only complicate. Your example of limitation is flawed, as hyperlinks are possible in PDF. Hyperlinks for attribution are however not used with this extension as the MIDI option co-opts the linking.

Where is it a mess there? [...] All the textual content there though like title, lyrics, dynamics is perfectly copyable for me, including the text in Ukrainian.

Is it well copyable though? Trying to select text in either PDF or SVG, the lyrics are quite disorganized and useless to copy, even if possible.

Basically this makes pages with scores we have badly reusable for readers of whom we cannot demand to be able to extract the Lilypond (or ABC) code and generate PDF externally.

And so you want the music content able to be standalone from the rest of the content? That is what I am presuming. This is why I sub-grouped this task in 4, going from the quality of the prints to the PDF export. It is true that Lilypond natively exports to PDF, and so adding another step to the conversion (Electron) would be pointless.

I do see what you mean though, and it would be nice to achieve. The problem is the very limited use-case and the technical difficulty to create. Its use for PDF export is even more limited. This task seems more to be about text selection and exporting music (not necessarily the whole page) to PDF, using our "Export to PDF" as an implementation for that. I have shown why text selection is quite pointless in music, and why Export to PDF is the wrong approach to use for the use I perceive you as having based on your examples.

@Base Also, I have no idea how playing sheet music is possible in random PDF files...

PDFs actually do support audio files, like what we do with HTML. However, it is not widely supported. It is also way besides the point that I see here. The point of sheet music PDFs is to perform it, so audio is not pertinent to consider. Lets see about T49578: Score should output SVG's effect on PDFs and then take it from there. The main benefit here seems kinda useless though, and @Base is disabled.