Page MenuHomePhabricator

Extension:Score / Lilypond is disabled on all wikis
Open, HighPublic

Description

Due to an ongoing security issue, Score/Lilypond have been disabled on Wikimedia wikis for the time being.

This task serves as the public tracking for this issue


Multiple security issues were found in Lilypond, the software used to render musical notations in <score> tags. Some have been fixed, but others are still outstanding. The current plan is to move lilypond to a more secure, isolated environment called "Shellbox": T260330: RFC: PHP microservice for containerized shell execution. Once that's done, we plan to re-enable Score.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Joe reopened subtask Restricted Task as Open.Aug 13 2020, 1:01 PM
wkandek closed subtask Restricted Task as Resolved.Aug 18 2020, 1:55 PM
kaldari added a subscriber: kaldari.

Reopening per T257091

Alejabo lowered the priority of this task from High to Medium.Oct 6 2020, 11:40 PM
Ladsgroup raised the priority of this task from Medium to High.Oct 6 2020, 11:46 PM
Ladsgroup added a subscriber: Ladsgroup.

Do not change priorities without any context.

It is 3 months since the extension was disabled. When will any news be announced?

The disabling of these score tags appears to have some negative interaction with the rest of the page. See this revision:

https://en.wikipedia.org/w/index.php?title=Circle_of_fifths&oldid=981481037#Chromatic_circle

In the section following the (disabled) score tags, there is some math rendering that shows as:

'"UNIQ--postMath-00000001-QINU"'

instead of the proper fancy Z with a 12 after it. See this revision, where the only change is commenting out the score tags entirely:

https://en.wikipedia.org/w/index.php?title=Circle_of_fifths&type=revision&diff=984700416&oldid=981481037

Please change the method of disabling the score tags such that it does not affect the rest of the page.

@patilise - Because of the nature of the problem, we unfortunately can't share many details right now. We consider this issue high priority and are actively working to resolve it, but the problem has turned out to be more complicated than initially expected. We are still hoping to be able to re-enable the Score extension in safe mode once a couple more bugs are fixed and we have completed a security audit of Score and Lilypond. Unfortunately, some features will not work under safe mode, so even that will only be a partial solution. (Some of the features that are disabled under safe mode are listed in the description of T174413.)

[My personal take, which is only an educated guess, is that it may be a while before Score is back up and running. Communities that rely on it may want to consider replacing score notations with images and/or audio files in the meantime. Tim Starling or other developers may have more specific information.]

Since a couple of days, saving any article that calls the score extension leads to the score becoming invisible (without having the score itself changed). So the workaround of reading pre-rendered files from the cache seems no longer to be working. It is really annoying that the problem is still not solved after such a long time of waiting. There are currently 1,241 pages calling score on en:wikipedia, 748 on de:wikipedia which are endangered of losing the proper display of the contained scores.

Since a couple of days, saving any article that calls the score extension leads to the score becoming invisible (without having the score itself changed). So the workaround of reading pre-rendered files from the cache seems no longer to be working. It is really annoying that the problem is still not solved after such a long time of waiting. There are currently 1,241 pages calling score on en:wikipedia, 748 on de:wikipedia which are endangered of losing the proper display of the contained scores.

And 2,161 on en:Wikisource. And 1,254 pages marked as waiting for a score to be added. In addition, we've got several books that are stalled while waiting for this update.

Since a couple of days, saving any article that calls the score extension leads to the score becoming invisible (without having the score itself changed). So the workaround of reading pre-rendered files from the cache seems no longer to be working. It is really annoying that the problem is still not solved after such a long time of waiting. There are currently 1,241 pages calling score on en:wikipedia, 748 on de:wikipedia which are endangered of losing the proper display of the contained scores.

And 2,161 on en:Wikisource. And 1,254 pages marked as waiting for a score to be added. In addition, we've got several books that are stalled while waiting for this update.

And 4275 in frwikisource and 1891 in plwikisource. Wikisources are highly struck by this problem.
In plwikisource we have a group of users focused on music transcription and few music transcription projects started a year or two ago. Have we to loose them?
Any hint about a workaround? A userspace workaround?

Just few examples of such projects:
https://pl.wikisource.org/wiki/Wiki%C5%BAr%C3%B3d%C5%82a:Wikiprojekt_Moniuszko_2019
https://pl.wikisource.org/wiki/Autor:Ignacy_Komorowski
https://pl.wikisource.org/wiki/Pie%C5%9Bni_Ludu_Polskiego_w_G%C3%B3rnym_Szl%C4%85sku
https://pl.wikisource.org/wiki/%C5%9Apiewnik_dla_dzieci
https://pl.wikisource.org/wiki/Kantyczki_(Miarka)
https://pl.wikisource.org/wiki/Indeks:%C5%9Apiewnik_ko%C5%9Bcielny_katolicki_(T._Flasza,_1930).djvu

Hi,

I'm one among a presumably significant group of people around the world trying to learn music during isolation, and I've come across this issue in Wikipedia. I wonder what is the status, since there doesn't seem to be a lot of updates lately.

From my limited technical knowledge about this, I am thinking that if the LilyPond security is very difficult to overcome, anyone with access to all the Wikipedia source (understood as the source anyone can read in the edit page of each article) could run lilypond on an offline machine with no sensitive info and generate an image for each <score> tag, and another simple script could then substitute which <score> tag with an appropriate <img> tag. A special attribute could be added so all these cases can easily be reverted to something else if the <score> issue is solved in the future.

Hi,

I'm one among a presumably significant group of people around the world trying to learn music during isolation, and I've come across this issue in Wikipedia. I wonder what is the status, since there doesn't seem to be a lot of updates lately.

From my limited technical knowledge about this, I am thinking that if the LilyPond security is very difficult to overcome, anyone with access to all the Wikipedia source (understood as the source anyone can read in the edit page of each article) could run lilypond on an offline machine with no sensitive info and generate an image for each <score> tag, and another simple script could then substitute which <score> tag with an appropriate <img> tag. A special attribute could be added so all these cases can easily be reverted to something else if the <score> issue is solved in the future.

This is not so simple. In Wikisource we convert music scores from images to make them editable and proofread the music scores. So users needs to be able to edit them some way. How are they expected to edit the resulting images if their role is just converting images into an editable form?

See eg. this page:
https://en.wikisource.org/wiki/Page%3AChopin_Nocturnes_Op_9_Kistner_First_Edition_1833.djvu/6
transcluded here:
https://en.wikisource.org/wiki/Chopin_Nocturnes_Opus_9/Number_2

The users' tools were taken away from them and I wonder how long they will wait for getting them back.

This is not so simple. In Wikisource we convert music scores from images to make them editable and proofread the music scores. So users needs to be able to edit them some way. How are they expected to edit the resulting images if their role is just converting images into an editable form?

So the editable format is lilypond? The second link is an ogg file. Maybe you could move to midi or musicxml as editable format?

Anyway, my proposed solution was just meant as a quick and dirty hack so Wikipedia end users for the time being can at least see the scores that are currently in lilypond. These scores are usually small examples anyway, not full pieces, but they are important for end users trying to learn from Wikipedia. Once the lilypond security issue is solved, it could be reverted back to <src> (the src could actually be kept as a comment or deactivated somehow). Some scores in Wikipedia are already in image format, so I guess it wouldn't be that bad (for end users).

This is not so simple. In Wikisource we convert music scores from images to make them editable and proofread the music scores. So users needs to be able to edit them some way. How are they expected to edit the resulting images if their role is just converting images into an editable form?

So the editable format is lilypond?

Yes. And the image should not replace it. The lilypond code should be still available in some way in the newest page version.
So, if it is updated, the image may also be updated.

The second link is an ogg file. Maybe you could move to midi or musicxml as editable format?

Anyway, my proposed solution was just meant as a quick and dirty hack so Wikipedia end users for the time being can at least see the scores that are currently in lilypond. These scores are usually small examples anyway, not full pieces, but they are important for end users trying to learn from Wikipedia. Once the lilypond security issue is solved, it could be reverted back to <src> (the src could actually be kept as a comment or deactivated somehow). Some scores in Wikipedia are already in image format, so I guess it wouldn't be that bad (for end users).

Making a dirty hack is better than doing nothing. If lilypond cannot be executed on WMF servers, it may be executed on an external server, in specially prepared secured, chrooted environment, and the image uploaded/updated by a bot.

But at the moment, I see no easy way to access a wiki-parsed lilypond code. It is not accessible. Consider {{#tag:score|...}} usage.

If lilypond cannot be executed on WMF servers, it may be executed on an external server, in specially prepared secured, chrooted environment

This is what T260330: RFC: PHP microservice for containerized shell execution is designed to do. But these things take time.

Assuming the problem is the CVE, Debian believes it was fixed in a more recent version, but the reply on the mailing list doesn't exactly instill me with confidence. I think we do need to decide in relatively short order whether the issues have been addressed and we're going to stick with lilypond, or whether it will never be secure enough for our purposes and we need to start transitioning to something else like MusicXML files.

Assuming the problem is the CVE, Debian believes it was fixed in a more recent version, but the reply on the mailing list doesn't exactly instill me with confidence. I think we do need to decide in relatively short order whether the issues have been addressed and we're going to stick with lilypond, or whether it will never be secure enough for our purposes and we need to start transitioning to something else like MusicXML files.

The current proposed (at least for WMF purposes) solution is:

If lilypond cannot be executed on WMF servers, it may be executed on an external server, in specially prepared secured, chrooted environment

This is what T260330: RFC: PHP microservice for containerized shell execution is designed to do. But these things take time.

tstarling, do you know why it currently works so long as the vorbis=1 parameter is removed? Is this intentional or not?

Also, I need to know what "time" means here, because the above fact means that it's possible to restore functionality to the wiki, but at the cost of possibly changing a large number of pages. If it's going to take months for a fix to be in place, that would be reasonable, but not if it's going to be patched next week. Can I have a three-point estimate at least?

You'll have to ask the people working on it. You probably won't get an answer till next week.

It's certainly possible to restore functionality in various ways, but that doesn't mean it's necessarily going to happen just because it might be quicker. I doubt SRE are likely to want to deploy other packages (that are completely unknown at this point) for similar functionality, never mind the code changes needed to the Score extension.

Assuming the problem is the CVE, Debian believes it was fixed in a more recent version, but the reply on the mailing list doesn't exactly instill me with confidence.

From the same thread:

I discussed a plan for rectifying it with Han-Wen, and suggested that we
could contribute funding towards fixing it. However, I was not able to get
approval for funding it. So the task remains open for volunteers to
address. Of course, it is difficult to recruit volunteers when it is a
private security issue.

Pity.

Change 609467 merged by jenkins-bot:
[mediawiki/extensions/Score@master] Add description to config variables

https://gerrit.wikimedia.org/r/609467

So… we're currently waiting for a suitable volunteer to materialize out of thin air to address an issue whose details are not public for security reasons? And in the mean time we have many thousand broken pages across multiple projects and all we can do is bleed contributors in those areas?

So… we're currently waiting for a suitable volunteer to materialize out of thin air to address an issue whose details are not public for security reasons? And in the mean time we have many thousand broken pages across multiple projects and all we can do is bleed contributors in those areas?

There is some progress being made on various protected tasks, though a number of issues with Lilypond have surfaced requiring far more thought and attention to having it safely run within Wikimedia production again, including sandboxing it within the new Shellbox service (T260330). I'm also fairly certain that if any volunteer wished to help out with these efforts, they could comment upon this public-facing task and we could likely subscribe them to any relevant protected tasks on a case-by-case basis, after some initial vetting.

There is some progress being made on various protected tasks, …

Oh, good. There's a world of difference between "We're working on it, but it's a lot of work so it'll take time" and "There's nothing we can do, we're just keeping the task open in case a volunteer magically appears at some point." :)

Since the tasks where progress happens are protected it would be very useful to periodically update this (public) task with some kind of status. And some expectation management regarding when we could realistically expect the functionality to return (or any intermediary milestones if relevant) wouldn't go amiss either.

We are building a new dedicated service with special security considerations so we can make this work. It's not trivial and requires lots of engineering time and resources but lots of progress is being made. Feel free to help on the work related to shellbox to make it go faster but keep it in mind it's a huge amount of work (and basically restructuring some parts of our current system).

I updated the task description with the current plan. Progress is being made regularly on various subtasks including:

Hey, thanks for y'all's work on this serious issue 💟 . Some of the disclosed vulnerabilities refer to including SVG and PS files with embedded-ps and embedded-svg commands. I doubt any uses of Extension:Score on WMF wikis use this feature, so could Lilypond gain compile settings to disallow, and not even compile the code for, all such insecure commands?

Would it be possible to implement the temporary solutions as described at enwiki here and here automatically (have the software ignore the "vorbis=1" and "score=1" attributes? This would at least solve issues about content in articles across all wikis while the actual problem gets resolved.

While one would expect that such a crucial broken feature should be fixed within days, in Wikimedia environment even 9 months is not enough and I am quite sure that it will not be too difficult to conquer even 1 year boundary for WM tech team. The reason can be found at https://lists.gnu.org/archive/html/lilypond-devel/2020-10/msg00092.html , quoting Tim Starling:

I discussed a plan for rectifying it with Han-Wen, and suggested that we could contribute funding towards fixing it. However, I was not able to get approval for funding it.

While one would expect that such a crucial broken feature should be fixed within days, in Wikimedia environment even 9 months is not enough and I am quite sure that it will not be too difficult to conquer even 1 year boundary for WM tech team. The reason can be found at https://lists.gnu.org/archive/html/lilypond-devel/2020-10/msg00092.html , quoting Tim Starling:

I discussed a plan for rectifying it with Han-Wen, and suggested that we could contribute funding towards fixing it. However, I was not able to get approval for funding it.

Having said Wikimedia Italia has some relevant partnerships with music archives under Wikipedia-compatible licenses and it seems they are interested in trying to help to resuscitate Score, funding the development of Shellbox to make it happen,

Is this request for funding still valid?

We have a budget for it now, but nobody to actually do it. But the plan is to re-enable LilyPond execution anyway once Shellbox setup is complete.