Wed, Oct 21
Tue, Oct 20
My idea for detection/prevention of opcache corruption is to use a memory protection key to do essentially what opcache.protect_memory=1 does, but fast enough for it to be always enabled in production.
We could use Shellbox RPC (plus a cache) to provide the sort key from a different version of PHP/ICU. That would make the T37378 approach more feasible. MediaWiki would write both collations on edit/refreshlinks, and there would also be a script running writing the same thing.
The way it worked in T37378 was to include cl_collation in the primary key. So the data in the table was duplicated, but with cl_sortkey depending on cl_collation. I think it would have worked, it's just that the ball was dropped during code review. The catch with this is that LinksUpdate will drop rows for collations it doesn't know about. The work of the script would be partially undone by edits between the start of the script execution and the PHP version switch. We would either have to put up with that, or have a special-case hack in LinksUpdate::doIncrementalUpdate(), or PHP would have to be linked against multiple versions of ICU so that MediaWiki can insert rows for all collation versions simultaneously.
Thu, Oct 15
I should note that I haven't actually tested and confirmed correct streaming behaviour in Guzzle, but I'm pretty sure it will work, based on the code and docs.
Mon, Oct 12
I responded "I tried it but it never worked well", which is not exactly right. I know that different people have different needs and goals for a development environment and I don't mean to criticize Vagrant. I just prefer things to be lightweight and simple. I have a single systemd-nspawn container running all relevant services, which I administer directly. If I want to configure Apache, I just edit the Apache configuration files. That works for me.
Tue, Oct 6
The reason it needs a GuzzleHttp\Client not a GuzzleHttpRequest is, as I said in the task description, because it needs response streaming. The WIP patch is already using a GuzzleHttpRequest, but GuzzleHttpRequest finishes reading the response before it returns. It can take a write callback, but control is inverted compared to Guzzle which allows the caller to pull from the response stream. The multipart reader I wrote assumes this control direction, which is convenient and consistent with PSR-7.
Sat, Sep 26
MessageBlobStore was @Catrope's work, he added this feature to rebuildLocalisationCache in 2010, so he can probably explain better than I can.
Fri, Sep 25
I uploaded the Score changes to give you an idea of what a moderately complex caller looks like in practice. It's lightly tested but seems to work for me.
Thu, Sep 24
Wed, Sep 23
I reviewed it. I'll untag Contributors-Team since I don't think there's anyone there who can help with this. The maintainer column for Scribunto is closer to the actual list of people who can review this kind of change. Once it is merged, I'll tag some people for help with the Debian release. I'll also need to release it to PECL, I'm the only maintainer there.
Sep 23 2020
@Joe says that application logs can go to logstash while php-fpm error logs will go to k8s. I want to say that the simplest way to send application logs to logstash is to use the logToClient option, which serializes log entries and sends them back to MediaWiki. This way, the configuration stays the same, and we don't need to port MediaWiki's custom logstash formatter. You lose log messages generated after the response is sent, but currently that's just temporary directory teardown failures which probably don't matter. For php-fpm error logs, we can start it with "php-fpm -F 2>&1", then logs go to php-fpm's stderr which is redirected to k8s's stdout.
https://pracucci.com/php-on-kubernetes-application-logging-via-unix-pipe.html provides two options for how php-fit could be set up to get the logs from worker processes.
Sep 18 2020
Sep 17 2020
One possible solution is for Score to drop TMH support. Instead we can just play an MP3 directly with <audio controls>.
The Score extension is broken by this change in multiple ways. It shows a fake player off to the right of the score (T226910), which gives a JS fatal error when clicked. If I set the file option to a fake file and wrap the output in a div, it kind of works, but a 500x350 popup obscuring the document is not appropriate for this use case. Obscuring the score while playing the audio defeats the purpose.
I would accept a contributed review if it looks rigorous. I'm just looking for someone to read the source and comment on potential vulnerabilities. For example:
If the point is just to enable the coroutine module then the proposed change seems unnecessarily complex. As far as I'm concerned, coroutine can be enabled if it passes a security review. If it can't pass a security review, allowing users to enable it via php.ini seems imprudent. If it can pass a security review, then it can be enabled everywhere.
Sep 16 2020
Reproduced locally and confirmed the fix by faking part of the call stack:
This is very similar to T231866. The fix for that bug was committed in December 2019 https://gerrit.wikimedia.org/r/c/mediawiki/core/+/553170 . But that fixed the code in ServiceWiring.php without fixing the duplicated code in rebuildLocalisationCache.php. So I will apply the same fix to rebuildLocalisationCache.php.
I was able to reproduce it by first erasing the gadgets-definition cache:
I tried to reproduce this on mwdeploy1001 but it didn't work. I changed /srv/mediawiki/wikiversions.php and then ran
Sep 14 2020
An open question is what to do about shell pipelines. Currently if you do Shell::command()->unsafeParams('foo|bar') then foo will run under firejail and bar will run unsandboxed. Maybe that's accidental? We could do firejail sh -c "foo|bar" but that doesn't work with NO_EXEC. We could parse the command line and disallow pipelines and lists in BoxedCommand, but that would probably lead to non-portable code as people set up their own shell wrappers. Or we could parse the command line and run each component separately under firejail, but that's the most work, and I'm not sure if anything really needs it. Or we could leave it as it is.
Sep 10 2020
Regarding restrictions again. I'm not a big fan of Firejail after my recent code review and bug reports, so I'm looking at the restriction options with a view to making them independent of the sandboxing system. Shell::SECCOMP is awkward in this respect since it disables a firejail-specific set of syscalls. It also also implies no_new_privs, disabling setuid-root executables, which seems like it should be a separate option.
Note that Shellbox depends on monolog and core will depend on shellbox, so this will make core indirectly depend on monolog. In core at the moment, monolog is optional (although it's a dev dependency). I think loose integration between core's LoggerFactory and monolog is the source of a lot of nuisances, so I'm happy to add it as a core dependency, as part of a plan to more tightly integrate core's logging with monolog in future. Monolog is fairly large (6 kloc) but doesn't have any secondary dependencies.
Sep 9 2020
I figured out a workaround, patch will be up shortly.
One outstanding question is what to do about the restrictions bitfield. In production, firejail will be disabled and restrictions will be ignored. But in other situations, presumably some effort should be made to respect them. Obviously NO_LOCALSETTINGS doesn't make sense, so the new wrapper in MediaWiki will respond to that by adding LocalSettings.php to a blacklist array.
Sep 2 2020
Aug 28 2020
Aug 27 2020
Task description edit:
- Changed the file API again as discussed
- Stopped describing BoxedCommand as a subclass of Command and explained why that doesn't quite work.
- Proposed passing $this to Executor since passing each variable as a separate formal parameter is tedious and harder to maintain. We can just have accessors in BoxedCommand.
- Mentioned the PHP call endpoint.
Aug 26 2020
The solution used in Phabricator, which I found via https://secure.phabricator.com/T11105 , is to redirect stdout and stderr to temporary files.
I can reproduce this in eval.php using echo_333333_stars.php from https://gerrit.wikimedia.org/r/c/mediawiki/core/+/471885/ . The problem is that stream_select() returns 3, with all pipes supposedly ready, but they're not really ready. Reading from stdout will give you as many bytes as you ask for, but reading from stderr blocks forever. If you fully drain stdout, then reading from stderr gives you an empty string instead of blocking.
Aug 25 2020
Has anyone got an idea for giving the HMAC key to the server without allowing the command to have access to it? Otherwise an attacker can use a command to exfiltrate the key and then spoof requests. If it's not possible, maybe we should think about asymmetric encryption.
Aug 24 2020
Aug 23 2020
Aug 21 2020
I'm reconsidering the layout abstraction I had planned, with inputs/ and outputs/ subdirectories under the working directory. It's simpler to allow slashes in filenames and to allow files to go anywhere under the working directory. So you could have
OK, I'm adding PHP execution to the service.
Aug 20 2020
From what I understand, the service will run inside the sandbox, and eval() can be done in this service with relative safety. I planned on having eval() in there for another reason: I'm considering allowing callers to subclass BoxedCommand. However @Joe was not a fan of this idea. We won't be providing a full MediaWiki install inside the sandbox, with autoloading, so there is a risk of development and production instances working differently if the default in development will be local in-process execution.
I was able to reproduce this. It stops with SIGTTOU when it calls tcsetattr() on stdin. gdb identifies the caller:
Aug 19 2020
Aug 18 2020
Assigning to myself since implementation work is underway.
Aug 17 2020
Aug 16 2020
Download of .ly files should not be allowed unless there is a change in security policy from the LilyPond developers. LilyPond files trivially allow arbitrary execution when they are rendered by end users. Safe mode is supposed to prevent that, but it is not the default and is not recommended in the manual. In conversations relating to the recently discovered safe mode escape vulnerabilities (T258547, T259210, T260225), the developers stated that safe mode escape vulnerabilities are unsurprising, and that they consider that only OS-level containerization to be sufficient. But this level of isolation is beyond the abilities of ordinary end users.
Aug 14 2020
MemoryLimit is now a deprecated property. The documentation in stretch and buster recommends using MemoryHigh, which causes processes to be "heavily slowed down and memory is taken away aggressively", and MemoryMax, which invokes the oom-killer.
Aug 13 2020
It's simpler to require the table to be in $table. The table name in $join_conds is really just the alias, i.e. the key in the $table array. There's no other place to put the actual table expression besides $table. If everyone just used SelectQueryBuilder, there would be no problem: it doesn't have this ambiguity.
Aug 12 2020
I think it was an idea for solving a fundraising-related problem. They were using $wgRawHtml for delivering PayPal donation forms at the time. The first three commits show that it was a single day of work for me in 2007, but glaring bugs prove that it was never deployed. It can be archived.
Aug 11 2020
Aug 10 2020
Of course, I would also support T258852. I just don't like the idea of a hook that only works with the old hooks system.
How about an option to HookContainer::run() that prevents a hook handler from having service dependencies? I'll submit a WIP patch along these lines.
Aug 6 2020
I was able to confirm the expected behaviour using a script that waits until a specified time before beginning a request.
Aug 5 2020
We still can't announce anything since we're waiting for vendor security releases. Third party sites should leave lilypond execution disabled.
Aug 4 2020
Jul 31 2020
It's disabled again, since I found a new vulnerability.
Jul 30 2020
Jul 29 2020
Ideally I think it would be called restrictions(), following the convention that chaining mutators take the name of the thing they are mutating. But the fact that there are already callers expecting this behaviour makes the proposed patch good enough for me.
I submitted the merge request https://gitlab.com/lilypond/lilypond/-/merge_requests/285 for this.
Jul 28 2020
Since there was no response from netblue30, I sent an email to Reiner Herrmann, as suggested by @Legoktm.
Jul 27 2020
We can leave this open for now to track the upstream issue.
Jul 24 2020
It will probably be re-enabled in safe mode early next week. Hopefully Monday my time (i.e. Sunday night US time). I don't know how long it will take to restore it in unsafe mode, probably another couple of weeks.
Jul 2 2020
Excessive number of returns is really just an example. On https://gerrit.wikimedia.org/r/c/mediawiki/core/+/608993 SonarQube is putting 8 comments per patchset, of which 3 are about excessive returns. None are actual issues which require any action.
Thanks @kostajh. I think I'd rather not see any non-voting comment as an inline comment. For example, PHPCS failures as inline comments might be a useful feature. But if SonarQube is not voting then I don't want to see inline comments from it. I especially don't want to see them duplicated on every patchset. The problem is that inline comments need to be resolved before the patch can be accepted. Sorting through a mass of bot-generated comments to try to find the human ones that still need resolution is not a task I want to add to my workflow.
I'm not kidding about this, this is very annoying and the number of useful comments I have seen from it is exactly zero. I'll submit a revert for rLTSQcfa9ad30f4d9: Remove inline comment safelist if that is the cause of this.
Jul 1 2020
There is the question of how much we really care about the incompatible browsers, since the recommended solution for them (sending two cookies) is quite intrusive. I ran some queries in Turnilo. As a percentage of non-bot page views in the latest day:
Testing on Firefox and Chromium shows that it is necessary to set SameSite=None on local session cookies in order for cross-site autologin to work. When a wiki is visited by a user who is logged in centrally but not locally, there is a four step process driven by redirects: central/checkLoggedIn -> local/createSession -> central/validateSession -> local/setCookies. Both browsers agree that a redirect from the central wiki to the local wiki is a cross-site request even though the top-level origin is the local wiki, so Lax cookies are not sent. Patching MW to make the local (and central) session cookies have SameSite=None causes autologin to work in both browsers.