Page MenuHomePhabricator

Clarify Handler::getValidatedBody error states
Open, MediumPublic

Description

The documentation of Handler::getValidatedBody states:

/**
 * Fetch the validated body
 * @return mixed|null Value returned by the body validator, or null if validate() was
 *  not called yet, validation failed, there was no body, or the body was form data.
 */

To me, it looks like none of these is supposed to happen during normal operations, either because of a request error that is caught soon enough, or because the caller should know whether it makes sense to call getValidatedBody (e.g., based on the content type). That is, the most common use case of accessing the body in the handler's execute() method should always be successful (again, as long as the preconditions on content type etc are met). In turn, this leads to handlers having to account for a "ghost" null in awkward ways. Some examples from a quick codesearch:

  • Wikibase and other places use $jsonBody = $this->getValidatedBody() ?? [];, and then accesses the array elements without further checks (hence, it's clear that the null coalesce is only there to satisfy static analysis)
  • Core uses '@phan-var array $body'; to suppress phan issues
  • Other places, such as CheckUser, have 20 lines of code dedicated to handling the null case
  • And most other places have no null checks at all. Presumably this is because until r935079, merged a few days ago, phan would not treat the return type of mixed as nullable, and hence it would not complain about code assuming that the return value is an array. But it does complain now, which means we're going to see the boilerplate above propagating to more and more places.

IMO, it would be really nice to avoid this, by making getValidatedBody throw an (unchecked) exception when the validated body is null (akin to getValidatedParams). Note, this implies making sure that getValidatedBody won't return null under normal circumstances. While to me it looks like this should be the case, T344901 seems to tell an entirely different story, where the content type validation is skipped altogether when using a form-like content type. This should be addressed in the REST framework itself, as we obviously can't copy-paste the same 20 lines of boilerplate to every handler in existence.

At any rate, it's clear that the current situation, where the method's return value is documented to be nullable but everyone just pretends it isn't, is far from being ideal.

Event Timeline

Wikibase switched to @phan-var because it turns out the null case is actually unreachable in the context of that codebase: https://gerrit.wikimedia.org/r/1004666. A few more comments that explain the why can be found in https://gerrit.wikimedia.org/r/1004634.

Body validation is a bit of a mess (@daniel has been working on how a v2 could look). In the short term maybe just revert the change? I think the proper fix would be to add a return type with @method to each handler, but with body validation going to be rewritten I don't think it's worth all the effort it would take.

You mention @method. Is this a mistake? The correct @return types (esp. that JsonBodyValidator always returns an array) have been added in https://gerrit.wikimedia.org/r/935079. But because of the way these classes call each other purely via their interfaces there is no way Phan can understand that in its entirety.

As said in a comment in https://gerrit.wikimedia.org/r/935079 the issue existed long before that patch. The patch just made it visible. Personally I'm really not a fan of hiding known issues. On the other hand, removing the |null is arguably not making the code worse (null is still part of mixed, as well as mentioned in the comment). All it does is silencing that extra-suspicious Phan rule, which might be ok.

Once T358560 is done, we can probably make getValidatedBody always return an array when called on a request that may have a body. When called on a GET or HEAD request it would throw.