Page MenuHomePhabricator

Discourage use of is_null() with PHPCS
Closed, ResolvedPublic

Description

It seems not currently explicitly covered by https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP, but implicitly by example, at https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#isset()

Do not use isset() to test for null. Using isset() in this situation could introduce errors by hiding misspelled variable names. Instead, use $var === null.

However, I've seen people point out in code review about prefering === null over is_null(). Assuming that believe is shared, we should consider enforcing and auto-fixing with PHCPS.

Event Timeline

Krinkle created this task.Jul 17 2018, 7:00 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 17 2018, 7:00 AM

Change 446271 had a related patch set uploaded (by Prtksxna; owner: Prtksxna):
[mediawiki/tools/codesniffer@master] Forbid usage of is_null()

https://gerrit.wikimedia.org/r/446271

How do we do this? Make fixes everywhere first?

Krinkle added a comment.EditedJul 18 2018, 6:43 AM

@Prtksxna The mediawiki-codesniffer rule can be changed without worry of breaking stuff or needing to update everywhere first.

We use project-local composer.json and package.json. This means that when a new version of the sniffer package is released, nothing happens at first (which is good). Individual projects can update their codesniffer version at a time that is convenient for them.

When individual projects decide to update their version and find issues, they can update the code during that commit as well, or they can disable the rule in phpcs.xml as part of the commit that increments the codesniffer version (and maybe fix it at a later time, if it is not a high priority).

This process is semi-automated by https://www.mediawiki.org/wiki/Libraryupgrader.

Guys, I honestly don't think this is worth anybodies time. Yes, isset is problematic because of it's hidden semantics. But what is problematic about is_null? Is it not readable? Does it not do what it says it does? What is the gain win by actively forbidding it?

I can even recall situations where is_null makes code actually better readable than === null.

Please don't disallow is_null.

@thiemowmde I haven't portrayed is_null() itself as being problematic. The problem is with tolerating two virtually identical ways to do the same thing.

I think there are many cases where style cannot be programmatically enforced, such as the grouping of related statements and separating of related blocks with an empty line. That is a matter of opinion and sense based on the current context. The same for things like using an inline expression versus assigning everything to a variable and evaluating only variables. I don't see any particular value in trying to standardise that.

I can even recall situations where is_null makes code actually better readable than === null.

Coding style is in my opinion about reducing cognitive overhead required to read and understand code, by eliminating suspicious patterns and by avoiding multiple ways to perform the same operation that don't look the same.

I struggle to understand how either of is_null() or === null could be clearer than the other, but please provide examples. We should strive to eliminate one of them.

I struggle to understand how either of is_null() or === null could be clearer than the other […]

Exactly that is the reason why I believe it's pointless and even counterproductive to make our CI punch people for using one of them, but not the other.

hashar added a subscriber: hashar.Jul 18 2018, 9:24 AM

Surely empty() and isset() are problematic, since they have edge cases which are counter intuitive. We have been banning the use of empty() entirely and isset() is still some time the cause of bugs.

is_null() and === null are strictly identical code wise (or at least I can't find a different use case). For a human, I don't think they carry any ambiguity. I would rather not enforce one or the other, at this point that sounds too pedantic to me and causing developers to whine about the style being way too strict for no benefit code wise.

But that is really just my 2 cents.

is_null can make code more readable in cases where an extra pair of lose parenthesis would be needed anyway to make sure the operator precedence does not conflict with the intent, or when it simply helps making the intent more clear.

I know, the following example is bad coding style. Please bear with me. It's just to illustrate the point:

$result = is_null( $condition ? $a : $b ) ? 'yes' : 'no';
$result = ( $condition ? $a : $b ) === null ? 'yes' : 'no';

If I would need to pick one of these, I would pick the first.

[…] eliminating suspicious patterns […]

Sure, but if I understood correctly you don't think is_null is "suspicious".

Krinkle added a comment.EditedJul 19 2018, 3:27 AM

[…] eliminating suspicious patterns […]

Sure, but if I understood correctly you don't think is_null is "suspicious".

Indeed, the part that sentence applies:

[..] reducing cognitive overhead [..], by eliminating suspicious patterns and by avoiding multiple ways to perform the same operation.

For a human, I don't think they carry any ambiguity.

I agree. Both patterns are equally fine.

Krinkle added a comment.EditedJul 19 2018, 3:31 AM

I would rather not enforce one or the other, at this point that sounds too pedantic to me and causing developers to whine about the style being way too strict for no benefit code wise.

This is not about code quality in isolation, but about the code together being consistent and easy to parse mentally by being able to consistently map the same concept to the same syntax (visually). I think we both agree that allowing two patterns for the same thing is often bad. For example, we have previously decided over the following as well. For each of these, I'd say both are equally good.

  • Allowing both array() and [].
  • Allowing both tabs and space for indentation.
  • Allowing both __DIR__ and dirname( __FILE__ ).
  • Allowing both @param bool and @param boolean.
  • Allowing both (boolean) and (bool) for type casting.
  • Allowing both 0.52 and .52 for float notation.

The above is currently enforced against by mediawiki-codesniffer. We could also think about the following:

  • Allowing both is_string() and gettype() === 'string'.
  • Allowing both (int) $var and intval($var).

And for this task:

  • Allowing both is_null() and === null.

A talk from Douglas Crockford some years ago, about coding style, stated (paraphrasing) that as writers of code, we have the power to effectively improve the programming language we work in, a little bit. Specifically, we can pretend some of its features don't exist.

Let's ask: Why use is_null()? What if it wasn't part of the language, would we create it? Should we create and prefer an is_false() or is_emptystr()?

I know, the following example is bad coding style. [..] It's just to illustrate the point:

$result = is_null( $condition ? $a : $b ) ? 'yes' : 'no';
$result = ( $condition ? $a : $b ) === null ? 'yes' : 'no';

If I would need to pick one of these, I would pick the first.

Thank you! I thought about this, and if given two bad choices, I would agree.

But as you said, this is bad coding style. I can somewhat tolerate a simple nested ternary if only using boolean evaluation, eg. $x = ( $isA ? $aa : ( $isB ? $bb : $cc ) ), but I do not think we should allow nesting ternaries inside comparisons inside another ternary.

Can you think of an example where there is at least one good choice? I believe if code is written in a way that allows at least one good choice, probably the other choice is also good.

From the Zen of Python:

There should be one-- and preferably only one --obvious way to do it.

I can't speak for is_null, but I've seen enough confusion over int type casting vs intval() on IRC and in code review. I'd be OK having codesniffer bless one way (for is_null/===null) as the preferred way, but given that people are very likely to use is_null if they aren't already familiar with our conventions, I think it needs to have an autofix before it can be merged:

is_null( $foo ) -> $foo === null
!is_null( $foo ) -> $foo !== null

I gave this a bit more time and agree that what we are discussing here is not different from, for example, (boolean) vs. (bool). However, what I don't want us to do is to disallow is_null without a proper auto-fix. is_null appears about 1600 times in 900 files: https://codesearch.wmflabs.org/search/?q=is_null\(&i=1. Having to manually deal with these would be tedious and mostly pointless, given we all agreed there is not really anything wrong with is_null. I believe such a sniff must have an auto-fix that's clever enough to understand edge-cases like the one I listed above, as well as being able to replace !is_null( $foo ) with $foo !== null.

That said, I'm still questioning if this is worth the effort.

@thiemowmde Thanks. I also agree we should have an auto-fix, and it should be able to convert the vast majority of the affected code.

Umherirrender added a subscriber: Umherirrender.EditedJan 4 2019, 10:43 PM

The sniff for review under https://gerrit.wikimedia.org/r/#/c/mediawiki/tools/codesniffer/+/446271/ has a autofix and works pretty well from my point of view.

How to go further? This is not a sniff which affects a small part of code. When merging the sniff and libraryupdater is running for the new release, than "composer fix" is also run to autofix all extensions/skins/core. The check for null is often done in php code, so this would affect many repos and therefor many developer. The impact is similar to the short array syntax (That means, many patch sets needs rebasing or fixing, if the is_null is in new code).

Should this be pre-announced before merging to get more feedback, if the way is approved by a larger amout of developer?

hashar removed a subscriber: hashar.Jan 12 2019, 2:14 PM
Ricordisamoa added a subscriber: Ricordisamoa.
Krinkle removed a subscriber: Krinkle.Sep 2 2019, 3:30 PM

Change 446271 had a related patch set uploaded (by Jforrester; owner: Prtksxna):
[mediawiki/tools/codesniffer@master] Forbid usage of is_null()

https://gerrit.wikimedia.org/r/446271

Change 446271 merged by jenkins-bot:
[mediawiki/tools/codesniffer@master] Forbid usage of is_null()

https://gerrit.wikimedia.org/r/446271

Umherirrender closed this task as Resolved.Nov 23 2019, 9:20 PM
Umherirrender assigned this task to Prtksxna.
Umherirrender triaged this task as Medium priority.