Page MenuHomePhabricator

phpunit-patch-coverage logic on when coverage goes down in one file and up for others needs improving
Open, Needs TriagePublic

Description

On https://gerrit.wikimedia.org/r/#/c/415249/ Tim wrote:

+-------------------------------------+--------+--------+
| Filename                            | Old %  | New %  |
+-------------------------------------+--------+--------+
| includes/parser/LinkHolderArray.php |  76.19 |  74.29 |
| includes/parser/Parser.php          |  81.66 |  82.01 |
| includes/parser/StripState.php      |  48.48 |  95.89 |
+-------------------------------------+--------+--------+

Seems a bit unfair to call that a test coverage decrease.

Probably we should compare overall lines covered before and after instead of going by file?

Event Timeline

Probably we should compare overall lines covered before and after instead of going by file?

Yes, though the per-file numbers are also useful. Maybe do the pass/fail on overall coverage, and the output on per-file coverage with overall as a summary?

Even comparing lines rather than percentages could be unfair: if you remove covered lines from a method, you're not really decreasing coverage even though you are decreasing the number of covered lines.

I suppose what we really want to test for is "are any lines added in this patch not covered?", although actually determining that may not be easy.

Even comparing lines rather than percentages could be unfair: if you remove covered lines from a method, you're not really decreasing coverage even though you are decreasing the number of covered lines.

Yep, also mentioned in T188695.

Indeed: The relevant measure is the number of untested lines. If this goes up per file, that could cause a "failure".

This will not catch the introduction of new files with no coverage, however. For new files, some minimum percentage should be required, e.g. 66% or 80%.

It seems correct to complain in this case. The phrasing of the error message is just misleading -- it makes it sound like you removed existing tests or made previously covered lines no longer covered, when it really means you didn't cover newly-added lines. But this way of measuring things will give wrong results in some cases, like if you remove some existing test coverage but also add more, which should warrant a complaint but might not.

The proper analysis would be to complain about a line that used to be covered and no longer is, or a newly-added line that isn't covered. This is probably too difficult to be worth it, given that we don't require 100% test coverage for new code anyway.