Page MenuHomePhabricator

Tests whose evaluation results in an error showing as Passed
Open, MediumPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:

  • The test shows as passed for all implementations.
  • The test is in error in all cases.

What should have happened instead?:
All tests should show as failed.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

IMG_0997.png (2×960 px, 289 KB)
all tests show as passed
IMG_0998.png (2×960 px, 285 KB)
but they are all in error (argument type mismatch)

The error is correct, because the first element is a Day of the week rather than a Gregorian calendar month. It is indeed False that the element is in the list, but this is not a valid inference from the error. In fact, if the Day is prepended to the list of Months (making it True, though not valid), the tests all show as Failed for True but Passed for False (though still, rightly, in error).

Event Timeline

Perhaps a more obvious example is https://www.wikifunctions.org/w/index.php?Z14K1=Z15801&title=Special:CreateObject&uselang=en&zid=Z14

Currently, two tests appear to pass with a new (undefined) implementation (Z15809 and Z16310) but all tests actually result in an error (Z507). The cases that appear to pass are expected to evaluate to Z42/false.

Ran into this today. See e.g. the test cases for Z12427/is prime - for several of the "is not prime" tests (for a specific example, test "2147483647^2 is not prime", implementation "is prime (Python)"), the test displays as "Passed" for a number of implementations, but clicking on the details reveals that the test had an error: "Reached time limit in evaluator".

I believe that the reason why it shows as passed is because it passes Z24/void to Z844/Boolean equality, and the built-in implementation of Z844 considers Z24/void to be equivalent to Z42/false - see test case Z24790. I would consider changing this behavior in the built-in implementation or changing the definition of "passed" for a test to not include evaluations with errors.

I confirm this is a bug. Probably an issue of the perform tests API.
The validation result shows as Z41/true while the metadata contains an error key.

Here's a particular example of prime running test Z14095 for implementation Z12428:

response:

{
    "query": {
        "wikilambda_perform_test": [
            {
                "zFunctionId": "Z12427",
                "zImplementationId": "Z12428",
                "zTesterId": "Z14095",
                "validateStatus": "{\n    \"Z1K1\": \"Z40\",\n    \"Z40K1\": \"Z41\"\n}",
                "testMetadata": "..." // see below for readibility
            }
        ]
    }
}

test metadata:

{
    "Z1K1": {
        "Z1K1": "Z7",
        "Z7K1": "Z883",
        "Z883K1": "Z6",
        "Z883K2": "Z1"
    },
    "K1": [
        {
            "Z1K1": "Z7",
            "Z7K1": "Z882",
            "Z882K1": "Z6",
            "Z882K2": "Z1"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "errors",
            "K2": {
                "Z1K1": "Z5",
                "Z5K1": "Z575",
                "Z5K2": {
                    "Z1K1": {
                        "Z1K1": "Z7",
                        "Z7K1": "Z885",
                        "Z885K1": "Z575"
                    },
                    "Z575K1": "9000 ms"
                }
            }
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "evaluationMemoryUsage",
            "K2": "110.93 MiB"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "evaluationCpuUsage",
            "K2": "75.011 ms"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "evaluationStartTime",
            "K2": "2025-05-26T02:33:02.694Z"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "evaluationEndTime",
            "K2": "2025-05-26T02:33:11.728Z"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "evaluationDuration",
            "K2": "9034 ms"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "evaluationHostname",
            "K2": "function-evaluator-python-evaluator-5964578b4-vqwr7"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "implementationId",
            "K2": {
                "Z1K1": "Z6",
                "Z6K1": "Z14090"
            }
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "implementationType",
            "K2": "Z14K3"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "orchestrationMemoryUsage",
            "K2": "113.96 MiB"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "orchestrationCpuUsage",
            "K2": "215.302 ms"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "orchestrationStartTime",
            "K2": "2025-05-26T02:33:02.290Z"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "orchestrationEndTime",
            "K2": "2025-05-26T02:33:11.734Z"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "orchestrationDuration",
            "K2": "9444 ms"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "orchestrationHostname",
            "K2": "function-orchestrator-main-orchestrator-5f787fc4d-xz8tv"
        },
        {
            "Z1K1": {
                "Z1K1": "Z7",
                "Z7K1": "Z882",
                "Z882K1": "Z6",
                "Z882K2": "Z1"
            },
            "K1": "loadedFromMediaWikiCache",
            "K2": "2025-05-26T15:55:55Z"
        }
    ]
}