Page MenuHomePhabricator

Decide how validation should work
Closed, ResolvedPublic

Description

Validation is currently very underspecified and it is hard to guess how it should work. This doc goes into details about the problem and potential solutions: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Semantics_of_validation_in_Wikifunctions

Event Timeline

When I click on https://docs.google.com/document/d/1G6lmh0LcxFtT3RwiwBenEOqFc2mRE_41XKTr4YkWuS4/, I get a message that authorization is necessary. Please could you copy the document to a place which is public?

Hi Strobilomyces, the contents of the document have been ported to metawiki: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Semantics_of_validation_in_Wikifunctions. I'll update the OP.

Here's a proposal on how to let validation work:

Validation should only run in a small number of circumstances:

  1. When storing an object, the object being stored should be validated
  2. When submitting a function call for evaluation, that function call should be validated
  3. When a user requests validation of an object, e.g. while editing an object to check whether it is still valid
  4. When the validation function changes, to re-run it on the persisted objects of that type
  5. Possibly on the end result of an evaluation before giving it back to the user, but I am not sure about this

Situations 1 and 3 are particularly interesting for composition writing.

In particular

  1. We should not repeatedly run validation during evaluation, but trust that once the evaluation of a valid object starts, the intermediate steps and the end result are valid.
  2. We should trust a function call to return the type it promises to return, and not do a validation on the result of an embedded function call during evaluation
  3. We should trust that a persisted object is a valid object of their claimed type and not run validation on a persisted object

This means, that

  1. a Z9/Reference is always assumed to be a valid object of their type.
  2. a Z7/Function call will always return a valid object of their return type.

We believe the former without again validating the referenced object, but we need to look up the type of the object. We believe the latter without evaluating the function call, but we need to look up the return type of the function.

Example

Assume the following function call (all IDs per Wikifunctions Beta):

Z10118/add(
  Z10015/Positive integer<2>,
  Z10228/increase by one( Z10301/Five ) )

This requires the following checks:

  • add takes two arguments and there are two arguments (check)
  • The first argument expects a Positive Integer.The first argument is given as a literal of type Positive Integer (check).
  • Since it is a literal, we validate that literal (check).
  • The second argument of add expects a Positive Integer. The second argument is a function call. We need to do two things: first, check whether the function increase by one has a compatible return type (check)
  • Then check that function call for validity (but only the form, we do not evaluate it!). So we check the call to increase by one:
    • we see it takes one argument and there is one argument (check)
    • The argument expects the type Positive Integer. We check that the reference to five is compatible to type Positive Integer (check).

All tests pass. Only now we start the evaluation.

What does it mean for two types to be compatible?

Expected typeGiven typeCompatible?
Z1/ObjectAnythingYes
A specific type, e.g. Z6/StringA literal with the same specific type, i.e. Z6/StringYes
A specific type, e.g. Z6/StringA function with the return type Z1/ObjectSometimes, but we accept it
A specific type, e.g. Z6/StringA function with the same return type, i.e. Z6/StringYes
A specific type, e.g. Z6/StringA reference to an object of that specific type, i.e. Z6/StringYes
A specific type, e.g. Z6/StringA literal with some other type, or a function that returns some other type, or a reference to an object of some other typeNo
A specific type, e.g. Z6/StringSome other type that has exactly the same structure, i.e. the same number of keys with the same expected typesNo (but that is implicit in the previous box)
A generic type, i.e. a function call that returns a type. The function call has no Z1/Object as one of the arguments, recursively.A literal of a type T or a reference to an object of type T or a function with return type TIf and only if T is exactly the same as the function call representing the function call. Yes, that means that Z801/Echo(Z6/String) is not compatible with Z6/String.
A generic type, i.e. a function call that returns a type. The function call has a Z1/Object as one of the arguments, either directly or deeper in the call tree.A literal of a type T or a reference to an object of type T or a function with return type TIf and only if T is the same as the expected function call, module the place where there is a Z1/Object in the expected type, since that is a wildcard that allows for everything. So if we expect Z801/Echo(Z1/Object), Z801/Echo(Z6/String) is compatible. More relevant, if we expect Z881/Typed List(Z1/Object) that is compatible with Z881/Typed List(Z6/String)

For starters, we can just assume that whenever we have a generic type, we accept any generic type, and we can make validation tighter later.

Reminder, a generic type is a function with return type Z4/Type. A generic function is a function with return type Z8/Function.

What about generic functions?

Many of them are solved above. But here’s a generic function which is not.

Assume I have a function Typed Head which takes a type T as an argument, and returns a function Typed Head(T) that takes a Typed List(T) as the input and returns a T as the output.

In that case, ideally, the following function call would be valid:

Z10118/Add(
  Z10015/Positive Integer<3>,
  Typed Head(Z10015/Positive Integer)(
    [Z10015/Positive Integer,
     Positive Integer<2>,
     Z10015/Positive Integer<4>]))

should be valid (note that lines 4-6 are a Benjamin Array), but

Z10118/Add(
  Z10015/Positive Integer<3>,
  Typed Head(Z6/String)(
    [Z6/String, “2”, “4”]))

should not.

But we would only figure that out if we actually evaluate the generic functions Typed Head(Z6/String) and Typed Head(Z10015/Positive Integer), which is not always feasible.

The suggestion is that every time we notice a generic function we just allow it for now. We can later think about smarter approaches. The assumption is that anyone who works with generic functions knows what they are doing.

Implementation suggestion for compatibility

Since the above proposal for type compatibility has a few rather intricate examples, one way to implement that easier is to have an approach that, by default, thinks that any two types are compatible, and then adds the few cases that are not explicitly, instead of the other way around.

This would literally mean that from the table above, we need only implement a single box: if the expected type is a specific type, and the given type is a either a literal of a different type, or a reference to an object of a different type, or a function call for a function where the return type is a different type.

Anything else passes as compatible for now.