Page MenuHomePhabricator

Pair a function with its inverse function for testing
Open, Needs TriagePublic

Description

Where there is an inverse function for a function, the two functions should be paired to ensure that they are entirely consistent, and remain so. A function's inverse could also be used to generate test inputs from expected results.

Some functions are their own inverse, and pair with themselves (for example, a function which reverses the order of its input).

Some functions are naturally "lossy" and have no inverse (like a sort). Some of these can be constructed as a lossless function that has an inverse, followed by a lossy function that does not. (For example, you can sequentially number the elements to be sorted before you begin the sort function; the "unsort" inverse would then sort by the sequence number, returning the elements to their original order.)

"And if a function is paired with its inverse, you can automatically check that the final output is the same as the initial input. As under#Distillation of existing content, this is an important consideration for a rendering sub-function, whose inverse is a parser sub-function.--GrounderUK (talk) 23:26, 29 July 2020 (UTC)

I really like the idea of using the inverse to auto-generate tests! That should be possible, agreed. Will you create a phab-ticket? --DVrandecic (WMF) (talk) 02:11, 5 August 2020 (UTC)

https://meta.wikimedia.org/wiki/Talk:Abstract_Wikipedia#Some_questions_and_ideas

Event Timeline

One question is whether we want to have a community-defined ontology between Functions and other Wikilambda objects, or whether these relations would make more sense to live in Wikidata.

(This does not necessarily apply to this particular request, just using this place to keep that idea)

Q265623.

Okay, so here's a link...

I oppose original content in Wikidata anyway, but the community-led evolution of an ontology, one that is mainly a specialization of somewhat notable (encyclopedic) concepts...? I think the correct term for that is "unworkable", if you'll pardon my language.

Once it has evolved and largely stabilized onwiki, there's no problem formalizing any or all of it onto Wikidata, if the community feels that's worth while.

Do we need a separate name for inverse functions?
After all a function is just a relation associating some ordered items from several sets, stating that, when the items are properly ordered, they form a tuple that is part of the set of tuples represented by the relation (independantly of their inner side-effect, but if there are side-effects, they should apply to two other items added to the tuple: the "input" state and the "output" state (including errors of executions or exceptions thrown and their own properties).
So we don't really need to make a distinction between inputs and outputs, they are just items in the ordered tuple; the subset of what is input or what is output does not matter if we want to be purely function. What really matters during the evaluation is to check that the tuple is a member or not of the relation, and an evaluator can as well resolve any function as a relation, i.e. as if it was an enumerator of tuples it contains: so we can bind any value to any input or output, the code of the "function" will see what is bound and will return an enumerator whose result will be either:

  • an empty sequence (this is not a solution, the tuple is not member of the relation)
  • a singleton sequence (this is the unique solution, what was an unbound variable is resolved to a single value; if all variables were bound, the enumerator will return the same tuple in the sequence to assert that the tuple is a valid member, but if it returns an empty sequence, we kwno that the proposed tuple was not a solution); it may also return a singleton containing unbound variables: this is exactly like an object data type, but can be reused later for further inferences.

You could design multiple implementations of the function, actually a relation, they can be tried and run in parallel to see which one reduces the number of unbound variables, so you have several enumerators returned whose state (empty, singleton or sequence, and number of unbound variables) can be used to make inferences even if not all variables are bound (e.g. when we don't care aboutr the value of the unbound variables but just to see if the relations offers solutions or not, and for which bound values of any variable)

Such representation offers several advantages:

  • everything (every ZObject) can be represented this way: "constant" values, functions,
  • backward inference is possible
  • we can be purely functional, including for "non-functional functions" (i.e. those that have two additional state variables for their "input" and "output" environment)
  • a traditional mathematical function (like sinus) taking one input and returning one output in its definition set would become a 4-ary relation: "sin(in,out,x,y)" where the values bound to "in" and "out" don't matter, and "y=sin(x)". That function is usually not inversible in the reals but in a reduced set, it is so that "x=arcsin(y)" when "x" is constrained, so "sin(in,out,x,y)" can represent this constraint when "x" is unbound but "in" is bound to an object representing the restriction on "x" to its reduced set (e.g. in [-pi/2, +pi/2]).
  • curification of functions is also possible and very natural.
  • all functions become inversible (the inverse may not return a single value but an enumerator of tuples (e.g. the "inverse" of sinus).

So the same "function" name can be used for the function and its inverse. But we can optionally define identities, e.g. stating that "plus(a,b,c)" = "minus(c,a,b)" = "minus(c,b,a)" for the arthmetic operation "c=a+b" (the identities state that "c-a=b" and "c-b=a"). Such identities are just renaming (sort of aliases, but with parameters in the tuple just reordered).
May be we can design every "function" (or any other object, including constants and types, and validators) just as these relations of ordered tuples, taking always the two additional implicit parameters for the input and ouput state (for tracking non-functional functions).

Prolog-like Inferences becomes possible. Just like in Prolog, the evaluator just tries to bind variables with missing values by reducing the cardinality of their enumerators; the evaluator can retrigger the returned iterators (result sets) just like when we use a "SELECT" in SQL (where enumerators are represented by the concept of "cursors"), and it can do that as many times it wishes and in any order, it just tries to minimize the number of open iterators, focusing first on those that return empty sets, then singletons; the iterators may also indicate an hint about the estimated number of subsequent members (plus may be a cost estimate) to help the evaluator select the best evaluation path (this will be helpful when there will be several implementations of the same "function": for the "sum" relation, one implementation could provide a fast addition, another a fast substration, the evaluator will then kwow which iterator, one for each invokable implementation, will return the best solution).

Do we need a separate name for inverse functions?

I don't know. If we don't, is the function necessarily "self inverse"? In the general case, "function" should be understood as a composition of functions. This task arises from a discussion that is now archived. The point there was that compositions of functions that should achieve the same result can be used to test one another. Here we extend that principle to function/compositions that deliver the "opposite" result; given the output from one function/composition, a second (possibly identical) function/composition will deliver output equal to the first's input. In practice, we either need two separate function calls or some logging of the intermediate result of the first function.

I think we remembered to do this. See https://www.wikifunctions.org/view/en/Z14195.

However, I said “… In practice, we either need two separate function calls or some logging of the intermediate result of the first function.” Because we do not have separate function calls but a composition we should have visibility of intermediate results. I think that would be useful for all compositions, so is that something we are looking at?

Much as I shall miss her when she’s gone, I’m feeling my way towards being ready to recommend closing this, my first ticket.

We should make sure we cover “using an inverse to auto-generate tests” under T358216, I believe. (As in the screenshot, we would get the expected result but, lacking the intermediate result, we would not know the required input for a simple test of the decode function.)

IMG_0909.png (2×960 px, 302 KB)