To get a translation score, we compare the translation output vs the expected output for each translation field.
In some cases we need to compare arrays vs arrays (for example, in the author fields). In these cases, we calculate two scores and average them:
- we compare arrays item-wise in the order they were given (i.e., 1st item of one array vs 1st item of the other array, etc); and
- we compare one array vs all possible permutations of the other array, and return the one with the highest score.
However, as noted by @Nidiah (who also uses this strategy in Web2Cit-Research), the number of possible permutations grows very rapidly with array length. For example, the items of an array only 10 items long can be ordered in over 3 million different ways!!
As a result, Web2Cit-Server crashes with an out of memory error when trying to process test cases including such "long" arrays, which would result in Web2Cit-Monitor and Web2Cit-Gadget failures as well.
Is there a more efficient array vs array comparison method we may use? In the meantime, consider running the ordered comparison of arrays only.