Description
zobjectUtils.js convertTableToJson has quite a lot of complexity, and it's used extensively. Particularly:
- Every time we need to get the type of an object by rowId, we call convertTableToJson (for the sub-zobject): this is called many times
- When we use function call widget, we convert the metadata to table, and then convert back to Json: this is called once but it's an unnecessary step
We should audit the usage of this and limit it to the minimum
Additionally, we should improve the performance of the convertTableToJson so that it's not so costly on the UI
Root cause
Why are these methods so costly?
Building a JSON representation of the flattened zobject table representation is costly when the object is very large.
I can identify two main reasons:
- Row ids are different to the index that the row takes in the array.
- As a result, getRowById requires walking all the table, row by row, searching for an id match.
- getRowById is a nuclear operation performed again and again
- Children rows point at their parent, but there's no reverse index
- As a result, getChildrenByParentRowId requires walking all the table, row by row, searching for all the id=parent matches
- getChildrenByParentRowId is a nuclear operation performed again and again
Ideas:
- Don't fall back to convertTableToJson so often when we do getZObjectTypeByRowId:
- getZObjectTypeByRowId is a widely used method, we should be simplifying it as much as possible and only falling back to returning a json in necessary cases
- ✅ done in patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikiLambda/+/1083205/3
- Don't transform Z22K2/metadata into flat table, because for the Metadata dialog we transform it back to JSON again
- not only the two-way transformation takes a lot of computation time, but it doubles up the size in memory and worsens performance considerably
- ✅ done in patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikiLambda/+/1085375/5 (very significant improvement in the function evaluator)
- Transform flat zobject table array into object indexed by the row id
- This should improve object access, we would access rows directly without having to do zobject.find()
- This should also improve operations like deletion
- ❌ tried this. Observed no improvement. Javascript arrays are much more efficient than objects and the application becomes heavier in memory and slower.
- Create a reverse index from parent -> children rows
- This should improve performance when finding the children of a row (for example, when deleting a sub-tree)
- ❌ tried this. Application becomes very heavy in memory.
- Keep row Ids and array indexes in sync
- This should allow row access by simply doing zobject[ rowId ] without having to do zobject.find()
- But this means that rows cannot be deleted, they just need to be nullified or marked as destroyed
- Namespace zobject tables
- Instead of having all rows in the same array, we can namespace arrays and have them host their tables isolated from each other
- This means that destroying detached objects (for example, when re-running a function in the function evaluator) is as simple, quick and secure, as destroying the namespace
- This would improve speed and could facilitate keeping indexes in sync with row IDs
- This can also be very beneficial when using default view in other contexts (e.g. Visual Editor)
Completion checklist
- Before closing this task, review one by one the checklist available here: https://www.mediawiki.org/wiki/Abstract_Wikipedia_team/Definition_of_Done#Front-end_Task/Bug_Completion_Checklist