Page MenuHomePhabricator

Optimize term updating
Closed, DeclinedPublic

Description

Currently the new service (DoctrinePropertyTermStore) uses naive updating. It deletes everything and then inserts everything. The old wb_terms code (TermSqlIndex) first does a select and then a diff to only delete and insert those terms that changed. The new service should also do this.

Event Timeline

I looked into this for a bit.

I'm not sure if doing a select&diff is better than the current implementation.. it requires joining all the tables to get to the text nodes and lots of wasted processing power in cases where we update on term out of hundreds (is that the common case actually).

A significant improvement (both architecturally and for this sake) I can think of is to pass what have actually changed to the store writer (instead of passing EntityDocument, I would pass EntityId and Term[] of those terms that have actually been added/updated/deleted. nevermind this, it won't work that way.

I came to a similar conclusion after trying to write some code without looking at this ticket first :) It might still be worth it to do the diff because it helps https://phabricator.wikimedia.org/T220150. I'll comment more there.