Given the performance problems we have with needing to touch entity usage rows (T111769, T122429) I looked at Wikibase for a bit and tried to see whether we could actually drop eu_touched.
I hope the analysis below is conclusive (did I miss any major use case for the field?), thus allowing for the conclusion I drew from it.
The field is (directly) only being used in `EntityUsageTable` right now (except for tests). In there we have four usages of that field:
# `EntityUsageTable::touchUsages` (thus updating the field)
# `EntityUsageTable::makeUsageRows` which is only called in `EntityUsageTable::addUsages` (for adding new rows)
# `EntityUsageTable::queryUsages` (which can optionally be filtered by the touched time).
# `EntityUsageTable::pruneStaleUsages` (usages older than a given value of that field)
1 and 2 are only used in `SqlUsageTracker::trackUsedEntities` which in turn is only used in `UsageUpdater::addUsagesForPage`. That function is used in two places:
* `DataUpdateHookHandlers::doLinksUpdateComplete` which calls it immediately before also pruning old values (with the current timestamp, thus all values in the table before that will be pruned).
* `AddUsagesForPageJob` which is only fired in case a new ParserCache entry is being saved.
3 has only one usage where eu_touched actually matters: `EntityUsageTable::pruneStaleUsages` (which is 4) where it is used to be able to delete by PK. Thus it is not interesting here.
4 is being used in `SqlUsageTracker::pruneStaleUsages` only which in turn is used in `UsageUpdater::pruneUsagesForPage` only. That function in turn is only used in `DataUpdateHookHandlers` in two places:
# To prune entries for deleted pages (where the timestamp obviously doesn't matter)
# In `DataUpdateHookHandlers::doLinksUpdateComplete` immediately after touching the batch of relevant usages (with the current timestamp, thus all values in the table before that will be pruned).
To conclude this, it should be enough to look at `DataUpdateHookHandlers` in order to get the big picture.
Behaviour on edit:
After a user edited a page we (immediately) run `DataUpdateHookHandlers::doParserCacheSaveComplete`, thus adding the new usage entries to the table (but without touching any of the old values, yet). Some time after that a LinksUpdate job will run (asynchronously), that will trigger `DataUpdateHookHandlers::doLinksUpdateComplete` which deletes all usage entries, except for those in the ParserOutput of the edit that triggered the LinksUpdate.
Page views that happen between the page save but before the LinksUpdate run will have their usages being lost (as we initially insert the usages via `DataUpdateHookHandlers::doParserCacheSaveComplete`, but purge them in our LinksUpdate hook handler later on). That is a problem with the current implementation and will also be one in the new implementation without `eu_touched`.
As far as I see, we can come around using eu_touched at all by making to changes:
# Simply delete all old usages and insert the new ones afterwards in `DataUpdateHookHandlers::doLinksUpdateComplete` (obviously you would do a diff and only touch rows you need to in a real implementation)
# Simply keep letting `AddUsagesForPageJob` insert all new usages it has. In order to avoid race conditions with `doLinksUpdateComplete`, the job should know about the page_touched of the page in question and only actually insert its rows, in case the touched timestamp hasn't changed.