As part of T95501, it was decided to do LinksUpdates in a loop of commitAndWaitForReplication() in order to avoid replication lag. This means that if any server in the cluster is lagged, even if it has 0.01% load, waitForReplication() will time out and throw an exception after a hard-coded 10 seconds. Unlike when the same problem occurs in jobs, there is no possibility of the commit being retried, so the links will just be wrong forever.
A possible fix: after a short timeout of say 1s, ignore the error and continue with the next transaction. This will throttle the update and thus limit the impact of update jobs on the replication lag, but still allow the update to complete.
I don't think it's acceptable to throw an exception and discard the update, even if all slaves are lagged. Read-only mode is a better mechanism for offloading the DB cluster in this case. By dumping outstanding work into the binlog at whatever rate and then switching to read-only mode for a few minutes, we would at least preserve the consistency of the links tables.
I noticed these exceptions while investigating T198049.