Page MenuHomePhabricator

Repeated deletes from objectcache table for expiry
Closed, ResolvedPublic

Description

Author: zigger

Description:
When the objectcache table is used, there are redundant deletions (for expiry)
within db connection sessions, as well as across them.

The following patch avoids repeating the deletion within a db connection session.


Version: 1.4.x
Severity: enhancement

Details

Reference
bz1431
TitleReferenceAuthorSource BranchDest Branch
triggers: Trim string inputs at start of param processingtoolforge-repos/ifttt!23bd808work/bd808/data-cleaningmain
evaluate: Don't log the JSON object, it's too big for prodrepos/abstract-wiki/wikifunctions/function-orchestrator!56jforresterT343176main
Customize query in GitLab

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:12 PM
bzimport set Reference to bz1431.
bzimport added a subscriber: Unknown Object (MLST).

zigger wrote:

REL1_4 patch for ObjectCache.php

Attached:

zigger wrote:

The previous patch also applies to HEAD.

I wonder if it might be better to do delete only as an occasional garbage collection step, and do an expiration check explicitly on
load.

zigger wrote:

(In reply to comment #3)
I had wondered the same thing, but I could not think of a way to schedule
garbage collection across all requests without adding to the database I/O or
adding another process. Although adding an expiry check to the lookups may help
future optimisations probably at the cost of another index.

There are other inefficiencies in the objectcache table mode, but as the first
patch immediately halves the deletions for expiry in local tests, it seems
useful to include in v1.4 before the next release. AFAIK, the objectcache table
is not used on the Wikimedia sites.

The simple way for for scheduling garbage collection in this environment is to
use a probability trigger with a random number. The recentchanges old rows
purge works this way: on roughly 1 in 1000 edits it'll do the purge.

And yes, objectcache is pretty suck; I wrote it as a quick hack for when
memcached is not available so it's not super. :)

Applied a slightly altered version of the patch (separated garbageCollect() logic from expireall() action).