We have too many different cache objects, configuration variables, methods and factories for various types of cache interfaces.
### Entry points
A few of the things we use right now:
* `wfGetCache( "hash" )`, `new HashBagOStuff();`.
* `wfGetCache( CACHE_ANYTHING )`, `ObjectCache::newAnything()`.
* `ObjectCache::newAccelerator( fallback )`.
* `wfGetMainCache()`, `ObjectCache::getMainClusterInstance()`.
* `ObjectCache::getMainWANInstance()`.
* `ObjectCache::getMainStashInstance()`.
* `ObjectCache::getInstance()`.
* `ObjectCache::getWANInstance()`.
* more..
* variations on the above due to back-compat aliases.
We should reduce this to a handful only that we use everywhere.
With the multi-datacenter work (T88445) we're also changing which cache developers should use by default. Namely one should now default to WANCache instead of the local-dc cache. It may make sense to also update the notion of "main cache" to start referring to WAN cache to clarify this convention shift.
I propose we standardise on the following four main entry points:
* Server cache or Process cache (e.g. APC, fallback to none, or hash.)
* Local cluster cache (e.g. Memcached).
* WAN cache (e.g. Memcached, with relayed purges and other improvements).
* Main stash (e.g. db-replicated).
With two extra entry points for special cases:
* Identified cache group (e.g. `ObjectCache::getGroup( "language-converter" )`).
* Identified cache backend (should be unused, but for back-compat, e.g. `ObjectCache::getInstance( CACHE_DB );`_
Reduced interface:
```lang=php
class ObjectCache {
// Takes key from $wgObjectCaches
static function getInstance( string $type );
// Takes optional key from $wgObjectCacheGroups, defaults to main cache
static function getGroup( string $groupName );
// Get cache object for in-memory values on the current server (APC, fallback to in-process hash).
static function newAccelerator( $fallback = "hash" );
// Get cache object for basic caching in the local DC (aka wgMemc, or "main cache").
static function getMainClusterInstance();
// Get cache object for the main cache in the local DC that is multi-DC aware (relayed purges, and other improvements).
static function getMainWANInstance();
// Replicated store
static function getMainStashInstance();
}
```
Refactored interface (stretch goal, methods renamed from the above):
```lang=php
class ObjectCache {
static function getInstance( string $type );
static function getGroup( string $groupName );
static function newAccelerator();
static function getLocalClusterCache();
static function getMainCache();
static function getMainStash();
}```
Action items:
* {icon check color=green} Don't require custom accelerator fallback. Set a default one (empty or hash) and deprecate the fallback parameter.
* {icon check color=green} Give CACHE_ACCEL sensible default so it doesn't fail on plain installs.
* {icon check color=green} Reduce ObjectCache entry points to just WAN, LocalServer (APC), LocalCluster (Main) and stash.
* [ ] Deprecate wfGetCache(), wfGetMainCache().
### Configuration
Some configuration:
* `$wgObjectCaches array( .. )`
* `$wgWANObjectCaches array( .. )`
* `$wgMainCacheType`
* `$wgMainWANCache`
* `$wgMainStash`
* `$wgMessageCacheType`, `$wgParserCacheType`, `$wgSessionCacheType`, `$wgLanguageConverterCacheType` (maybe introduce some kind of cache grouping to allow overriding backends for individual cache groups, defaulting to main cache).
Action items:
* [ ] Deprecate cache group config vars (`$wgMessageCacheType`, `$wgParserCacheType` etc.) in favour of e.g. `$wgObjectCacheGroups`.
* [ ] Implement `ObjectCache::getGroup( $name )` with default to wgMainCacheType.
* [ ] Deprecate wfGetMessageCacheStorage(), wfGetParserCacheStorage()
Once we have `$wgObjectCacheGroups`, it'll be less complicated to introduce new cache groups. Making it easy to move around things in wmf-production and re-use caches without having to add more `wg*CacheType` variables and ad-hoc instantiation of BagOStuff objects.
### Deprecate "Anything"
The concept of `CACHE_ANYTHING` is primarily used to cache something somewhere even if the "main" cache has not been configured (e.g. is still CACHE_NONE per the default settings.) Typically this means that something is sufficiently expensive to compute that we don't mind using the Database to store it - since all cache groups default to CACHE_NONE and `ObjectCache::newAnything` finally falls back to CACHE_DB.
This status quo is based on the very old assumption that most keys are "as expensive or less expensive" to generate than a database query, which is why the default is CACHE_NONE, not CACHE_DB.
I'd like to challenge this assumption and instead recommend that anything sufficiently cheap to generate that doesn't want to be stored in the database, should probably use APC instead and fallback to nothing.
Additionally, the problem with CACHE_ANYTHING is that it predated WANObjectCache, which means it isn't multi-DC aware.
| | Expensive | Not expensive | Really not expensive
|--|--|--|--
| **Current** | CACHE_ANYTHING | Main cache (default: None, typically Memcached) | Local server (APC if available, fallback: None)
| **Proposed** | Main cache | Local server or Main cache (case by case) | Local server
Action items:
* [ ] Deprecate the concept of CACHE_ANYTHING. Convert uses to WANObjectCache or local server cache.
* [ ] Change the default main cache in stock MediaWiki from CACHE_NONE to CACHE_DB.
* [ ] (Later) Remove `ObjectCache::newAnything()`, `CACHE_ANYTHING` etc.