On the main Wikimedia app cluster we log occasional read and write failures in CDB, causing bug 31576 and other issues.
Analysis of the CDB code and limited testing indicates that a disk full condition may cause a truncated CDB file to be moved into the destination. Most of our app servers have /tmp mounted on a tiny (2GB) partition which is commonly 60-70% full. A spike in disk space usage could cause a transient disk-full condition.
More defensive handling of CDB writes might fix this issue. The return values of the underlying write() and close() syscalls should be checked, and if an error occurs, the temporary file should be deleted without the rename being done, and an exception thrown.
There are problems in many places:
- PHP's dba_close() does not check whether the write of the hashtable or the close() succeeded. This is not easy to fix, since dba_close() just calls the resource destructor, and the resource destructor interface (rsrc_dtor_func_t) does not provide a return value.
- Using CdbWriter_PHP would at least allow us to check for failures during the hashtable write. However fclose() suffers from the same problem as dba_close(): it calls the resource destructor and so cannot provide a meaningful error return. It doesn't even check for failures during the final flush of the internal write buffer, see https://bugs.php.net/bug.php?id=60110
- CdbWriter_DBA::set() just passes through the return value of dba_insert(), it does not throw an exception. Thus most callers will continue despite a disk-full error.
- CdbWriter_PHP::set() throws an exception on a write error, however this will lead to __destruct() being called and thus a truncated CDB file being moved into the destination.
Also, the fact that CdbWriter objects are finalised and closed on __destruct() may lead to an incomplete file being moved into place if some unrelated exception is thrown in the caller during rebuild. Fixing this will mean requiring that callers call CdbWriter::close() before letting the object go out of scope, breaking backwards compatibility for callers like dumpInterwiki.php and extensions/Babel/txt2cdb.php
My recommendations are:
- Add a configuration variable to configure which CdbWriter subclass is used, and make it CdbWriter_PHP by default. It looks like it makes sense to use the C reader but the PHP writer.
- Fix the exception and error handling in both the DBA and PHP versions of CdbWriter.
- Encourage the PHP devs to do something about resource destructor error propagation.
- Increase the size of the /tmp partition on the app servers.