Page MenuHomePhabricator

Create batch access interface for BlobStore
Closed, ResolvedPublic

Description

BlobStore does not provide a way to access data in bulk. Code that has been using the 'text' for getQueryInfo() for bulk access to data blobs can no longer be used, because we no longer have rev_text_id to go by in the new (MCR) schema.

As a replacement, we need a way to retrieve a set of data blobs in a way that optimizes the query.

NOTE: at the moment, ExternalStore doesn't offer a bulk interface. That could be added as well, but that would be outside the scope of this ticket. This ticket is about avoiding queries against individual rows of the text table. A later iteration could further optimize to also batch queries to the external store databases.

Draft:
Introduce BlobStore::getBlobBatch( $blobAddresses, $queryFlags = 0 ): string[] as the batch analog to BlobStore::getBlob. Note that the return value should be associative, indexed by blob address.

Implementation ideas:

  • BlobStore::getBlobBatch will have to first see which blobs are already cached, and then only fetch (and then cache) the uncached ones.
  • BlobStore::fetchBlob should probably be rewritten to support fetching multiple blobs at once.
  • BlobStore::expandBlob will have to be called for each blob individually for now.
  • In the future, when we have multiple different BlobStore implementation, the top level "dispatching" BlobStore would have to divide the batch by address schema, and forward each sub-batch to the appropriate BlobStore for that schema. For now though, we can just fail if any of the addresses has a schema different from "tt".

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedZabe
Resolveddaniel
ResolvedCCicalese_WMF
Resolveddaniel
ResolvedNone
ResolvedNone
ResolvedCCicalese_WMF
ResolvedCCicalese_WMF
Resolveddaniel
Resolved Pchelolo
Resolveddaniel
ResolvedBPirkle
Resolved Pchelolo
Resolved Pchelolo
Resolved Pchelolo
Resolveddaniel
Resolveddaniel
ResolvedNone

Event Timeline

Change 532449 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/core@master] Introduce BlobStore::getBlobBatch method.

https://gerrit.wikimedia.org/r/532449

Change 532449 merged by Daniel Kinzler:
[mediawiki/core@master] Introduce BlobStore::getBlobBatch method

https://gerrit.wikimedia.org/r/532449