When DjVu files contain text layers, we currently extract these and store them into the file's metadata blob, so it's available to extensions like ProofreadPage which can use it.
Unfortunately this *massively* increases the size of the file object -- which contains the uncompressed serialized metadata blob in memory -- leading to errors like T32751, running out of memory when loading a bunch of file objects at once in an API request.
In addition it's a bit awkward to access the text from other places; things like search indexing (T8421) would benefit from having a more standardish place to get at extracted text, and this could also be used for other file formats.
Version: 1.20.x
Severity: normal