Page MenuHomePhabricator

Ensure that thumbor uses output functions of tools rather than capturing stdout
Open, Needs TriagePublic

Description

In T334725 we encountered an issue where the output of a tool changed between versions (we also saw a different but very similar problem in T327887). This is unfortunate but is entirely to be expected when we expect tools to output binary output to stdout when -o or other output to file functions are available. We currently use a mix of writing to temporary files and capturing tool output.

Wherever possible we should avoid using stdout in tools and using canonical output-to-file functions. Of course, wherever we do this it is imperative that we clean up any temp files when work is complete or upon any exception cases.

Event Timeline

Writing to files, even in memory /tmp, is usually slower than writing directly to stdout. This is why upstream Thumbor, and our customizations, avoid it wherever possible.

Writing to files, even in memory /tmp, is usually slower than writing directly to stdout. This is why upstream Thumbor, and our customizations, avoid it wherever possible.

I'd be interested to see how much slower it will make our operations in general though - relying on tool output seems brittle (as the issues linked have shown). We already make the tradeoff around storing files on disk vs in memory as one of our base differences from upstream Thumbor for scalability purposes. We already write to tempfiles for (at least) vips, svg, some exif operations, stl, thumbnail validity and a few other things.