Page MenuHomePhabricator

Service-based thumbnailing re-architecture on Vagrant
Closed, ResolvedPublic

Description

Given that several of the thumbnailing re-architecture goals are intertwined, it seems like implementing a working solution on VM will help us progress towards a real solution for production. Not having to make each piece production-ready will help us move faster and check if all the complicated moving parts work well together. I expect that once a VM-based full solution is built, it will be the basis for one or more RfCs to solve these issues the same way in production.

The likely core service of this architecture would be Thumbor.

Related Objects

StatusAssignedTask
OpenNone
StalledNone
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
Resolvedhashar
ResolvedGilles
InvalidGilles
ResolvedGilles
InvalidGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
InvalidGilles
InvalidGilles
InvalidGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
InvalidGilles
ResolvedGilles
InvalidGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
DuplicateGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
DeclinedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
DeclinedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles
ResolvedGilles

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Gilles closed subtask T120207: Thumbor PDF support as Resolved.
Gilles closed subtask T120202: Thumbor DJVU support as Resolved.
Gilles closed subtask T120203: Thumbor TIFF support as Resolved.

I have two questions:

  • Will I be able to use one API to get derivative images (i.e. images flipped/cropped/rotated/scaled) regardless of whether the wiki is using the new service-based architecture or not?
  • Can I use that API now in any way?

And I guess, the implied question:

  • Will the API use imagemagick if the service is not available, and should that API get built now rather than later (and should I go ahead and do that)?
Gilles added a comment.Feb 4 2016, 7:50 PM

Ultimately the service will indeed be able to be queried directly, without any wiki relationship. It only works on VM at the moment, though, I can't point you to anything you can use right away.

Flipping, cropping, rotating, scaling are all supported out of the box (and more!).

You can of course write a Mediawiki API that will then become a shallow proxy to thumbor when that's available. If you're eager to start working on this approach right away, I think it makes sense to do that, as writing a public transformation API isn't in the scope of my project. I'm just trying to hook it up to our thumbnail generation, even if it can do a lot more than that.

You could indeed use imagemagick as a fallback, in fact for JPGs it's highly likely that Thumbor will be using the imagemagick libraries anyway. If you have an immediate need, I think it would be good long-term value for you to write an API that uses IM, which will then be updated to rely on Thumbor when available. If the API is written after Thumbor is available on beta and production, the extra work to write an IM fallback would become extra-curricular, imho, as the WMF won't have any need for it.

Am I making any sense?

@Gilles only as much as usual. I'll probably wade into writing this API (actually, it's already Kind Of Finished), though it will be encapsulated in ImageTweaks for now. We'll use that API and use it to support Thumbor in the future. Thanks!

I'm playing around with the Thumbor API locally, and I have a few gripes:

  • All of the operations are included in the path of the URL, not in query string parameters or POSTed data, so URL length limits might apply.
  • Order matters, but Thumbor has an order of operations that it maintains - I can crop and rotate, but I cannot rotate and crop.
  • As far as I can tell, you can only perform each operation once - i.e. I could not do a crop, a rotation, then another crop. (note: we would be limiting the number of operations allowed per request in the API module on the MediaWiki side)
  • In order to flip an image, you must trim or crop it. You can do a null crop, but that requires you to know the dimensions of the image. Pretty nasty.

I don't know how many of these issues will be addressed, but I'm aware that my use-case is not the primary one for the first rollout of Thumbor on our cluster.

I think that's fair feedback for what it can do. The upside is that Thumbor is very easy to extend and the upstream is very welcoming of feedback and PRs. You could report some of those as issues: https://github.com/thumbor/thumbor/issues If you missed something they'll point you to what you can do and if it's a legitimate issue, they or we can address it down the line.

Decoupling the flip is really easy to do, you could just create a flip filter that calls the existing flip operation. This way it could be called without specifying a size, thus maintaining the original size without knowing it. That can be done as an extension without touching Thumbor itself.

It's straightforward to write a POST-based handler that does the same things as the GET-based one. For instance I'm working on a handler that handlers our thumbnail url scheme: https://phabricator.wikimedia.org/D107 I've defined the get and head Tornado handlers, but post is available as well. I think that might be achievable without touching Thumbor itself.

Regarding the issue of ordering, I think it's because cropping is done as a special operation that's not a filter. I don't know if filters are deduplicated/ordered, it's worth checking if they are. In which case it might be the same answer as the flip, i.e. creating a cropping filter that reuses the existing cropping code. And if the filters aren't run in order or are deduplicated, then that part is definitely PR material.

This laundry list would probably be 1-2 weeks of work for me (depending on whether or not we have to submit a PR for a core change or not), if you need that support at some point, let me know and I'll run that request by Ori.

In fact I think that doing everything might take less time than the sum of its parts, because the ordering/deduping issue could be taken care of in the POST handler and not in thumbor's default handler, for instance. Which means that we might be able to take care of everything as extensions, without core changes.

@Gilles, I'll probably start with the POST handler soon, and I'll let you know if/when I need help with extensions.