ArchCom RFC Meeting W45: Image Thumbnail API (2016-11-09, #wikimedia-office)

Hosted by daniel on Nov 9 2016, 10:00 PM - 10:00 PM.



  • Location: #wikimedia-office IRC channel
  • Meeting type: TBD
  • Time: Weekly, Wednesday 21:00 UTC (2pm PDT, 23:00 CEST)
    • This time is the standard as of this writing (in March 2016), but may change as we make adjustments to accommodate daylight savings/summer time adjustments for the participants. By default, we intend for this meeting to be held Wednesday 2pm San Francisco time.
  • Topic: The first half of T66214: Define an official thumb API, without the bits about hashing. RFC to be revised or split until the meeting.

Other meetings

Architecture meetings
13:00 PT ArchCom Planning Meetingsupcomingall since 2016-03-30
14:00 PT ArchCom-RFC Meetingsupcomingall since 2015-09-09

Recurring Event

Event Series
This event is an instance of E66: ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office), and repeats every week.

Event Timeline

daniel renamed this event from ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office) to ArchCom RFC Meeting W45: Image Thumbnail API (2016-11-09, #wikimedia-office).Nov 2 2016, 9:11 PM
daniel updated the event description. (Show Details)
daniel changed the host of this event from RobLa-WMF to daniel.

Meeting log:

09:01:25 <TimStarling> #startmeeting RFC meeting: Define an official thumb API
09:01:42 <brion> \o/
09:02:26 <TimStarling> no meetbot
09:02:33 <brion> poop
09:03:31 <TimStarling> and the instructions at https://wikitech.wikimedia.org/wiki/Tool:Meetbot don't work for me, it just closes the ssh connection immediately after login
09:05:24 <gilles> become meetbot > "You are not a member of the group tools.meetbot."
09:05:35 <TimStarling> who are the key participants in this meeting? AaronSchulz?
09:06:13 <tgr> gwicke I suppose?
09:06:16 <brion> i had some interest but starting with wanting to see what updates came along
09:06:41 <gwicke> hey
09:06:58 <TimStarling> anomie previously commented but is marked away
09:07:04 <tgr> someone from the mobile apps or MCS
09:07:21 <gwicke> so, we repeatedly run into the issue of having no API for requesting thumbnails in a client-selected size or quality
09:07:50 <gwicke> this need comes up in the context of apps, lazy image loading, and API design
09:08:06 <gwicke> for example, lots of references to page thumbs currently provide a set of fixed thumb sizes
09:08:28 * bearND is lurking
09:08:31 <gwicke> and every so often another use case needs a different size, which then triggers API changes
09:08:54 <gwicke> the RFC describes requirements and some options for implementing this
09:09:29 <gilles> the source of the problem is that you can't request arbitrary sizes, otherwise it would just be a matter of changing the size in the url you already have
09:09:34 <gilles> that's not solved in the current proposal
09:09:44 <gwicke> an important requirement is to preserve cache efficiency, which basically means that we don't want to fragment caches significantly
09:09:52 <brion> this is https://phabricator.wikimedia.org/T66214 for anyone without the link handy
09:10:04 <TimStarling> why can't you request arbitrary sizes?
09:10:10 <tgr> I think there are two scenarios worth differentiating: 1 you have a file name and want to get a thumbnail with given parameters 2 you have a thumbnail URL and want to get a thumbnail of the same image but with different parameters
09:10:16 <gilles> because if it's bigger than the original size, you get an error back
09:10:24 <tgr> as I understand no one is really interested in 1
09:10:35 <gilles> and you'd have to keep requesting smaller ones until you get a thumbnail that actually exists
09:10:35 <TimStarling> I mean, I probably wrote the code which caused that limitation, but surely it's just a couple of lines of code to fix if that's desirable
09:10:43 <brion> well 1 is a case of 2 where you get the filename by extracting it from the url
09:10:44 <gwicke> the status quo is that thumb APIs are private information, and explicitly not an API
09:11:01 <brion> returning an image on requested-larger is an actionable change which simplifies some of these interactions, yes
09:11:02 <gwicke> so you can't construct random URLs unless you are prepared for your app to break any time
09:11:06 <DanielK_WMDE_> tgr: would be nice for the cache if a request by name would redirect to a request by hash
09:11:13 <gilles> it's just not documented, I don't know where you're getting the information that it doesn't try to be used as an API
09:11:27 <gilles> everybody can request it and we've alwayd allowed hotlinking and the like
09:11:45 <TimStarling> I've generally encouraged people to use it as an API
09:11:49 <gwicke> de facto it works, but there is no guarantee that it will continue to work
09:11:58 <tgr> I think our current thumbnail URL schema is a de facto API, ugly as it is
09:12:00 <gwicke> and it is not designed as an API
09:12:01 <brion> yes, practically we've avoided making major changes to the url structure because they *do* get used ad-hoc as an api :)
09:12:14 <brion> it just is poorly documented and has weird edge cases
09:12:17 <tgr> it's used in the iOS app, MCS, MediaViewer...
09:12:38 <gwicke> MCS is not actually making up random urls
09:12:40 <DanielK_WMDE_> the old schema could redir to the new schema
09:12:47 <tgr> we could as well as make it official, document it and get rid of the size limitations
09:12:47 <gilles> having recently written a parser for all its cases, it's not as bad as it first appears
09:12:56 * DanielK_WMDE_ likes redirects
09:12:58 <TimStarling> what do you want it to do when scaling up is requested?
09:13:16 <gilles> redirects cause an extra round trip, they're terrible for people with high latency
09:13:20 <tgr> just return the original
09:13:25 <brion> may be better to return the original yeah
09:13:29 <gilles> we're talking about several requests per page on average
09:13:32 <TimStarling> ok, so let's make that change
09:13:34 <brion> i like the idea of a redirect but the latency sucks
09:13:41 <tgr> or a thumbnail with the dimensions of the original rather
09:13:44 <TimStarling> will it break anything to make it return the original?
09:13:46 <gwicke> so one option is to re-define the current format as an api
09:13:49 <gilles> so no, permanent redirect isn't a viable migration option unless temporary
09:14:03 <tgr> TimStarling: it currently returns an error so not likely
09:14:28 <DanielK_WMDE_> gilles: sure, when we generate html, we should use a url that doesn't need a redirect. but redirects are still useful for discoverability, and to keep compat, or provide aliases (file names instead of hashes, for example)
09:14:32 <gilles> make that change, as in redirect to the original in case of requesting a size too large?
09:14:34 <gwicke> parsing and serialization would require custom code
09:14:58 <gwicke> would this work outside the WMF environment?
09:15:16 <TimStarling> no, make the change as in stream out the original file with a 200 response code
09:15:18 <tgr> (people use far more horrible hacks to get thumbnail URLs, like Special: pages)
09:15:40 <TimStarling> yeah if you want a redirect we do have a special page for that
09:15:47 <gilles> TimStarling: that's creating inbounded cache fragmentation, where currently it was at least limited by how wide the image is
09:15:48 <gwicke> the other main option is to consider this as an API design problem, and see if we can do better than the current format
09:15:52 <TimStarling> thumb_handler.php is more for streaming
09:15:53 <gilles> unbouded
09:16:08 <gwicke> and standardize on an official API that is a bit more designed, and easier to use
09:16:27 * hashar has quit (Remote host closed the connection)
09:16:33 <brion> for instance, would it be beneficial to standardize how parameters are passed?
09:16:45 <brion> language for svg
09:16:50 <brion> lossless/lossy for tiff
09:16:54 <brion> page for pdf, tiff, djvu
09:16:59 <brion> time for video thumbnail
09:17:13 <gwicke> I personally think that the gains from a decent API are large enough to warrant cleaning this up once
09:17:36 <gilles> what gains?
09:17:42 <brion> and what about adding more parameters for future rich media types? coordinates for prerendered maps?
09:18:00 <tgr> I think the current API is ugly but usable (with a few small fixes)
09:18:04 <DanielK_WMDE_> i like hash based urls for files. and we can have the old urls redirect to the new ones for compat.
09:18:05 <gwicke> any API should be extensible
09:18:08 <brion> ability to set camera position for 3d model renderings or panoramas?
09:18:09 <gilles> so far the main version proposed has qualities that are shared with the existing url scheme (low cache fragmentation, strict format, etc.)
09:18:39 <gwicke> gilles: ease of parsing and construction is definitely different
09:18:52 <gwicke> the current URL format is ad-hoc, and not used anywhere else
09:18:57 <gilles> gwicke: as I've mentioned before, show me code that proves that what you propose is easier to parse. it's not
09:19:12 <gwicke> compared to query strings for example, it's a lot more code to parse it
09:19:17 <quiddity> (Not sure if relevant) Possibly this relates to https://www.mediawiki.org/wiki/Extension:PageImages ? (IIUC that's used to determine the page_image property which is then used by hovercards, action=info, and mobile search)
09:19:34 <gilles> 4 different characters for separation, logic about some parameters needing to be in some place, etc. it's just as complicated to parse. it just looks nicer at a glance when you don't think about the code involved to parse it
09:19:35 <brion> i kinda like query strings for params, they're standard already :)
09:19:37 <tgr> current de facto API format is 1) break up by '/', get the last component 2) get the first match for <number>px 3) break up the part before that by '-', those are the extra parameters
09:19:40 <gwicke> gilles: you can easily try it yourself
09:19:49 <TimStarling> the existing URL format is obviously extensible, considering that we have extended it to support arbitrary parameters
09:19:52 <brion> tgr: last component fails if filename is very long
09:19:59 <DanielK_WMDE_> gwicke: why no .png etc file extension on the urls?
09:20:10 <gwicke> tgr: how do you parse the extra parameters?
09:20:12 <gwicke> how are they constructed?
09:20:16 <gilles> gwicke: you're the one making a case, do the work. if you're not going to bother doing that I have absolutely no faith that you will be ready for the huge undertaking that the migration involves
09:20:17 <gwicke> what is their order?
09:20:18 <tgr> brion: don't think so, it just won't contain the file name
09:20:34 <DanielK_WMDE_> gwicke: alphabetical
09:20:46 <gwicke> gilles: I trust that you have parsed a query string before
09:21:06 <tgr> gwicke: it's unsorted, we could sort with minimal B/C break
09:21:32 <gilles> gwicke: there have been libraries for that in any decent languages for decades
09:21:38 <tgr> and just have prefix-based ownership which is also the current de facto standard
09:21:56 <gilles> built-in functions, even
09:22:16 <tgr> pageXXX, qXXXX etc
09:22:18 <gwicke> gilles: for mediawiki thumbnail syntax? I don't think so.
09:22:28 <gwicke> and that's my point
09:22:40 <brion> are query string parameters an acceptable thing for 404-handler setup? what are the practical issues of getting something like that into production versus encoding settings into the filename?
09:22:43 <gilles> you just said "query string" please me more accurate in your statement
09:22:52 <DanielK_WMDE_> to me, the biggest argument for changing the thumbnail api is to move to hash based urls.
09:22:58 <gilles> that's not the same thing as mediawiki thumbnail url
09:23:00 <DanielK_WMDE_> the rest is cosmetics
09:23:06 <gwicke> gilles: as you say, one format has built-in support, the other does not
09:23:07 <TimStarling> the existing paramaeter order is defined by makeParamString() in the relevant MediaHandler subclass
09:23:18 <gilles> if the issue is lack of libraries for the existing format, that's very easy to resolve
09:23:30 <tgr> more importantly, most clients don't care about the part before the size, they just want to change sizes, so we could alter that with limited B/C breakage
09:23:44 <brion> DanielK_WMDE_: benefit of hash-based urls being.... versioning? cache sharing for duplicates? fixed length? all good bits i agree :)
09:24:01 <tgr> DanielK_WMDE_: that's kind of an orthogonal subject
09:24:12 <tgr> just replace filename with hash and you are done
09:24:14 <TimStarling> the event page on phabricator said that we weren't going to talk about hashing
09:24:18 <DanielK_WMDE_> brion: yes :)
09:24:26 <gwicke> DanielKWMDE: I decoupled the two proposals, as the move to hash-based image names will take a bit longer, and can be done in a second phase
09:24:26 <TimStarling> unless I misunderstood it
09:24:33 <gilles> DanielK_WMDE_: including the hash in the thumbnail url is already implemented in core, I wrote that while investigating hash-based cache busting a year ago. the vagrant thumbor role uses that option
09:24:34 <brion> ah yes there was talk of separating the hash issue
09:24:52 <gilles> the second instance of the filename in the thumbnail url is simply replace by the original's sha1
09:24:57 <gwicke> the RFC is explicit about this
09:25:01 <DanielK_WMDE_> tgr: true. but if we have a compelling reason to change the url format, it makes more sense to discuss niceties of parameter passing. Without the move to hash based urls, there is no pressing need, imho
09:25:06 <gilles> imho, for that reason and many others, it's a feature need decouple from a url scheme overhaul
09:25:10 <gilles> decoupled
09:26:00 <brion> the most immediately pressing need is the 'given a thumbnail, request same thumbnail in different size' case i think
09:26:04 <gwicke> so, to be clear, is anybody offering to develop the current thumb syntax in a stable & documented API?
09:26:20 <brion> which is improved if we make the single change of letting original files through for too-large requests
09:26:39 <DanielK_WMDE_> my point is: if we *don't* want to go to hashes, messing with the current thumb "api" is probably not worth the pain. but if we change the format anyway, we should make it nice.
09:26:41 <gilles> brion: but that's possible in the current URI scheme. and the new scheme doesn't solve the width > original, which as we've mentioned earlier can be solved by a redirect that doesn't require touching the URI scheme
09:26:47 <gwicke> it sounds like some here favor that solution, but I'm not sure if anybody would be willing to take it on
09:27:02 <brion> an additional need, i think important, is to create image urls 'from whole cloth' during editing, parsing, plugin magic, etc
09:27:11 <brion> gilles: agreed
09:27:27 <gilles> DanielK_WMDE_: moving to hash-based doesn't require a URI scheme overhaul
09:27:33 <tgr> gwicke: I can come up with an RfC if decide to prefer that route, I really don't think it's all that complex
09:27:47 <brion> this add'l need will get worse when we add new media types, assuming we do (panoramas, finishing the 3d support, etc)
09:27:55 <TimStarling> tgr: doesn't really need an RFC if you are describing the current situation, not proposing a change
09:28:10 <gwicke> I primarily care about having a sane API some time soon
09:28:17 <TimStarling> just write about it on mediawiki.org
09:28:25 <brion> will adding more parameters on the existing schema re-complicate things that need to deal with media and images?
09:28:39 <gwicke> and I care less about the exact syntax
09:28:40 <gilles> brion: the question is, to we go through a painful migration before those needs materialize, or do we leave things as-is with the potential implication that new ideas get shot down because it's too hard to do just for the sake of panoramas, for instance
09:28:43 <DanielK_WMDE_> gilles: no, but doing both at the same time may be easier than doing one without the other. and more useful, too
09:29:02 <tgr> the cost of radical changes to the API is that 1) all media handler extensions need to be rewritten (which would be about time, the way they work is terrible) 2) all clients need to be rewritten, and I don't know if we have a grasp on the size of that
09:29:07 <brion> gilles: how painful a migration do we expect?
09:29:18 <DanielK_WMDE_> gwicke: what's insane about the current api? besides "ugly"
09:29:38 <tgr> maybe the mobile apps and the media viewer are the only ones that actually try thumbnail URL guessing currently, in which case no big deal
09:29:56 <brion> tgr: 2) may or may not come with caveat "existing clients may or may not be correct under existing schema" :)
09:29:58 <brion> yeah
09:30:05 <gilles> brion: code to adapt in mediawiki, extensions, VCL, Swift's rewrite.py, probably miscellaneaous puppet, thumbor, apps, restbase. you name it
09:30:06 <DanielK_WMDE_> tgr: i have written toolserver/labs tools that do that
09:30:16 <brion> *nod*
09:30:22 <gwicke> key-value maps can encode a lot of things, so I'm not too worried about future parameter passing needs
09:30:45 <brion> and i'm still concerned about parameters mapping onto low-level backing files
09:30:48 <gwicke> it also seems that the current options map pretty directly to key-value maps
09:30:52 <tgr> DanielK_WMDE_: main problem with the current API beyond ugliness is that parameters are completely ad hoc
09:32:07 <TimStarling> the current API is meant to be compact and human-readable
09:32:13 <DanielK_WMDE_> ok, we can sort the parameters, and slowly standardize them. use redirects for compat
09:32:19 <tgr> the way it works internally is that the pre-filename-part of the URL (e.g. qlow-page1-123px) is passed through the MediaHandler inheritance chain and any handler is free to do any kind of processing
09:32:28 <tgr> append, regex-parse, whatever
09:32:30 <TimStarling> which I think maps to insane and ugly from a programmer's perspective
09:32:33 <gilles> I'm in favor of a key-value format, preferably one that already exists for the sake of available tooling (which is why I brought up the classic ?&= URL format convention). the issues to solve at the Varnish level and client best practices don't add much work to the migration that needs to happen anyway for a URI overhaul to happen
09:32:44 <gwicke> I personally think that cleaning this up will only get more expensive over time, especially once we encourage users to rely on this format
09:32:59 <brion> yes, the arbitrary filename adjustments are painful and mean we have to duplicate knowledge of special structures in multiple places
09:33:07 <tgr> so yeah, a key-value format would be a significant increase in sanity
09:33:23 <DanielK_WMDE_> +1
09:34:10 <TimStarling> with plain old URL query strings you can't really mandate a sort order
09:34:13 <brion> caching: we can prefer a canonical order for key-value maps, but will it be honored consistently? what about human usability?
09:34:19 <TimStarling> although I guess you can rewrite it in varnish
09:34:21 <tgr> OTOH we could go with /thumb/Filename/key:val-key:val-width:123px-Filename for example which is similar enough of the current scheme that most tools would not notice the difference
09:34:21 <DanielK_WMDE_> ...should we support the same key/valeu format in the file link syntax, then?...
09:34:59 <brion> DanielK_WMDE_: that does raise the related question of how to specify available keywords
09:35:06 <gwicke> tgr: downside is that it's still custom
09:35:10 <gilles> TimStarling: you can encourage it. it's only problematic if it's inconsistent from the client, for the client's own cache's sake. at the varnish level we would normalize it. so that a random order of the sake parameter values would hit the same cache entry
09:35:14 <brion> the file link syntax doesn't distinguish between parameter options and caption text at the syntax level
09:35:21 <gilles> *of the same
09:35:30 <gwicke> I have personally warmed up more & more to just using query strings
09:35:47 <brion> query strings _strike me_ as the right thing, i just am cautious :D
09:36:04 <TimStarling> we do already have a key/value API, that's what imageinfo uses
09:36:18 <gilles> I'm happy with that idea because the cache fragmentation seems solvable at the varnish level. and we need to fix it for our other API calls with are also query string-based
09:36:19 <tgr> gwicke: yes. I think it would be easier to evaluate trade-offs if we had more of an idea of what clients have built-in knowledge about the current URL schema and to what extent
09:36:38 <TimStarling> and we already have thumb.php which streams out files with parameters specified in the query string
09:37:21 <bearND> The apps and MCS try to change thumbnail widths downwards via regex
09:37:22 <gilles> thumb.php's scheme is not extensible, that's the main problem
09:37:26 <gwicke> tgr: yeah, I think that's a good point that we should record as a follow-up
09:37:32 <brion> indeed, we can use the existing param names used internally...
09:37:34 <gilles> not without making it even more awkward anyway
09:38:00 <bearND> we try to stick to certain bucket sizes: 320px, 640px, 800px, 1024px
09:38:02 <brion> extensibility is going to be important though, in a way that's as transparent as possible
09:38:14 <brion> to the code parsing through things
09:38:51 <bearND> but that's really for width since that's the only thing that can be changed through URL manipulation
09:39:00 <tgr> bucket sizes (thumbnail rendering speed) is a whole different can of worms, let's keep that separate IMO
09:39:22 <bearND> sure, was just trying to answer the question what the apps and MCS use
09:39:27 <gwicke> okay, so it sounds like there is some amount of support for considering moving to query strings, with the main caveat being that we need to gauge the cost by figuring out how many clients rely on the current syntax
09:40:13 <gilles> I will say one thing on that topic, though, which is that we intend to study the distribution of sizes again with filippo, to determine whether we can move away from storing all thumbnails forever in swift to storing only the most requested formats (de facto buckets based on actual use). if we find that the long tail being cut that way is significant in terms of storage size
09:40:15 <brion> ok so jut summarizing a couple things. 1) broad agreement(?) on letting orig file through on requesting oversized thumb. 2) jury still out on whether to use query strings for params, but lots of interest. 3) extensible parameters are important, but need to know more about other params that might be used
09:40:24 <gwicke> we also need to estimate how much change would be needed at the media handler layer
09:40:32 <brion> yes
09:40:52 <brion> media handlers mostly take the key-value pairs
09:41:09 <brion> so i think not huge
09:41:14 <tgr> brion, for 1) the people who would disagree are probably in the editor community, not the developer one
09:41:16 <gilles> the implication going forward with the issue of fragmentation is that we'll have a class of thumbnails that are more likely to be misses when they go out of varnish, so on average lower performance, when requesting exotic parameters
09:41:17 <TimStarling> also tgr is going to write a spec of the existing situation
09:41:37 <gilles> but the investigation to see if that's worth doing hasn't happened yet
09:41:52 <tgr> media handlers use regex parsing, not key-value pairs
09:42:03 <tgr> but fixing that would be time well spent IMO
09:42:27 <brion> tgr: when generating a thumbnail from parameters they use key-value pairs.
09:42:40 <brion> tgr: when extracting those parameters from URLs they use regexes
09:42:42 <TimStarling> yeah, maybe we can pull out a centralised way of parsing URLs and feeding structured data to media handlers
09:43:10 <brion> and then extracting those parameters from [[File:foo=bar]] they use magic word regex chunks
09:43:13 <brion> *when
09:43:20 <tgr> yeah, they transform between a key-value hash and a string
09:44:02 <tgr> so just replacing that with putting key-value in the query is indeed easy
09:44:14 <brion> so i'm very interested in making some stuff happen on this :) anything i can help with on the media handlers end?
09:44:28 <TimStarling> maybe thumb.php should give a Content-Disposition header with the human-readable filename
09:44:47 <gilles> making this work on vagrant without varnish (a.k.a. "small wikis") would be a nice first step
09:45:03 <brion> ah yes, the no-varnish question :D
09:45:14 <brion> would this require a thumb.php-like intermediary for them?
09:45:15 <TimStarling> you know we do that already to work around filename length limits in swift
09:45:26 <gilles> varnish works on vagrant, if you feel adventurous in the vmod side of the issue
09:45:46 <tgr> brion, re: replacing original with thumbnail, there is some discussion on the dedicated task, people use originals embedded in articles to demonstrate technical concepts
09:45:52 <tgr> color spaces and whatnot
09:46:01 <tgr> which get removed when thumbnailing
09:46:26 <brion> tgr: that bears investigation, reminds me of occasional requests to run videos at a specific resolution etc. need to think about a solution there.
09:46:58 <gilles> as for the redirect to the original, I just remembered that it's a bad idea for EXIF rotation, which we apply and strip on thumbnails. what we really need is an original-sized thumbnail
09:46:59 <brion> an option to force fixed-original size would sometimes be useful
09:47:09 <brion> hehe yeah
09:47:12 <brion> good point :D
09:47:15 <gwicke> brion, tgr, timstarling: in the first step, would you be primarily interested in moving the key-value parsing of the existing syntax into a single step & then pass key-value pairs to the individual handlers?
09:47:38 <brion> gwicke: sounds about right
09:48:04 <brion> how would we handle migration? varnish magic?
09:48:14 <gwicke> so this would be a prep step that would make it easy to support query strings
09:48:34 <TimStarling> gwicke: I think that would be desirable, I don't really have a good sense of how feasible it is
09:48:39 <tgr> gwicke: isn't that how it works now with MediaHandler::parseParamString ?
09:49:07 <gilles> brion: varnish magic is needed to avoid doubling the thumbnails stored, yes. rewriting both conccurrent schemes to the same URI varnish bases its caching on
09:49:11 <TimStarling> currently parseParamString() is spread out in lots of little bits, in extensions and core
09:49:33 <TimStarling> in theory we could instead have a single grammar for it, which is parsed in core
09:49:34 <brion> we could have a single back-compat param string parser that handles all known existing options
09:49:35 <gwicke> yeah, there are lots of individual implementations
09:49:41 <gilles> presumably rewriting the new scheme to the old one, since that's what existing entries are stored by
09:50:06 <TimStarling> the output of parsing that grammar might not be the actual b/c key/value pairs, that is the bit that might not be feasible
09:50:12 <tgr> what would be the point of rewriting how a soon-to-be-deprecated syntax gets parsed?
09:50:56 <brion> well, would we still need it for back-compat / migration in updater?
09:50:57 <gwicke> re varnish migration: embedding complex parameter rewriting in varnish sounds fairly ugly
09:51:22 <gilles> what do you propose? doubling the hardware for the varnish caches we have?
09:51:30 <tgr> brion: yes, but no need to touch the code for that
09:51:40 <gwicke> we really don't want to duplicate the thumbs themselves, but maybe we could afford some duplication of metadata (headers)
09:51:59 <brion> tgr: well, depends whether we want to keep the old piecemeal extension bits or consolidate it into one bit of code that lives in the updater
09:52:18 <tgr> gwicke: as in, cache redirects and resolve them internally in varnish?
09:52:29 <gwicke> tgr: yeah, something like that
09:52:39 <gwicke> would have to talk to bblack on that
09:52:47 <brion> could VCL distinguish between the 99% of thumbs with simple syntax and the "other cases"?
09:52:52 <tgr> that would probably be useful on a more general level as well
09:52:53 <brion> then only double storage on the "others"?
09:53:09 <gwicke> brion: yeah, that sounds like a good idea as well
09:53:19 <brion> in case full k/v pairs are too awkward
09:53:29 <gwicke> most thumbs don't have any other parameters
09:54:55 <gilles> I think that the discussions about varnish need to involve the traffic team, we have to guess too much about what's possible or practical
09:54:59 <gwicke> so, for next steps..
09:55:18 <gwicke> 1) look into which users rely on the current thumb format
09:56:14 <gwicke> 2) investigate effort needed to clean up the mediahandler parameter parsing
09:56:39 <gwicke> 3) discuss possible migration strategies with the traffic team
09:56:56 <gwicke> and then reconvene with the results?
09:57:00 <bearND> The apps/MCS get their initial thumbnail URLs from mobileview, Parsoid, or the RB /page/summary endpoint.
09:57:02 <bearND> We can't control when old versions of the Android app get updated. Can we use some sort of flag in the APIs which include thumbnail URLs for a transition period? If we don't find the /(\d+)px- regex then the older Android app versions won't be able to request a different size. Maybe something akin to formatversion=2 for action=mobileview? Content-type
09:57:02 <bearND> versioning for RB
09:57:02 * tgr is now known as tgr|away
09:58:08 <brion> bearND: good point! needs to be considered in migration
09:58:38 <brion> (it's also conceivable we could keep the XYZpx- prefix as the lone exception to k/v pairs for other options)
09:58:49 <TimStarling> ok, I guess we're pretty much done?
09:59:01 <brion> sounds like!
09:59:12 <bearND> brion: that sounds like an interesting option
09:59:20 <gwicke> TimStarling, tgr, brion, gilles, DanielKWMDE: does the summary of next steps make sense to you?
09:59:21 <gilles> keeping some of the ugliness for nostalgia's sake, right?
09:59:27 <gilles> it does
09:59:28 <brion> gilles: hehehe
10:00:04 <gwicke> cool, thank you all for the discussion!
10:00:20 <TimStarling> I will copy the IRC log to somewhere
10:00:23 <DanielK_WMDE_> sounds good to me
10:00:34 <TimStarling> do you want it on the task or the event?
10:00:37 <brion> gwicke: i'll do some general looking over media handlers this week, will include the parameter handling in my logs
10:00:44 <bearND> Thank you guys! This is a really pressing topic for the apps and MCS
10:00:46 <TimStarling> looks like it has previously been on the event
10:00:56 <brion> TimStarling: probably the event yeah

daniel renamed this event from ArchCom RFC Meeting W45: Image Thumbnail API (2016-11-09, #wikimedia-office) to ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office).Nov 21 2016, 6:11 PM
daniel invited: ; uninvited: .
daniel updated the event description. (Show Details)
ssastry renamed this event from ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office) to ArchCom RFC Meeting W45: Image Thumbnail API (2016-11-09, #wikimedia-office).Nov 30 2016, 5:03 PM
ssastry changed the start date for this event from Nov 9 2016, 10:00 PM to Nov 9 2016, 10:00 PM.
ssastry changed the end date for this event from Nov 9 2016, 10:00 PM to Nov 9 2016, 10:00 PM.