Most use cases for the generic POST storage developed in T105975 are basically about shortening URLs to fit comfortably in the ~2083 bytes supported by IE. Many of those use cases would actually straddle that limit only rarely.
While POST storage works, it has some significant disadvantages:
- Each re-render needs to POST the same data to RESTBase.
- POST data can't be freed, unless we'd embark on a very elaborate dependency tracking scheme.
- There is some significant complexity in end points using it.
It would thus be interesting to avoid using it for applications where we can bring down the size of URLs through other means.
One such technique is to use compression. Naive compression on short strings does not achieve significant savings, as the amount of repetition is relatively small, and the compressor dictionary is fully embedded in the output. However, many URLs contain very predictable data, which can be leveraged to improve compression by pre-initializing the compressor dictionary with a string containing the most common request data tokens for this end point.
Simple example, using nodejs:
var options = { level: 9, // In real applications, this would be a bigger sample of typical request data for a specific end point. dictionary: new Buffer('https://en.wikipedia.org/w/api.php?action=query&format=json&prop=extracts|pageimages|revisions&formatversion=2&redirects=true&exintro=true&exsentences=5&explaintext=true&piprop=thumbnail&pithumbsize=600&rvprop=timestamp&titles=') }; // Note: URL differs from dictionary string (size and title) var urlToCompress = 'https://en.wikipedia.org/w/api.php?action=query&format=json&prop=extracts|pageimages|revisions&formatversion=2&redirects=true&exintro=true&exsentences=5&explaintext=true&piprop=thumbnail&pithumbsize=640&rvprop=timestamp&titles=Master_System'; zlib.deflateSync(urlToCompress, options).toString('base64') 'ePlS/1mMyxgmvjTB7UvfxOKS1KL44EoglQsA6g9dvA==' // Same, but using modified base64 to avoid percent encoding: https://en.wikipedia.org/wiki/Base64#URL_applications zlib.deflateSync(urlToCompress, options).toString('base64').replace(/[/=+]/g, function(x) { return {'/': '_','+':'-','=':'.'}[x]; }) 'ePlS_1mMyxgmvjTB7UvfxOKS1KL44EoglQsA6g9dvA..'
In this (admittedly somewhat optimistic) example, we achieved a compression from 240 bytes to 44 bytes, or a ratio of 18.3%. While many other use cases won't match as long substrings as in this example, many other use cases will see significant savings by avoiding percent encoding overhead.
Zlib is widely available on many platforms. The main consideration is that producers and consumers will need to use the same dictionary string to compress URLs for a given entry point. This introduces a tight coupling of configurations, but arguably this coupling isn't any tighter than that induced by POST storage.
Figuring out whether this works for a given entry point
To evaluate whether the compression ratios achievable are good enough for a given entry point, we'll need to construct a dictionary string for that entry point. Given some representative example data, tools like dictator can be used to compute a near-optimal dictionary string (Example usage: go run dictator.go 32768 /path/to/testdata-dir /tmp/test.dict). Alternatively, a manual approximation will work, too. Then, we'll need to test compression ratios for a representative sample of requests, focusing especially on the largest expected requests.
@Yurik, @Physikerwelt: Does this sound like an interesting experiment to you?