I have been working on a new application, Langviews Analysis, that returns pageviews data for a given article in all languages. For some articles this means 200+ individual queries to the pageviews API. According to the RESTBase documentation, high-volume access is capped at 500 requests per second, yet every once in a while I still get an array of requests failing with the error Error in Cassandra table storage backend. I am not making more than say, 210 requests per second.
My first guess was this was some automated throttling, so I implemented throttling of my own on the frontend. While I do see the issue is lessened by this tactic, it still does not prevent it entirely, even if I limit it to say, 5 requests per second (200ms delay).
At the time of writing, the main http://tools.wmflabs.org/langviews is throttled at 100ms between making requests, and http://tools.wmflabs.org/langviews-test has no throttling. Try making several queries on the latter for articles with lots of languages, and eventually you will run into the Cassandra error. If the data loads very quickly that means it's been cached, so try a different article.
Is this error actually because of throttling, or is it perhaps a general performance issue that could be fixed? If the errors are expected, what could I do to ensure I don't run into them?