Messages should be cached in process.
As a starting point I would suggest the TTL is 1min. Assuming 1request/second and only 1 instance this gives us a hit rate of around (60/59)=98% and reduces our number of hits to the API by a factor of 60. On the other hand 1 min is an acceptable delay to get new messages.
After talking on IRC with the serviceops people it sound like only having 1 instance is likely but we should be prepared for it to scale up slightly if needed with load. We would expect ca. 1-3 instances (probably 1) and not 10-50.