I'm working on the WikiFactMine project. We do some of our fact extraction using a tool called canary which basically loads papers into an elasticsearch index and we then issue a range of queries against the index to get out 'facts'; we then also store these facts in another index. Currently we probably put our ES server under quite a lot of load but we could temper this so we don't slow down whatever else is indexed on the cluster if there isn't already some kind of throttling built in. Currently we use three indices: one for paper bodies, one for facts and one for paper metadata.
We have been running this on a server elsewhere but its currently having some hiccups; it would also be nice to have the open access material running on labs or tool-labs to help with the longevity of the project. I thought about requesting a labs project to run both the tool and elasticsearch on but wanted to check it couldn't be done on tool-labs first. It looks like perhaps it could be possible to run just on tool-labs although canary may also have to be altered to run on the grid and if this is too difficult we may still need to request a labs project.