Context: Phabricator is still getting hammered, to some extent, and it's putting stress on the database (T109279: Phabricator creates MySQL connection spikes: Attempt to connect to firstname.lastname@example.org failed with error #1040: Too many connections.)
I watched the access logs a bit and noticed quite a few URLs which don't have any value in search engines. I also have a suspicion that some of the spikes could be coming from the sprint extension (T107197: Sprint extension doesn't scale to thousands of tasks in a single sprint: burndown page exceeds max execution timeout on visual editor project) and @chasemp mentioned to me that phragile does similar crazy things to get the data that it needs to compute burndown charts.
So I've made a paste (below) and will compile a list of URLs that should be excluded from search spiders. Once the list is looking good we will implement the change to phabricator's robots.txt and hopefully lessen the impact of search engines on our poor overworked mysql servers.