I'm running a site with a lot of content, but little traffic, on a middle-of-the-road dedicated server.
Occasionally, Googlebot will stampede us, resulting in Apache maxing out its memory, and causing the server to crash.
How can I avoid this?
Solution
register at google webmaster tools, verify your site and throttle google bot down
submit a sitemap
read the google guildelines: (if-Modified-Since HTTP header)
use robot.txt to restrict access from to bot to some parts of the website
make a script that changes the robot.txt each $[period of time] to make sure the bot is never able to crawl too many pages at the same time while making sure it can crawl all the content overall