I was checking a client's logs and I realized that the Yahoo Slurp spider was eating more than a gigabyte of bandwith per day. This is way too much....I saw 25 Slurp spiders, all coming from different IP addresses and they all crawled the same pages, sometimes one after the other. This is entirely confusing.. Yahoo must be trying to crawl and reindex the entire internet. I added the following to my robots.txt file and it took a couple days but it finally seems to be slowing down.
User-agent: Slurp Crawl-delay: 60
I set the number relatively high because the content doesn't change often.
Another way to slow it down is to restrict the robots from crawling directories that are not necessary such as your admin, includes and other directories that normal users don't see or need.
User-agent: Slurp Disallow: /cgi-bin/ Disallow: /ask Disallow: /admin
Yahoo recommends that
If you do feel that a crawl-delay is necessary, use small values to avoid blocking Slurp discovery and refresh of your key content.