Re: comments in robots.txt - bug in RobotRules.pm??
David L. Sifry (david@sifry.com)
Tue, 28 Jan 1997 23:51:19 -0800
Andrew Daviel wrote:
>
> What are peoples thoughts on the robot wait time, using HEAD vs. GET,
> etc. ??
> It seems to me that the robot rules were written back in the days
> of single-threaded httpd like NCSA 1.1, and that agents now might not
> unreasonably send a small flurry of requests (like browsers do).
> As I recall, I do HEADs to check timestamps and MIME type and GETs to
> update, and wait longer between GETs than HEADs.
I agree with you. I do a HEAD request before any GET to check MIME type
and timestamp, but it is a little annoying to have to wait after getting
/robots.txt and the HEAD to finally do the GET. I've been meaning to do
a minor patch to LWP to make robots.txt and HEAD requests freebies but
have been too busy and I got off track. If I recall correctly, it's
just a couple of lines of code.
The key is making sure that you wait after any GET - those are the most
server intensive requests.
Dave
--
Dave Sifry http://www.sifry.com
President, Sifry Consulting (408) 471-0667 (voice)
david@sifry.com (408) 471-0666 (fax)