Re: comments in robots.txt - bug in RobotRules.pm??

David L. Sifry (david@sifry.com)
Tue, 28 Jan 1997 23:51:19 -0800


Andrew Daviel wrote:
> 
> What are peoples thoughts on the robot wait time, using HEAD vs. GET,
> etc. ??
> It seems to me that the robot rules were written back in the days
> of single-threaded httpd like NCSA 1.1, and that agents now might not
> unreasonably send a small flurry of requests (like browsers do).
> As I recall, I do HEADs to check timestamps and MIME type and GETs to
> update, and wait longer between GETs than HEADs.

I agree with you.  I do a HEAD request before any GET to check MIME type
and timestamp, but it is a little annoying to have to wait after getting
/robots.txt and the HEAD to finally do the GET.  I've been meaning to do
a minor patch to LWP to make robots.txt and HEAD requests freebies but
have been too busy and I got off track.  If I recall correctly, it's
just a couple of lines of code.

The key is making sure that you wait after any GET - those are the most
server intensive requests.

Dave
-- 
Dave Sifry 				http://www.sifry.com
President, Sifry Consulting		(408) 471-0667 (voice)
david@sifry.com				(408) 471-0666 (fax)