wwwbot.pl problem

Andrew Daviel (andrew@andrew.triumf.ca)
Wed, 22 Nov 1995 18:16:07 -0800 (PST)


(from mail to bcutter, after I'd read a bit further in the documentation 
:)= )


I'm trying to write a robot using libwww-perl-0.40 which I picked up a 
while ago to run MOMspider. I just tried Archie; liege.ics.uci.edu 
won't talk to me and I found  the same version on anubis.ac.hmc.edu
that I have already. Is there an updated version?

I'm having trouble with wwwbot. What I think is happening is that if I 
try a site that disallows everything (http://www.riddler.com/)
that when I try to get files from sites I visited before I went to 
riddler, that don't even have a robots.txt, that I'm getting a 0 back 
from wwwbot'allowed - so that the cache is corrupted somehow.
I tried calling dont_cache in testbot, but get the same results.

My robot goes round and round a list of URLs at different sites if it has 
to wait, so I'm always getting this and killing a lot of perfectly good
links.

Andrew Daviel         email: advax@triumf.ca 
TRIUMF                voice: 604-222-7376 
4004 Wesbrook Mall    fax:   604-222-7307 
Vancouver BC          http://andrew.triumf.ca/~andrew 
Canada   V6T 2A3      49D14.7N 123D13.6W