RE: How to 'solve' 401 HTTP-errors?

Tim.Meadowcroft@westmerchant.co.uk
Tue, 9 Feb 1999 15:18:00 +0000


> > My first thought was that it might be _intentional_ - some
> people put
> > a lot of work into keeping web crawlers off their system...
>
> This was my first thought as well, and if that is the reason that I
> cannot request pages on these sites then I'll respect that.
> However, that still keeps me wondering how the server 'knows' that the
> page is NOT being requested by a Web browser, but rather by a robot...
> Does anybody have the answer to that question?

I wrote an HttpSniffer in Perl (but not with LWP) that's on my web page at
http://www.compansr.demon.co.uk - I only use it on Perl on Win32, but it should
work on other platforms.

Like xmon for debugging X, it makes a fake server and satisfies requests by
talking to a real server, logging all the request and reply headers at the same
time.

It's handy for watching the real negotiation that goes on b/n browsers and
servers, as well as debugging cookies, ASP/CGI pages, LWP scripts etc. as you
can see exactly what is going on with re-directs, authentications, HTTP 1.1 ->
1.0 drop-backs, etc.

There's a bit of simple documentation as POD, mail me if you have problems, but
try that and you'll probably be able to see what your browser is sending that
you're not.

Cheers

Tim Meadowcroft   // Consultant - IT Development
                  // West Merchant Bank - London

_________________________________________________________________________
For information about West Merchant Bank Limited please visit our 
website http://www.westmerchant.co.uk

This document and any attachments are strictly confidential and intended
for use only by the named addressee(s). No other person is entitled or 
authorised to act upon them. West Merchant Bank Limited reserves the 
right to monitor all e-mail communications through the Bank's network.