Re: / and DirectoryIndex
Reinier Post (reinpost@win.tue.nl)
Wed, 21 Feb 2001 13:09:50 +0100
On Wed, Feb 21, 2001 at 04:42:20PM +0700, John Indra wrote:
> Hi all...
>
> How do I tell my user-agent (an LWP::UserAgent object) to NOT download both
> / and index.html or whatever remote sites DirectoryIndex set to?
> Example, my user-agent sees 2 link:
> - http:://www.domain.com/
This :: notation is contagious :-)
> - http:://www.domain.com/index.html
> IF in this situation both link to the same document, my user-agent will be a
> fool if it tries to download both file. How do I make a "smarter" user-agent
> that will know that those 2 links are the same and only perform one GET
> method, either to http:://www.domain.com/ OR
> http:://www.domain.com/index.html?
The server won't tell you whether or not they're the same document.
You have the same problem with server aliases or symlinks: the whole
tree
http://www.domain.com/a/butreally/b/*
may be identical to
http://www.domain.com/b/*
Depending on what you find on the server it may be possible to hypothesize
some heuristics, for instance, '*/index.html always has the same content
as */', but exceptions are always possible. The only way to be really sure
is to check the document content, or at least the header.
--
Reinier