Re: lrequest() routine.. (Retrieves redirected documents)

Brooks Cutter (bcutter@pdn.paradyne.com)
Sun, 24 Jul 1994 20:28:32 -0400 (EDT)


> Try this (I haven't tested it):
> sub www'lrequest
> {
>     local($method, $url, *headers, *content, $timeout) = @_;
>     local($hd, $response);
> 
>     for (;;) 
>     {
>         $response = &www'request($method, $url, *headers, *content, $timeout);
>         last unless ($response =~ /^30[12]$/);
> 
>         if ($url = $headers{'location'})
>         {
>               $url =~ s/, .*//;         # Get rid of multiple Location: entries
>         }
>         elsif ($url = $headers{'uri'})
>         {
>               $url =~ s/\s*;.*//;
>               $url =~ s/, .*//;         # Get rid of any multiple URI: entries
>         }
>         else { last; }
> 
>         foreach $hd (keys(%headers))
>         {
>             next if ($hd =~ m#^[A-Z]#);
>             delete $headers{$hd};
>         }
>     }
>     return($response);
> }

This worked great... 

later I realized the interface should be like the following..

    local($method, *url, *headers, *content, $timeout) = @_;

(passing url as a pointer rather than a url)..

I didn't want to deviate from the &request interface, but if a URL is
redirected to a new URL, the program should know the new URL..

Consider Mosaic - when it gets a location redirect, it prints the new
URL above the document... (this is why I need the new URL)..

robots should also remember the redirected URL if they are going to access
a resource multiple times.. (MOMspider could use &lrequest if the value of
$url changes after the call, notify the author)..

There are two other options for returning the redirected URL
without changing the subroutine argument interface..

Either return it after $response or don't treat it as a special case
and require the application to check $headers{'location'} or
$headers{'uri'} (either could be returned)..

Thoughts?

-Brooks
bcutter@paradyne.com