Re: lrequest() routine.. (Retrieves redirected documents)
Brooks Cutter (bcutter@pdn.paradyne.com)
Sun, 24 Jul 1994 20:28:32 -0400 (EDT)
> Try this (I haven't tested it):
> sub www'lrequest
> {
> local($method, $url, *headers, *content, $timeout) = @_;
> local($hd, $response);
>
> for (;;)
> {
> $response = &www'request($method, $url, *headers, *content, $timeout);
> last unless ($response =~ /^30[12]$/);
>
> if ($url = $headers{'location'})
> {
> $url =~ s/, .*//; # Get rid of multiple Location: entries
> }
> elsif ($url = $headers{'uri'})
> {
> $url =~ s/\s*;.*//;
> $url =~ s/, .*//; # Get rid of any multiple URI: entries
> }
> else { last; }
>
> foreach $hd (keys(%headers))
> {
> next if ($hd =~ m#^[A-Z]#);
> delete $headers{$hd};
> }
> }
> return($response);
> }
This worked great...
later I realized the interface should be like the following..
local($method, *url, *headers, *content, $timeout) = @_;
(passing url as a pointer rather than a url)..
I didn't want to deviate from the &request interface, but if a URL is
redirected to a new URL, the program should know the new URL..
Consider Mosaic - when it gets a location redirect, it prints the new
URL above the document... (this is why I need the new URL)..
robots should also remember the redirected URL if they are going to access
a resource multiple times.. (MOMspider could use &lrequest if the value of
$url changes after the call, notify the author)..
There are two other options for returning the redirected URL
without changing the subroutine argument interface..
Either return it after $response or don't treat it as a special case
and require the application to check $headers{'location'} or
$headers{'uri'} (either could be returned)..
Thoughts?
-Brooks
bcutter@paradyne.com