Re: how can I protect big big pages ?

Marc Langheinrich (marclang@cs.washington.edu)
Thu, 15 Apr 1999 17:41:01 +0900


On Wed, Apr 14, 1999 at 03:23:57PM -0400, Hyun-ju Seo wrote:
> my $req = HTTP::Request->new('GET', $url);
> $req->header(Accept=>'text/html');
> $req->header(Content-Length=>2000) ;
> my $res = $ua->request($req) ;
> 
> I don't want to get any big big page (maybe 100 M bytes).
> So, I tried  not to get that kind of page like the above method.
That won't work - "Content-Length" is used to indicate the length of the
response (or request), not for setting an upper limit on the size.

> And also I tried to give the max_size like "$ua->max_size(2000);"
That is the correct function to use. Don't know why it doesn't work for
you. The following code snippet works for me:


require LWP::UserAgent;
$ua = new LWP::UserAgent;
$ua->max_size(100);
$request = new HTTP::Request('GET', 'http://localhost/very_large.tar.gz');

$response = $ua->request($request); 
print $response->as_string;
__END__

This prints the following:

HTTP/1.1 200 OK
Connection: close
Date: Thu, 15 Apr 1999 08:33:20 GMT
Accept-Ranges: bytes
Server: Apache/1.2.6 Red Hat
Content-Encoding: x-gzip
Content-Length: 1315945
Content-Type: application/x-gunzip
Last-Modified: Thu, 15 Apr 1999 00:56:28 GMT
Client-Date: Thu, 15 Apr 1999 08:33:20 GMT
Client-Peer: 127.0.0.1:80
X-Content-Range: bytes 0-3802/1315945

[Content follows]

Notice that the "X-Content-Range" header indicates how many bytes were
read. This can be larger than the size you indicated with "max_size", since
all LWP does is check if the response (which comes in in chunks of
variable size) has exceeded the "max_size" limit. If it does, it simply
closes the connection. Not sure if you'd expect it to also cut down the
content it received so far to that limit, but you can always do that
yourself... 

marc
-- 
Marc Langheinrich
marclang@cs.washington.edu