RE: :UserAgent redirect with POST
Nathaniel Good (good@cs.umn.edu)
Sun, 3 Oct 1999 06:16:14 -0500
Hi. Actually, reel.com uses asp now instead of cgi and therefore uses GET a
lot, and the redirect problem I had before is no more. The original POSTing
problem I had was solved by the wonderful Maurice Aubrey, one of the nicest
and most helpful people on this list, and a real great programmer also. I've
appended his solution below (just if your curious). So to GET (pardon the
accidental humor) stuff from reel all you need to do is make a query string
and request it. The trick is that you need to get a cookie before going to
reel so you have to go to reel first, then do all your GETing (there I go
again, sorry). The code below goes to reel, gets a cookie, and then goes out
and does a search for movies with "revenge" in the title. In this example it
writes it out to an HTML file. To do other queries you need to look at the
source on the reel.com/reel.asp page, but I'm sure you get the idea.
When things get really hairy with lots of Redirects, cookies, POSTs and so
forth, I usually just use lynx to get the URL and write it out to a file or
object where I then go through and parse it. It works well without any
hangups.
The cookie redirect problem seems to generate a lot of discussion on this
list, and it would be nice if someone could eventually come up with a good
solution. For reel though, LWP works just fine.
Good Luck,
Nathan Good
University of Minnesota
www.movielens.umn.edu
#!/usr/bin/perl
use LWP::UserAgent;
use HTTP::Cookies;
#for testing
open(STUFF,">mine.html");
$cookie = new HTTP::Cookies( ignore_discard => 1 );
$ua = new LWP::UserAgent;
$ua->cookie_jar( $cookie );
$ua->agent("Mozilla/8.0");
#get the cookie
$request = new HTTP::Request('GET', 'http://www.reel.com/');
$response = $ua->request( $request );
#now go and get stuff
$request = new HTTP::Request('GET',
'http://www.reel.com/search.asp?START=0&SFor=1&STRING=revenge');
$response = $ua->request( $request );
if ($response->is_success) {
print STUFF $response->content;
print "success\n";
} else {
print "FAILED: ", $response->code, " ", $response->message, "\n";
Maurice Aubrey solution to the POSTing problem:
#!/usr/bin/perl
$SEARCH_URL = 'http://www.reel.com/cgi-bin/search/nph-movie.exe';
use LWP::UserAgent;
use HTTP::Cookies;
#need to manually handle redirects
BEGIN {
package LWP::UserAgent::NoRedirect;
use LWP::UserAgent;
@ISA = qw( LWP::UserAgent );
sub redirect_ok { 0 }
}
$cookie = new HTTP::Cookies( ignore_discard => 1 );
$ua = new LWP::UserAgent::NoRedirect;
$ua->cookie_jar( $cookie );
$request = new HTTP::Request('POST', $SEARCH_URL);
$request->content_type('application/x-www-form-urlencoded');
$request->content("STRING=Revenge&TYPE=TITLE&Go=Go");
$response = $ua->request( $request );
if ($response->code == 302) {
$request = new HTTP::Request('GET', $response->header( location ));
$response = $ua->request( $request );
}
if ($response->is_success) {
print $response->content;
} else {
print "FAILED: ", $response->code, " ", $response->message, "\n";
}
> -----Original Message-----
> From: libwww-perl-request@ics.uci.edu
> [mailto:libwww-perl-request@ics.uci.edu]On Behalf Of Kit
> Sent: Tuesday, August 10, 1999 3:42 AM
> To: libwww-perl@ics.uci.edu
> Subject: LWP::UserAgent redirect with POST
>
>
> Hi Nathan Good,
>
> I saw Nathan Good's post on the libwww-perl mailing list archive regarding
> problems with LWP::UserAgent handling POST redirects and cookies. I am
> wondering if anyone has found out a solution for it yet, cause I
> am running
> into the same problems. I use LWP::UserAgent to send a POST request to
> www.reel.com and didn't get the results I wanted. Any help would be
> appreciated.
>
>
> Here is Nathan Good's post just to remind you:
>
> Hi. I have found myself in a situation where I need cookies and
> am getting
> the redirect error and it doesn't look like the UserAgent is sending the
> cookies back to the server. The story is that I would like to suck
> (responsibly of course) a bunch of movie information from
> http://www.reel.com for my research project. Here is what I know:
> 1)You must have cookies enabled to visit Reel.com and do anything useful
> (search moveis etc)
> 2)When I go to reel.com in my browser and enter in a title, say
> "revenge" in
> the quick search column on the left, the following string shows up in the
> location window of my browser:
> http://www.reel.com/cgi-bin/nph-reel.exe?OBJECT=searchresults&STRI
NG=revenge
> &TYPE=TITLE&STORE=ALL&MODE=BEGINS&GENRE=0&PERIOD=0&RATING=0&AVAIL=
BOTH&PRICE
> RANGE=0&PRODUCTTYPE=0
> 3)If I go to this string w/o going to http://www.reel.com first (or if I
> disable cookies) then I get a page that says "Sorry nothing found" where
> there should be results.
> 4)if I have cookies enabled in the browser and I go to reel, preform a
> search "revenge" I get back the same string in the location as above and I
> get a list of movies with the title "revenge".
>
> Here's what I've tried in LWP:
> 1)simulate a browser session; get "Reel.com", then POST.
> 2)same as above with a GET after the POST
> Result is 302 Redirect error from POST
> So then I subclassed UserAgent and changed the redirect_ok method to all
> ways spit out true:
> package MyUserAgent;
> require LWP::UserAgent;
> @ISA =qw(LWP::UserAgent);
> sub redirect_ok{
> return TRUE;
> }
>
> 3)After this I repeated the above tests.
> Result: No more redirect error. kept getting the page saying that
> there were
> no results found, the same as when cookies were disabled in the browser.
> 4)Tried a single POST Test.
> Result: kept getting the page saying that there were no results found.
> 5)tried various combinations of post strings in
> $req->content();
> exp:
> $req->content('OBJECT=searchresults&STRING=revenge&TYPE=TITLE&STOR
E=ALL&MODE
> =BEGINS&GENRE=0&PERIOD=0&RATING=0&AVAIL=BOTH&PRICERANGE=0&PRODUCTTYPE=0');
> and
> $req->content(STRING=revenge&TYPE=TITLE&STORE=ALL&MODE=BEGINS&GENR
> E=0&PERIOD
> =0&RATING=0&AVAIL=BOTH&PRICERANGE=0&PRODUCTTYPE=0);
> Result: Same 'results not found' page.
>
> I am new to the user agent interface (I was just using LWP Simple
> before) so
> I might have made some errors there. I think the problem has
> something to do
> with my mishandling of the cookie-jar, as it seems that reel.com is really
> not getting my cookies. Anyway, I've included the errors I get
> and the code
> I wrote.
>
> Code I wrote:
> #!/project/gl/install/bin/perl
>
> use lib "/project/gl2/nathan/lib";
> #after subclassing I used also included this
> #require "/project/gl2/nathan/lib/LWP/MyUserAgent.pm";
>
> use LWP::UserAgent;
>
> $ua = new LWP::UserAgent;
> #after subclassing I used this:
> #$ua= new MyUserAgent;
>
> $ua->agent("Mozilla/4.0");
>
> #for cookies
> $ua->cookie_jar($cookies);
>
> #Go to Reel.com and get a cookie
> my $req = new HTTP::Request 'GET','http://www.reel.com';
> $req->content_type('application/x-www-form-urlencoded');
> my $res = $ua->request($req);
> if ($res->is_success) {
> print "Was able to go to reel.com\n";
> } else {
> print "Failed. Didn't make it\n";
> }
>
> $r = $res;
>
> print "Code: ", $r->code, "\n";
> print "Message: ", $r->message, "\n";
> print "Request: ", $r->request, "\n";
> print "Base: ", $r->base, "\n";
>
> print "###################Basic URL\n";
> #print "As_string: ", $r->as_string, "\n";
>
>
> #Now try to post to Reel.com
> #When using MyUserAgent got back the the 'No search results page'
>
> my $req = new HTTP::Request
> 'POST','http://www.reel.com/cgi-bin/search/nph-movie.exe';
> $req->content_type('application/x-www-form-urlencoded');
> $req->content('STRING=revenge&TYPE=TITLE');
> my $res = $ua->request($req);
>
> if ($res->is_success) {
> print "OK Was Able to Post\n";
> } else {
> print "Failed. Posting did not succeed\n";
> }
>
> $r = $res;
>
> print "Code: ", $r->code, "\n";
> print "Message: ", $r->message, "\n";
> print "Request: ", $r->request, "\n";
> print "Base: ", $r->base, "\n";
> print "#################################### POST URL\n";
> print "As_string: ", $r->as_string, "\n";
>
>
> #Now try to get things from reel.com
> #I didn't use this all the time
>
> my $req = new HTTP::Request
> 'GET','http://www.reel.com/cgi-bin/nph-reel.exe?OBJECT=searchresul
ts&STRING=
> revenge&TYPE=TITLE&STORE=ALL&MODE=BEGINS&GENRE=0&PERIOD=0&RATING=0
> &AVAIL=BOT
> H&PRICERANGE=0&PRODUCTTYPE=0';
> $req->content_type('application/x-www-form-urlencoded');
> my $res = $ua->request($req);
> if ($res->is_success) {
> print "OK Was Able to grab the page\n";
> } else {
> print "Failed. Grabbing did not succeed\n";
> }
>
> $r = $res;
>
> print "Code: ", $r->code, "\n";
> print "Message: ", $r->message, "\n";
> print "Request: ", $r->request, "\n";
> print "Base: ", $r->base, "\n";
> print "#################################### The Actual URL\n";
> print "As_string: ", $r->as_string, "\n";
>
> <end of code>
>
> This is a list of my results before subclassing UserAgent.pm
> ####################################
> MY Results:
>
> Was able to go to reel.com
> Code: 200
> Message: ok
> Request: HTTP::Request=HASH(0xb2fc8)
> Base: http://www.reel.com/
> ###################Basic URL
> Failed. Posting did not succeed
> (this changed after I used MyUserAgent. After I started using
> that I got the
> same results as the GET results below:)
> Code: 302
> Message: Redirect
> Request: HTTP::Request=HASH(0x23e558)
> Base: http://www.reel.com/cgi-bin/search/nph-movie.exe
> #################################### POST URL
> As_string: HTTP/1.0 302 (Found) Redirect
> Location:
> http://www.reel.com/cgi-bin/nph-reel.exe?OBJECT=searchresults&STRI
NG=revenge
&TYPE=TITLE&STORE=ALL&MODE=BEGINS&GENRE=0&PERIOD=0&RATING=0&AVAIL=BOTH&PRICE
RANGE=0&PRODUCTTYPE=0
Content-Type: text/html
Client-Date: Sun, 08 Nov 1998 10:17:41 GMT
Client-Peer: 209.185.135.200:80
Set-Cookie: ReelSession=REEL345JPMY45ML4TZJ; path=/; domain=.reel.com;
Set-Cookie: ReelSender=; path=/; domain=.reel.com;
OK Was Able to grab the page
Code: 200
Message: ok
Request: HTTP::Request=HASH(0x31b17c)
Base:
http://www.reel.com/cgi-bin/nph-reel.exe?OBJECT=searchresults&STRING=revenge
&TYPE=TITLE&STORE=ALL&MODE=BEGINS&GENRE=0&PERIOD=0&RATING=0&AVAIL=BOTH&PRICE
RANGE=0&PRODUCTTYPE=0
#################################### The Actual URL
As_string: HTTP/1.0 200 (OK) ok
Content-Type: text/html
Client-Date: Sun, 08 Nov 1998 10:17:42 GMT
Client-Peer: 209.185.135.200:80
Set-Cookie: ReelSession=REEL3ET4XDY93ESCNHS; path=/; domain=.reel.com;
Title: Reel | Smart Search Results
(and the rest of the HTML file, the one with no results). So there it is.
Thanks again for all of your help. Nathan Good good@cs.umn.edu Try MovieLens
http://www.movielens.umn.edu