Re: anyone know why extract_links() isn't working?
Nathaniel Good (good@cs.umn.edu)
Thu, 30 Oct 1997 09:13:07 -0600 (CST)
On Wed, 29 Oct 1997, Matt Silvia wrote:
> Hi there...
>
> I'm having trouble getting the extract_links() method of
> HTML::TreeBuilder to work.
>
> More specifically, I use a UserAgent to execute a request and return a
> response object, and then use HTML::Parse::parse_html to create a tree
> object.
>
> I try to run the extract links method of this tree, but it seems as if
> it's not extracting anything from the tree.
>
> Does anyone know what I'm doing wrong?
>
> Thanks,
>
> Matt
>
> ----
> example code, comments and declarations removed:
>
>
> $ua = new LWP::UserAgent;
> $ua->agent("AgentName/0.1 " . $ua->agent);
>
> $URL = 'http://www.somethin.com/';
>
> my $req = new HTTP::Request POST => $URL;
> $req->content_type('application/x-www-form-urlencoded');
> $req->content('');
>
> my $response = $ua->request($req);
> $html = $response->content();
> $tree = HTML::Parse::parse_html($html);
>
> for (@{ $tree->extract_links( qw(a) ) }) {
> $link = $_->[0];
> print "$link\n";
> }
This is what I use and it seems to work. $ARGV[0] is input from the
command line but I'm sure you could change it to a static URL and it
should work ok also. hope this helps.
#!/usr/local/bin/perl
use LWP::Simple;
use HTML::Parse;
use HTML::Element;
$html = get $ARGV[0];
$parsed_html = HTML::Parse::parse_html($html);
for (@{ $parsed_html->extract_links() }){
$link = $_->[0];
print "$link\n";
}
>
>