Re: problems with extract_links on a frame tag
Martijn Koster (m.koster@webcrawler.com)
Fri, 3 Jan 1997 09:33:33 -0800 (PST)
At 1:38 AM 1/3/97, Luuk de Boer wrote:
> I have a problem with extract_links and LinkExtor.pm.
> ... So I don't know if [...] nobody care's.
He, I can't let a fellow countryman think that, now can I :-)
>here's my test page which I use: (index3.htm)
>here is my test perl code (problem.pl):
>here's the output of problem.pl (problem.txt):
Did everyone make producing reproducable test cases their
new-years resolution? Wonderful! :-)
HTML:: is definately Gisle's turf, and I only had a quick look,
but I found that adding frame and frameset to HTML::TreeBuilder.pm's
%isBodyelement made your scripts return the missing links. Make sure
you resolve them to relative to the base URL, sub extract_links seems
to give them back raw:
link = menu.htm
link = main.htm
Hope this is of use/interest...
-- Martijn
Email: m.koster@webcrawler.com
WWW: http://info.webcrawler.com/mak/mak.html