Re: problems with extract_links on a frame tag

Martijn Koster (m.koster@webcrawler.com)
Fri, 3 Jan 1997 09:33:33 -0800 (PST)


At 1:38 AM 1/3/97, Luuk de Boer wrote:

> I have a problem with extract_links and LinkExtor.pm.
> ... So I don't know if [...] nobody care's.

He, I can't let a fellow countryman think that, now can I :-)

>here's my test page which I use: (index3.htm)
>here is my test perl code (problem.pl):
>here's the output of problem.pl (problem.txt):

Did everyone make producing reproducable test cases their
new-years resolution? Wonderful! :-)


HTML:: is definately Gisle's turf, and I only had a quick look,
but I found that adding frame and frameset to HTML::TreeBuilder.pm's
%isBodyelement made your scripts return the missing links. Make sure
you resolve them to relative to the base URL, sub extract_links seems
to give them back raw:

link = menu.htm
link = main.htm

Hope this is of use/interest...


-- Martijn

Email: m.koster@webcrawler.com
WWW: http://info.webcrawler.com/mak/mak.html