Re: HTML::LinkEx(trac)tor
Hakan Ardo (hakan@munin.ub2.lu.se)
Tue, 9 Jul 1996 09:45:52 +0200
>
> Something similar to this will be in the next libwww-perl release.
> This module extracts links about 4-5 times faster than the current
> method of first building a HTML syntax tree and then calling
> $t->extract_links on it. (Since it does not build a syntax tree, we
> don't leak memory if we forget to call $t->delete.)
>
> Does anyone have comments to the interface, naming or such?
What about the ancher text? Shouldent that one be extracted as well? What
also might be nice, but that should probably be a separate object, would
be a more flexable extracter implemented in the same way, that allowed you
to say specify a list of regexps connected to procedures, where the regexps
specifyes which tags that specific procedure should be called with. That
would allow you to extract any imformation you are intrested in using this
fast method. What do you say?
----------------------------------------------------------
Name: Hakan Ardo
E-Mail: hakan@munin.ub2.lu.se
WWW: HTTP://www.ub2.lu.se/~hakan/sig.html
Interests: WWW, Programming, 3D graphics
Thought for the day: As long as one understands, the
spelling does not matter :-)
----------------------------------------------------------