LWP::HTML traverse()

Chris Chubb (cchubb@codegurus.com)
Fri, 06 Dec 1996 15:47:36 -0500


I know that the LWP::HTML::* modules are considered betas. So, I want
to check to see if anybody else is getting the same problem:

I use the HTML modules traverse() sub. I pass it a function
reference, and it appears to have been working fine. When I do
a print out from within the called procedure, it appears
to print out for each tag I am looking for. But, it
never executes the next statement AFTER the $t->traverse()
function. It just seems to die while coming out of the tag stack.
I looked at the traverse() function, and not much there to 
go on.

What really burns me is that it was working hunky-dorey for a while,
then went down in flames. (Honest officer, I didnt touch a thing,
it just fell over all by its self...) I was trying to fix a problem
with parsing BIG HTML files, with like 200 <INPUT > tags in it.

I have lwp version 5.03, and the error is under both UNIX and NT versions.
But, I have modified the HTML:: files by taking out the 'use strict'
to get the NT version to work. It worked after I did this, and I have
reverted to that version of the calling program, but nothing works.

Here is how I call the module:
                local $ht;
		require HTML::TreeBuilder;
		$ht = new HTML::TreeBuilder;

		$ht->parse($http_doc);

print "About to PARSE \n";
		$ht->traverse(\&readurltags, 'ignoretext');
print "About to DELETE the structure\n";
		#Be a good boy and clean up your messes
		$ht->delete();

readurltags is defined at the end of the .pl file. 
I get the 'About to PARSE' message. Inside readurltags
I print out each tag that is visited, and they all print out. 
I DONT get the 'About to DELETE' message. 
Its like there is an 'exit' hidden in the module somehwere,
but I searched all the HTML:: files and there arent any
in there.

Any suggestions? Too small a stack? (I tried this with a 10 tag 
file and it bombs too). It fails with any HTML I throw at it.

- Chris Chubb (cchubb@codegurus.com)- Alexandria, VA, USA