Re: Possible memory leak using HTML::Parse

Clinton Wong (clintdw@netcom.com)
Fri, 30 May 1997 08:40:27 -0700 (PDT)


Ian Beckwith wrote:
> I think I have found a problem with the HTML::Parse module
> I have narrowed it down to the following script:

> for($i=0; $i<5000; $i++)
> {
>     my $parser=parse_html($page);
> }

> the process size (as reported by ps) grew to 23Mb before I killed it.

See the bug section at:
http://www.perl.org/CPAN/doc/wwwman/libwww/lib/HTML/Element.html

Basically, HTML::Parse::parse_html() returns a reference to a
HTML::TreeBuilder object.   The tree is internally composed
of HTML::Element objects, which uses circular references.
According to page 300 of Programming Perl (2nd edition), circular
references are not disposed of by the garbage collector,
at least not yet.

In the mean time, you can explicitly free up the memory with
the delete() method... so your code would look like:

 for($i=0; $i<5000; $i++)
 {
     my $parser=parse_html($page);
     $parser->delete();
 }


Regards,
Clinton