Patch suggestion for HTML::Element::delete (Was: Deleting images using libwww-perl)

Andreas Koenig (k@anna.in-berlin.de)
Mon, 12 Aug 1996 21:10:45 +0200


>>>>> WWW projekt <wwwproj@dna.lth.se> writes:

 www> I still think that a 'remove' method should me implemented that 
 www> really removes the element from the tree, but it might be a bit 
 www> tricky because of the use of arrays instead of linked lists in 
 www> the structure.
 >> 
 >> Sorry, can't follow you. Why does ->delete() not do what you want?

 www> Because the tag is not deleted from the tree:

 www> ~/Stefan/tmp>perl -w -MHTML::TreeBuilder
 www> $html = '<I> Italic </I> <B> Bold </B> <HR>';
 
 www> $p = new HTML::TreeBuilder;
 www> $p->parse($html);
 
 
 www> $p->traverse(sub {
 www>     my ($ele, $flag, $depth) = @_;
 www>     if ($ele->tag eq 'b') {
 www>         $ele->delete();
 www>     }
 www>     return 1;
 www> }, 1);
 
 www> print "---------\n", $p->as_HTML, "--------\n";      
 
 www> $p->traverse(sub {
 www>     my ($ele, $flag, $depth) = @_;
 www>     return unless $flag;
 www>     print $ele->tag, " printing parent tag: ";
 www>     if ($ele->tag ne 'html') {
 www>         print $ele->parent->tag;
 www>     }
 www>     print "\n";
 www>     return 1;
 www> }, 1);
 www> ---------
 www> <HTML><BODY><P><I> Italic </I> <B></B> <HR></BODY></HTML>
 www> --------
 www> html printing parent tag: 
 www> body printing parent tag: html
 www> p printing parent tag: body
 www> i printing parent tag: p
 www> Can't call method "tag" without a package or object reference at - line
 www> 21.
 www> b printing parent tag: ~/Stefan/tmp>

Great! An excellent test case. I realize that I was wrong about the
exact semantics of delete(). I'd call it a bug and suggest the
following patch. (Hardly tested)


*** /tmp/Element.pm.5.01	Mon Aug 12 13:46:22 1996
--- /tmp/Element.pm	Mon Aug 12 13:46:22 1996
***************
*** 400,408 ****
  
  =head2 $h->delete()
  
! Frees memory associated with the element and all children.  This is
! needed because perl's reference counting does not work since we use
! circular references.
  
  =cut
  #'
--- 400,409 ----
  
  =head2 $h->delete()
  
! Frees memory associated with the element and all children and
! eliminates the pointer to itself from a parent element--provided a
! parent exists.  This is needed because perl's reference counting does
! not work since we use circular references.
  
  =cut
  #'
***************
*** 410,415 ****
--- 411,428 ----
  sub delete
  {
      $_[0]->delete_content;
+     my $pos_within_parent;
+     no overload;
+     foreach (0..$#{$_[0]->{'_parent'}{'_content'}}) {
+ 	# looking for myself in parent and splicing me out after
+ 	if (ref($_[0]->{'_parent'}{'_content'}->[$_]) && "$_[0]->{'_parent'}{'_content'}->[$_]" eq "$_[0]"){
+ 	    $pos_within_parent = $_;
+ 	    last;
+ 	}
+     }
+     if (defined $pos_within_parent) {
+ 	splice @{$_[0]->{'_parent'}{'_content'}}, $pos_within_parent, 1;
+     }
      delete $_[0]->{'_parent'};
      delete $_[0]->{'_pos'};
      $_[0] = undef;

You like it, Stefan?

 www> Look at the line printed by ->as_HTML: The tag is still there, but it's
 www> content is gone.
 www> Then Have a look at the error, it's printed when trying to print parent
 www> of the 'b' tag, that I really don't want to be a part of the tree
 www> anymore.

 www> Maybe I have misunderstood you, but was this not the way you wanted to
 www> delete elements? Have I done anything wrong?

I don't think so.

[...]

 www> The third style was really what I wanted. I'm still not convinced that
 www> it is a general method to solve the problem, though, so I challange you
 www> to solve this problem:

 www> Insert 
 www> <B> <P> Bold paragraph <B>
 www> inbetween the hr and br tags in this tree:
 www> <BODY>
 www> <HR>
 www> <BR>
 www> </BODY>

I think, you're right and it can't be done. Maybe some splice method
should be invented.

 www> Can you do this in a way that excludes the risk of errors and  that does
 www> not depend on the type of tags that are inserted or surround the
 www> inserted tag?

You got me ;-) I'd love to donate a splice method, but my time's too
limited, sorry. Thanks for the challange, btw!

andreas