I just found an unfinished emeil to you to answer your recent
questions which I never came around to complete. Better than nothing,
I suppose, and append it below.
>>>>> WWW projekt <wwwproj@dna.lth.se> writes:
www> Due to the misfortunate fact that the contents in an element is stored
www> in an array, there is no way of actually deleting an element in another
www> elements contents.
www> What you can do is to exchange the element with something else.
www> Andreas Koenig suggested Mon, 22 Jul 1996 in message "Re: Manipulating
www> The HTML Tree", that it should be done like this:
www> -----
www> $page->traverse(sub {
www> my($ele,$flag,$depth) = @_;
www> if ($depth > 7) {
www> $ele->delete;
www> return;
www> }
www> if ($self->{HTML_OK}{$ele->tag}) {
www> return 1;
www> } else {
www> $ele->tag("noop");
www> 1;
www> }
www> }, 1);
www> -----
www> Note the $ele->tag("noop"); line, he sets the tag to be 'noop' and it
www> will stay in the tree, but as an unknown tag NOOP.
The NOOP was intentional, not an unfortunate effect. $ele->delete
works just fine for real deletes. The NOOP is there, so I do not
censor what the people wrote, but prevent them from breaking the
guestbook page. $self->{HTML_OK} allows e.g. UL, OL, A, LI, and some
such. If they come with IMG, they get what they deserve :-) But I
still can see, what they tried to do.
www> This was not quite what I wanted, so I exchanged the tag with an empty
www> string instead. I could not be done the easy way though ($ele = "";) so
www> I had to add a method exchange to the HTML::Element package.
www> -----
www> sub exchange
www> {
www> my($self, $from, $to) = @_;
www> # return if nothing to exchange
www> return $self unless (defined $self and defined $from);
www> $to = "" unless defined $to;
www> my $el;
www> foreach $el (@{$self->content}) {
www> # if the element is a reference, compare pointers
www> # else compare text
www> if ((ref($el) and $el == $from) or $el eq $from) {
www> # delete the element if it is a reference
www> $el->delete if ref($el);
www> $el = $to;
www> return 1;
www> }
www> }
www> return 0;
www> }
www> -----
www> I use it like this:
www> $ele->parent->exchange($ele, "");
www> and it seems to work allright.
www> I still think that a 'remove' method should me implemented that really
www> removes the element from the tree, but it might be a bit tricky because
www> of the use of arrays instead of linked lists in the structure.
Sorry, can't follow you. Why does ->delete() not do what you want?
I append my (unfortunately not really complete) answer to your recent
posting.
---snip---
>>>>> WWW projekt <wwwproj@dna.lth.se> writes:
>> if ($depth > 7) {
>> $ele->delete;
>> return;
>> }
>> if ($self->{HTML_OK}{$ele->tag}) {
>> return 1;
>> } else {
>> $ele->tag("noop");
>> 1;
>> }
> This is really two methods, isn't it? One where you deletes the element
> and one where you make it into a <NOOP ..> element.
That's intentional. For documenting purposes I don't want to delete
these elements, just not display them.
[...]
>>> Today there is no way of inserting a new element in the tree, i.e.
>>> inserting an element into a certain position in the contents of
> ^^^^^^^^^^^^^^^^
>>> another element.
>>
>> perl -MHTML::Parse -e '
>> $page =
>> parse_html("<HEAD><TITLE>forgot the base</HEAD><BODY>Reached end");
>> $page->traverse(sub {
>> my($ele,$flag,$depth) = @_;
>> if ($ele->tag eq "head") {
>> $ele->insert_element(HTML::Element->new("base",
>> HREF=>"rtrtr"));
>> return 0;
>> }
>> return 1;
>> },1);
>> print $page->as_HTML;
>> '
>> <HTML><HEAD><TITLE>forgot the base</TITLE><BASE HREF="rtrtr"></HEAD><BODY><P>Reached the end</BODY></HTML>
> This inserts a BASE _last_ in the surrounding HEAD, but what happens if
> I have:
> <HTML>
> <HEAD><TITLE>forgot something else</TITLE></HEAD>
> <BODY>
> <A HREF="calvin"> hobbes </A>
> <P>Reached the end
> </BODY>
> </HTML>
> And I want to insert something between the link to calvin and the
> following text?
It's becoming a bit lengthy, but it's feasible. _Where_ exactly did
you have in mind? I insert for you something in three places. The
first requires to know a bit of the source, that's kind of
hackery. The second is fine, the third needs a flag that I can set
myself, but I think, that's no bad style.
% perl -MHTML::Parse -e '
$page = parse_html(q(
Reached the end
));
$page->traverse(sub {
my($ele,$flag,$depth) = @_;
if ($ele->tag eq "a" && $flag) {
unshift @{$ele->content}, "|||BEFORE|||";
$ele->push_content("|||AFTER|||");
return 1;
} elsif ($ele->tag eq "a" && !$flag){
$GlobalFlag = "saw_a_href";
} elsif ($GlobalFlag eq "saw_a_href"){
$e = HTML::Element->new("HTML");
$e->push_content("|||OUTSIDE|||");
$ele->insert_element($e);
$GlobalFlag=0;
return 0;
}
return 1;
},1);
print $page->as_HTML;
'
|||BEFORE||| hobbes |||AFTER||| |||OUTSIDE|||
Reached the end [...] >> When _I_ tried for the first time, things messed up too ;-) > Then I'm not alone... ;-) HTH, andreas