HTML::Element, extend language

WWW projekt (wwwproj@dna.lth.se)
Fri, 19 Jul 1996 16:35:38 +0200


Hi,

I want to parse an extension of HTML 2.0 and thought that I migth use
the HTML::* modules to do this.

I found TreeBuilder especially interesting for this purpose, but as it
inherits from Element, I can not use it as it is. The problem is that
the hashes defined in HTML::Element (emptyElement, optionalEndTag, ...)
are defined in the package and are not members in the objects.
Therefore, I cannot change the values of these hashes when I want to
extend HTML 2.0 with my own tags.

Shortly, it looks like this:

package Element;

%emptyElement = map {$_ => 1} qw(base link meta isindex ...);

..
sub traverse
{
   ...
   &$callback($_, 1, $depth+1) unless $emptyElement{$self->{'_tag'}};
   ...
}

And the problem is that the very useful method traverse cannot be used
with other tags than them defined in the %emptyElement variable.

If there was a reference to %emptyElement etc in the *objects* and these
where use in the module, it would be much easier to inherit from Element
and redefine the language it recognized.

In HTML::Element::new, one could add:

    $self->{'_emptyElement'} = \%emptyElement;
    $self->{'_optionalEndTag'} = \%optionalEndTag;
    $self->{'_linkElements'} = \%linkElements;
    $self->{'_boolean_attr'} = \%boolean_attr;

and then always use these instead. Then Element could be set to
understand a completely different language.

The same method should be done in every module where static variables
are used to define the classes behaviour.

By the way, I did see that som classes where defined in HTML::Base, one
for each tag in HTML. Are these used anywhere? I could not find any
place they where.

---
Stefan Eriksson, Lund university, Sweden
wwwproj@dna.lth.se, dat93ser@ludat.lth.se