Re: [PATCH]HTML::Parser-XS-2.9913_mac-1
Gisle Aas (gisle@aas.no)
25 Nov 1999 14:21:30 +0100
"Michael A. Chase" <mchase@ix.netcom.com> writes:
> When declarations are parsed, two extra characters are appended to the
> declaration type. For example,
> '<!ENTITY name "replacement entity text">'
> is tokenized as
> ['ENTITY n', 'name' '"replacement entity text"']
> The attached patch fixes the problem.
Thanks! Patch applied. The test suite obviously need more coverage.
BTW, I have decided that I want to modify the new callback interface.
The main new thing is that you should always tell the parser what
arguments you want the parser to pass to the callback handlers.
Example:
$p->callback(start => "self,tagname,line", sub { ... });
This would set up a callback for start tags and tell the parser that
it should pass $p, the name of the tag and the line number where the
tag starts to the subroutine given.
The argspec allows me to get rid of several of the new parser options
(decode_text_entities, v2_compat, pass_self, attr_pos) and allow
further optimizations as we don't have to build the stuff to
represents arguments that the parser user don't need. The interface
also become much easier extensible.
The things that I think can go into argspec are:
self
tagname (element_name, gi)
origtext
decodedtext
cdata_flag
attr_arrayref
attrpos_arrayref
attr_hashref
attrpos_hashref
tokens_arrayref
tokens # separate arguments (tagname @$attr_arrayref)
charpos
line
Does anybody have some other ideas of how the argspec interface might
look? An array? Just peek at the prototype of the callback function?
This stuff probably also mean that the $p->accum() stuff should go.
One idea would be to allow a array ref as the third $p->callback
argument and then push stuff instead of doing a call.
Regards,
Gisle