Re: SGML Parser in Perl?

Earl Hood (ehood@imagine.convex.com)
Thu, 29 Sep 94 16:56:59 CDT


> Has anyone written (or have libs to handle the functions of) an SGML 
> parser in perl? I have html-parser, but I would like a fully functional
> SGML parser that can read a DTD and parse the file based on that. Thanks
> in advance,

Writing a complete SGML parser is very difficult in Perl 4.  Perl 5
should make writing a parser easier.  However, SGML is ugly when it
comes to building a conforming parser, regardless of tools to build a
parser.  I do not know of any HTML parser in Perl that can handle
all the features of SGML.

If you want to see some work done with Perl and SGML DTDs, then look at
<URL:http://www.oac.uci.edu/indiv/ehood/dtd2html.doc.html>.  You'll
find a Perl library called dtd.pl that parsers a DTD.

Your best bet is using Sgmls (a C program).  You can use Sgmls to parse
the sgml documents, and a then use a Perl program to process the output
of sgmls.  You should be able to get Sgmls at
<URL:ftp://ftp.ifi.uio.no/pub/SGML/>.

	--ewh