RE: HTML::Parser - newbie question

David Baron (davidb@versaware.com)
Sun, 5 Jul 1998 10:03:33 +0300


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------ =_NextPart_000_01BDA7FC.2C51AFB0
Content-Type: text/plain

Sorry about that. Who said life is easy.
Goodies is attached. You could have written them yourself (I too am a
newbie and it probably shows).
 
All ghe HTML parsers are called by getting a new class ref from "new".

With mine, then call ParseFile.

The program will call "virtual functions" (well, not really, but ... )
for starting comments, ending comments, startgeneric for tags you listed
in the call to new, and endgeneric for the untag. Endgeneric will be
called even for unary and automatically ended tags. Exported functions
will let you replace the tagged sequence, delete the tag (call twice
where appropriate), and replace attributes in a tag. Call AbortFile if
you do not need to keep processing.

------ =_NextPart_000_01BDA7FC.2C51AFB0
Content-Type: application/octet-stream;
	name="goodies.pm"
Content-Disposition: attachment;
	filename="goodies.pm"

#!/usr/local/bin/perl
# unix header

require 5;	# or else

# Headers to set this up as a class
package Goodies;
require Exporter;
@ISA = qw ( Exporter );
@EXPORT = qw ( InList, Ieq, Last);

##########################################################################################
# Useful exportable utilities that are lack in the PERL lexicon!
# 	could I use these as statics (no class ref or name argument) ??
##########################################################################################
# sub InList
#	search tag lists for inclusion ( grep returns need special treatment, apparantly)
#
#	Author:		David Baron
#	Arguments:	member?, list or array
#	returns:	if found in list

sub InList {
	my ( $tag, @list ) = @_;

	my $ret = grep (/^$tag$/i, @list );
}

##########################################################################################
# sub Ieq
#	Non-case sensitive equality test
#
#	Author:		David Baron
#	Arguments:	class ref, string1, string2
#	returns:	if found in list

sub Ieq {
	my ( $string1, $string2 ) = @_;

	return ( $string1 =~ /^$string2$/i );
}

##########################################################################################
# sub Last
#	return the last value in an array/list
#
#	Author:		David Baron
#	Arguments:	class ref, array
#	returns:	last element, if any

sub Last {
	my ( @array ) = @_;
	my $nItems = @array;

	return $nItems > 0
		? $array [ $nItems - 1 ]
		: undef;

}

####################################################################################
# sub Redir
#	make new path from old path
#
#	Author:		David Baron
#	Arguments:	base directory (or ""), subdirectory (or ""), oldpath
#	Returns:	new path

sub Redir {
	my ( $directory, $subdirectory, $oldpath ) = @_;

	# first, find the separator:
	my $dirsep;
	if ( $directory =~ m#[/\\]# ) {
		$dirsep = $&;
	}
	elsif ( $directory =~ m#[/\\]# ) {
		$dirsep = $&;
	}
	else {
		$dirsep = '/';	# the whole world except DOS
	}

	# strip off trailers if relevant:
	chop ( $directory )	if ( $directory =~ /$dirsep$/ );
	chop ( $subdirectory )	if ( $subdirectory =~ /$dirsep$/ );

	# now, first strip link to barest local path
	$oldpath =~ s/[^$dirsep]*$dirsep//g;

	# now place path components
	$oldpath = $subdirectory . $dirsep . $oldpath	if ( $subdirectory ne "" );
	$oldpath = $directory . $dirsep . $oldpath	if ( $directory ne "" );

	$oldpath;
}

package main;

1;	# required return
------ =_NextPart_000_01BDA7FC.2C51AFB0--