Re: extracting XML?
Gisle Aas (gisle@aas.no)
13 Feb 1998 00:30:48 +0100
Otis Gospodnetic <otis@populus.net> writes:
> can data in an HTML/XML document be extracted with LWP?
I don't know XML, but the HTML::Parser should work with most simple
SGML markup text.
> Things like Author of the document (name/email), Summary, Title, etc.
>
> How about META tags data - are there built-in methods that can get data out
> of keywords, description, and other META tags or does one have to write its
> own parser for those tags?
No.
Regards,
Gisle
> Wouldn't it be better if the parser does it for you?
It would be wrong. Which browsers do something like that? None of
those around me do it?
Regards,
Gisle
p $
+# $Id: http.pm,v 1.39 1998/02/12 22:24:11 aas Exp $
package LWP::Protocol::http;
@@ -140,7 +140,7 @@
die "short write" unless $n == length($buf);
LWP::Debug::conns($buf);
}
- } else {
+ } elsif (length($$contRef)) {
die "write timeout" if $timeout && !$sel->can_write($timeout);
my $n = $socket->syswrite($$contRef, length($$contRef));
die $! unless defined($n);
w-perl/lib/HTTP/Date.pm,v
retrieving revision 1.28
retrieving revision 1.29
diff -u -u -r1.28 -r1.29
--- Date.pm 1997/12/02 10:58:31 1.28
+++ Date.pm 1998/02/12 23:13:47 1.29
@@ -290,7 +290,7 @@
# Should we compensate for the timezone?
$tz = $default_zone unless defined $tz;
- return Time::Local::timelocal($sec, $min, $hr, $day, $mon, $yr)
+ return eval {Time::Local::timelocal($sec, $min, $hr, $day, $mon, $yr)}
unless defined $tz;
# We can calculate offset for numerical time zones
@@ -299,7 +299,7 @@
$offset += 60 * $3 if $3;
$offset *= -1 if $1 && $1 ne '-';
}
- Time::Local::timegm($sec, $min, $hr, $day, $mon, $yr) + $offset;
+ eval{Time::Local::timegm($sec, $min, $hr, $day, $mon, $yr) + $offset};
}