*** ../libwww-perl-0.30/LWP_Changes.pl Tue Sep 20 19:09:38 1994 --- LWP_Changes.pl Tue Sep 20 18:53:23 1994 *************** *** 0 **** --- 1,157 ---- + $Library = 'libwww-perl/0.40'; + __END__ + + The first line of this file sets the User-Agent and Version number of + libwww-perl and should not be changed unless you have modified the code + to work beyond its originally intended purpose AND permission has been + obtained from Roy Fielding at . + + Changes to libwww-perl + ====================== + # $Id: LWP_Changes.pl,v 1.1 1994/09/21 01:53:03 fielding Exp $ + + See the files README.html and Artistic.txt for licensing and distribution info. + See the file INSTALL.txt for installation information. + + If you have any suggestions, bug reports, fixes, or enhancements, + send them to the libwww-perl mailing list at . + + + Known problems + Documentation of the library architecture is sorely lacking, + although the code itself is fairly easy to read and understand. + + Things that need to be done (let us know if you are working on something good) + Interfaces to FTP, Gopher, WAIS, ... + A real HTML (or SGML) parser. + + NOTE: Version numbers increment according to the significance of the new + changes. The major number is incremented only for large overhauls + of the code or changes in the basic architecture/interface which + makes if incompatible with prior releases. The first minor number + reflects a change in the interface (such as a new library or a new + method for making requests) which is still compatible with the old. + The last number reflects minor bug fixes and documentation updates. + + + Version 0.40 September 20, 1994 + Changed the name of this file from Changes.txt to LWP_Changes.pl + and moved the $www'Library version name so that it can be set here. + Added a new Makefile to ease the installation process. + Added sys_socket_ph.c to help find problems in SVR4 system installs. + Added hostname.pl so that people can more easily/portably get host names. + Fixed more usage of undefined proxy environment vars (from Martijn Koster). + + Revamped the "get" program: Added code to show original headers if they + were received; Added tout= to interactively change the timeout value; + Added ims= to interactively give If-Modified-Since; Added handling of + POST content suggested by Mel Melchner; Added command-line options, + debug and quiet modes such that the program is now ideal for testing + server/proxy responses to requests; Added "get -h" usage information. + Added $headers initialization to other test clients as well. + + Fixed a number of problems with the handling of years in wwwdates.pl, + most of which were due to limitations in timelocal.pl, which goes into + an infinite loop if (year >= 2038). Now handles 2 and 4-digit years + regardless of date format. + + In wwwhttp.pl, removed unnecessary bind() and host stuff (Jack Shirazi); + Added a bunch of alarm() calls to lessen timeout problems; Now replaces + empty paths with "/" (Marc VanHeyningen); Changed name of &timeout + routine to &timed_out to avoid confusion with $timeout parameter. + + In wwwurl.pl, the parsing sets were renamed to match the IETF draft + on Relative URLs and the method used to test them was changed to use + bitmap masks. Modified parsing algorithm to use the new sets. + Parser now handles URLs like http://host:/ and uses the leftmost "?" + as the start of query info. Added caching of base URL components so + that they don't get re-parsed for every URL in a document. Now allows + lowercase hex digits in unescape(). + + Added a source code "contrib" directory at + for use as + a half-way house for wayward programs. + + + Version 0.30 August 1, 1994 + Added the wwwmailcap.pl library for handling MIME mailcap files, + www'get_def_header() for reading the default headers, and www'lrequest() + for doing autoredirected requests (all submitted by Brooks Cutter). + Removed the default headers from the www'stat() interface. + Changed the testbot and wwwbot'allowed interface to make use of the + default User-Agent header. + Firmed-up the URL parsing algorithm in wwwurl.pl (particularly relating + to the parsing of relative URLs) to coincide with the IETF standards + discussion. This fixed several potential (but unlikely) bugs and also + got rid of any "URL:" prefix parsing [finally!]. + Fixed parsing in wwwhtml.pl of href's that had a new-line after the + quote mark, causing an extra space to precede the extracted URL, + which in turn created a black hole. Also added code to extract and change + the base URL if there exists a element. + Updated the wording in Artistic.txt to represent a Perl API rather than + a compiler written in C (as is the Perl distribution). + + + Version 0.20 July 20, 1994 + Added the wwwbot.pl library and testbot program (by Brooks Cutter) + for implementing the robot exclusion protocol. + Added the testlinks program for yet another example of how useful + programs can be easily implemented on top of libwww-perl -- it also + tests just about every aspect of the request libraries. + Added &www'set_def_header() and check_defaults() so that protocol + header defaults (such as the HTTP From: header) can be set within + the library and other default request headers can be set + once by the client and effect all requests (e.g. User-Agent). + Fixed the source of an annoying warning from "perl -w" in wwwhttp.pl. + Moved some existing code in wwwmime.pl into a separate function + set_content() which can set the "content-type" header for any given + file extension. + Added &wwwurl'get_site() for extracting the site name (server:port) + from a given URL. + Updated the get program to make use of the new interface changes. + Changed the eval of &wwwscheme'request to a simpler &$routine call + after a suggestion from Brooks. + Fixed a bug in &wwwhtml'extract_links() which was causing a segmentation + fault when a completely -free file (i.e. a text file) was + mistakenly extracted. + + + Version 0.12 July 8, 1994 + Placed everything under RCS version control and included repository. + Added www'stat (from Brooks Cutter) for doing stat-like calls on a URL. + Added message field to wwwerror'onrequest so that error-specific + messages (e.g. $@ and $!) can be included in the canned HTML output. + Added symbolic names for all response code numbers. + Reassigned 000 Timed Out error to response code 603. + Added 602 Connection Failed response code. + Vastly improved the error-handling for wwwhttp'request(). + Now escapes the URL entries generated by wwwfile'dirlist(). + Removed buggy attempt to delete comments at start of wwwhtml'extract_links. + Updated META parsing in wwwhtml to reflect HTML 2.0 proposed spec. + Moved require of sys/socket.ph outside of wwwhttp package declaration + due to a bug in perl4 found by Martijn Koster. + Added many checks to be sure environment variables are defined before + trying to use them in wwwmime, wwwurl, get, and testdates (Martijn Koster). + Fixed bug that occurred when parsing URLs with an empty path. + Replaced complicated wwwurl'unescape loop with a simple substitute + (from Steven E. Brenner via Brooks Cutter). + Added wwwurl'escape() to %hex escape URL segments (from Brooks Cutter). + Added testescapes program for testing wwwurl'escape and unescape. + + + Version 0.11 June 17, 1994 + Changed environment variable LIBWWW-PERL to LIBWWW_PERL because + some systems can't handle the dash (Charlie Stross). + Fixed bug in "get" that caused full pathname to be used as the method + (Martijn Koster). + Fixed handling of perverse relative URLs (e.g. ../../) in wwwurl'absolute. + + + Version 0.10 June 13, 1994 + First public version. libwww-perl was developed by Roy Fielding + from the core of MOMspider, a program intended to assist multi-owner + maintenance of distributed hypertext infostructures. It was expanded + to a general-purpose library after some encouragement from + Oscar Nierstrasz and Martijn Koster during the First International + Conference on the World-Wide Web (WWW94). + *** ../libwww-perl-0.30/hostname.pl Tue Sep 20 19:09:39 1994 --- hostname.pl Tue Sep 20 18:25:17 1994 *************** *** 0 **** --- 1,49 ---- + # $Id: hostname.pl,v 1.1 1994/09/21 01:23:18 fielding Exp $ + # --------------------------------------------------------------------------- + # hostname.pl: A package for getting the fully-qualified domain name (FQDN) + # of the operating host machine. This is used for From: and + # Reply-To: addresses sent to other (possibly distant) machines. + # + # Usage: require "hostname.pl"; + # $my_name = $hostname'FQDN; + # + # This package has been developed by Roy Fielding + # as part of the Arcadia project at the University of California, Irvine. + # It is distributed under the Artistic License (included with your Perl + # distribution files and with the standard distribution of this package). + # + # 17 Sep 1994 (RTF): Initial version + # + # If you have any suggestions, bug reports, fixes, or enhancements, + # send them to the libwww-perl mailing list at . + # --------------------------------------------------------------------------- + + package hostname; + + chop($host = `hostname`); # The preferred BSD method + if (!$host) + { + chop($host = `uuname -l`); # The UUCP method (very old, but not dumb) + if (!$host) + { + chop($host = `uname -n`); # The POSIX method (very dumb) + if (!$host) + { + $host = $ENV{'HOST'} || # The desperation method + $ENV{'host'} || + die "Can't find the hostname for this machine, stopped"; + } + } + } + + if (index($host,'.') == -1) # Is it not fully-quallified? + { + ($FQDN, $aliases, $addrtype, $len, @addrs) = gethostbyname($host); + if (!$FQDN) { die "Unknown host $host, stopped"; } + } + else { $FQDN = $host; } + + # ================== + # print $FQDN, "\n"; # Uncomment for testing via "perl hostname.pl" + + 1; *** ../libwww-perl-0.30/www.pl Mon Aug 1 06:32:52 1994 --- www.pl Tue Sep 20 18:26:28 1994 *************** *** 1,4 **** ! # $Id: www.pl,v 0.15 1994/08/01 13:32:38 fielding Exp $ # --------------------------------------------------------------------------- # www.pl: A package for handling requests of any World-Wide Web URL, # including requests that should be redirected to a proxy server. --- 1,4 ---- ! # $Id: www.pl,v 0.16 1994/09/21 01:23:18 fielding Exp $ # --------------------------------------------------------------------------- # www.pl: A package for handling requests of any World-Wide Web URL, # including requests that should be redirected to a proxy server. *************** *** 24,36 **** # Changed the request eval to version suggested by Brooks. # 31 Jul 1994 (RTF): Added get_def_header() and lrequest() (from Brooks). # Removed default headers from the stat() interface. # # If you have any suggestions, bug reports, fixes, or enhancements, # send them to the libwww-perl mailing list at . # --------------------------------------------------------------------------- require "wwwurl.pl"; - require "wwwmime.pl"; require "wwwerror.pl"; require "wwwhttp.pl"; # Note that there should eventually be a wwwSCHEME require "wwwfile.pl"; # package for each supported protocol scheme. --- 24,39 ---- # Changed the request eval to version suggested by Brooks. # 31 Jul 1994 (RTF): Added get_def_header() and lrequest() (from Brooks). # Removed default headers from the stat() interface. + # 19 Sep 1994 (RTF): Added hostname.pl to satisfy those non-BSD people. + # Fixed usage of undefined proxy vars (from Martijn Koster). # # If you have any suggestions, bug reports, fixes, or enhancements, # send them to the libwww-perl mailing list at . # --------------------------------------------------------------------------- + require "hostname.pl"; require "wwwurl.pl"; require "wwwerror.pl"; + require "wwwdates.pl"; require "wwwhttp.pl"; # Note that there should eventually be a wwwSCHEME require "wwwfile.pl"; # package for each supported protocol scheme. *************** *** 38,44 **** # and a "request" subroutine. package www; ! $Library = 'libwww-perl/0.30'; # To be appended onto client's User-Agent # ========================================================================== # Get the default From address for HTTP requests and add it to defaults. --- 41,47 ---- # and a "request" subroutine. package www; ! require "LWP_Changes.pl"; # Imports Library Version Number # ========================================================================== # Get the default From address for HTTP requests and add it to defaults. *************** *** 47,61 **** @DefHeaderSchemes = (); @DefHeaderValues = (); - chop($host = `hostname`); - if (index($host,'.') == -1) - { - $host = join('.', $host, `domainname`); - chop($host); - } $user = ( $ENV{'USER'} || $ENV{'LOGNAME'} || 'unknown' ); ! &set_def_header('http', 'From', join('@', $user, $host)); # ========================================================================== --- 50,58 ---- @DefHeaderSchemes = (); @DefHeaderValues = (); $user = ( $ENV{'USER'} || $ENV{'LOGNAME'} || 'unknown' ); ! &set_def_header('http', 'From', join('@', $user, $hostname'FQDN)); # ========================================================================== *************** *** 67,78 **** # request(): perform a WWW request using the passed method, absolute URL, # and request headers, and return the resulting response code. # The response codes for all protocols mirror those of HTTP. ! # Also returns as parameters the response %headers and # document $content. $timeout is specified in seconds. # # This is the primary interface to libwww-perl. Use the following # format to request a WWW document: # # $respcode = &www'request($method, $url, *headers, *content, $timeout); # # WHERE, --- 64,79 ---- # request(): perform a WWW request using the passed method, absolute URL, # and request headers, and return the resulting response code. # The response codes for all protocols mirror those of HTTP. ! # Also returns as parameters the response $headers, %headers and # document $content. $timeout is specified in seconds. # # This is the primary interface to libwww-perl. Use the following # format to request a WWW document: # + # local($content) = ''; + # local($headers) = ''; + # local(%headers) = (); + # # $respcode = &www'request($method, $url, *headers, *content, $timeout); # # WHERE, *************** *** 83,92 **** # # $url: A WWW Uniform Resource Locator in absolute form. # # %headers: (Incoming) Request headers for request, e.g. ! # $headers{'User-Agent'} = 'MOMspider/0.1'." $www'Library"; # ! # (Returned) Response headers from result (in lower-case), e.g. # $headers{'content-type'} = 'text/html'; # # $content: (Incoming) Document to send for methods POST, PUT, etc. --- 84,96 ---- # # $url: A WWW Uniform Resource Locator in absolute form. # + # $headers: (Incoming) Ignored + # (Returned) The actual headers returned from the network request + # # %headers: (Incoming) Request headers for request, e.g. ! # $headers{'User-Agent'} = "MOMspider/0.1 $www'Library"; # ! # (Returned) Response headers from result (parsed and lower-case), # $headers{'content-type'} = 'text/html'; # # $content: (Incoming) Document to send for methods POST, PUT, etc. *************** *** 247,255 **** } } ! local($pcheck) = join(//, q/$ENV{'/, $scheme, q/_proxy'}/); ! return (eval "$pcheck;"); } --- 251,259 ---- } } ! local($pcheck) = q/$ENV{'/ . $scheme . q/_proxy'}/; ! return (eval "$pcheck if defined($pcheck);"); } *************** *** 302,309 **** sub stat { local($url) = @_; - local(%headers, $content, $response, $last_modified); $response = &request('HEAD', $url, *headers, *content, 30); if ($headers{'last-modified'}) --- 306,318 ---- sub stat { local($url) = @_; + local($content) = ''; + local($headers) = ''; + local(%headers) = (); + local($response) = 0; + local($last_modified) = 0; + $response = &request('HEAD', $url, *headers, *content, 30); if ($headers{'last-modified'}) *************** *** 345,352 **** # $url: A WWW Uniform Resource Locator in absolute form. If the request # is redirected, $url will be changed to reflect the new URL. # # %headers: (Incoming) Request headers for request, e.g. ! # $headers{'User-Agent'} = 'MOMspider/0.1'." $www'Library"; # # (Returned) Response headers from result (in lower-case), e.g. # $headers{'content-type'} = 'text/html'; --- 354,364 ---- # $url: A WWW Uniform Resource Locator in absolute form. If the request # is redirected, $url will be changed to reflect the new URL. # + # $headers: (Incoming) Ignored + # (Returned) The actual headers returned from the last net request + # # %headers: (Incoming) Request headers for request, e.g. ! # $headers{'User-Agent'} = "MOMspider/0.1 $www'Library"; # # (Returned) Response headers from result (in lower-case), e.g. # $headers{'content-type'} = 'text/html'; *************** *** 365,371 **** foreach $idx (1 .. 10) { ! $response = &www'request($method, $url, *headers, *content, $timeout); last unless ($response =~ /^30[12]$/); last if ($idx == 10); --- 377,383 ---- foreach $idx (1 .. 10) { ! $response = &request($method, $url, *headers, *content, $timeout); last unless ($response =~ /^30[12]$/); last if ($idx == 10); *************** *** 385,390 **** --- 397,403 ---- next if ($hd =~ m#^[A-Z]#); delete $headers{$hd}; } + $headers = ''; } return($response); } *** ../libwww-perl-0.30/wwwbot.pl Mon Aug 1 06:32:09 1994 --- wwwbot.pl Tue Sep 20 18:26:51 1994 *************** *** 1,4 **** ! # $Id: wwwbot.pl,v 1.2 1994/08/01 13:32:01 fielding Exp $ # --------------------------------------------------------------------------- # wwwbot.pl: This library implements the Robot Exclusion protocol # (draft 6/30/94) as documented on --- 1,4 ---- ! # $Id: wwwbot.pl,v 1.3 1994/09/21 01:23:18 fielding Exp $ # --------------------------------------------------------------------------- # wwwbot.pl: This library implements the Robot Exclusion protocol # (draft 6/30/94) as documented on *************** *** 25,30 **** --- 25,31 ---- # Wrote documentation and examples for wwwbot routines # 20 Jul 1994 (RTF): Reformatted a bit for inclusion in standard libwww-perl. # 30 Jul 1994 (RTF): Changed interface to make use of default User-Agent. + # 20 Sep 1994 (RTF): Added initialization of $headers # # If you have any suggestions, bug reports, fixes, or enhancements, # send them to the libwww-perl mailing list at . *************** *** 244,250 **** sub load_robots { local($host, $port, $user_agent) = @_; ! local(%headers, $content, $response, $url, $n, $ua, $dis); local($timeout) = 30; --- 245,251 ---- sub load_robots { local($host, $port, $user_agent) = @_; ! local($headers, %headers, $content, $response, $url, $n, $ua, $dis); local($timeout) = 30; *************** *** 253,258 **** --- 254,260 ---- $url = "http://$host:$port/robots.txt"; %headers = (); + $headers = ''; $content = ''; $response = &www'request('GET', $url, *headers, *content, $timeout); *** ../libwww-perl-0.30/wwwdates.pl Fri Jul 8 01:10:06 1994 --- wwwdates.pl Tue Sep 20 18:27:19 1994 *************** *** 1,8 **** ! # $Id: wwwdates.pl,v 0.12 1994/07/08 08:08:14 fielding Exp $ # --------------------------------------------------------------------------- # wwwdates: A package for manipulating date/time stamps in the format used # on the World-Wide Web. # # This package has been developed by Roy Fielding # as part of the Arcadia project at the University of California, Irvine. # Each routine in this package has been derived from the work of multiple --- 1,11 ---- ! # $Id: wwwdates.pl,v 0.13 1994/09/21 01:23:18 fielding Exp $ # --------------------------------------------------------------------------- # wwwdates: A package for manipulating date/time stamps in the format used # on the World-Wide Web. # + # NOTE: Due to the limitations of timelocal.pl, this library can only + # handle dates between "01 Jan 1970" and "01 Jan 2038". + # # This package has been developed by Roy Fielding # as part of the Arcadia project at the University of California, Irvine. # Each routine in this package has been derived from the work of multiple *************** *** 12,20 **** # distribution files). # # 09 Jun 1994 (RTF): Initial Version # # If you have any suggestions, bug reports, fixes, or enhancements, ! # send them to the author Roy Fielding at . # --------------------------------------------------------------------------- # # To convert machine time to ascii www date: --- 15,26 ---- # distribution files). # # 09 Jun 1994 (RTF): Initial Version + # 18 Sep 1994 (RTF): Fixed a number of problems with the handling of years, + # most of which were due to limitations in timelocal.pl + # which goes into an infinite loop if (year >= 2038). # # If you have any suggestions, bug reports, fixes, or enhancements, ! # send them to the libwww-perl mailing list at . # --------------------------------------------------------------------------- # # To convert machine time to ascii www date: *************** *** 66,73 **** ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = ($tz eq 'GMT') ? gmtime($time) : localtime($time); ! $year += ($year < 70) ? 2000 : 1900; ! sprintf("%s, %02d %s %4d %02d:%02d:%02d %s", substr($DoW[$wday],0,3), $mday, $MoY[$mon], $year, $hour, $min, $sec, $tz); } --- 72,79 ---- ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = ($tz eq 'GMT') ? gmtime($time) : localtime($time); ! $year += 1900; ! sprintf("%s, %02d %s %04d %02d:%02d:%02d %s", substr($DoW[$wday],0,3), $mday, $MoY[$mon], $year, $hour, $min, $sec, $tz); } *************** *** 98,104 **** ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = ($tz eq 'GMT') ? gmtime($time) : localtime($time); ! sprintf("%s, %02d-%s-%2d %02d:%02d:%02d %s", $DoW[$wday], $mday, $MoY[$mon], $year, $hour, $min, $sec, $tz); } --- 104,112 ---- ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = ($tz eq 'GMT') ? gmtime($time) : localtime($time); ! if ($year > 99) { $year %= 100; } ! ! sprintf("%s, %02d-%s-%02d %02d:%02d:%02d %s", $DoW[$wday], $mday, $MoY[$mon], $year, $hour, $min, $sec, $tz); } *************** *** 117,130 **** # "Thu Feb 3 17:03:55 GMT 1994" -- ctime format # "Wed, 09 Feb 1994 22:23:32 GMT" -- proposed new HTTP format # "Tuesday, 08-Feb-94 14:15:29 GMT" -- old rfc850 HTTP format # # "03/Feb/1994:17:03:55 -0700" -- common logfile format # "09 Feb 1994 22:23:32 GMT" -- proposed new HTTP format (no weekday) # "08-Feb-94 14:15:29 GMT" -- old rfc850 HTTP format (no weekday) # ! # "08-Feb-94" -- old rfc850 HTTP format (no weekday, no time) ! # "09 Feb 1994" -- proposed new HTTP format (no weekday, no time) ! # "03/Feb/1994" -- common logfile format (no time, no offset) sub get_gmtime { --- 125,141 ---- # "Thu Feb 3 17:03:55 GMT 1994" -- ctime format # "Wed, 09 Feb 1994 22:23:32 GMT" -- proposed new HTTP format # "Tuesday, 08-Feb-94 14:15:29 GMT" -- old rfc850 HTTP format + # "Tuesday, 08-Feb-1994 14:15:29 GMT" -- broken rfc850 HTTP format # # "03/Feb/1994:17:03:55 -0700" -- common logfile format # "09 Feb 1994 22:23:32 GMT" -- proposed new HTTP format (no weekday) # "08-Feb-94 14:15:29 GMT" -- old rfc850 HTTP format (no weekday) + # "08-Feb-1994 14:15:29 GMT" -- broken rfc850 HTTP format(no weekday) # ! # "08-Feb-94" -- old rfc850 HTTP format (no weekday, no time) ! # "08-Feb-1994" -- broken rfc850 HTTP format (no weekday, no time) ! # "09 Feb 1994" -- proposed new HTTP format (no weekday, no time) ! # "03/Feb/1994" -- common logfile format (no time, no offset) sub get_gmtime { *************** *** 148,160 **** $day = shift(@w); $atime = shift(@w); shift(@w); ! $yr = shift(@w) - 1900; } elsif ($w[0] =~ m#/#) # Must be common logfile (03/Feb/1994:17:03:55 -0700) { ($adate, $atime) = split(/:/, $w[0], 2); ($day, $mn, $yr) = split(/\//, $adate); - $yr -= 1900; shift(@w); if ( $w[0] =~ m#^([+-])(\d\d)(\d\d)$# ) { --- 159,170 ---- $day = shift(@w); $atime = shift(@w); shift(@w); ! $yr = shift(@w); } elsif ($w[0] =~ m#/#) # Must be common logfile (03/Feb/1994:17:03:55 -0700) { ($adate, $atime) = split(/:/, $w[0], 2); ($day, $mn, $yr) = split(/\//, $adate); shift(@w); if ( $w[0] =~ m#^([+-])(\d\d)(\d\d)$# ) { *************** *** 172,178 **** { $day = shift(@w); $mn = shift(@w); ! $yr = shift(@w) - 1900; $atime = shift(@w); } if ($atime) --- 182,188 ---- { $day = shift(@w); $mn = shift(@w); ! $yr = shift(@w); $atime = shift(@w); } if ($atime) *************** *** 184,190 **** $hr = $min = $sec = 0; } ! if (!$mn || ($yr < 70)) { return 0; } # Translate month name to number $midx = index($Mstr, substr($mn,0,3)); --- 194,205 ---- $hr = $min = $sec = 0; } ! if (!$mn || ($yr !~ /\d+/)) { return 0; } ! if (($yr > 99) && ($yr < 1970)) { return 0; } # Epoch started in 1970 ! ! if ($yr < 70) { $yr += 100; } ! if ($yr >= 1900) { $yr -= 1900; } ! if ($yr >= 138) { return 0; } # Epoch counter maxes out in year 2038 # Translate month name to number $midx = index($Mstr, substr($mn,0,3)); *** ../libwww-perl-0.30/wwwhttp.pl Wed Jul 20 09:15:06 1994 --- wwwhttp.pl Tue Sep 20 18:28:07 1994 *************** *** 1,4 **** ! # $Id: wwwhttp.pl,v 0.13 1994/07/20 16:14:56 fielding Exp $ # --------------------------------------------------------------------------- # wwwhttp: A package for sending HTTP requests and handling responses for the # World-Wide Web. This package is designed for use by www.pl --- 1,4 ---- ! # $Id: wwwhttp.pl,v 0.15 1994/09/21 01:23:18 fielding Exp $ # --------------------------------------------------------------------------- # wwwhttp: A package for sending HTTP requests and handling responses for the # World-Wide Web. This package is designed for use by www.pl *************** *** 17,28 **** # a bug in perl4 found by Martijn Koster # Fixed error handling in case of problems in eval. # 19 Jul 1994 (RTF): Fixed nagging warning from perl -w that made no sense. # # If you have any suggestions, bug reports, fixes, or enhancements, ! # send them to Roy Fielding at . # --------------------------------------------------------------------------- ! # Some of these routines are reduced versions of those distributed by ! # Oscar Nierstrasz from CUI, University of Geneva. # See for more info. # =========================================================================== require "wwwerror.pl"; --- 17,31 ---- # a bug in perl4 found by Martijn Koster # Fixed error handling in case of problems in eval. # 19 Jul 1994 (RTF): Fixed nagging warning from perl -w that made no sense. + # 17 Sep 1994 (RTF): Removed unnecessary bind() and host stuff (Jack Shirazi); + # Added a bunch of alarm() calls to lessen timeout problems; + # Replaces empty paths with "/" (Marc VanHeyningen). # # If you have any suggestions, bug reports, fixes, or enhancements, ! # send them to the libwww-perl mailing list at . # --------------------------------------------------------------------------- ! # Some of these routines are enhanced versions of those distributed by ! # Oscar Nierstrasz from IAM, University of Berne. # See for more info. # =========================================================================== require "wwwerror.pl"; *************** *** 43,60 **** 'SHOWMETHOD', 1, ); - # - # Setup the socket parameters for this process - # - $SockAddr = 'S n a4 x8'; - chop($ThisHost = `hostname`); - if (!$ThisHost) { die "Can't get hostname of this host, stopped"; } - ($name, $aliases, $Proto) = getprotobyname("tcp"); - ($name, $aliases, $addrtype, $len, $ThisAddr) = gethostbyname($ThisHost); - $ThisSock = pack($SockAddr, &main'AF_INET, 0, $ThisAddr); - - # =========================================================================== # request(): perform an http request for the $object at the HTTP server # on the specified $host and $port, giving up after $timeout seconds. --- 46,52 ---- *************** *** 81,86 **** --- 73,81 ---- "Library does not allow that method for HTTP"); } + if (!$object) { $object = '/'; } # Trailing slash is optional on URLs, + # but is required for HTTP server root. + $reqstr = "$method $object HTTP/1.0\r\n"; foreach $hd (keys(%headers)) { *************** *** 108,116 **** } } ! $that = pack($SockAddr, &main'AF_INET, $port, $thataddr); ! if (!( socket(FS, &main'AF_INET, &main'SOCK_STREAM, $Proto) && ! bind(FS, $ThisSock) )) { return &wwwerror'onrequest($wwwerror'RC_connection_failed, $method, 'http', $host, $port, $object, *headers, *content, --- 103,110 ---- } } ! $that = pack('S n a4 x8', &main'AF_INET, $port, $thataddr); ! if (! socket(FS, &main'AF_INET, &main'SOCK_STREAM, 0)) { return &wwwerror'onrequest($wwwerror'RC_connection_failed, $method, 'http', $host, $port, $object, *headers, *content, *************** *** 119,127 **** local($/); $run_it = <<'EOF'; ! $SIG{'ALRM'} = "wwwhttp'timeout"; alarm($timeout); connect(FS, $that) || die "Cannot connect to $host:$port, $! \n"; select((select(FS), $| = 1)[0]); # Make FS unbuffered print FS $reqstr; if ($AllowedMethods{$method} == 2) { print FS $content; } --- 113,122 ---- local($/); $run_it = <<'EOF'; ! $SIG{'ALRM'} = "wwwhttp'timed_out"; alarm($timeout); connect(FS, $that) || die "Cannot connect to $host:$port, $! \n"; + alarm($timeout); select((select(FS), $| = 1)[0]); # Make FS unbuffered print FS $reqstr; if ($AllowedMethods{$method} == 2) { print FS $content; } *************** *** 128,140 **** $/ = "\n"; $_ = ; ! if (m:^HTTP/\S+\s+(\d+)\s:) # HTTP/1.0 or better { $response = $1; while() { last if /^[\r\n]+$/; # end of header $resphead .= $_; } undef($/); $content = ; --- 123,142 ---- $/ = "\n"; $_ = ; ! die "No response.\n" unless defined($_); ! ! $timeout <<= 2; # Quadruple timeout after 1st response ! alarm($timeout); ! if (m:^HTTP/\S+\s+(\d+)\s:) # HTTP/1.0 or better { $response = $1; + $headers = $_; # pass real headers back to client while() { + alarm($timeout); last if /^[\r\n]+$/; # end of header $resphead .= $_; + $headers .= $_; } undef($/); $content = ; *************** *** 141,153 **** } else # old style server reply { - $response = $wwwerror'RC_ok; # I have no idea if it's good or not - undef($/); $content = $_; $_ = ; $content .= $_; } $SIG{'ALRM'} = "IGNORE"; EOF eval $run_it; --- 143,156 ---- } else # old style server reply { $content = $_; + $response = $wwwerror'RC_ok; # Assume it is a good response + undef($/); $_ = ; $content .= $_; } $SIG{'ALRM'} = "IGNORE"; + alarm(0); EOF eval $run_it; *************** *** 154,162 **** if ($@) { $SIG{'ALRM'} = "IGNORE"; close(FS); ! if ($@ =~ /^Time/o) { $response = $wwwerror'RC_timed_out; } ! else { $response = $wwwerror'RC_connection_failed; } return &wwwerror'onrequest($response, $method, 'http', $host, $port, $object, *headers, *content, $@); --- 157,167 ---- if ($@) { $SIG{'ALRM'} = "IGNORE"; + alarm(0); close(FS); ! if ($@ =~ /^Time/o) { $response = $wwwerror'RC_timed_out; } ! elsif ($@ =~ /^No r/o) { $response = $wwwerror'RC_bad_response; } ! else { $response = $wwwerror'RC_connection_failed; } return &wwwerror'onrequest($response, $method, 'http', $host, $port, $object, *headers, *content, $@); *************** *** 166,172 **** return $response; } ! sub timeout { die "Timed Out\n"; } # =========================================================================== --- 171,177 ---- return $response; } ! sub timed_out { die "Timed Out\n"; } # =========================================================================== *** ../libwww-perl-0.30/wwwurl.pl Mon Aug 1 06:31:17 1994 --- wwwurl.pl Tue Sep 20 18:29:06 1994 *************** *** 1,4 **** ! # $Id: wwwurl.pl,v 0.14 1994/08/01 13:30:59 fielding Exp $ # --------------------------------------------------------------------------- # wwwurl: A package for parsing and manipulating World-Wide Web # Uniform Resource Locators (URL). --- 1,4 ---- ! # $Id: wwwurl.pl,v 0.15 1994/09/21 01:23:18 fielding Exp $ # --------------------------------------------------------------------------- # wwwurl: A package for parsing and manipulating World-Wide Web # Uniform Resource Locators (URL). *************** *** 19,27 **** # 27 Jul 1994 (RTF): Firmed-up algorithm for parsing relative URLs, fixing # several potential (but unlikely) bugs in the process. # Removed any hint of "URL:" prefix. # # If you have any suggestions, bug reports, fixes, or enhancements, ! # send them to Roy Fielding at . # --------------------------------------------------------------------------- package wwwurl; --- 19,35 ---- # 27 Jul 1994 (RTF): Firmed-up algorithm for parsing relative URLs, fixing # several potential (but unlikely) bugs in the process. # Removed any hint of "URL:" prefix. + # 17 Sep 1994 (RTF): Renamed parsing sets to match IETF draft and changed + # how they are tested to use bitmap masks; + # Modified parsing algorithm to use new sets; + # Parser now handles URLs like http://host:/ and uses + # the leftmost "?" as the start of query info; + # Added caching of base URL components so that they don't + # get re-parsed for every URL in a document. + # Allowed lowercase hex digits in unescape. # # If you have any suggestions, bug reports, fixes, or enhancements, ! # send them to the libwww-perl mailing list at . # --------------------------------------------------------------------------- package wwwurl; *************** *** 41,74 **** 'prospero', 1525, # I thought it was 191, but IETF differs ); ! %CantChange = ( # Define schemes that cannot be altered by absolute ! 'mailto', 1, ! 'news', 1, ! 'mid', 1, ! 'cid', 1, ! ); ! %NonHierarchical = ( # Define remaining schemes that can be changed ! 'telnet', 1, # but which cannot use relative URL paths ! 'rlogin', 1, ! 'tn3270', 1, ! 'whois', 1, ! 'gopher', 1, ! 'finger', 1, ! ); ! %UsesQuery = ( # Define schemes that use '?' to denote a query ! 'http', 1, ! 'wais', 1, ); ! %UsesParams = ( # Define schemes that use ';' to denote parameters ! 'ftp', 1, ! 'prospero', 1, ! ); # =========================================================================== # parse(): Parse the given URL into its component parts according to # WWW URI rules, returning '' for those that are not present. # If no scheme is given, the URL is parsed according to HTTP rules, --- 49,102 ---- 'prospero', 1525, # I thought it was 191, but IETF differs ); ! # =========================================================================== ! # The following six categories are bitmap masks for determining membership ! # in the corresponding URL syntactic set, as per the IETF/URI working group ! # draft specification for Relative URLs . ! $UsesRelative = 1; ! $UsesNetloc = 2; ! $NonHierarchical = 4; ! $UsesParams = 8; ! $UsesQuery = 16; ! $UsesFragment = 32; ! %InSet = ( # Define scheme membership in each category ! '', ($UsesRelative | $UsesNetloc | $UsesFragment | $UsesQuery), ! 'http', ($UsesRelative | $UsesNetloc | $UsesFragment | $UsesQuery), ! 'file', ($UsesRelative | $UsesNetloc | $UsesFragment), ! 'ftp', ($UsesRelative | $UsesNetloc | $UsesFragment | $UsesParams), ! 'prospero', ($UsesRelative | $UsesNetloc | $UsesFragment | $UsesParams), ! 'nntp', ($UsesRelative | $UsesNetloc | $UsesFragment), ! 'gopher', ($UsesRelative | $UsesNetloc | $NonHierarchical | ! $UsesFragment), ! 'wais', ($UsesRelative | $UsesNetloc | $NonHierarchical | $UsesQuery | ! $UsesFragment), ! 'mailto', ($NonHierarchical), ! 'news', ($NonHierarchical | $UsesFragment), ! 'finger', ($UsesNetloc | $NonHierarchical | $UsesFragment), ! 'whois', ($UsesNetloc | $NonHierarchical | $UsesFragment), ! 'webster', ($UsesNetloc | $NonHierarchical | $UsesFragment), ! 'telnet', ($UsesNetloc | $NonHierarchical), ! 'rlogin', ($UsesNetloc | $NonHierarchical), ! 'tn3270', ($UsesNetloc | $NonHierarchical), ); ! # =========================================================================== ! # The following package globals are used to cache the last Base URL parsed. + $Burl = ''; + $Bsch = ''; + $Baddr = ''; + $Bport = ''; + $Bpath = ''; + $Bquery = ''; + $Bfrag = ''; + $Bmem = 0; + # =========================================================================== + # =========================================================================== # parse(): Parse the given URL into its component parts according to # WWW URI rules, returning '' for those that are not present. # If no scheme is given, the URL is parsed according to HTTP rules, *************** *** 79,87 **** # $scheme : The access scheme (converted to lower case); # $address: The login or hostname/IP address (if appropriate); # $port : The TCP port (if appropriate); ! # $path : The object path; ! # $query : The post-'?' search info (only if scheme uses queries); ! # $frag : The post-'#' fragment identifier. # sub parse { --- 107,115 ---- # $scheme : The access scheme (converted to lower case); # $address: The login or hostname/IP address (if appropriate); # $port : The TCP port (if appropriate); ! # $path : The object path (plus any params); ! # $query : The post-'?' search info (if scheme uses queries); ! # $frag : The post-'#' fragment identifier (if uses fragments). # sub parse { *************** *** 99,119 **** $scheme =~ tr/A-Z/a-z/; } ! if ($url =~ s/#([^#]*)$//) { $frag = $1; } ! if ($url =~ m#^//#o) { $url =~ s#^//([^/]*)##; $address = $1; ! if ($address =~ s/:(\d+)$//) { $port = $1; } } ! if (!$scheme || $UsesQuery{$scheme}) { ! if ($url =~ s/\?([^?]*)$//) { $query = $1; } } $path = $url; --- 127,152 ---- $scheme =~ tr/A-Z/a-z/; } ! local($member) = $InSet{$scheme} || 0; ! if ($member & $UsesFragment) { + if ($url =~ s/#([^#]*)$//) { $frag = $1; } + } + + if (($member & $UsesNetloc) && ($url =~ m#^//#o)) + { $url =~ s#^//([^/]*)##; $address = $1; ! if ($address =~ s/:(\d*)$//) { $port = $1; } } ! if ($member & $UsesQuery) { ! if ($url =~ s/\?(.*)$//) { $query = $1; } } $path = $url; *************** *** 130,136 **** # $scheme : The access scheme; # $address: The hostname/IP address; # $port : The TCP port; ! # $path : The object path; # $query : The post-? search info # $frag : The post-'#' fragment identifier # --- 163,169 ---- # $scheme : The access scheme; # $address: The hostname/IP address; # $port : The TCP port; ! # $path : The object path (plus any params); # $query : The post-? search info # $frag : The post-'#' fragment identifier # *************** *** 167,173 **** { local($url) = @_; ! $url =~ s/%([\dA-F][\dA-F])/pack("C",hex($1))/ge; return $url; } --- 200,206 ---- { local($url) = @_; ! $url =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack("C",hex($1))/ge; return $url; } *************** *** 191,201 **** # =========================================================================== # absolute(): Return the absolute URL given a (possibly relative) URL ! # and its parent's absolute URL. # sub absolute { ! local($parent, $url) = @_; $url =~ s/^\s+//; # Remove any preceding whitespace $url =~ s/\s.*//; # Remove anything after first word --- 224,235 ---- # =========================================================================== # absolute(): Return the absolute URL given a (possibly relative) URL ! # and the document's absolute base URL. Uses the $B* variables ! # to cache the last Base URL parsed. # sub absolute { ! local($base, $url) = @_; $url =~ s/^\s+//; # Remove any preceding whitespace $url =~ s/\s.*//; # Remove anything after first word *************** *** 202,251 **** local($scheme, $addr, $port, $path, $query, $frag) = &parse($url); ! return $url if ($CantChange{$scheme}); RELATED: { ! if (!$parent) # If no parent was given then it can't be relative { if (!$scheme) { $scheme = 'file' } # Default to a file URL last RELATED; } ! local($psch,$paddr,$pport,$ppath,$pquery,$pfrag) = &parse($parent); if (!$scheme) { ! $scheme = $psch; ! if ($query && !$UsesQuery{$scheme}) # Restore mistaken queries { $path .= '?'. $query; $query = ''; } } ! else { last RELATED if ($scheme ne $psch); } ! last RELATED if ($addr || $port); # Child must have used '//' ! $addr = $paddr; ! $port = $pport; if (!$path) { ! $path = $ppath; ! if (!$query) { $query = $pquery; } } ! elsif ($NonHierarchical{$scheme}) {;} # Do nothing elsif ($path !~ m|^/|o) # If the child URL does not begin with '/' { ! if ($ppath) { ! if ($UsesParams{$scheme}) { ! $ppath =~ s#;.*##; # Trim off any parent parameters } ! $ppath =~ s#/[^/]*$#/#; # Trim off any parent filename } else { $ppath = '/'; } --- 236,296 ---- local($scheme, $addr, $port, $path, $query, $frag) = &parse($url); ! local($member) = $InSet{$scheme} || 0; + RELATED: { ! if (!$base) # If no base was given then it can't be relative { if (!$scheme) { $scheme = 'file' } # Default to a file URL last RELATED; } ! if ($base ne $Burl) # Check the Base URL cache ! { ! $Burl = $base; ! ($Bsch,$Baddr,$Bport,$Bpath,$Bquery,$Bfrag) = &parse($Burl); ! $Bmem = $InSet{$Bsch} || 0; ! } if (!$scheme) { ! $scheme = $Bsch; ! $member = $Bmem; ! if ($query && !($Bmem & $UsesQuery)) # Restore mistaken queries { $path .= '?'. $query; $query = ''; } } ! else { last RELATED if ($scheme ne $Bsch); } ! last RELATED unless ($member & $UsesRelative); ! last RELATED if ($addr || $port); # Child must have used '//' + $addr = $Baddr; # else it inherits base netloc + $port = $Bport; + if (!$path) { ! $path = $Bpath; ! if (!$query) { $query = $Bquery; } } ! elsif ($member & $NonHierarchical) {;} # Do nothing elsif ($path !~ m|^/|o) # If the child URL does not begin with '/' { ! local($ppath); ! if ($Bpath) { ! $ppath = $Bpath; ! if ($Bmem & $UsesParams) { ! $ppath =~ s#;.*##; # Trim off any base parameters } ! $ppath =~ s#/[^/]*$#/#; # Trim off any base filename } else { $ppath = '/'; } *************** *** 256,263 **** # while ($path =~ s#/\./#/#) {;} $path =~ s#/\.$#/#; ! while ($path =~ s#/[^/]+/\.\./#/#) {;} ! $path =~ s#/[^/]+/\.\.$#/#; } } --- 301,308 ---- # while ($path =~ s#/\./#/#) {;} $path =~ s#/\.$#/#; ! while ($path =~ s#/[^/]*/\.\./#/#) {;} ! $path =~ s#/[^/]*/\.\.$#/#; } } *************** *** 270,281 **** # file: as an alias for ftp: (i.e. when the IETF standard is done). # } ! elsif ($scheme eq 'http') { $path =~ s#^/\%7E#/~#; } # NOTE: Fanatical spec-followers should reverse the above substitution # because it improperly prefers the tilde character over %7E (:-b) ! if ($port && $port == $DefPort{$scheme}) { $port = ''; } return &compose($scheme, $addr, $port, $path, $query, $frag); } --- 315,328 ---- # file: as an alias for ftp: (i.e. when the IETF standard is done). # } ! elsif ($scheme eq 'http') { $path =~ s#^/\%7E#/~#io; } # NOTE: Fanatical spec-followers should reverse the above substitution # because it improperly prefers the tilde character over %7E (:-b) ! if ($port && ($port == $DefPort{$scheme})) { $port = ''; } ! ! if (!$path) { $path = '/'; } return &compose($scheme, $addr, $port, $path, $query, $frag); } *** ../libwww-perl-0.30/INSTALL.txt Mon Aug 1 06:40:22 1994 --- INSTALL.txt Tue Sep 20 18:23:54 1994 *************** *** 1,9 **** libwww-perl Installation Information ==================================== ! # $Id: INSTALL.txt,v 0.14 1994/08/01 13:40:13 fielding Exp $ See the files README.html and Artistic.txt for licensing and distribution info. ! See the file Changes.txt for a complete list of changes and version history. The latest version of libwww-perl can always be found at: --- 1,9 ---- libwww-perl Installation Information ==================================== ! # $Id: INSTALL.txt,v 0.15 1994/09/21 01:23:18 fielding Exp $ See the files README.html and Artistic.txt for licensing and distribution info. ! See the file LWP_Changes.pl for a complete list of changes and version history. The latest version of libwww-perl can always be found at: *************** *** 21,161 **** it will be in the form of a compressed unix tar file. If it has not already been decompressed by your WWW client, then do one of: ! % uncompress libwww-perl-0.30.tar.Z ! % gunzip libwww-perl-0.30.tar.gz ! depending on which compressed version you downloaded. ! 2. Move the resulting libwww-perl-0.30.tar file to the directory above where you want to install libwww-perl, cd to that directory, and do ! % tar xvf libwww-perl-0.30.tar ! to create the directory ./libwww-perl-0.30 containing the following: ! Artistic.txt -- the Artistic License governing redistribution of the libwww-perl package. ! Changes.txt -- the list of known problems and version information. ! INSTALL.txt -- this file ! README.html -- primary source of information about libwww-perl ! get -- a simple program for performing WWW GET requests from the command-line. The name of the program determines what request method to be used (i.e. create a link to it called "head" and you have a program that does HEAD requests). This program demonstrates the power and simplicity of the libwww-perl interface. ! mime.types -- the standard MIME content-types and default filename extensions in the same format as that used by NCSA httpd_1.3 and many WWW clients. ! testbot -- a simple program for testing the wwwbot.pl package. ! testdates -- a simple program for testing the wwwdates.pl package. ! testescapes -- a program for testing wwwurl'escape and unescape. ! testlinks -- a simple program for testing HTML link extraction and combinations of GET and HEAD requests. ! www.pl -- the primary entry point for WWW requests -- give it any absolute URL and a request method and it will try to perform the method using the URL's protocol scheme (or a proxy). ! wwwbot.pl -- a package for implementing the robot exclusion protocol. ! wwwdates.pl -- a package of library utilities for reading, manipulating, and writing dates as they are formatted by most World-Wide Web software and protocols. ! wwwerror.pl -- a package for defining and generating error messages for requests which did not make it outside the client program. ! wwwfile.pl -- a package for performing local file requests (URLs of the form file://localhost/*) and returning a response as if it came from an HTTP server. ! wwwhtml.pl -- a package of library utilities for reading and manipulating HTML documents. ! wwwhttp.pl -- a package for performing HTTP requests (URLs of the form http:*). ! wwwmailcap.pl -- a package of library utilities for handling MIME mailcap files and executing viewers by content-type. ! wwwmime.pl -- a package of library utilities for handling MIME content-types and message headers. ! wwwurl.pl -- a package of library utilities for parsing, composing, manipulating, and canonicalizing Uniform Resource Locators (URLs) as they are used by the World-Wide Web software and protocols. ! 3. You may need to change the following (with any text editor). ! The first line of each program (get and all test*) ! should point to your perl executable: ! #!/usr/local/bin/perl ! 4. The LIBWWW_PERL environment variable must be set to point to the libwww-perl directory, e.g. ! % setenv LIBWWW_PERL /usr/local/lib/libwww-perl-0.30 This allows clients like "get" to place the libwww-perl on their @INC path and also allows wwwmime.pl to find the standard mime.types file. ! 5. Make sure the programs are executable: - % chmod 755 get test* - % ln -s get HEAD ! HEAD is in uppercase only because there already exists a unix head command. ! 6. That's it. You should now be able to run get and HEAD, e.g. ! % get http://www.ics.uci.edu/ ! ========================================================================== ! Usage: ! See the "get" and test* programs for examples of how to interface with ! libwww-perl. More documentation will be available later this summer ! (Northern Hemisphere ;-) ========================================================================== Frequently Asked Questions 1. Why doesn't libwww-perl support FTP, Gopher, WAIS, ... ? ! Because you haven't written the interface yet ;-) ! Seriously, though, all you need to do to add a new protocol to the ! library is to copy an existing one (e.g. "cp wwwhttp.pl wwwftp.pl") ! and define the contents of the %AllowedMethods array and the scheme's ! request function (e.g. wwwftp'request()), and then add it to the list ! of required packages in www.pl. That's it -- determination of whether ! or not a protocol module exists is made dynamically by &www'request(). 2. How do I contribute my changes to the standard distribution? ! First, you should join the libwww-perl mailing list ! by sending a subscribe request, including your name and preferred e-mail ! address, to . You will be sent a welcome ! message when you are placed on the list. To see what the list looks like, ! see the Hypermail Archive of it at: ! ! After that, send a mail message describing your changes or suggestions ! to and we can all talk about them. ! If you have RCS (or CVS), you can use the included RCS repository ! to keep track of your changes and merge them with later distributions. ! You are also free to send changes to others by mail or news (or even disk), ! just as long as you don't claim they are part of the "standard distribution" ! of libwww-perl. ========================================================================== Have fun, ! ! ....Roy Fielding ICS Grad Student, University of California, Irvine USA ! (fielding@ics.uci.edu) ! About Roy --- 21,250 ---- it will be in the form of a compressed unix tar file. If it has not already been decompressed by your WWW client, then do one of: ! % uncompress libwww-perl-V.vv.tar.Z ! % gunzip libwww-perl-V.vv.tar.gz ! depending on which compressed version you downloaded. "V.vv" should ! be replaced with the library version number, e.g. "0.40". ! 2. Move the resulting libwww-perl-V.vv.tar file to the directory above where you want to install libwww-perl, cd to that directory, and do ! % tar xvf libwww-perl-V.vv.tar ! to create the directory ./libwww-perl-V.vv containing the following: ! Artistic.txt -- the Artistic License governing redistribution of the libwww-perl package. ! INSTALL.txt -- this file ! LWP_Changes.pl -- the list of known problems and version information. ! Makefile -- a Makefile for automating the initial configuration. ! RCS/ -- the complete RCS repository, including all versions. ! README.html -- primary source of information about libwww-perl ! get -- a simple program for performing WWW GET requests from the command-line. The name of the program determines what request method to be used (i.e. create a link to it called "head" and you have a program that does HEAD requests). This program demonstrates the power and simplicity of the libwww-perl interface. ! hostname.pl -- a library for determining the fully qualified domain ! name for the host running libwww-perl. ! mime.types -- the standard MIME content-types and default filename extensions in the same format as that used by NCSA httpd_1.3 and many WWW clients. ! sys_socket_ph.c - A simple C program for displaying your system's ! symbolic values normally found in sys/socket.ph. ! testbot -- a simple program for testing the wwwbot.pl package. ! testdates -- a simple program for testing the wwwdates.pl package. ! testescapes -- a program for testing wwwurl'escape and unescape. ! testlinks -- a simple program for testing HTML link extraction and combinations of GET and HEAD requests. ! www.pl -- the primary entry point for WWW requests -- give it any absolute URL and a request method and it will try to perform the method using the URL's protocol scheme (or a proxy). ! wwwbot.pl -- a package for implementing the robot exclusion protocol. ! wwwdates.pl -- a package of library utilities for reading, manipulating, and writing dates as they are formatted by most World-Wide Web software and protocols. ! wwwerror.pl -- a package for defining and generating error messages for requests which did not make it outside the client program. ! wwwfile.pl -- a package for performing local file requests (URLs of the form file://localhost/*) and returning a response as if it came from an HTTP server. ! wwwhtml.pl -- a package of library utilities for reading and manipulating HTML documents. ! wwwhttp.pl -- a package for performing HTTP requests (URLs of the form http:*). ! wwwmailcap.pl -- a package of library utilities for handling MIME mailcap files and executing viewers by content-type. ! wwwmime.pl -- a package of library utilities for handling MIME content-types and message headers. ! wwwurl.pl -- a package of library utilities for parsing, composing, manipulating, and canonicalizing Uniform Resource Locators (URLs) as they are used by the World-Wide Web software and protocols. ! 3. Edit the Makefile to match your system configuration. All you should ! need to change is the value of PERLBIN -- the full pathname of your ! perl interpreter. Then, perform the command ! % make ! If the full pathname of your perl interpreter is not "/usr/public/bin/perl", ! you should also perform the command: + % make config + ! 4. Set the LIBWWW_PERL environment variable to point to the libwww-perl directory, e.g. ! % setenv LIBWWW_PERL /usr/local/lib/libwww-perl-V.vv This allows clients like "get" to place the libwww-perl on their @INC path and also allows wwwmime.pl to find the standard mime.types file. ! 5. That's it. You should now be able to run get, HEAD and POST, as well ! as the other library test* programs. See the usage info and the ! FAQ list below for more information. ! ========================================================================== ! Usage: ! See the "get" and test* programs for examples of how to interface with ! libwww-perl. More documentation will be available later. ! The "get" program is a production-quality WWW client, useful for performing ! quick downloads from HTTP servers, translating FILE directories to HTML, ! and testing request/response headers on HTTP servers. + usage: get [-heqd] [-b BaseURL] [-t Timeout] [-i IMS_date] [-c ContentType] + [URL ...] ! GET/0.5 -- A program for sending GET requests for World-Wide Web URLs ! Options: [DEFAULT] ! -h Help -- just display this message and quit. ! -e Display the request and response headers to STDERR. [STDOUT] ! -q Don't display the request and response headers. ! -d Don't display the content (useful for debugging servers). ! -b Start with the given Base URL. ! [file://localhost/co/ub/fielding/public/www/lwp/libwww-perl/] ! -t Start with the given Timeout value (in seconds) [30] ! -i Add the If-Modified-Since header (an HTTP date) to GET requests. ! -c Use the given MIME Content-type for POST, PUT, and CHECKIN requests. ! [application/x-www-form-urlencoded] ! URL ... Perform the GET request on each URL listed. ! If no URLs are listed on the command-line, the program enters an ! interactive mode. The following commands are available interactively: + base=BaseURL -- changes the current Base URL to that given. + tout=NNNN -- sets the current Timeout value (in seconds). + ims=IMS_date -- sets the If-Modified-Since header value. + URL -- performs the request on the given URL. + + Here's a nice way to download information AND see the response headers: + + % get -e http://www.ics.uci.edu/WWWdocs/papers/rfc1630.txt > rfc1630.txt + + And, since the method used is equal to the program's name (uppercased), + you can use symbolic links to create other useful programs, e.g. + + % echo "tick=sunw" | POST http://www.secapl.com/cgi-bin/qs + + Give it a try. I have only tested the GET, HEAD, and POST methods, but all the + others are supported as well (though they may not be supported by any server). + ========================================================================== Frequently Asked Questions 1. Why doesn't libwww-perl support FTP, Gopher, WAIS, ... ? ! Because you haven't written the interface yet ;-) ! Seriously, though, all you need to do to add a new protocol to the ! library is to copy an existing one (e.g. "cp wwwhttp.pl wwwftp.pl") ! and define the contents of the %AllowedMethods array and the scheme's ! request function (e.g. wwwftp'request()), and then include a "require" ! statement in the main program that uses it. That's it -- determination ! of whether or not a protocol module exists is made dynamically by ! &www'request(). 2. How do I contribute my changes to the standard distribution? ! First, you should join the mailing list ! by sending a subscribe request, including your name and preferred e-mail ! address, to . You will be sent a welcome ! message when you are placed on the list. To see what the list looks like, ! see the Hypermail Archive of it at: ! ! After that, send a mail message describing your changes or suggestions ! to and we can all talk about them. ! If you have RCS (or CVS), you can use the included RCS repository ! to keep track of your changes and merge them with later distributions. ! You are also free to send changes to others by mail or news (or even disk), ! just as long as you don't claim they are part of the "standard distribution" ! of libwww-perl. + + 3. Help, I have encountered a bug and I don't know what to do... + + First, look at the hypertext archive (the URL above) to see if a similar + problem has already been discussed on the mailing list. If not, send a + message to the mailing list which describes the + problem and symptoms, etc. Above all, be sure to mention what platform + you are running on, since most of the problems discovered so far have + been platform-specific. Finally, if you solve a problem, be sure to send + the solution to the mailing list as well. + + + 4. Undefined subroutine "main'_BSD" called at /usr/local/lib/perl/sys/socket.ph + + Arrgh! + + This has been the big problem so far with SVR4 and mach-based system + installs. What you need to do is create a sys/socket.ph file for your + perl standard library which is valid for your system. Normally, + you can just run the "h2ph" command (part of the perl distribution) to set + up the files, but some SVR4 and mach-based systems use extra symbols which + can't be found by h2ph. So, you need to do one (or more) of the following: + + A. Comment out the lines in sys/socket.ph that generate errors (they are + rarely needed in any case). + + B. Add your own definitions to the sys/socket.ph, e.g. + + eval 'sub BSD { 0; }'; + + You may have to guess the correct value, or do a grep on + /usr/include/sys/*.h to find the exact definition. + + C. Create your own canned socket.ph file via the included C program + sys_socket_ph.c -- compile and run it using the commands: + + % make socket + % sys_socket_ph > my_socket.ph + + and then edit "wwwhttp.pl" (in the libwww-perl stuff) to replace + the require "sys/socket.ph" with require "my_socket.ph"; + + Depending on the vagaries of your system, at least one of the above + fixes should work. + ========================================================================== Have fun, ! ......Roy Fielding ICS Grad Student, University of California, Irvine USA ! ! *** ../libwww-perl-0.30/Makefile Tue Sep 20 19:09:38 1994 --- Makefile Mon Sep 19 05:06:49 1994 *************** *** 0 **** --- 1,45 ---- + # $Id: Makefile,v 1.1 1994/09/19 12:06:37 fielding Exp $ + # + # This Makefile is used to configure the perl scripts so that + # they all use the correct pathname for the perl interpreter. + # It also makes the programs executable and creates links from "get" + # to the other commonly-used methods. Use the following commands: + # + # % make + # % make config + # + # You should only need to change the following line to the full pathname + # of your perl interpreter (if it happens to be "/usr/public/bin/perl", + # you do not need to do a "make config"). + + PERLBIN = /usr/bin/perl + + # The rest should be automatic + + OLDPERL = /usr/public/bin/perl + CC = cc + CLIENTS = get testbot testdates testescapes testlinks + + all: + chmod 755 $(CLIENTS) + ln -s get HEAD + ln -s get POST + + config: + $(PERLBIN) -pi.orig -e 's#$(OLDPERL)#$(PERLBIN)#o' $(CLIENTS) + + # + # Now this part is only used if you are having problems with sockets + # on non-BSD systems. It just compiles the test program. + # + + socket: sys_socket_ph.c + $(CC) -o sys_socket_ph sys_socket_ph.c + + # + # Use only for cleaning up after a bad config + # + + clean: + rm -f HEAD POST + $(PERLBIN) -pi.orig -e 's#$(PERLBIN)#$(OLDPERL)#o' $(CLIENTS) *** ../libwww-perl-0.30/get Wed Jul 20 11:10:47 1994 --- get Tue Sep 20 18:24:55 1994 *************** *** 1,11 **** #!/usr/public/bin/perl ! # $Id: get,v 0.14 1994/07/20 18:10:37 fielding Exp $ ! #----------------------------------------------------------------- # Perform a WWW request on a (set of) absolute or relative URL(s). # The URL(s) may be on the command line or passed via a pipe. # The method used is equal to the uppercased name of this program, ! # so the intention is to name it "get" and create a symbolic link ! # called "head" which points to "get" (two programs for the price of one). # # The program starts with the BASE URL equal to the current file directory. # To change it, enter a URL prefixed with "base=", e.g, --- 1,12 ---- #!/usr/public/bin/perl ! # $Id: get,v 0.15 1994/09/21 01:23:18 fielding Exp $ ! # ========================================================================== # Perform a WWW request on a (set of) absolute or relative URL(s). # The URL(s) may be on the command line or passed via a pipe. # The method used is equal to the uppercased name of this program, ! # so the intention is to name it "get" and create a symbolic links ! # called "HEAD" and "POST" which point to "get" (three programs for ! # the price of one). # # The program starts with the BASE URL equal to the current file directory. # To change it, enter a URL prefixed with "base=", e.g, *************** *** 18,60 **** # 06 Jul 1994 (RTF): Added extra fallback code from Martijn Koster # 20 Jul 1994 (RTF): The default From header is now set by www.pl # and &www'set_def_header() is called to set User-Agent # # Created by Roy Fielding to test the libwww-perl system ! #----------------------------------------------------------------- if ($libloc = $ENV{'LIBWWW_PERL'}) { unshift(@INC, $libloc); } require "www.pl"; require "wwwurl.pl"; require "wwwerror.pl"; ! $method = $0; # Method = program name ! $method =~ s#^.*/([^/]+)$#$1#; # lose the path ! $method =~ tr/a-z/A-Z/; # uppercase it ! &www'set_def_header('http', 'User-Agent', "$method/0.3"); ! # Set up User-Agent: header $pwd = ( $ENV{'PWD'} || $ENV{'cwd'} || '' ); ! $base = "file://localhost$pwd/"; # Set up initial Base URL ! #----------------------------------------------------------------- ! if ($#ARGV == 0) { # Quickie, one-line version ! $url = &wwwurl'absolute($base, $ARGV[0]); ! &do_req($method, $url); } else { # Interactive version ! print "Enter a URL (^D to exit): "; ! while (<>) { chop; ! if (/^base=(.*)$/) { $base = $1; next; } ! $url = &wwwurl'absolute($base, $_); &do_req($method, $url); } continue { ! print "===========================================================\n"; print "Enter a URL (^D to exit): "; } print "\n"; --- 19,132 ---- # 06 Jul 1994 (RTF): Added extra fallback code from Martijn Koster # 20 Jul 1994 (RTF): The default From header is now set by www.pl # and &www'set_def_header() is called to set User-Agent + # 07 Sep 1994 (RTF): Added code to show original headers if they were received; + # Added tout to interactively change the timeout value; + # Added ims to interactively give If-Modified-Since; + # Added handling of POST content suggested by Mel Melchner. + # 18 Sep 1994 (RTF): Added command-line options, debug and quiet modes. # # Created by Roy Fielding to test the libwww-perl system ! # ========================================================================== if ($libloc = $ENV{'LIBWWW_PERL'}) { unshift(@INC, $libloc); } + require "getopts.pl"; require "www.pl"; require "wwwurl.pl"; require "wwwerror.pl"; ! $pname = $0; ! $method = $pname; # Method = program name ! $method =~ s#^.*/([^/]+)$#$1#; # lose the path ! $method =~ tr/a-z/A-Z/; # uppercase it ! $Version = "$method/0.5"; ! # Set up User-Agent: header ! &www'set_def_header('http', 'User-Agent', $Version); $pwd = ( $ENV{'PWD'} || $ENV{'cwd'} || '' ); ! $Base = "file://localhost$pwd/"; # Set up initial Base URL ! $Tout = 30; # Time-out in seconds ! $Ims = ''; # If-Modified-Since header ! $Contype = 'application/x-www-form-urlencoded'; # Content-type for POST ! $Debug = 0; # Ask before display? ! $Quiet = 0; # No headers if Quiet ! $Out = STDOUT; ! # ========================================================================== ! # ========================================================================== ! # Print the usage information if help requested (-h) or a bad option given. ! # ! sub usage ! { ! die <<"EndUsage"; ! usage: $pname [-heq] [-b BaseURL] [-t Timeout] [-i IMS_date] [-c ContentType] ! [URL ...] ! $Version -- A program for sending $method requests for World-Wide Web URLs ! Options: [DEFAULT] ! -h Help -- just display this message and quit. ! -e Display the request and response headers to STDERR. [STDOUT] ! -q Don't display the request and response headers. ! -d Don't display the content (useful for debugging servers). ! -b Start with the given Base URL. ! [$Base] ! -t Start with the given Timeout value (in seconds) [$Tout] ! -i Add the If-Modified-Since header (an HTTP date) to GET requests. ! -c Use the given MIME Content-type for POST, PUT, and CHECKIN requests. ! [$Contype] ! URL ... Perform the $method request on each URL listed. ! ! If no URLs are listed on the command-line, the program enters an ! interactive mode. The following commands are available interactively: ! ! base=BaseURL -- changes the current Base URL to that given. ! tout=NNNN -- sets the current Timeout value (in seconds). ! ims=IMS_date -- sets the If-Modified-Since header value. ! URL -- performs the request on the given URL. ! ! EndUsage } + + + # ========================================================================== + # Get the command-line options + + if (!(&Getopts('heqdb:i:t:c:')) || $opt_h) { &usage; } + + if ($opt_e) { $Out = STDERR; } + if ($opt_q) { $Quiet = 1; } + if ($opt_d) { $Debug = 1; } + if ($opt_b) { $Base = $opt_b; } + if ($opt_i) { $Ims = $opt_i; } + if ($opt_c) { $Contype = $opt_c; } + if ($opt_t) { $Tout = $opt_t if ($opt_t =~ /\d+/); } + + # ========================================================================== + # Do the work + + if ($#ARGV >= 0) { # Quickie, one-line version + $Interactive = 0; + foreach $arg (@ARGV) + { + $url = &wwwurl'absolute($Base, $arg); + &do_req($method, $url); + } + } else { # Interactive version ! $Interactive = 1; ! print "Enter a command or URL (^D to exit): "; ! while () { chop; ! if (/^base=(.*)$/) { $Base = $1; next; } ! if (/^tout=(\d+)$/) { $Tout = $1; next; } ! if (/^ims=(.*)$/) { $Ims = $1; next; } ! $url = &wwwurl'absolute($Base, $_); &do_req($method, $url); } continue { ! print "\n==========================================================\n"; print "Enter a URL (^D to exit): "; } print "\n"; *************** *** 68,97 **** local($hd, $response); local(%headers) = (); local($content) = ''; ! print "$method $url HTTP/1.0\n"; # Show user what it looks like ! # and then do the request ! $response = &www'request($method, $url, *headers, *content, 30); ! ! foreach $hd (keys(%headers)) # This is cheating, but it shows ! { # the default headers generated ! next if ($hd =~ m#^[a-z]#); # by the www.pl request library. ! print "$hd: $headers{$hd}\n"; } ! print "\n"; ! # And print out the result ! ! print "HTTP/1.0 $response $wwwerror'RespMessage{$response}\n"; ! foreach $hd (keys(%headers)) { ! next if ($hd =~ m#^[A-Z]#); ! print "$hd: $headers{$hd}\n"; } - print "\n"; ! print $content; ! print "\n"; } 1; --- 140,212 ---- local($hd, $response); local(%headers) = (); + local($headers) = ''; local($content) = ''; ! if ($method eq 'GET') ! { ! if ($Ims) { $headers{'If-Modified-Since'} = $Ims; } } ! elsif (($method eq 'POST') || ($method eq 'PUT') || ($method eq 'CHECKIN')) { ! if ($Interactive) ! { ! print "Enter content-type [$Contype]: "; ! $_ = ; ! chop; ! if (/^\S/) { $Contype = $_; } ! print 'Enter content ("." to end): ', "\n"; ! } ! while () ! { ! last if (/^\.$/); ! chop; ! $content .= $_; ! } ! $headers{'Content-type'} = $Contype; ! $headers{'Content-length'} = length($content); } ! print($Out "$method $url HTTP/1.0\n") # Show user what it looks like ! unless $Quiet; # and then do the request ! ! $response = &www'request($method, $url, *headers, *content, $Tout); ! ! if (!$Quiet) ! { ! foreach $hd (keys(%headers)) # This is cheating, but it shows ! { # the default headers generated ! next if ($hd =~ m#^[a-z]#); # by the www.pl request library. ! print($Out "$hd: $headers{$hd}\n"); ! } ! print($Out "\n"); ! # And print out the result ! if ($headers) ! { ! print($Out $headers); ! } ! else ! { ! print($Out "HTTP/1.0 $response $wwwerror'RespMessage{$response}\n"); ! foreach $hd (keys(%headers)) ! { ! next if ($hd =~ m#^[A-Z]#); ! print($Out "$hd: $headers{$hd}\n"); ! } ! } ! print($Out "\n"); ! } ! if ($Debug) ! { ! if ($Interactive) ! { ! print 'Do you want the content displayed (y/n)? [n] '; ! $_ = ; ! chop; ! if (/^y/i) { print $content if defined($content); } ! } ! } ! else { print $content if defined($content); } } 1; *** ../libwww-perl-0.30/sys_socket_ph.c Tue Sep 20 19:09:38 1994 --- sys_socket_ph.c Tue Sep 20 18:25:43 1994 *************** *** 0 **** --- 1,17 ---- + /* $Id: sys_socket_ph.c,v 1.1 1994/09/21 01:23:18 fielding Exp $ */ + /* This simple program just prints out a sample sys/socket.ph file -- */ + /* useful for those people having problems with perl sockets on SVR4 */ + + #include + #include + #include + + int main() { + + printf("sub AF_INET { %d; }\n", AF_INET); + printf("sub PF_INET { %d; }\n", PF_INET); + printf("sub SOCK_DGRAM { %d; }\n", SOCK_DGRAM); + printf("sub SOCK_STREAM { %d; }\n", SOCK_STREAM); + printf("%d;\n", 1); + exit(0); + } *** ../libwww-perl-0.30/mime.types Fri Jul 8 01:14:43 1994 --- mime.types Thu Sep 1 03:57:15 1994 *************** *** 1,5 **** ! # $Id: mime.types,v 0.12 1994/07/08 08:08:14 fielding Exp $ application/octet-stream bin application/oda oda application/pdf pdf --- 1,6 ---- ! # $Id: mime.types,v 0.13 1994/09/01 10:56:42 fielding Exp $ + application/mac-binhex40 hqx application/octet-stream bin application/oda oda application/pdf pdf *** ../libwww-perl-0.30/testlinks Wed Jul 20 12:15:52 1994 --- testlinks Tue Sep 20 18:26:03 1994 *************** *** 1,5 **** #!/usr/public/bin/perl ! # $Id: testlinks,v 1.1 1994/07/20 19:14:56 fielding Exp $ # --------------------------------------------------------------------------- # GET and extract the links from the URLs passed as arguments, test them # using HEAD requests, and output an HTML index fragment describing the --- 1,5 ---- #!/usr/public/bin/perl ! # $Id: testlinks,v 1.2 1994/09/21 01:23:18 fielding Exp $ # --------------------------------------------------------------------------- # GET and extract the links from the URLs passed as arguments, test them # using HEAD requests, and output an HTML index fragment describing the *************** *** 14,19 **** --- 14,20 ---- # 20 Jul 1994 (RTF): The default From header is now set by www.pl # and &www'set_def_header() is called to set User-Agent. # Added to libwww-perl distribution. + # 20 Sep 1994 (RTF): Added initialization of $headers # # Created by Roy Fielding to test MOMspider and the libwww-perl system #----------------------------------------------------------------- *************** *** 43,48 **** --- 44,50 ---- $url = &wwwurl'absolute($base, $rel); $content = ''; + $headers = ''; %headers = (); $response = &www'request('GET', $url, *headers, *content, 30); *************** *** 57,63 **** # Now print out the index entry for this URL ! $nextbit = ($headers{title} || $url); print "

$nextbit

\n"; $vidx++; print "$response $wwwerror'RespMessage{$response}\n", --- 59,65 ---- # Now print out the index entry for this URL ! $nextbit = ($headers{'title'} || $url); print "

$nextbit

\n"; $vidx++; print "$response $wwwerror'RespMessage{$response}\n", *************** *** 85,90 **** --- 87,93 ---- print "\n"; undef $content; + undef $headers; undef %headers; if ($TestLinks[0]) *************** *** 120,125 **** --- 123,129 ---- local($response, $nextbit) = 0; local($content) = ''; + local($headers) = ''; local(%headers) = (); if ($parent) { $headers{'Referer'} = $parent; } *** ../libwww-perl-0.30/README.html Mon Aug 1 06:40:51 1994 --- README.html Tue Sep 20 18:24:16 1994 *************** *** 1,5 **** ! libwww-perl: Distribution Information --- 1,5 ---- ! libwww-perl: Distribution Information *************** *** 42,47 **** --- 42,52 ---- A Hypermail Archive of the mailing list is also available.

+ A contrib + directory has been established for perl source that is not (yet) + part of the libwww-perl package, but which may be useful to current + implementors.

+ Support for the initial development and distribution of libwww-perl has been provided by the Arcadia Project at UCI, part of the larger *************** *** 92,97 **** --- 97,106 ---- The following developers have contributed (either directly or indirectly) to the libwww-perl distribution:

+
Alberto Accomazzi, Harvard-Smithsonian Center for Astrophysics, USA +
Suggestions for hostname.pl +
James Casey, CERN, Switzerland +
Routines for processing HTML anchors
Brooks Cutter, STUFF.com, USA
Contributed wwwbot.pl, testbot, code for escaping and unescaping URLs, *************** *** 101,120 ****
Architect and primary developer of the library.
Martijn Koster, NEXOR Ltd., UK !
Contributed many bug fixes and part of Oscar's http. !
Oscar ! Nierstrasz, Universitaet Bern, Switzerland
Oscar's collection of useful perl scripts formed the basis on which the wwwhttp.pl and wwwhtml.pl packages were built.
Gertjan van Oosten, West Consulting bv, NL !
Contributed code for parsing WWW date formats (used in wwwdates.pl)
Gene Spafford, Purdue University, USA !
Contributed the MailStuff package for parsing rfc822 headers.
Others
These people contributed to prior packages which influenced the development of libwww-perl: Steven E. Brenner (cgi-lib), ! Marion Hakanson (ctime), Marc van Heyningen (http), ! Waldemar Kebsch (ctime), and Larry Wall (Perl).

--- 110,139 ----
Architect and primary developer of the library.
Martijn Koster, NEXOR Ltd., UK !
Many bug fixes and part of Oscar's http. !
Mel Melchner, AT&T Research, USA !
Suggested changes to get to support the POST method. !
Oscar ! Nierstrasz, University of Berne, Switzerland
Oscar's collection of useful perl scripts formed the basis on which the wwwhttp.pl and wwwhtml.pl packages were built.
Gertjan van Oosten, West Consulting bv, NL !
Code for parsing WWW date formats (used in wwwdates.pl) !
Jared Rhine, ! Harvey Mudd College, USA !
Makefile/config suggestions. !
Jack Shirazi, BIU, UK !
Many good suggestions regarding alarms and sockets.
Gene Spafford, Purdue University, USA !
MailStuff package for parsing rfc822 headers. !
! Marc VanHeyningen, Indiana University, USA !
HTML entity stuff and part of Oscar's http.
Others
These people contributed to prior packages which influenced the development of libwww-perl: Steven E. Brenner (cgi-lib), ! Marion Hakanson (ctime), ! Waldemar Kebsch (ctime), Tony Sanders (Plexus), and Larry Wall (Perl).
*************** *** 121,129 ****

The Distribution

For easy distribution, libwww-perl is available as a ! gzip'd tar file or as a ! compress'd tar file. It is also available via anonymous ftp from liege.ics.uci.edu in the directory --- 140,148 ----

The Distribution

For easy distribution, libwww-perl is available as a ! gzip'd tar file or as a ! compress'd tar file. It is also available via anonymous ftp from liege.ics.uci.edu in the directory *************** *** 134,160 ****
Artistic.txt
the Artistic License. -
Changes.txt -
the complete list of changes and version information. -
INSTALL.txt
Installation instructions and usage information.
README.html
this document.
get !
a simple program for performing WWW ! GET requests from the command-line. The name of the program determines what ! request method to be used (i.e. create a link to it called "HEAD" and ! you have a program that does HEAD requests). This program demonstrates the ! power and simplicity of the libwww-perl interface.
mime.types
the standard MIME content-types ! and default filename extensions in the same format as that used by ! NCSA httpd_1.3 and many WWW clients.
testbot
a simple program for testing the wwwbot.pl package. --- 153,190 ----
Artistic.txt
the Artistic License.
INSTALL.txt
Installation instructions and usage information. +
LWP_Changes.pl +
the complete list of changes and version information. + +
Makefile +
a Makefile for automating the initial configuration. +
README.html
this document.
get !
a simple program for performing WWW GET requests from the ! command-line. The name of the program determines what request method ! to be used (i.e. create a link to it called "HEAD" and you have a ! program that does HEAD requests). This program demonstrates the power ! and simplicity of the libwww-perl interface. +
hostname.pl +
a library for determining the fully qualified domain + name for the host running libwww-perl. +
mime.types
the standard MIME content-types ! and default filename extensions in the same format as that used by ! NCSA httpd_1.3 and many WWW clients. +
sys_socket_ph.c +
A simple C program for displaying your system's + symbolic values normally found in sys/socket.ph. +
testbot
a simple program for testing the wwwbot.pl package. *************** *** 170,204 ****
www.pl
the primary entry point for WWW requests -- give it any absolute ! URL and a request method and it will try to perform the method using ! the URL's protocol scheme (or a proxy).
wwwbot.pl
a package for implementing the ! ! robot exclusion protocol.
wwwdates.pl
a package of library utilities for reading, manipulating, and ! writing dates as they are formatted by most World-Wide Web software ! and protocols.
wwwerror.pl
a package for defining and generating error messages for requests ! which did not make it outside the client program.
wwwfile.pl
a package for performing local file requests (URLs of the form ! file://localhost/*) and returning a response as if it ! came from an HTTP server.
wwwhtml.pl
a package of library utilities for reading and manipulating HTML ! documents.
wwwhttp.pl
a package for performing HTTP requests (URLs of the form ! http:*).
wwwmailcap.pl
a package of library utilities for handling MIME mailcap files --- 200,234 ----
www.pl
the primary entry point for WWW requests -- give it any absolute ! URL and a request method and it will try to perform the method using ! the URL's protocol scheme (or a proxy).
wwwbot.pl
a package for implementing the ! ! robot exclusion protocol.
wwwdates.pl
a package of library utilities for reading, manipulating, and ! writing dates as they are formatted by most World-Wide Web software ! and protocols.
wwwerror.pl
a package for defining and generating error messages for requests ! which did not make it outside the client program.
wwwfile.pl
a package for performing local file requests (URLs of the form ! file://localhost/*) and returning a response as if it ! came from an HTTP server.
wwwhtml.pl
a package of library utilities for reading and manipulating HTML ! documents.
wwwhttp.pl
a package for performing HTTP requests (URLs of the form ! http:*).
wwwmailcap.pl
a package of library utilities for handling MIME mailcap files *************** *** 206,217 ****
wwwmime.pl
a package of library utilities for handling MIME content-types ! and message headers.
wwwurl.pl
a package of library utilities for parsing, composing, ! manipulating, and canonicalizing Uniform Resource Locators (URLs) as ! they are used by the World-Wide Web software and protocols. --- 236,247 ----
wwwmime.pl
a package of library utilities for handling MIME content-types ! and message headers.
wwwurl.pl
a package of library utilities for parsing, composing, ! manipulating, and canonicalizing Uniform Resource Locators (URLs) as ! they are used by the World-Wide Web software and protocols. *************** *** 218,225 ****

Version History

! Current version is 0.30.