|
libwww-perl
WWW Protocol Library for Perl
|
|
libwww-perl is a library of Perl packages/modules which
provides a simple and consistent programming interface to the
World Wide Web. This library is being developed as a collaborative
effort to assist the further development of useful WWW clients and
tools.
Two versions of libwww-perl exist, representing the evolution of Perl
as a language:
- libwww-perl4
- The first generation libwww-perl, based on version 4.036 of the
Perl Programming Language.
This library was written by Roy Fielding in 1994 as the backend for
MOMspider,
with contributions by many individuals around
the globe.
- libwww-perl5
- The second generation libwww-perl, based on version 5.004 of the
Perl Programming Language.
Since Roy didn't have any free time, Gisle Aas and Martijn Koster
led the Perl5 effort and creation of an object-oriented architecture
for the new library. All new projects should use the Perl5 version,
available at any CPAN archive site.
libwww-perl is freely available as described in
the Artistic License which accompanies the
standard distribution. The libwww-perl software
architecture and standard distribution package are copyrighted by the
University of California for the sole purpose of retaining consistency
and coherence in the distribution of the library. Contributions to the
library are strongly encouraged and will be included in the standard
distribution with full citation to the developers.
See below for the list of past and current contributors.
A mailing list has been
established for technical discussion about libwww-perl,
including problem reports, interim fixes, suggestions for features,
and contributions. The mailing list address is
libwww-perl@ics.uci.edu
and administrivia (including subscribe requests) should be sent to
libwww-perl-request@ics.uci.edu
A Hypermail Archive of the mailing list
is also available.
A contrib directory has been established for
perl source that is not (yet) part of the libwww-perl package, but
which may be useful to current implementors.
Support for the initial development and distribution of libwww-perl
has been provided by the
Arcadia Project at UCI, part of the larger
Arcadia Consortium for
research in software engineering environments.
WWW Requests Currently Supported
-
Proxies
- Full support is provided for redirecting WWW requests
by protocol scheme to a proxy server via the HTTP protocol.
- HTTP/1.0
- All requests and responses for the
Hypertext Transfer Protocol are supported (November 1993 draft).
- FILE
- Support for GET and HEAD requests on file://localhost URLs
is provided, with results translated to HTTP responses as if they were
handled by an HTTP gateway.
Developing code to handle all the request formats and protocols present
on the World Wide Web is too big a task for any one person or organization.
For that reason, the future of this library is dependent on the contributions
of those who make use of it. Please send in your extensions so that we can
all benefit from the effort required to make distributed information systems
work.
The following developers have contributed (either directly or indirectly)
to the libwww-perl distribution:
Primary developers and project support
- Gisle Aas, Schibsted Nett AS, Norges
- Co-architect and primary developer of the Perl5 library.
- Roy Fielding,
University of California, Irvine, USA
- Architect and primary developer of the Perl4 library and originator of
the libwww-perl collaborative project.
- Martijn Koster,
WebCrawler, America On-Line (AOL)
- Co-architect of the Perl5 library and many bug fixes to the Perl4 version.
Martijn is the guy who, during a break at
WWW94 in Geneva, convinced Roy
to make the libwww-perl4 library available separate from
MOMspider.
Martijn also maintains the
World Wide Web Robots, Wanderers, and Spiders pages.
libwww-perl4 contributors
- Alberto Accomazzi, Harvard-Smithsonian Center for Astrophysics, USA
- Suggestions for hostname.pl
- James Casey, CERN, Switzerland
- Routines for processing HTML anchors
- Brooks Cutter, STUFF.com, USA
- Contributed wwwbot.pl, testbot, code for escaping and unescaping URLs,
www'stat(), wwwmailcap.pl, and many suggestions and bug fixes.
- Mel Melchner, AT&T Research, USA
- Suggested changes to get to support the POST method.
- Oscar
Nierstrasz, University of Berne, Switzerland
- Oscar's collection of useful perl scripts formed the basis on
which the wwwhttp.pl and wwwhtml.pl packages were built.
- Gertjan van Oosten, West Consulting bv, NL
- Code for parsing WWW date formats (used in wwwdates.pl)
- Jared Rhine, Harvey Mudd College, USA
- Makefile/config suggestions.
- Jack Shirazi, BIU, UK
- Many good suggestions regarding alarms and sockets.
- Gene Spafford, Purdue University, USA
- MailStuff package for parsing rfc822 headers.
- Marc VanHeyningen, Indiana University, USA
- HTML entity stuff and part of Oscar's http.
- Others
- These people contributed to prior packages which influenced the
development of libwww-perl: Steven E. Brenner (cgi-lib),
Marion Hakanson (ctime), Waldemar Kebsch (ctime),
Tony Sanders (Plexus), and Larry Wall (Perl).
The libwww-Perl4 Distribution
The libwww-perl4 software package is available as a
gzip'd tar file via both
HTTP and
FTP.
The libwww-perl4 distribution consists of the following files:
- Artistic.txt
- the Artistic License.
- INSTALL.txt
- Installation instructions and usage information.
- LWP_Changes.pl
- the complete list of changes and version information.
- Makefile
- a Makefile for automating the initial configuration.
- README.html
- this document.
- get
- a simple program for performing WWW GET requests from the
command-line. The name of the program determines what request method
to be used (i.e. create a link to it called "HEAD" and you have a
program that does HEAD requests). This program demonstrates the power
and simplicity of the libwww-perl interface.
- hostname.pl
- a library for determining the fully qualified domain
name for the host running libwww-perl.
- mime.types
- the standard MIME content-types
and default filename extensions in the same format as that used by
NCSA httpd_1.3 and many WWW clients.
- sys_socket_ph.c
- A simple C program for displaying your system's
symbolic values normally found in sys/socket.ph.
- testbot
- a simple program for testing the wwwbot.pl package.
- testdates
- a simple program for testing the wwwdates.pl package.
- testescapes
- a simple program for testing the wwwurl'escape and unescape routines.
- testlinks
- a simple program for testing HTML link extraction and
combinations of GET and HEAD requests.
- www.pl
- the primary entry point for WWW requests -- give it any absolute
URL and a request method and it will try to perform the method using
the URL's protocol scheme (or a proxy).
- wwwbot.pl
- a package for implementing the
robot exclusion protocol.
- wwwdates.pl
- a package of library utilities for reading, manipulating, and
writing dates as they are formatted by most World Wide Web software
and protocols.
- wwwerror.pl
- a package for defining and generating error messages for requests
which did not make it outside the client program.
- wwwfile.pl
- a package for performing local file requests (URLs of the form
file://localhost/*) and returning a response as if it
came from an HTTP server.
- wwwhtml.pl
- a package of library utilities for reading and manipulating HTML
documents.
- wwwhttp.pl
- a package for performing HTTP requests (URLs of the form
http:*).
- wwwmailcap.pl
- a package of library utilities for handling MIME mailcap files
and executing viewers by content-type.
- wwwmime.pl
- a package of library utilities for handling MIME content-types
and message headers.
- wwwurl.pl
- a package of library utilities for parsing, composing,
manipulating, and canonicalizing Uniform Resource Locators (URLs) as
they are used by the World Wide Web software and protocols.
The current Perl5 version is
maintained by Gisle Aas. All future work should be done in Perl5.
The current Perl4 version is 0.40. It is only supported for use by older
projects.
If you have any suggestions, bug reports, fixes, or enhancements, send
them to the libwww-perl mailing list as described above.
Please see the file Artistic.txt for complete
licensing and redistribution information.
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY
FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES
ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION
(INCLUDING, BUT NOT LIMITED TO, LOST PROFITS) EVEN IF THE UNIVERSITY
OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The libwww-perl4 work was sponsored in part by the Defense Advanced
Research Projects Agency under Grant Number MDA972-91-J-1010.
This software does not
necessarily reflect the position or policy of the U.S. Government and no
official endorsement should be inferred. Their support is appreciated.
Roy Fielding
Department of Information and Computer Science,
University of California, Irvine, CA 92697-3425
Last modified: 13 May 1998