Roy T. Fielding,
Maintaining Distributed Hypertext Infostructures:
Welcome to MOMspider's Web.
The World-Wide Web (WWW) can be described in many ways. From an organizational perspective, it is an initiative aiming to give universal access to a world of documents [Hughes94]. Technically, it is a distributed object management system designed for information retrieval via textual relations. From a practical viewpoint, however, the WWW is a synergistic combination of three technologies: a means for providing potentially useful information such that it can be accessed by distributed (and sometimes distant) users; a means for users to access information stored at distributed sites without requiring knowledge of the underlying access mechanism; and a means for structuring information such that it can be discovered, retrieved and viewed by those who would find it useful. These three technologies are enabled by WWW server and client programs and a set of proposed standards which allow them to communicate, to identify information objects, and to structure and view the information such that it can be traversed via hypertext links.
The Hypertext Transfer Protocol [HTTP] is used by distributed servers to communicate with each other and with a variety of client applications. Many of these clients are also capable of communicating with other information services, such as FTP, Gopher, and WAIS. A client can send commands to any server accessible via a TCP/IP connection. The command is usually a request for transfer (GET) of an information object, which is then displayed (or saved) locally by the client.
An information object is identified by a Uniform Resource Locator
[URL] which, in its
canonical form, can include the access scheme (e.g. http, gopher,
telnet, etc.), an IP/hostname address and TCP port for the server
location (e.g. www.ics.uci.edu:80), and a name recognizable by the
server as representing that object (usually in the form of a relative
file pathname). For example, the full URL for a preliminary hypertext
version of this document is
<http://www.ics.uci.edu:80/WebSoft/MOMspider/WWW94/intro.html>.
In most cases, the URL is embedded in other documents as a hypertext
reference (link) associated with some meaningful text pointer
(anchor). Viewed graphically, these links form a hypertext "web" of
related information.
The Hypertext Markup Language [HTML] is used to structure information such that it can be readily displayed by viewing clients. Because these clients exist on heterogeneous platforms and may vary in their rendering abilities, HTML emphasizes the description of content and structure rather than form. Of primary importance is HTML's ability to define portions of a document as being hypertext -- pieces of text or images which are linked (via anchor references) to other documents. End-users of the WWW interact with a viewing client (such as NCSA's Mosaic) by reading documents and traversing to related documents by selecting the hypertext anchors. In addition to the main body of information, HTML documents can contain special header information (metainformation) such as the document's title, expiration date, and version.
[Continue to the Maintenance Problem or Up to Contents]