米葫芦网

RFC1625 - WAIS over Z39.50-1988

热度:5℃ 发布时间:2024-11-18 00:40:04

Network Working Group M. St. Pierre
Request for Comments: 1625 WAIS, Inc.
Category: Informational J. Fullton
CNIDR
K. Gamiel
CNIDR
J. Goldman
Thinking Machines Corp.
B. Kahle
WAIS, Inc.
J. Kunze
UC Berkeley
H. Morris
WAIS, Inc.
F. Schiettecatte
FS Consulting
June 1994
WAIS over Z39.50-1988
Status of this Memo
This memo provides information for the Internet community. This memo
does not specify an Internet standard of any kind. Distribution of
this memo is unlimited.
1. Introduction
The network publishing system, Wide Area Information Servers (WAIS),
is designed to help users find information over a computer network.
The principles guiding WAIS development are:
1. A wide-area networked-based information system for searching,
browsing, and publishing.
2. Based on standards.
3. Easy to use.
4. Flexible and growth oriented.
From this basis, a large group of developers, publishers, standards
bodies, libraries, government agencies, schools, and users have been
helping further the WAIS system.
The WAIS software architecture has four main components: the client,
the server, the database, and the protocol. The WAIS client is a
user-interface program that sends requests for information to local
or remote servers. Clients are available for most popular desktop
environments. The WAIS server is a program that services client
requests, and is available on a variety of UNIX platforms. The
server generally runs on a machine containing one or more information
sources, or WAIS databases. The protocol, Z39.50-1988, is used to
connect WAIS clients and servers and is based on the 1988 Version of
the NISO Z39.50 Information Retrieval Service and Protocol Standard.
The goal of the WAIS network publishing system is to create an open
architecture of information clients and servers by using a standard
computer-to-computer protocol that enables clients to communicate
with servers.
WAIS development began in October 1989 with the first Internet
release occurring in April 1991. From the beginning, WAIS committed
to use the Z39.50-1988 standard as the information retrieval protocol
between WAIS clients and servers. The implementation is still in use
today by existing WAIS clients and servers resulting in over 50,000
users of Z39.50-1988 on the Internet.
2. Purpose
The purpose of this memo is to initiate a discussion for a migration
path of the WAIS technology from Z39.50-1988 Information Retrieval
Service Definitions and Protocol Specification for Library
Applications [1] to Z39.50-1992 [2] and then to Z39.50-1994 [3]. The
purpose of this memo is not to provide a detailed implementation
specification, but rather to describe the high-level design goals and
functional assumptions made in the WAIS implementation of Z39.50-
1988. WAIS use of Z39.50-1992 and Z39.50-1994 standards will be the
subject of future RFCs.
3. Historical Design Goals of WAIS
As an aid to understanding the original WAIS implementation and its
use of Z39.50-1988, the historical design goals of WAIS are presented
in this section. Included with each goal is a brief description of
the assumptions used to meet these design goals.
1. Provide users Access to bibliographic and non-bibliographic
information, including full-text and images.
Because Z39.50-1988 grew out of the bibliographic community,
additional assumptions with the protocol were required to serve non-
bibliographic information. They were also necessary to serve
documents existing in multiple formats (e.g., rtf, postscript, gif,
etc.).
2. Keep the client/server interface simple and independent of
changes in the functionality of the server.
To achieve this, the text string entered by the user was transmitted
to the server without parsing the string into a Type-1 RPN (reverse-
polish notation) query, as is common for bibliographic applications.
Instead WAIS defined a new Type-3 query containing the text string.
In this way, knowledge of the Z39.50 Attributes supported by the
server was no longer required by the client or the user, as is true
of many existing Z39.50 implementations. In addition, the client
software did not require modification to support the evolving
functionality of the server.
3. Provide relevance feedback capability.
Relevance feedback is the ability to select a document, or portion of
a document, and find a set of documents similar to the selection.
WAIS included documents used in relevance feedback as part of the
Type-3 query.
4. Permit the server to operate in a stateless manner.
A WAIS server was designed to be "stateless", meaning that search
result sets were not stored by the server. In Z39.50 terms, the
server exercised its right to unilaterally delete a result set as
soon as it sent the search response. For this reason, the Present
Facility of Z39.50 was not used, and retrievals were performed using
the Search Facility. Relaxing this constraint in future
implementations may prove the most prudent path.
5. Provide the ability for a client to retrieve documents in
pieces.
Because retrieval of a portion of a document could be done several
ways with Z39.50-1988, specific assumptions were made to implement
this functionality. Accessing a portion of a document was required
for both retrieval and for relevance feedback.
6. Run over TCP.
The Z39.50-1988 standard was designed to run in the application layer
using the presentation services provided by the Open Systems
Interconnection (OSI) Reference Model. Due to the popularity of
TCP/IP and the Internet, WAIS was designed to run over TCP. Use of
Z39.50 over TCP is described in [4].
4. WAIS Implementation of Z39.50-1988
By working with the Z39.50 Implementors Group (ZIG), the WAIS
developers used a recommended subset of Z39.50-1988 and specific
assumptions to fulfill its requirements. Over time, many of these
requirements have then gone into the definition of subsequent
versions of Z39.50. As new requirements become apparent, WAIS will
document any additional assumptions and work with the ZIG in
developing extensions.
WAIS supported the Init and Search Facilities of Z39.50-1988. Both
search and retrieval were implemented using the Search Facility, as
described in this section.
Search was initiated by the client with a Search Request APDU
(Application Protocol Data Unit) using a Type-3 query. The query
contained two main fields:
1. The "seed Words", or text, typed by the user.
2. A list of document objects, where a document object is a
full document, or portion thereof, to be used in relevance
feedback. Each document object contains a document
identifier (Doc-ID) [5], type, chunk-code, and start and
end locations. The Doc-ID and type specify the location and
format, respectively, of the document. The chuck-code
determines the unit of measure for the start and end
locations. Examples of chunk-codes used include
byte, line, paragraph, and full document. If the chunk code
is a full document, the start and end locations are ignored.
A Search Response APDU returned by the server contained a relevance
ranked list of records, or WAIS Citations. A WAIS Citation refers to
a document on the server. Each WAIS Citation contains the following
fields:
1. Headline - a set of words that convey the main idea of the
document.
2. Rank - the numerical score of the document based on its
relevance to the query, normalized to a top score of 1000.
3. List of available formats - e.g. text, postscript, tiff, etc.
4. Doc-ID - the location of the document.
5. Length - the length of the document in bytes.
The number of WAIS Citations returned was limited by the preferred
message size negotiated during the Init.
Retrieval of a document was initiated by the client with a Search
Request APDU using a Type-1 query. The query contained up to four
terms:
1. Term: Doc-ID
Use Attribute: system-control-number code = "un"
Relation Attribute: equal code = "re"
2. Term: the requested document format
Use Attribute: data-type code = "wt"
Relation Attribute: equal code = "re"
3. Term: the start location
Use Attribute: paragraph, line, byte code = "wp", "wl",
"wb"
Relation Attribute: greater-than-or-equal code = "ro"
4. Term: the end location
Use Attribute: paragraph, line, byte code = "wp", "wl",
"wb"
Relation Attribute: less-than code = "rl"
Because full-text and images were often larger in size than the
receive buffer of the client, clients were designed to optionally
retrieve documents in chunks, specifying the start and end positions
of the chunk in the query. An example of a fully-specified retrieval
query is:
query = ( ( use = "un", relation = "re", term = <Doc-ID> )
AND
( use = "wt", relation = "re", term = postscript )
AND
( use = "wb", relation = "ro", term = 0 )
AND
( use = "wb", relation = "ro", term = 2000 )
)
A retrieval response was issued by the server with a Search Response
APDU. In this case a single record corresponding to the requested
document, or portion thereof, was returned in the specified format.
5. Security Considerations
Security issues are not discussed in this memo.
6. References
[1] National Information Standards Organization (NISO). American
National Standard Z39.50, Information Retrieval Service
Definition and Protocol Specifications for Library Applications,
New Brunswick, NJ, Transaction Publishers; 1988.
[2] ANSI/NISO Z30.50-1992 (version 2) Information Retrieval Service
and Protocol: American National Standard, Information Retrieval
Application Service Definition and Protocol Specification for
Open Systems Interconnection, 1992.
[3] Z39.50 Version 3: Draft 8", October 1993. Maintenance Agency
Reference: Z39.50MA-034.
[4] Lynch, C., "Using the Z39.50 Information Retrieval Protocol
in the Internet Environment", Work in Progress, November 1993.
[5] "Document Identifiers, or International Standard Book Numbers
for the Electronic Age", Brewster Kahle, Thinking Machines
Corporation, see URL=<FTP://wais.com/pub/class/doc-ids.txt>,
September 1991.
7. Authors" Addresses
Margaret St. Pierre
WAIS Incorporated
1040 Noel Drive
Menlo Park, California 94025
Phone: (415) 327-WAIS
Fax: (415) 327-6513
EMail: saint@wais.com
Jim Fullton
Clearinghouse for Networked Information
Discovery & Retrieval
3021 Cornwallis Road
Research Triangle Park, North Carolina 27709-2889
Phone: (919)-248-9247
Fax: (919)-248-1101
EMail: jim.fullton@cnidr.org
Kevin Gamiel
Clearinghouse for Networked Information
Discovery & Retrieval
3021 Cornwallis Road
Research Triangle Park, North Carolina 27709-2889
Phone: (919)-248-9247
Fax: (919)-248-1101
EMail: kevin.gamiel@cnidr.org
Jonathan Goldman
Thinking Machines Corporation
1010 El Camino Real, Suite 310
Menlo Park, California 94025
Phone: (415) 329-9300 x229
Fax: (415) 329-9329
EMail: jonathan@think.com
Brewster Kahle
WAIS Incorporated
1040 Noel Drive
Menlo Park, California 94025
Phone: (415) 327-WAIS
Fax: (415) 327-6513
EMail: brewster@wais.com
John A. Kunze
UC Berkeley
289 Evans Hall
Berkeley, California 94720
Phone: (510) 642-1530
Fax: (510) 643-5385
EMail: jak@violet.berkeley.edu
Harry Morris
WAIS Incorporated
1040 Noel Drive
Menlo Park, California 94025
Phone: (415) 327-WAIS
Fax: (415) 327-6513
EMail: morris@wais.com
Francois Schiettecatte
FS Consulting
435 Highland Avenue
Rochester, New York 14620
Phone: (716) 256-2850

网友评论
评论
发 布

更多软件教程
  • 软件教程推荐
更多+
Greenfoot设置中文的方法

Greenfoot设置中文的方法

Greenfoot是一款简单易用的Java开发环境,该软件界面清爽简约,既可以作为一个开发框使用,也能够作为集成开发环境使用,操作起来十分简单。这款软件支持多种语言,但是默认的语言是英文,因此将该软件下载到电脑上的时候,会发现软件的界面语言是英文版本的,这对于英语基础较差的朋友来说,使用这款软件就会...

07-05

Egret UI Editor修改快捷键的方法

Egret UI Editor修改快捷键的方法

Egret UI Editor是一款开源的2D游戏开发代码编辑软件,其主要功能是针对Egret项目中的Exml皮肤文件进行可视化编辑,功能十分强大。我们在使用这款软件的过程中,可以将一些常用操作设置快捷键,这样就可以简化编程,从而提高代码编辑的工作效率。但是这款软件在日常生活中使用得不多,并且专业性...

07-05

KittenCode新建项目的方法

KittenCode新建项目的方法

KittenCode是一款十分专业的编程软件,该软件给用户提供了可视化的操作界面,支持Python语言的编程开发以及第三方库管理,并且提供了很多实用的工具,功能十分强大。我们在使用这款软件进行编程开发的过程中,最基本、最常做的操作就是新建项目,因此我们很有必要掌握新建项目的方法。但是这款软件的专业性...

07-05

Thonny设置中文的方法

Thonny设置中文的方法

Thonny是一款十分专业的Python编辑软件,该软件界面清爽简单,给用户提供了丰富的编程工具,具备代码补全、语法错误显示等功能,非常的适合新手使用。该软件还支持多种语言,所以在下载这款软件的时候,有时候下载到电脑中的软件是英文版本的,这对于英语基础较差的小伙伴来说,使用这款软件就会变得十分困难,...

07-05

最新软件下载