Doyle, L. B. (1975). Information retrieval and processing. Los Angeles: Melville.
Lauren B. Doyle, an early researcher in the field of information retrieval, prepared this volume for the firm of Becker and Hayes, drawing somewhat on an earlier title prepared by Joseph Becker and Robert M. Hayes in 1963, (Information Storage and Retrieval: Tools, Elements, Theories). Chapter ten, covering the processing of language data, provides a comprehensive overview of automatic indexing and language processing by means of semantic or conceptual schemes. Chapter eleven provides an overview of evaluation techniques developed at the time. References at the end of each chapter list documents and reports that can be considered foundational writings on these topics.
Keenan,
S. (1973). Progress in automatic indexing and prognosis for the future. In J.
A. Clifton, & D. Helgeson (Eds.), Computers in information data centers
(pp. 97-104). Montvale, NJ: AFIPS Press.
This short paper reviews basic
readings in automated indexing and attempts to predict trends for the future in
this field. The author expresses a need for the fields of automated language processing
and linguistics to collaborate for mutual benefit and study of the handling of
natural language systems. At this point in time, automatic indexing techniques
are being used with large size information stores and journal production is increasingly
moving to machine-readable form.
Licklider, J. C. R. (1965). Libraries
of the future. Cambridge, MA: M.I.T. Press.
This book is dedicated
to Dr. Vannevar Bush for having written his pioneering article "As We May Think"
in Atlantic Monthly, July, 1945, giving the author inspiration to prepare
this work sponsored by the Council on Library Resources. At that time, the exploding
size of the corpus of knowledge versus the capacity of computer memories and the
speed of computer processors was a major concern. The author provides the term
procognitive systems to name the system of the future that he predicts will bring
many disciplines together, blending computer sciences, behavioral and social sciences,
library sciences, and information storage and retrieval studies to form a system
that benefits mankind. Topics discussed include random access memory, parallel
processing, cathode ray oscilloscope displays, light pens, list structures, xerographic
output units, and time-sharing computer systems with remote user stations.
Luhn, H. P. (1959). Auto-encoding of documents for information retrieval systems.
In M. Boaz, Modern Trends in Documentation (pp. 45-58). London: Pergamon
Press.
Luhn believed that the growing rate of information and document
production necessitated the invention of methods allowing data to be retrieved
from stores of documents without expensive human intervention. This paper discusses
auto-encoding based on statistical procedures performed by a machine on the original
text of a document already in machine-readable form. The prevalent machine-readable
form of that time was primarily punched cards or paper tape and less frequently
magnetic tape. The auto-encoding method used word frequency rates, a special thesaurus,
and the development of multi-dimensional patterns based on word proximity. At
the time, application of the method was limited to articles of 500 to 5000 words,
but Luhn was confident that the logical capabilities of electronic machines, statistical
methods, and "further research into the characteristics of human behavior
as manifested in writing" would lead to better information dissemination
and retrieval. Earlier articles by this author discuss the automatic creation
of abstracts and the development of thesauri.
Luhn, H. P. (1961). Automated
intelligence systems: Some basic problems and prerequisites for their solution.
In E. Tomeski, R. W. Westcott, & M. Covington (Eds.), The clarification,
unification & integration of information storage & retrieval proceedings
of February 23rd 1961 symposium (pp. 3-20). New York: Management Dynamics.
Luhn discusses the swelling flood of new information and the need for a comprehensive
intelligence system for selective dissemination of new information in this article.
The system he outlined provides a profile reflecting the current interests of
the user and is capable of storing and retrieving information related to those
interests by comparing the profile of interests to the stored library of documents
and retrieving those with matching similarity. Additional functions of the system
were to locate others interested in similar topics and to match up any new incoming
interest profiles with those already stored in order to facilitate communication
among people in an organization. Interaction with the system was handled for the
user through an intermediary, the information specialist. He proposes centralized
services that could make machine-readable texts available promptly and predicts
direct communication between electronic information processing machines and these
text centers that would be entirely automatic. He also requests that work be done
in the area of automatic recognition and characterization of pictorials and urges
that the approach he advocated be implemented without delay.
Salton, G.
(1968). Automated language processing. In C. A. Cuadra, (Ed.), Annual review
of information science and technology (pp. 169-199). Chicago: Encyclopaedia
Britannica.
This review article illustrates the state of the art in syntactic
and semantic theories of language, emphasizing that the interpretation of the
meanings of words in text is a formidable task. At this point, the ideal natural
language processor traits were defined, but no actual model containing all traits
existed. Examples of user involvement in iterative search query formulation were
noted. Statistical processing for document retrieval was described as existing
in a variety of semi-operational settings, however, Salton notes the need for
evaluation of these methods. Work under way to foster evaluation includes the
SMART project and the ASLIB Cranfield work. Salton comments that "It is this writer’s
guess that the ideas of Luhn, far from being abortive, may really come into their
own within the next few years. The simple language processing methods, including
small stored dictionaries, suffix recognition procedures, and word token statistics,
appear to be far more powerful than was originally thought possible; such methods
will likely be used for most of the language processing tasks actually implemented
on computers, including applications in library science and information retrieval."
p. 191.
Schultz, C. K. (Ed.). (1968). H. P. Luhn: Pioneer of information
science; selected works. New York: Spartan.
Hans Peter Luhn died in
1964 during his term as President of the American Documentation Institute, and
in his honor a scholarship fund for Information Science was established in his
name and this book was produced to convey biographical information, list the eighty
patents held by H. P. Luhn, and compile in one place a considerable number of
his writings and speeches. Mr. Luhn was an inventor, long term employee of IBM,
and a person who was somewhat ahead of his time. He gained an interest in information
retrieval in the 1950s and is credited with creating Keyword-in-Context (KWIC)
indexing by machine method and coining the term selective dissemination of information.
He believed in providing practical solutions to problems and produced two experimental
demonstrations in connection with the 1958 International Conference on Scientific
Information (ICSI) papers. The first was the application of his auto-abstracting
technique to the papers for one session, and the second was the production of
the KWIC index to all of the papers presented. These devices were introduced at
this conference along with two new Luhn inventions, the 9900 Index Analyzer and
the Universal Card Scanner. Following the conference, newspapers across the country
carried stories about the auto-abstracting and auto-indexing system, which he
described as the machine-generated equivalent of a completely intellectual task
in the field of literature evaluation.
Stevens, M. E. (1970). Automatic
indexing: A state-of-the-art report. National Bureau of Standards Monograph
91 reissued with additions and corrections SD Catalog No. C13.44:91). Washington,
DC: U.S. Government Printing Office.
This formidable piece of work was
initiated by the National Science Foundation and was jointly funded with the National
Bureau of Standards. The survey was first conducted to be current through February
of 1964. It was updated in February of 1970 with additions bringing it up to date
with literature references through August 1969. The text of this volume represents
a who’s who of thought and writing in the area of automatic indexing from the
very earliest recorded instance of the concept through 1969. Each reference in
the report is accompanied by a full citation. The report is comprehensive and
thoroughly researched.
Taube, M., & Wooster, H. (Eds.). (1958). Information
storage and retrieval: Theory, systems, and devices. New York: Columbia University
Press.
This book represents the record of an Air Force symposium held
forty years ago in Washington D. C. The list of participants in the panel discussions,
impressive for the time, included representatives from Magnavox, Documentation
Incorporated, Zator Company, Dow Chemical Company, International Business Machines,
Eastman Kodak Company, and many others who were conducting advanced research and
development in information storage and retrieval. H. P. Luhn presented a paper
titled "Indexing, Language, and Meaning" in which he distinguished between systems
for classifying collections that are "adopted" or borrowed from a pre-established
categorical scheme, those that are "synthetic" or developed by subject matter
experts for a given field, and those that are "native" or derived by statistical
analyses from the collection itself. He felt that the native category was the
most effective one for retrieval systems. At this same event, Calvin N. Mooers
predicted that current barriers in language symbols (the control of meaning of
languages) and machines that are simple enough and cheap enough to allow any individual
to perform retrieval on his own collection would be crossed, giving rise to unimaginable
advances.
Watters, C. (1992). Dictionary of information science and
technology. San Diego: Academic Press.
This dictionary combines terms
from several specialized subject areas and presents definitions that are quite
understandable. Each definition is accompanied by a key which leads to a subject
outline in the back of the dictionary. Each term is also annotated by one or more
references to the literature which is the basic or seminal discussion of the term
and also to a work that represents a direct usage of the term in the information
science field.
Balnaves, J., Gerrie, B., & Oxley, S. (1980). A workbook
in information retrieval (Fifth ed.). Canberra, Australia: Canberra College
of Advanced Education.
This workbook for students of librarianship aims
to provide practical exercises in retrieval of documents providing a familiarity
with the vocabulary, systems, and popular database of the time. Included in the
exercises is practice with key-word-in-context (KWIC) and key-word-out-of-context
(KWOC). This work contains a glossary of terminology used in information retrieval.
| Return to Table of Contents |
This page is created and maintained
by Sue Soy ssoy@ischool.utexas.edu
Last Updated 11/11/98
© Copyright 1996 Susan K. Soy
Please feel free
to copy and distribute freely for academic purposes with this notice and attribution.
All other rights reserved