Graduate School of Library and Information Science, UT Austin
Information Technologies
and the
Information Professions
spacer


Shortcuts
Home
Introduction
Syllabus
Texts
Tech Modules
Assignments
Standards
Grading
Completion
Resources
Discussion Board
 
GSLIS Links
GSLIS Home
Tutorial Junction
IT Services
 
Site Tools
Site Map
Contact Info
 

OUTLINE OF THE INFORMATION RETRIEVAL (IR) PROBLEM
Philip Doty

  • Given what we have accumulated in the cultural record, how does one find information?

  • Corollary to the "classic IR problem": How do we distinguish what we want from the sea of what we do not want, especially the bad, i.e., the irrelevant, unreliable, inaccurate, outdated, misleading, etc.?

  • The "IR problem" is especially problematic as knowledge increases, as the number of media and platforms increases, as the integration of media grows, as the interoperability of platforms increases, and as we face information overload.

CLASSICAL (SIMPLE) MEASURES OF INFORMATION RETRIEVAL (IR)

Precision = number of relevant documents or records retrieved
number of documents or records retrieved
Recall = number of relevant documents or records retrieved     
number of relevant documents or records in collection

MAJOR DIFFICULTIES WITH RECALL AND PRECISION

  • Defining relevance

  • Searches are not discrete events

  • Problems with real users, real information needs, and large corpora

  • Assumed equivalence of information/document retrieval with the satisfaction of information need

  • "Tyranny of topic"

  • System performance is concerned with users' satisfaction only obliquely

TRENDS IN IR

  • Composite (multimedia) documents

  • Documents in electronic formats only, i.e., documents "born digital" o Dislocation of electronic and print "counterparts"

  • Increased end-user computing, disintermediation; including the use of intelligent agents and meta-search engines on the Web

  • Direct marketing to end-users

  • Document delivery

  • Virtual reality and navigational tools in "information space"

  • Use of fuzzy logic

  • Parallel processing, e.g., neural nets and genetic algorithms

  • Natural Language Processing

  • "Undiscovered public knowledge"
curve image  
Course emailbox: l38613dw@gslis.utexas.edu
GSLIS Website: www.gslis.utexas.edu

Last updated 2001 February 14 by R. E. Wyllys