An Experiment in Enhancing Information Access by Natural Language Processing

Cheng, Isaac
Wilensky, Robert
Technical Report Identifier: CSD-97-963
July 1997

Abstract: We explore the hypothesis that lexical disambiguation could be applied to provide useful information access services. Specifically, we refined a lexical disambiguation method, and used it in a fully automatic categorization algorithm we developed. We also used this method more directly, to implement a service that retrieves documents by word sense.

To test these algorithms, we developed an experimental system, IAGO!, in which we applied these algorithms to accessing the World Wide Web. IAGO! comprises both an Web directory (i.e., a classification of articles by topic) and a Web search service. Unlike most other Web directories, IAGO!'s directory was generated by a fully automatic process. One experiment shows a cataloging accuracy of 97%.

To improve net searching, IAGO! enables users to refine their queries by first detecting lexical ambiguities, and then allowing users to select specific word senses by which to search. IAGO! returns only Web pages in which a given keyword occurs in the specified sense. To help evaluate these results, we derive some performance thresholds that a disambiguation algorithm needs to operate within in order to be useful for retrieval. Our experimental results suggest that the implemented algorithm is performing well above these needs.