Mind Matters: A Tribute to Allen Newell/Meaning Matters

World Knowledge versus Linguistic Knowledge
 The need for world knowledge to support language comprehension is well recognized. I have in mind ambiguous sentences like "Sam walked into the library." Some people have trouble recognizing that "Sam walked into the library" is ambiguous. "Sam walked into the tree" is also ambiguous, and in the same way. One explanation of how people resolve such ambiguities is that they draw on their knowledge of the world. It is common knowledge that libraries are things you go inside of whereas trees (generally) are not, and that trees are things you bump into whereas libraries (generally) are not. That is to say, people know about libraries and about trees, and what they know resolves the potential ambiguity so rapidly that it is not even noticed.

 That is a plausible approach. But world knowledge is so huge and amorphous that I have never had the courage to tackle it. Nothing less than a system as sophisticated as Soar could possibly encompass it, and even Soar might not suffice. So I have tried to take a shortcut. I have tried to imagine what linguistic knowledge would be required and how it might be organized. Indeed, many of the problems that seem to require world knowledge could be handled well enough if only our systems had more linguistic knowledge. The ambiguous sentence just discussed, "Sam walked into the library," can serve as an example. Suppose our system -- the system that we design to map linguistic forms into their meanings -- know that walk into is a phrasal verb with more than one sense. Then the problem would be to use the linguistic context to determine which sense of walk into is appropriate in any given case. This is the kind of linguistic relation between a verb and its context that is called a selectional restriction (or a selection preference): A verb selects certain nouns that can serve felicitously as its direct object.

 Because linguistic knowledge is a part of world knowledge, an appeal to linguistic knowledge might seem parsimonious. But the sheer bulk of information that is involved is still forbidding. To get a rough estimate, take the 1,700-page grammar by Quirk, Greenbaum, Leech, and Svartivik (1985) and pile it on top of the 1,500-page Webster's Ninth New Collegiate Dictionary (1987). And even that is an underestimate -- there is much that speakers of English know about their language that the authors of these great books have not yet described. But it is less than, say, The Encyclopedia Britannica, which might be taken as a comparable estimate of the amount of world knowledge a system would require.

 Whether linguistic knowledge, defined to include lexical as well as grammatical information, could suffice to answer all of the questions for which a need for world knowledge has been invoked is not something I hope to settle here. My point is merely that more and better linguistic knowledge could be a great help. In any case, I am going to discuss only lingustic knowledge -- there are enough problems involved with that to overflow the space I have available.

 One obvious linguistic need is for a good, robust parsing program. In spite of remarkable progress in syntactical theory and in parser construction during the past quarter century, we still do not have a computer system capable of parsing correctly every grammatical sentence that comes its way. It is a difficult problem, but in order to get ahead with my present concerns I am going to assume that it has been solved. (p. 121-2)