From: Jorn Barger (jorn@enteract.com)
Subject: Semantics and NLP
Newsgroups: comp.ai.nat-lang
View: Complete Thread (19 articles) | Original Format
Date: 2001-11-05 06:10:02 PST
Judging from the replies I got to my recent ‘newsdiff’ posting, it
appears university programs in AI are (still) doing an extremely bad job
of covering *semantics*…
An unabridged dictionary may have half a million entries. Roget tried
to sort these into 1000 categories, in a hierarchical tree:
1.abstract relations
1.existence
2.relation
3.quantity
4.order
5.number
6.time
7.change
8.causation
2.space
1.generally
2.dimensions
3.form
4.motion
3.matter
1.generally
2.inorganic
3.organic
4.intellect
1.formation of ideas
2.communication of ideas
5.volition
1.individual
2.intersocial
6.affections
1.generally
2.personal
3.sympathetic
4.moral
5.religious
Semantic AI has to wrestle with ontologies like Roget’s, using a
precisely-defined, *limited* vocabulary to express the very-wide range
of realworld concepts.
And I think so long as one stays in the top three categories of Roget’s
tree (abstract relations, space, matter) progress can be– and is
being– made.
But when human psychology enters the picture (intellect, volition,
affections) precise definitions and limited vocabularies instantly and
catastrophically *fail*.
And this barrier has apparently traumatized the field of AI so severely
that the topic is practically taboo! It’s just excluded from
discussion.
*If* a neat ontology-of-psychology could be created, all the other
problems of NLP might just evaporate. Understanding a sentence would
just require finding the node in the ontology that expresses the exact
same meaning (in a generalised form).
I’ve been arguing that this ontology _can_ be built if we think of the
nodes not as dictionary-entries, but as the whole ‘usual stories’
surrounding a concept or word.
So Roget’s ‘intellect’ node would correspond to the whole usual-story of
human intellect: children are born with limited intellect, they learn,
some learn faster, some learn more, humans use intellect to solve
problems, they teach others, etc. (This could even be encoded in the
form a short, human-readable encyclopedia article, written in simplified
English.)
Particular concepts like ‘smarter’ can then be linked to specific parts
of this general story as ‘specialisations’. (The usual ‘smarter’ story
involves getting praised in school, becoming annoyingly self-important,
going to college, etc.)
But remarkably, the mental skill required to think in terms of these
usual-stories is much closer to the novelist’s art than to anything
taught in comp-sci– to such a great extent that comp.ai.nat-lang barely
acknowledges the topic as appropriate!
But I think I’ve found a leverage point, finally: pseudo-XML tagging of
the entries in Web *timelines*.
Because the authors of timelines are trying to limit themselves to the
most significant discrete events (in all of history), timelines do an
excellent job of prioritising human behaviors, and so of identifying the
most-useful limited vocabulary for human history.
Examples:
person1 is born at place on date to mother person2 and father person3
person1 is educated at place by person2
person moves from place1 to place2
person creates creative-work
person founds social-institution
person joins social-institution
person discovers theory
person1 fights person2
person leads group with persons2-3-etc
group fights group
etc
These, then, become the ‘root-level’ usual-stories in the psychological
ontology.
And my ‘newsdiff’ suggestion was about the need for a NLP to have a
mega-timeline of history that’s continually updated with new
news-events, winnowing just the new developments from the standard
high-redundancy journalistic style.
ai faq: http://www.robotwisdom.com/ai/
timelines project: http://www.robotwisdom.com/science/history.html
—
http://www.robotwisdom.com/ “Relentlessly intelligent
yet playful, polymathic in scope of interests, minimalist
but user-friendly design.” –Milwaukee Journal-Sentinel