Scoffing at Autonomy

From: CorporateInfo (
Subject: Expert scoffs at Autonomy software
View: (This is the only article in this thread) | Original Format
Date: 2001-08-23 21:49:53 PST

At Genentech headquarters today in South San Francisco, there was a
seminar on “Metadata & Controlled Vocabularies: What Are they and What
is their Value?” given by Amy Warner, faculty member in Information
Architecture at UMichigan, Genentech’s consultant on taxonomy creation
and a well-known corporate consultant on information technology. Dr.
Warner had a final slide with “Autonomy”and “Semio” written on it, and
here’s what she said:

“I always get asked, aren’t there automated tools to build
taxonomies? There is some ‘Hiearchy Generation Software’ available,
but it generally relies on the principal of Collocation, not
aboutness, which is preferable. Collocation simply means statistical
clustering. Two examples are Autonomy and Semio. Beware of these.
They don’t work very well. They make many promises, but Collocation
only works some of the time. I shouldn’t bash, but I haven’t talked
to a company yet that likes Autonomy, including those that have bought
it. In many cases they are overselling their product.”

Jorn’s Ontology

From: Jorn Barger (
Subject: Semantics and NLP
View: Complete Thread (19 articles) | Original Format
Date: 2001-11-05 06:10:02 PST

Judging from the replies I got to my recent ‘newsdiff’ posting, it
appears university programs in AI are (still) doing an extremely bad job
of covering *semantics*…

An unabridged dictionary may have half a million entries. Roget tried
to sort these into 1000 categories, in a hierarchical tree:

1.abstract relations


1.formation of ideas
2.communication of ideas



Semantic AI has to wrestle with ontologies like Roget’s, using a
precisely-defined, *limited* vocabulary to express the very-wide range
of realworld concepts.

And I think so long as one stays in the top three categories of Roget’s
tree (abstract relations, space, matter) progress can be– and is
being– made.

But when human psychology enters the picture (intellect, volition,
affections) precise definitions and limited vocabularies instantly and
catastrophically *fail*.

And this barrier has apparently traumatized the field of AI so severely
that the topic is practically taboo! It’s just excluded from

*If* a neat ontology-of-psychology could be created, all the other
problems of NLP might just evaporate. Understanding a sentence would
just require finding the node in the ontology that expresses the exact
same meaning (in a generalised form).

I’ve been arguing that this ontology _can_ be built if we think of the
nodes not as dictionary-entries, but as the whole ‘usual stories’
surrounding a concept or word.

So Roget’s ‘intellect’ node would correspond to the whole usual-story of
human intellect: children are born with limited intellect, they learn,
some learn faster, some learn more, humans use intellect to solve
problems, they teach others, etc. (This could even be encoded in the
form a short, human-readable encyclopedia article, written in simplified

Particular concepts like ‘smarter’ can then be linked to specific parts
of this general story as ‘specialisations’. (The usual ‘smarter’ story
involves getting praised in school, becoming annoyingly self-important,
going to college, etc.)

But remarkably, the mental skill required to think in terms of these
usual-stories is much closer to the novelist’s art than to anything
taught in comp-sci– to such a great extent that barely
acknowledges the topic as appropriate!

But I think I’ve found a leverage point, finally: pseudo-XML tagging of
the entries in Web *timelines*.

Because the authors of timelines are trying to limit themselves to the
most significant discrete events (in all of history), timelines do an
excellent job of prioritising human behaviors, and so of identifying the
most-useful limited vocabulary for human history.


person1 is born at place on date to mother person2 and father person3
person1 is educated at place by person2
person moves from place1 to place2
person creates creative-work
person founds social-institution
person joins social-institution
person discovers theory
person1 fights person2
person leads group with persons2-3-etc
group fights group

These, then, become the ‘root-level’ usual-stories in the psychological

And my ‘newsdiff’ suggestion was about the need for a NLP to have a
mega-timeline of history that’s continually updated with new
news-events, winnowing just the new developments from the standard
high-redundancy journalistic style.

ai faq:
timelines project: “Relentlessly intelligent
yet playful, polymathic in scope of interests, minimalist
but user-friendly design.” –Milwaukee Journal-Sentinel