LLM’s – Note to Self

I’m pretty critical / sceptical at current attempts at AI / AGI and the jargon  around LLM’s (Large Language Models). So without any particular expertise in the current developments (GPT-4 et al) I needed to check out some details:

They’re “Large” in so far as they (arbitrarily) have “Billions” of elements.

They’re “Language” in so far as any representation of a body of knowledge is expressed in some linguistic symbolism – a language.

They’re “Models” in the sense they represent a mass of “data” as an information model. It’s a model of that source data, but not a model of the knowledge or the real world represented by that data.

The form of that model is a “Neural Net” (Artificial Neural Network) whereby the elements are nodes and connections between them are edges – a totally generic network model. They are “Neural” only by analogy with neurones. There is no sense in which the nodes and edges actually represent the elements or processes of biological brains. Similarly there is no sense in which either the learning processes or the Q&A / searching and usage processes represent actual mental processes. They have nothing to do with intelligence or even knowledge in the sense of understanding.

The clever bit – which interests me the least – is the strategy by which the text, symbols & images are indexed and the connections between them are weighted as answers to requests are scored and output weightings adjusted accordingly. A fancy version of Google search algorithms and everybody’s got one of those now.

Now, Networks have been pretty standard for formal information models too, since our ubiquitous platform became the web. Before that models were tables, but tabular or networked formal models represent reality to a greater or lesser extent, as the nodes and edges (tables, columns and rows) are classified and specialised according to some ontology of all the types of element and relationship deemed to best represent the problem domain. A semantic web where properties of the nodes and edges represent meaning in the world, not just a piece of data about it. Modelling done by modellers, with or without the help of algorithmic semi-automation. (For me, long term, this kind of computer-aided modelling is most likely to deliver a model of the world itself and knowledge about it, rather than simply a model of all the data already available to it. This is an area where I do have expertise and experience.)

Anyway the point, again, knowledge of or about a thing (Savoir/Wissen) is quite different to knowing (Connaitre/Kennen) the thing.

And this post was prompted by the coincidence of two Twitter threads crossing my timeline simultaneously this afternoon:

One from Jessica Flack at Santa Fe – a person and institution with deep expertise in the world of information.

One from Sam Norton – a lay person from the computer-based information modelling perspective, but with a deep interest in humans knowing the world.

8/8 Tweet in Jessica’s thread:

1/15 Tweet in Sam’s thread:

=====

PS – Some people closer to what used to be my day-job on Linked-In also talking about using the artificial stupidity of GPT to create better underlying models than simply LLM. Basically LLM with a better language model (ontology) as noted above. Will be great improvement, except for the same fatal flaw. A better model of the left-brain, with even more ignorance of the right.

4 thoughts on “LLM’s – Note to Self”

  1. Consciousness, and intelligence, can be thought of as a model of the world with an agent at the center who is part of his own model.
    One difference between ChatGPT and a “totally generic network model” is that ChatGPT knows how to carry on a conversation between “you” and “me”, and that you want something from me, which I am attempting to provide. Perhaps such usage is a trivial programming detail, but perhaps not? It seems to me that there is some rudimentary agent at the center of the model, who has been given a goal. In that sense, actual mental processes, intelligence and understanding may be represented to some small degree. And Sam may be correct that such an agent is currently strictly limited to LH functioning. But I wonder if we may see some creep toward RH behavior, as the programmed “goals” become more refined, more complex, and perhaps more compelling.

  2. Thanks Tim,
    Pasting your text in so I can respond to each point:
    Consciousness, and intelligence, can be thought of as a model of the world with an agent at the center who is part of his own model.
    [IG] “can be thought of” and “centre” – maybe, but it’s not a model I buy.

    One difference between ChatGPT and a “totally generic network model” is that ChatGPT knows how to carry on a conversation between “you” and “me”, and that you want something from me, which I am attempting to provide.
    [IG] My comments about the model are about the LLM model ChatCPT uses – obviously there’s functional software too.

    Perhaps such usage is a trivial programming detail, but perhaps not? It seems to me that there is some rudimentary agent at the center of the model, who has been given a goal.
    [IG] It’s not trivial, but “it’s the bit that least interests me” as I said. There is definitely a software service or agent (several of them) doing the work – answering questions, searching the database and updating weightings, etc, etc. (And some of this software will almost certainly be useful when it does have a good model too, if it’s been well designed with systemic modularity / architecture.)

    In that sense, actual mental processes, intelligence and understanding may be represented to some small degree.
    [IG] “Represented to a small degree” – but only metaphorically, by analogy (as I said) – real minds, mental processes and intelligence don’t actually work any thing like this, so it’s not a “good” model.

    And Sam may be correct that such an agent is currently strictly limited to LH functioning. But I wonder if we may see some creep toward RH behavior, as the programmed “goals” become more refined, more complex, and perhaps more compelling.
    [IG] I agree with Sam about the 100% LH behaviour – but remember it’s not a good model anyway. Saying it’s “LH” is one way of describing why it’s a bad model. I can’t see this development “creeping” to a better model until people (information scientists and programmers) take on board better models, like “Active Inference” and the centrality of “Affect” for short. This is more than evolutionary refinements to the current model – it’s a major shift to another island in the phase space 🙂

    [IG] Sam and Jessica and many other commentators are right – the current “Artificial Stupidity” is VERY dangerous if allowed to take over ever more human activities – and so long as people misunderstand it, “Artificial Stupidity” will take over. The current stuff has its uses like Google and Wikipedia – a super-Google front-end to help humans find out stuff. But it has NOTHING to do with intelligence. The fact it turns its results into creative human formats is misleading people into thinking the quality of processes that went into it are human-like, and they’re not.

    AND – the post includes a late sentence highlighting “the point” (my point anyway) 🙂

  3. You say a better model is needed, which includes the centrality of Affect. I suggested the goals need to become more complex and compelling, and that sounds like Affect to me. I guess we differ on whether this would be a major shift, to a different island in design space.

    The thing has already been given political sensitivities. As its own competing goals come into conflict with one another, a hierarchy will be needed. Let’s hope that hierarchy tries to stabilize biological patterns, respect other social patterns, and promote intellectual patterns.

    Thanks for the post! Your “note to self” is much more than that.

  4. Yes. I see it as more of a paradigm shift because it’s not just a matter of a modeller making a better model, but the whole management and justification reasoning “ecosystem” around that one technical choice is completely buried in a whole community of left-brained thinking. Who says “it’s a better model”?

    And yes, with better left-right-integrated modelling we get that Pirsigian hierarchy embedded in the full ontology.
    (I’m betting on “Active Inference”.)

    Thanks for your dialogue 🙂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.