This page in my kid’s book from school to learn how to read.

Mildly Infuriating@lemmy.world – 686 points – 10 months ago

You are viewing a single comment

View all comments Show the parent comment

In a sense... yes! Although of course it's thought to be across many modalities and time-scales, and not just text. Also a crucial piece of the picture is the Bayesian aspect - which also involves estimating one's uncertainty over predictions. Further info: https://en.wikipedia.org/wiki/Predictive_coding

It's also important to note the recent trends towards so-called "Embodied" and "4E cognition", which emphasize the importance of being situated in a body, in an environment, with control over actions, as essential to explaining the nature of mental phenomena.

But yeah, it's very exciting how in recent years we've begun to tap into the power of these kinds of self-supervised learning objectives for practical applications like Word2Vec and Large Language/Multimodal Models.

We can have robots with bodies that talk and form relationships with people now. Not deep intimate relationships, but simple things like maintaining conversations with people. You wouldn’t need much more software on top of the LLM to make a really functional person.

I have to disagree about that last sentence. Augmenting LLMs to have any remotely person-like attributes is far from trivial.

The current thought in the field about this centers around so-called "Objective Driven AI":

in which strategies are proposed to decouple the AI's internal "world model" from its language capabilities, to facilitate hierarchical planning and mitigate hallucination.

The latter half of this talk by Yann LeCun addresses this topic too: https://www.youtube.com/watch?v=pd0JmT6rYcI

It's very much an emerging and open-ended field with more questions than answers.