We Have Made No Progress Toward AGI

  • The hard part is that for all the things that the author says disprove LLMs are intelligent are failings for humans too.

    * Humans tell you how they think, but it seemingly is not how they really think.

    * Humans tell you repeatedly they used a tool, but they did it another way.

    * Humans tell you facts they believe to be true but are false.

    * Humans often need to be verified by another human and should not be trusted.

    * Humans are extraordinarily hard to align.

    While I am sympathetic to the argument, and I agree that machines aligned on their own goals over a longer timeframe is still science fiction, I think this particular argument fails.

    GPT o3 is a better writer than most high school students at the time of graduation. GPT o3 is a better researcher than most high school students at the time of graduation. GPT o3 is a better lots of things than any high school student at the time of graduation. It is a better coder than the vast majority of first semester computer science students.

    The original Turing test has been shattered. We're building progressively harder standards to get to what is human intelligence and as we find another one, we are quickly achieving it.

    The gap is elsewhere: look at Devin as to the limitation. Its ability to follow its own goal plans is the next frontier and maybe we don't want to solve that problem yet. What if we just decide not to solve that particular problem and lean further into the cyborg model?

    We don't need them to replace humans - we need them to integrate with humans.

  • My understanding was that chain-of-thought is used precisely BECAUSE it doesn't reproduce the same logic that simply asking the question directly does. In "fabricating" an explanation for what it might have done if asked the question directly, it has actually produced correct reasoning. Therefore you can ask the chain-of-thought question to get a better result than asking the question directly.

    I'd love to see the multiplication accuracy chart from https://www.mindprison.cc/p/why-llms-dont-ask-for-calculator... with the output from a chain-of-thought prompt.

  • I mildly disagree with the author, but would be happy arguing his side also on some of his points:

    Last September I used ChatGPT, Gemini, and Claude in combination to write a complex piece of code from scratch. It took four hours and I had to be very actively involved. A week ago o3 solved it on its own, at least the Python version ran as-is, but the Common Lisp version needed some tweaking (maybe 5 minutes of my time).

    This is exponential improvement and it is not so much the base LLMs getting better, rather it is: familiarity with me (chat history) and much better tool use.

    I may be be incorrect, but I think improvements in very long user event and interaction context, increasingly intelligent tool use, perhaps some form of RL to develop per-user policies for improving incorrect tool use, and increasingly good base LLMs will get us to a place that in the domain of digital knowledge work where we will have personal agents that are AGI for a huge range of use cases.

  • So the “reasoning” text of openAI is no more than old broken Windows “loading” animation.

  • One point that I think seperates AI and human intelligence is LLM's inability to tell me how it feels or it's individual opinion on things.

    I think to be considered alive you have to have an opinion on things.

  • Fascinating look at how AI actually reasons. I think it's pretty close to how the average human reasons.

    But he's right that the efficiency of AI is much worse, and that matters, too.

    Great read.

  • People ditch symbolic reasoning for statistical models, then are surprised when the model does, in fact, use statistical features and not symbolic reasoning.

  • > All of the current architectures are simply brute-force pattern matching

    This explains hallucinations and i agree with 'braindead' argument. To move toward AGI i believe there should be some kind of social awareness component added which is an important part of human intelligence.

  • author says we made no progress towards agi, also gives no definition for what the "i" in agi is, or how we would measure meaningful progress in this direction.

    in a somewhat ironic twist, it seems like the authors internal definition for "intelligence" fits much closer with 1950s. good old-fashioned AI, doing proper logic and algebra. literally all the progress we made in ai in the last 20 years in ai is precisely because we abandoned this narrow-minded definition of intelligence.

    Maybe I'm a grumpy old fart but none of these are new arguments. Philosophy of mind has an amazingly deep and colorful wealth of insights in this matter, and I don't know why this is not required reading for anyone writing a blog on ai.

  • I really dislike what I now call the American We.

    "We made it!" "We failed!" written by somebody who doesn't have the slightest connection to the projects they're talking about. e.g. this piece doesn't even have an author but I highly doubt he has done anything more than using chatgpt.com a couple times.

    Maybe this could be the Neumann's law of headlines: if it starts with We, it's bullshit.

  • So? Who even wants it? Whatever the definition is, sounds like AGI and sentient AI are really close concepts. Sentient AI is like a can of worms for ethics.

    On the other hands, while definitely not having AGI, we have all these building blocks for AI tools for decades to come, to build on top. We've only barely scratched the surface of it.

  • This thing where AI can improve itself seems to me in violation of the second law. I'm not a physicist by training merely an engineer but my argument is as follows:

    - I think the reason humans are clever is because nature spent 6 billion years * millions of energetic lifetimes (that is, something on the order of quettajoules of energy) optimizing us to be clever.

    - Life is a system, which does nothing more than optimize and pass on information. An organism is a thing which reproduces itself, well enough to pass its DNA (aka. information) along. In some sense, it is a gigantic heat engine which exploits the energy gradient to organize itself, in the manner of a dissipative structure [1]

    - Think of how "AI" was invented: all of these geometric intuitions we have about deep learning, all of the cleverness we use, to imagine how backpropagation works and invent new thinking machines. All of the cleverness humanity has used to create the training dataset for these machines. This cleverness. It could not arise spontaneously, instead, it arose as a byproduct, from the long existence of a terawatt energy gradient. This captured energy was expended, to compress information/energy from the physical world, in a process which created highly organized structures (human brains) that are capable of being clever.

    - The cleverness of human beings and the machines they make is, in fact, nothing more than the byproduct of an elaborate dissipative structure whose emergence and continued organization requires enormous amounts of physical energy: 1-2% of all solar radiation hitting earth (terawatts), times 3 billion years (existence of photosynthesis).

    - If you look at it this way it's incredibly clear that the remarkable cleverness of these machines is nothing more than a bounded image, of the cleverness of human beings. We have a long way to go, before we are training artificial neural networks, with energy on the order of 10^30 joule [2]. Until then, we will not become capable of making machines that are cleverer than human beings.

    - Perhaps we could make a machine that is cleverer than one single human. But we will never have an AI that is more clever than a collection of us, because the thing itself must be, in a 2nd law sense, less clever than us, for the simple reason that we have used our cleverness to create it.

    - That is to say that there is no free lunch. A "superhuman" AI will not happen in 10, 100, or even 1,000 years, unless we find the vast amount of energy (10^30J) which will be required to train it. Humans will always be better and smarter. We have had 3 billion years of photosynthesis, this thing was trained in what, 120 days? A petajoule?

    [1] https://pmc.ncbi.nlm.nih.gov/articles/PMC7712552/

    [2] Where do we get 10^30J?

    Total energy hitting earth in one year: 5.5Ă—10^24 J

    Fraction of that energy used by all plants: 0.05%

    Time plants have been alive on earth: 3 billion years

    You get to 8*10^30 if you multiply these numbers. Round down.

  • Red flag nowadays is when a blog post tries to judge whether AI is AGI. Because these goal posts are constantly moving and there is no agreed upon benchmark to meet. More often than not, they reason why exactly something is not AGI yet from their perspective, while another user happily use AI as a full-fledged employee depending on use case. I’m personally using AI as a coding companion and it seems to be doing extremely well for being brain dead at least.

  • Imma be honest with you, this is exactly his I would do that math, and that is exactly the lie I would tell if you asked me to explain it. This is me-level agi.

  • > Which means these LLM architectures will not be producing groundbreaking novel theories in science and technology.

    Is it not possible that new theories and breakthroughs could result from this so-called statistical pattern matching? The information necessary could be present in the training data and the relationship simply never before considered by a human.

    We may not be on a path to AGI, but it seems premature to claim LLMs are fundamentally incapable of such contributions to knowledge.

    In fact, it seems that these AI labs are leaning in such a direction. Keep producing better LLMs until the LLM can make contributions that drive the field forward.