LLMs Don't Know What They Don't Know–and That's a Problem

  • The day can’t come fast enough where we just see things like this as trivial misuse of the tool— like using a hammer to drive in a screw. We use the hammer for nails and the screwdriver for screws. We use LLM for exploring data with language, and our brains for reasoning.

  • Claude does ask questions for clarification or asks me to provide something it does not know though, at least it has happened many times to me. At other times I will have to ask if it needs X or Y to be able to answer more accurately, although this may be the same case with other LLMs, too. The former though was quite a surprise to me, coming from GPT.

  • I would suggest that LLMs don't actually know anything. The knowing is inferred.

    An LLM might be seen as a kind of very elaborate linguistic hoax (at least as far as knowledge and intelligence are concerned).

    And I like LLMs, don't get me wrong. I'm not a hater.

  • I wonder how much of this is an inherent problem which is hard to work a solution into vs "confidently guessing the answer every time yields a +x% gain for a modelon all of the other benchmark results so nobody wants to reward opposite of that".

  • I use copilot every day and every day I'm more and more convinced that LLMs aren't going to rule the world but will continue to be "just" neat autocomplete tools whose utility degrades the more you expect from them.

  • Well humans don't know what they don't know either. I think the bigger problem is that LLMs don't know what they do know.

  • Is it that they're over confident, or that we are over confident in their responses also.

    LLM's aren't an all knowing power, much like ourselves, but we still take the opinions and ideas of others as true to some extent.

    If you are using LLM's and taking their outputs as complete truths or working products, then you're not using them correctly to begin with. You need to exercise a degree of professional and technical skepticism with their outputs.

    Luckily LLM's are moving into the arena of being able to reason with themselves and test their assumptions before giving us an answer.

    LLM's can push me in the wrong direction just as much as an answer to a problem on a forum.

  • LLMs don't know period. They can be useful to summarize well and redundantly publicized information, but they don't "know" even that.

  • to quote someone else: "at least when I ask an intern to find something they'll usually tell me they don't know and then flail around; AI will just lie with full confidence to my face"

  • LLMs learn from the internet. Refuse to admit they don't know something. I have to admit I'm not entirely surprised by this.

  • I consider this to be a solved problem. Reasoning models are exceptionally good at this. In fact, if you use ChatGPT with Deep Research, it can bug you with questions to the point of annoyance!

    Could have also been the fact that my custom GPT instructions included stuff like “ALWAYS clarify something if you don’t understand. Do not assume!”

  • I find most articles of the sort "LLMs have this flaw" to be of a cynical one-upmanship kind.

    "If you say please LLMs think you are a grandma". Well then don't say you are a grandma. At this point we have a rough idea of what these things are, what their limitations are, people are using them to great effect in very different areas, their objective is usually to hack the LLM into doing useful stuff, while the article writers are hacking the LLM into doing stuff that is wrong.

    If a group of guys is making applications with an LLM and another dude is making shit applications with the LLM, am I supposed to be surprised at the latter instead of the former? Anyone can do an LLM do weird shit, the skill and area of interest is in the former.