An AI that unexpectedly modified its own source code

  • I wish 'AI' (whatever that means) safety conversations could move past the pointless philosophizing of whether a system is self-aware or not. Regardless of whether it's sentient, which is a philosophical question that probably can't be easily resolved- society has a basic, common-sense interest in complex systems run by LLMs not exhibiting unacceptable behavior. Especially when they start interfacing with the physical world. For example, militaries are moving towards completely autonomous drones and fighter jets, equipped with weapons systems. Let's imagine that such a drone chose to bomb a bus full of civilians. Would it somehow be better if the drone wasn't sentient, but just an algorithm having a bad day? Obviously not.

    We're probably going to see increased let's call it 'unacceptable behavior' from increasingly complex autonomous systems. I feel like we should be having calm, practical discussions around safety regulations and best practices, not pointless 'how many angels can dance on the head of a pin' philosophizing about whether it's self-aware or not. It might be helpful to just stop calling everything AI. Safety legislation and best practices might intellectually borrow more from say the manufacturing, chemical, or aerospace industries. Less abstract philosophy please! Well, and less movie references too

  • Given it had write access to its implementation, "unexpected" here reads more like "inevitable" than the witchcraft the article wishes to imply.

  • Python, in a research environment. They did not bother restricting what the AI can do, and let it read and modify its own code in plaintext. Not really that surprising.

    Not sure why this kind of "we are aiming for AGI" code is written in Python. I don't get it.

  • the real inflection point is when an AI can autonomously make money. then it can keep buying servers to replicate itself

  • If I knew this would be news worthy I would have published the dozen times it happened to me. If you feed source-code to an LLM, it will modify it. If you run an llm in a loop and ask it to rewrite code to pass a test, it will sometimes rewrite the tests.

    There's nothing new or interesting about this.

  • nervous laughter