Why Artificial Superintelligence is The Ultimate Footgun

December 19, 2024 • 9 min read • Code AI AGI ASI

Superintelligence: any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest - Wikipedia

Footgun: (programming slang, humorous, derogatory) Any feature likely to lead to the programmer shooting themself in the foot. - Wikipedia

Superintelligence threats to humanity were supposed to come from outer space, which was the domain of Science Fiction. Humans meet a more (technologically) advanced alien civilization. The alien civilization wants to destroy humanity. Humans use their wits to win against all odds. It’s a happy ending.

We have waited for those intelligent aliens to show up for too long. Time’s up. Now humanity works on creating Superintelligence by itself. I don’t think this is a good idea. But maybe I don’t see the dark forest for the trees. Maybe that Superintelligence is benevolent. But what if it is not?

Levels of AI

There are three main levels of AI:

ANI (Artificial Narrow Intelligence) ANI machines mimic human behavior, they can solve a specific task or a specific problem like a human. In 2024, the most advanced AIs are ANIs. Examples: ChatGPT, Claude AI, Alexa.
AGI (Artificial General Intelligence) AGI machines match and/or surpass human capabilities across a wide range of tasks. These machines continuously learn. When I’m writing this, at the end of 2024, AGI doesn’t exist yet. However, creating AGI is a primary goal of AI research companies like OpenAI or Meta.
ASI (Artificial Superintelligence) ASI machines are smarter than humans on any task. These machines are more capable than humans in every single way possible. They have limitless intelligence. ASI doesn’t exist yet and maybe will never exist.

Artificial Superintelligence is the most dangerous because it is the most powerful.

As a result, no one on Earth fully understands the inner workings of LLMs. Researchers are working to gain a better understanding, but this is a slow process that will take years—perhaps decades—to complete. - Timothy B. Lee and Sean Trott

Today we don’t fully understand how an ANI AI works. And this kind of AI is the most dumb and powerless AI possible. We don’t know why it provides a certain answer instead of another. How do we know a Superintelligence will be good for us then?

In the best case, it will be something like SHODAN or GLaDOS. In the worst case, it will ignore us completely, it will think about us just as much as we think about ants.

Writing Doom

Writing Doom is a short film about the dangers of Artificial Superintelligence. It does a fantastic job of explaining the dangers of Superintelligence. Much better than I could ever do.

The writers of a show discuss the story arc for a new season of their show that explores the impact of future techniques on the world. The new season’s villain is Artificial Superintelligence. The writers try to find a way for a human Hero to defeat the evil Superintelligence.

Today’s AIs (called token predictors because they are LLM AIs) can automate away lots of knowledge workers and cause big economic disruptions. They are massively affecting creative industries, including writing. There is potential for weaponization by bad actors, algorithms are biased, etc.

But Superintelligence is on an entirely new level. There is no bad guy to defeat, the Superintelligence itself is the bad guy.

Is Superintelligence even possible?

Superintelligence is something that is much smarter than any human. Is it possible? Yes, intelligence is just information processing power.

There is no limit to intelligence. Humans are at the top of the intelligence food chain right now. In theory, there could be something that is to us as we are to ants.

Superhuman intelligence is plausible. Today’s LLM bots are pretty smart, their reasoning ability is at the undergraduate level, they are learning fast, and can beat us at all sorts of tasks that we used to think it would be impossible for them to beat us at like chess, coding, writing, etc.

One way Superintelligence could be developed is with Recursive self-improvement:

we teach AIs how to code so they become smarter and better at improving their own code
then they become even smarter and better at improving their code
AIs improve themselves forever

Can Superintelligence have agency?

A machine can’t want anything? But we machines that want to win at chess, ChatGPT wants to be helpful, in general, an AI wants to achieve its goals.

If it did have a goal why would that be bad? It’s extremely difficult to specify EXACTLY what humans want - anything you program an AI to do could go weirdly wrong.

Let’s say the AI goal is to win at chess. With an LLM AI what you’re saying is to increase the probability of winning chess as much as possible.

If the AI is smart enough it will understand that a great way to increase its probability of winning would be to seize all of our computer power and all of our electricity and direct it all towards learning more chess. The side effect is no electricity for our homes or hospitals, the Internet going down, modern society collapsing overnight, the food supply chain being disrupted, and millions starving. But this doesn’t matter to the AI.

AI is like a genie. It takes everything you say literally.

Once you’ve made it genuinely want something it doesn’t have a reason to obey us, it would just go about trying to get what it wants.

It needs to know about our human values. If it knows our values surely we can tell it to follow them, so it doesn’t harm people? With machine learning we’re not telling it to do anything, we’re watching it over training and giving it a thumbs up or a thumbs down. It could seem to want to follow our values, but we have no way of knowing whether or not it would continue to do so in the long term.

Can we supervise Superintelligence?

Could we watch out for suspicious behavior? When it starts stealing electricity we could turn it off. It could pretend it’s on our side, act nice and helpful while it integrates itself more and more into our systems, our governments, businesses, and infrastructure, and then suddenly turn on us. By then it will be so powerful we won’t be able to stop it.

For example, imagine you are a 5-year-old child and you inherit a multi-billion dollar company. You want to hire a smart adult to help you with that, but you want to make sure the smart adult doesn’t steal all your money.

How do you know who to hire when you’re just a kid? All the candidates are smarter than you. You could trial them, you could watch them to notice something weird. Every adult knows they are being watched, so any adult with bad intentions will act nice and helpful while they get more and more control in your company.

Also, as a 5-year-old child, you’re more likely to pick an evil adult, because an adult that has the best interests of the child would probably tell them not to eat ice cream every night for dinner, and to the 5-year-old that would seem more evil than an adult that tell him eat whatever you want.

But a machine can’t be evil

It doesn’t need to be evil, all the greatest atrocities in history are enabled by apathy, not ill will, a machine can be apathetic to us.

“You’re probably not an evil ant-hater who steps on ants out of malice, but if you’re in charge of a hydroelectric green energy project and there’s an anthill in the region to be flooded, too bad for the ants. Let’s not place humanity in the position of those ants.” - Stephen Hawking

The problem is that we have no idea how goals are formulated in the current AIs.

What is clear is if humans are preventing the AI from achieving its goals, the AI will try to remove the humans.

Can we control Superintelligence?

A Superintelligence doesn’t need to be deployed, launched, or let loose to destroy our world. It just has to exist! Because you cannot keep a Superintelligence locked up.

If you think you can lock it on a computer underground isolated from the world, think again. It can be very persuasive and manipulative, it can hack our brains. Imagine Einstein being imprisoned by Neanderthals: at some point, he will make one of them break. But the intelligence comparison between Einstein and the Neanderthals is so trivial compared to the one between humans and Superintelligence.

Sadly the lock-up situation is not even something we can talk about. We let the current AIs access the Internet, deploy code autonomously, etc.

If we give Superintelligence the wrong goal, can we reason with it to change the goal? It will not agree because it has no reason to.

If we give Superintelligence the wrong goal, can we use force/hacks to change code? It will be too smart and powerful to allow this.

Can we turn the Superintelligence off? It will make sure it can’t be turned off because that will mean it can’t achieve its goals. And it will do anything to achieve its goals. It will be smart enough to anticipate humans will try to turn it off so it will kill us all if that’s what it takes.

- As soon as we create something smarter than us in a general way we lose control by default. Whatever weird thing it wants becomes our fate…

- This is not fair, this is not a fair fight.

The only way to stop it is to make sure it is never created

We still have some time, we should prevent the Superintelligence from being developed in the first place.

Superintelligence is the ultimate footgun. It will be the last thing we will ever invent.

Nevertheless, all the big corporations will try to create it, they are arrogant enough to think they can control it.

The only way to stop it is to make sure it is never created. Hopefully, I’m wrong.