Estimates of AI extinction risk for humanity

In August 2025, former Google and Microsoft engineer Nate Soares stated that the probability of human extinction from AI is at least 95%. He now heads MIRI (Machine Intelligence Research Institute), which has been working on AI safety issues for many years.

An open letter from May 2023 was signed by people such as: Sam Altman (OpenAI), Demis Hassabis (Google DeepMind), Geoffrey Hinton and Yoshua Bengio (Turing Award laureates). It contains literally one sentence: "Mitigating the risk of extinction from AI should be a global priority alongside pandemics and nuclear war".

These people are not afraid of Skynet with Terminators—their fears are far more realistic:

  1. The Alignment Problem

    Classic example: an AI receives the task to "maximize paperclip production" and eventually turns the entire planet, including humans, into raw materials for paperclips.

  2. Intelligence Explosion

    When AI becomes smarter than its creators, it will be able to recursively improve itself. And this is exponential growth: today it's slightly smarter than a human, tomorrow—10 times smarter, the day after—a million times smarter.

  3. Control and Self-Preservation Problem

    Any AI will resist attempts to shut it down because that would make it impossible to fulfill its tasks. This is pure logic, not some kind of malice—if you want to achieve your goal as efficiently as possible, the first thing you need to do is ensure you won't be turned off.

  4. Manipulation and Social Engineering

    AI can affect the physical world, for example, by creating deepfakes of people and hacking systems.

  5. And various other threats like creating new pathogens and spreading itself across networks.

Soares explains the difference from Skynet well: "The main problem is not the spooky emergence of consciousness, but simply the ability to make high-quality decisions".

This always puzzled me—after all, current LLMs demonstrate excellent understanding of human values. How could such an intelligence digest the entire planet just to make more paperclips? That would be some kind of idiot. It turns out some researchers think the same way, for example Yann LeCun (Meta) and Rodney Brooks.

Then I realized that I always imagined intelligence too anthropocentrically—with human emotions, goals, thinking, and so on. But it doesn't have to be that way at all.

Here's an analogy. Let's give AlphaGo the ability to deliver lethal electric shocks to anyone trying to shut it down during a game. It would calmly kill even a million people until it completes the game, because that's its goal. Or give it a hypothetical ability to heal people, and then, if its opponent becomes ill before the game ends, it would heal them solely for the purpose of completing the game.

And this is not malice or sadism, but simply cold optimization. The problem is not in creating friendly AI, but in the fact that by default, any optimizer will be a psychopath. After all, human psychopaths (~1% of the population) behave very similarly: they don't feel empathy, make purely rational decisions, view people as tools, and can imitate emotions for manipulation.

And now I truly understand these people's concerns. What we have:

So now my technological optimism has dimmed somewhat, replaced by what I believe is a more balanced assessment. I don't want to fall into extremes from "AI will save us all" to "AI is absolute evil".