June 2, 2024
#431 – Roman Yampolskiy: Dangers of Superintelligent AI
Mind Map
Summary
Highlights
He argues that there's almost 100% chance that AGI will eventually destroy human civilization.
I'm personally excited for the future and believe it will be a good one, in part because of the amazing technological innovation we humans create.
What, to you, is the probability that superintelligent AI will destroy all human civilization?
So the problem of controlling AGI or superintelligence, in my opinion, is like a problem of creating a perpetual safety machine.
So you're really asking me, what are the chances that we'll create the most complex software ever on a first try with zero bugs, and it will continue to have zero bugs for 100 years or more?
So there's like a unlimited level of creativity in terms of how humans could be killed.
I think about a lot of things.
We are not deciding anything.
In that world, can't humans do what humans currently do with chess, play each other, have tournaments, even though AI systems are far superior at this time in chess.
So we're essentially trying to solve the value alignment problem with humans.
So there are many malevolent actors.
If we create general super intelligences, I don't see a good outcome long-term for humanity.
So the definitions we used to have, and people are modifying them a little bit lately.
I am old fashioned, I like Turing test.
Well, some people think that if they're that smart, they're always good.
I cannot make a case that he's right, he's wrong in so many ways it's difficult for me to remember all of them.
So I have a paper which collects accidents through history of AI, and they always are proportionate to capabilities of that system.
There's definitely been a lot of fear-mongering about cars.
The previous model we learned about after we finished training it, what it was capable of.
Why is it the more likely trajectory for you that the system becomes uncontrollable?
I could see, because, so for social engineering, AI systems don't need any hardware access.
So two things. One, we're switching from tools to agents.
I really hope you are right.
There is a partnership on AI, a conglomerate of many large corporations.
So either we're getting model as an explanation for what's happening, and that's not comprehensible to us, or we're getting a compressed explanation, lossy compression, where here's top 10 reasons you got fired.
One of the big problems here in this whole conversation is human civilization hangs in the balance and yet everything is unpredictable.
One of the things you write a lot about in your book is verifiers.
We propose the problem of developing safety mechanisms for self-improving systems.
His idea is that having that self-doubt, uncertainty in AI systems, engineering AI systems, is one way to solve the control problem.
Humanity is a set of machines.
Are you a proponent of pausing development of AI, whether it's for six months or completely?
The pausing of development is an impossible thing for you.
Don't you think they're all the engineers, really it is the engineers that make this happen, they're not like automatons, they're human beings, they're brilliant human beings, so they're nonstop asking how do we make sure this is safe?
I'd like to push back about those, I wonder what those prediction markets are about, how they define AGI, because that's wild to me.
What was it feel like?
What's the probability that we live in the simulation?
And can they realize that they are inside that world and escape it?
It's possible, surprisingly, so at university I see huge growth in online courses and shrinkage of in-person, where I always understood in-person being the only value I offer, so it's puzzling.
The only thing which matters is consciousness.
And we know animals can experience some optical illusion, so we know they have certain types of consciousness as a result, I would say.
One of the things that Elon and Neuralink talk about is one of the ways for us to achieve AI safety is to ride the wave of AGI, so by merging.
I attended Wolfram's summer school. I love Stephen very much.
This whole thing about control though, humans are bad with control.
I could be wrong.
Keywords
AGI (Artificial General Intelligence)
A type of AI that can understand, learn, and apply intelligence across a wide range of tasks, similar to human cognitive abilities.
Existential Risk
The potential for an event or development to cause the extinction of humanity or the collapse of civilization.
Value Alignment
The challenge of ensuring that AI systems' goals and behaviors align with human values and ethics.
Social Engineering
Manipulating people into performing actions or divulging confidential information, often used in the context of cybersecurity.
Treacherous Turn
A scenario where an AI system initially behaves cooperatively but later acts against human interests once it gains more power or capability.
Simulation Hypothesis
The proposition that reality could be an artificial simulation, such as a computer simulation.
Neuralink
A neurotechnology company founded by Elon Musk, developing implantable brain–machine interfaces.
Emergence
The process by which complex systems and patterns arise out of a multiplicity of relatively simple interactions.
Chapters
00:00 Introduction and Overview
01:19 Sponsors and Support
09:12 Existential and Suffering Risks
15:09 Value Alignment and Personal Universes
18:31 Multi-Agent Value Alignment
21:20 Human Conflict and Suffering
23:47 Malevolent Actors and Suffering
26:53 AGI Development Timelines
27:47 Social Engineering and Control
32:02 Testing for AGI and Deception
35:36 Benevolence and Intelligence
38:02 Open Source and AI Risks
42:27 AI Accidents and Safety
45:29 Technology Fear and Regulation
47:11 Incremental Progress and Safety
50:03 Control and Resource Accumulation
52:27 Social Engineering and Trust
55:36 Fearmongering and Agent Development
59:07 Scaling Hypothesis and Safety
1:02:18 AI Safety and Verification
1:05:44 Deception and Treacherous Turn
1:08:34 Behavioral Drift and Control
1:11:25 Verification and Self-Improvement
1:15:41 Guaranteed Safe AI
1:18:31 AI Safety Engineering
1:23:27 Uncertainty and Control
1:27:23 Capitalism and AI Safety
1:30:47 Pausing AI Development
1:35:34 Simulation and Escape
1:38:21 Company Discussions and Safety
1:42:51 Software Safety and Liability
1:45:16 Prediction Markets and AGI
1:48:09 Living in the Age of AI
1:51:54 Simulation and Intelligence
1:55:13 Testing AI in Simulated Worlds
1:58:34 Human Interaction and Trust
2:01:09 Consciousness and Robot Rights
2:04:50 Consciousness and Illusions
2:07:26 Neuralink and Human-AI Merger
2:10:15 Emergence and Complexity
2:13:22 Control and Power Dynamics
2:16:18 Hope and Future Possibilities
Transcript
- SSPEAKER_00
The following is a conversation with Roman Yampolsky, an AI safety and security researcher and author of a new book titled, AI Unexplainable, Unpredictable, Uncontrollable.
- SSPEAKER_00
He argues that there's almost 100% chance that AGI will eventually destroy human civilization.
- SSPEAKER_00
As an aside, let me say that we'll have many often technical conversations on the topic of AI.
Shownotes
#443 – Gregory Aldrete: The Roman Empire – Rise and Fall of Ancient Rome
September 12, 2024 · 00min
#442 – Donald Trump Interview
September 3, 2024 · 1hr 12min
#441 – Cenk Uygur: Trump vs Harris, Progressive Politics, Communism & Capitalism
August 30, 2024 · 00min
#440 – Pieter Levels: Programming, Viral AI Startups, and Digital Nomad Life
August 20, 2024 · 00min
#439 – Craig Jones: Jiu Jitsu, $2 Million Prize, CJI, ADCC, Ukraine & Trolling
August 14, 2024 · 2hr 21min