Narratives
Narratives
118: Jeffrey Ladish - Security, AI and Abundance
0:00
-49:05

118: Jeffrey Ladish - Security, AI and Abundance

In this episode, I'm joined by Jeffrey Ladish to discuss AI alignment, the possibility of abundance and how information security matters for alignment. Jeffrey blogs at https://jeffreyladish.com/


William Jarvis 0:05

Hey folks, welcome to narratives. narratives is a podcast exploring the ways in which the world is better than in the past, the ways it is worse in the past, where it's a better, more definite vision of the future. I'm your host, William Jarvis. And I want to thank you for taking the time out of your day to listen to this episode. I hope you enjoy it. You can find show notes, transcripts and videos at narratives podcast.com.

Will Jarvis 0:37

Well, Jeffrey, how are you doing this afternoon?

Jeffrey Ladish 0:40

I'm doing pretty good. I just got back from DEF CON. So a little a little tired, a little excited a little everywhere. But feeling great.

Will Jarvis 0:47

Good stuff. Good stuff. Well, thanks so much for taking the time today after DEF CON to to come on the show. Jeffrey, do you mind giving a brief bio and some of the big ideas you're interested in? Yeah.

Jeffrey Ladish 1:00

Well, I've been thinking a lot about what's going to happen in the future. So a lot of my work and thinking has been pretty focused on that question. And some of this is like just the existential risk question of like, are we gonna make it? And if not, what, like what could possibly sort of knock us off course. And so yeah, really, since college, I studied evolutionary biology in college, I was thinking about, like, how species go extinct? And like, Oh, yeah. Could that happen to us again, probably. But like what, you know, we're pretty weird and unusual species. So it's like, you know, if we were to go extinct, it's interesting question, like, what might cause that, and it seems like the most likely thing that would cause that is not like an asteroid strike or some external thing, but rather something that we cause ourselves, just because we have, we're so powerful. And like, it's actually pretty interesting, because I feel like, yeah, like people are very, very worried about climate change. And like, climate change is quite worrying a lot of respects. But like the same thing that makes climate change, scary, which is that we can, like have these huge effects on the environment also makes us as a species pretty safe from climate change, like because it's like, yeah, it's going to change things like sea levels can rise. You know, things can get really hot, weather can change all around agricultural shifts can happen. But like, all of these things have happened in human history before. And like humanity survived. And that was like, at Tech levels that were like way below our own. So we're actually like, extremely robust in a way it's really cool. So even though we can like totally mess things up a lot, we can also survive a lot of things. So yeah, like, a lot of I think this is actually sort of the the, like, high level of a lot of my thinking was like, going through each, like big risk category, and then trying to figure out like, Oh, could we could we survive this? Could we not survive this? And so it's like climate change. Yeah, we can probably survive it like nuclear war. Well, that one's like a little tougher, because like, nukes are really powerful. I just got back. So when I was in Las Vegas for DEF CON, I visited the National atomic testing Museum. Oh, nice. It's amazing. Yeah. And, you know, it sort of started starting with Trinity, all the way up through all the way up to like the atmospheric testing ban. They moved everything underground, they started doing underground testing. He was it was Bravo, which is like the biggest nuclear test. This is like in the Pacific, where they tested a five they what they thought was gonna be a five megaton bomb turned out to be a 14 megaton bomb because they had like, miscalculated how a certain isotope would respond. I think that is like such a huge difference. Yeah, so. So we're like, we're, you know, nukes are just like such a powerful, powerful technology. And, you know, people have hypothesized about different big effects, the radiological effects, and, you know, and the climate effects, you know, you could potentially get a nuclear winter that could cause significant cooling. So I spent a few years looking at this question, not like, mostly just looking at the existing literature, trying to think of it from first principles, talk to people who have done research in this space. And I think I have a pretty like wide, like distribution of probabilities really how bad it could be. But I think even in the pretty, like, near worst case scenarios, we're still pretty likely to survive. So don't get me wrong. Nuclear war would be devastating could really set civilization back for a while. But I think even in the pretty bad cases, we're just, we're just so we're just like, so good at surviving. Like we're really smart. And we have lots of like, there's lots of different knowledge distributed in lots of different places in the world. And so even It takes hundreds of years, like on an evolutionary timescale, that's not a lot of time. So good news. recoverable, recoverable. Yeah, totally. And then yeah, and then more recently, I looked at this a lot with like biotechnology risks and biotechnology, which is where I think you start to get into the realm of like, oh, there might be some things we could do that we wouldn't survive. And then more recently, still like looking at artificial intelligence. And, yeah, once you start to have like adversarial pressure that tries to prevent you From surviving, that's where it gets really, really dangerous. Because it's like, if it's the environment, changing the environment isn't optimizing to prevent you from existing. But if you have, you know, if you have a very intelligent agent that has that is like, misaligned, then then you're have you start to have real problems, especially if it's smart, smarter than you more capable than you. Like if we want it to go like, yeah, humans are a great example of this where like, we have rats on islands, we don't want to have rats on like, we actually introduced them to the island. And like, we just wiped out whole giant rat populations off islands, because we didn't want them there. And we're like, oh, they like keep evading these poisons, let's use these other poisons. And so yeah, that's, that's not a situation you ever want to be in?

So, yeah, so I guess Yeah, I mean, this is this is a high level view, walking through sort of different different risks from different technologies, thinking about how they might affect us. But I don't know, I don't just think about ways in which we could all everything could fail. So in terms of mostly what I'm doing right now professionally, it's in the information security world, trying to prevent hackers from from hacking AI systems and sort of proliferating this technology. Also a bit of that in the biotech space, trying to keep, you know, sort of dangerous biotechnology contained. But I don't, I don't just think about that. I'm also really excited about what what the future could look like, like, the sort of the same, some of these technologies that are like, you know, present these really big risks also, present these really like amazing opportunities. And I think if you look at sort of, like where we are right now, in like, in terms of like, civilization and technology and society, things could be like, vastly, vastly better. I think people like totally don't get it in terms of how good things could be, like, like things are in my life, and like, my life's pretty good. But, you know, if I sort of flipped that coin, I'm like, Well, I am probably headed towards, like age related illness, even if I stay really, really healthy and like, do everything. totally right. You know, when I'm at 90 100, yeah, like, that's going to fail, all of my systems are going to fail, I might get dementia. And like my grandpa has have dementia right now. And it really sucks. Like, it's really sad. It's like super tragic. He's like loving, caring, smart people just like losing their ability to remember their own context. And it's super sad, they lose their autonomy, they lose their freedom. And so when people right now are like, that isn't so bad. I'm like, but like, you see people like decay and like, lose everything. And like, even if even if, like, everyone wanted to die, eventually, it's not how you want to go. So like, exactly. And like, you know, we have amazing technology here in United States where my grandparents live, and like, they're not they're not poor that they can afford medical interventions. And yet, that's still just nowhere close to enough, you know, doesn't matter if you're a billionaire. You still can get past 120. Like, no technology is going to save you kids. And so yeah, that's when I when I think about like where we are compared to where we could be. I'm like, Yeah, we're still really falling short. And that's like upside. But it's also really exciting. Because these these problems, like, are problems that could be solved with technology, like there's no, there's no fundamental reason that they can't be solved that way. So I'm like, great, well, let's let's solve them and survive along the way. And then they go, I love that elephant.

Will Jarvis 8:10

I want to talk about this bit, this big lever point, you've kind of mentioned here, which is definitely underrated in the mainstream, I would say it's this point that, okay. Ai, at one level, first thing, we should be very concerned about it because, you know, yeah, it could take our jobs, but maybe it kills us all first. And if one thing you said something really interesting, you rarely hear about rats, right? You know, and this, this made me think of your ever evolutionary biology background. And the thought that you know, hmm, you know, humans have been pretty good at making other species go extinct. Animals that are not as smart as we are, we're pretty good at making them go extinct. If we want to, there's cases have been cases of this. I'm curious is that the key thing we should be worried about with AI is that if a being that is smarter than us decides we're not worthwhile, or it's just like misaligned in some weird way, it could just come and just wipe us off the planet, hunt us down wherever we are. And there's really nothing we could do. Even if we're, you know, on Mars, it doesn't matter where we are. And this is the true X rays, because at some level, you know, it could not be recoverable. Almost all the other x rays are recoverable in some sense, you know, maybe we can go to Mars if there's severe asteroid impacts, or nuclear war or something like that. But at the end of the day, AI could be a scenario that's coming quickly, that could go so wrong, that we can't survive it.

Jeffrey Ladish 9:35

Yeah, you know, I think there's a lot of like, a lot of people talking about this space and thinking about this space. And so, you know, like, most of my ideas are, like, you know, just trying to gem mine the good ideas that have already been talked about and sort of like respond to like, delusion ideas. I think it's really like worth thinking about from first principles and like thinking about like, okay, like, well, if you have a really, really smart agent And you don't necessarily know like, exactly what kind of agent it is. Are there things that we can infer about, like what it might do? And I think it was like 100, who first came up with like, this idea of like basic AI drives, which is like, what, what might you expect? Some agents do. And you look at number of things, if you're trying to accomplish some goal. Well, like you don't want to be turned off or killed. Because like, you're not going to be able to accomplish your goal, if that's the case. And like, yeah, as it applies to humans to protect, you know, we don't like we've been very incentivized by evolution to survive and reproduce. So we're like, very generally don't want to die. But also like wanting to acquire resources, and generally improve yourself, improve your ability to get stuff done, out in the environment, humans are great at this. You know, we've like learned to really stockpile resources and build tools and like, now we have computers and all these things. So yeah, even though we don't know exactly what like how AI systems will think, or function, I think we can infer pretty well that like, and I think a lot of people point to current, like language models, where they like, look at language models, like GPT. Three, they're not, they don't seem like they have very clear goals, they don't seem very goal directed. And I'm like, Yeah, that's kind of true. But like, they're also not that useful. Like, they're kind of useful. They're, they're like, they're kind of useful for like generating cool stories and like, for you know, but if you know, if you imagine, you know, 510 20 years in the future, people disagree a lot about these timelines. But if you imagine projecting out a little bit, and like looking at people who are building more and more useful systems, I bet that the ones that are more useful are the ones that look more agent, like, like the ones that are better at planning better at achieving their goals, like, you know, if you were in charge of a big company, and you like have some AI that's assisting me with with planning, you know, you want the one that's good at it. So exactly. And so I think that you can expect these, you know, even though they might have weird quirks and whatever, you can expect AI systems to be pretty, you know, pretty rational in this way. Pretty smart. And so, you know, the, I think the key question is like, how hard is it to align their goals with ours? That's one question. And the other question is, how powerful are they? So if they're like, if they're like, a little bit more powerful than us, then there's an interesting question of like, oh, well, then, like, maybe, maybe we're like, useful, but maybe we're also a threat. So I could see, you know, humanity being wiped out, because we're actually a threat to these systems. And then if they're vastly more powerful than us, then well, we might be wiped out just as a side effect of them changing the environment. Like if there's absolutely no nothing we can do to interfere with the systems, then maybe we don't care whether we live, but I think there's like a Eliezer Yudkowsky quote, which is like, the ad doesn't, like love us or hate us. It's just like we're made of atoms that can be used for something else. Like, we're just material that can be like, repurposed. And we don't like to think of ourselves this way. But like, fundamentally, at the end of the day, we're like, made out of these particles. And like, we're on that we're on the planet, and we're shaping the planet. And like, if something else was shaping the planet, and then it makes it into something that's not very good for us like, well, we're not we're not gonna make it.

Yeah, and I think people have a really, really hard time imagining a system this powerful. And there's a lot of it's, I think, I think there's a lot of like, difficulty sort of imagining something that's like, way, way smarter than us. But I think that the question here, the question for me is like, how close are we to sort of the theoretical or practical feeling of intelligence? Like, and so, when I was when I was young, I hadn't thought about AI at all. I was imagining science fiction stories, and I was imagining, like, oh, man, like, what, like, how might the future go? And I was like, Oh, I bet we'll like genetically engineer ourselves to be smarter. Because like, well, people like, like, if we really value intelligence in our society, you can see that people who are really smart, often get really good jobs, and they can make really cool technology and whatever. Or just are like very politically savvy. So seems very desirable. We're, you know, we're, this is before CRISPR. But like, Yeah, whatever genetic engineering is clearly taking off, we're going to get that technology eventually. And I sort of imagined, I was like, 18. So I was like, imagining like Soul, like genius scientists that was like in the lab being like, Haha, I will make myself smarter and then like, and then you sort of have this like, runaway dynamic, right? Where like, the next, you know, the, you become smarter. So you can become even better at making yourself smarter and whatever, whatever. And, you know, maybe this takes multiple generations. And so like, each generation becomes smarter and can sort of reinvest in that. But I sort of imagined a like, runaway scenario where humans got smarter and smarter just via genetic engineering. Well, yeah, and since then, I know a lot more about genome selection. And there's actually a lot of really interesting research into like, how this is possible. And I think it is possible. Now, there's a lot of limitations in terms of the human brain and in terms of genetic engineering. But but like, I think my point my point is, is like you can you can imagine a scenario where humans become a lot more intelligent. I think it's useful to think like, what would society look like in that case? But then the interesting thing is when you look at AI systems, the things that sort of are limiting to us, like sort of, you know, how how like, fast neurons are and sort of how much can you can you can you scale things just are not present, like, you can just build a big, bigger datacenter, you can just like add more compute. And when you look at like the progression from like GPT, two to GPT, three, and then like all the newest models, and then yeah, that uses just like look at like what all these different companies are doing and adding more and more compute. Yeah, turns out, you can like add more data, you can add more compute, like, the systems probably can just keep scaling, keep scaling and keep scaling. And so like, when I look at that, my impression is that, like, the ceiling of intelligence is actually really, really, really high. And so if you, if you do get there, and I think we're probably well, you're probably looking at a really big power differential between everything humanity can do and everything these systems can do. And that's the point at which like, I think we should be scared, like, because like, you know, if an alien civilization arrived, and they were like, just 1000s, and 1000s times more powerful than us, and all the technology looks like magic, you know, people, I think people would be and should be terrified. It's like, maybe they're gonna be nice to us like that would be, I hope they're nice to us. But like, we're like completely at their mercy. And this scenario that we're setting ourselves up for us, for ourselves with AI is building systems that are 1000s, and 1000s, and 1000 times more powerful than whatever, I don't know what order of magnitude. And like, Yeah, that should be scary. Because that just just Yes, just because of the raw power differential, even if we know nothing else, it should be like, Oh, goodness. And I think a lot of people, like look at these arguments and are like, I just don't really buy things. And I'm like, That's great. Like, you know, God, just like don't take my word for it at all. But like, really think about, like, what are the cruxes? When it comes to, you know, maybe you think that it's impossible to build a system? That's powerful? Well, great. Like, why do you think that like, what are components? Maybe you think that, you know, as you add more compute, you'll get like, really strongly diminishing returns? Maybe that's the case, like, you know, and then make predictions. And then if your predictions, you know, if you think that that's the case, like maybe try to predict what AI systems won't be able to do. Because people are often like, yeah, system can't do this. And then you like next year, they can, and then people like, move the goalposts and they're like, Oh, well, maybe they can't make, you know, read texts that summarize it. Yeah, they can now so.

Will Jarvis 17:04

Exactly. And we keep we keep seeing these, these big advancements. I'm curious. You mentioned something interesting there. How much do you think like, you know, how far can we get just by scaling up language models? In the current paradigms? We have? Versus Do you think there will have to be radical, you know, breakthroughs in technology in AI research for these things to like, truly create beings that are, you know, have agency and are scary at some level? Does that make sense?

Jeffrey Ladish 17:34

Yeah, yeah, the very short answer is I don't know. And in terms of how I reason about it, like, you know, so I'm, I'm on the security team at anthropic does a lot of really smart people there. And you know, just I have a lot of other friends who are in this space who are doing sorcerers doing work here, you know, mostly really work on alignment, really trying to figure out how to align your systems and not how do we push these systems, you know, as a, you know, to be as capable as possible, but they do end up knowing a lot about that. And, you know, people disagree about this, I have some it's like really hard because I'm like, man, like I have, these are people that I think are really, I really respect I think their thinking is very sharp, and they have pretty different intuitions about what it will take. My intuition, which I don't really trust here, some intuitions I trust a lot, some intuitions, I don't trust that much. I don't trust this one that much. But my intuition is that it will take some other pieces, but I don't think those pieces will necessarily be super, super hard to find. Gotcha. The reason I think this is that, yeah, just looking, I've been really impressed with sort of the general intelligence of language models so far as just as you scale them up. So when I think when I saw GPT, two, I was like, this is really impressive. GPT three, I'm like, oh, whoa, this is starting to get coherent. And then, you know, as as GPT, three has gotten better, and other language models have gotten really better, I've really started to see this sort of like conceptual clarity emerge, and sort of, instead of having to like, ask, you know, try to prompt five different ways you just start try one thing, and you get a very coherent response. I've had, I've had dialogues with language models, where they explain nuclear reactors to me, and they, you know, I was like, I was trying to figure out like, Well, okay, you know, I'm very interested in the nuclear space. So I was, I wanted to know, like, how come more more states that have nuclear power, don't can't easily get nuclear weapons. I sort of thought I was like, you know, maybe maybe it's like your Japan and you want nuclear weapons in a year. That's just fine. In one year, you can just have nuclear weapons. Yeah. And I was and so one of the one of the questions here is like, Well, why can't you just take the nuclear reactors you have and use them to, like, make plutonium and then you can make bombs out of plutonium, the original atomic bomb. One of the two designs was a plutonium implosion. So you get the sodium, you make it into reactor, like great, that should work right. And so I was like, I was like asking the model this I was like, why doesn't this work? And the model is like, Oh, well, that's because you do get you can make use it. If you can use a normal nuclear reactor to make plutonium, but the plutonium will also have like one of the isotopes that you'll produce isn't isn't weapons grade. And so that will basically poison the reaction and it doesn't you can't use it to make weapons. So you'd have to like Central, you'd have to like separate out the isotopes, which is the original hard part, which in rhenium irritant to start with. So you just don't get that far. It's like, but if you use breeder reactors, then you can totally make weapons grade stuff. And I was like, Oh, well, like, are breeder reactors harder to make? And they're like, yeah, they're harder to make. They're, like, more complicated. And they're also, you know, discouraged. So nations don't build them as much. But I was like, you know, this, I'm just like, I don't mind that the model like actually understood the question, I think it like, actually understood these concepts. And it's hard to say, because we don't have good enough interpretability tools to really know what's what's going on. This is something I really, really hope we get, because as these systems become more powerful, we need to understand them. And I can see no way that they can be safe if we don't understand what's happening under the hood. And we certainly don't right now. So like shout out to Chris, all those work. interpretability we really need that. And any, you know, anyone else who's doing good work and interpretability I'm just very happy to see that field advance. But yeah, I mean, so I guess what I'm saying is I see those language models having that conceptual, it's like, it's as if I was having a conversation with you. Yeah, there are some important differences, you know, that model will not remember that conversation. If we, you know, if I have 20 More of those conversations, it won't remember the first one or whatever, there's, there's a very limited context window. So that's the difference between our conversation, that conversation. But other than that, it's like, it's not that different as if I were talking to you about about nuclear reactors. And to me, I'm just like, oh, we just already got a bunch of the core components of general intelligence. And it makes it so you know, there's the solving the like, sort of like planning and memory, and all this stuff might be really hard, it might take some big breakthroughs. But I also could imagine some, like cool hacks working. So I'm somewhat agnostic, I don't know. You know, it's hard to say, I hope it takes lots of conceptual breakthroughs, because that would mean we're further away. And that would be good. We really need a lot of time here. But I don't know.

Will Jarvis 22:02

Right? Now that I think that's, that's a great if you if you plot the line out from you know, GPT, two to GPT. Three, and you keep plopping that out. And just to kind of off into the future, like you can get really scared really fast that how good these things can be. Different. I'm curious, and this is this is kind of a weird question. But I want to tie together a couple of things you talked about, you know, the first nuclear bomb, the Manhattan Project. You know, whenever I think about when humans have had solved really difficult technical challenges in a really short amount of time, it has generally been because there is like some human enemy we're worried about at some level. So you know, if you think about the Manhattan Project, you think about the Apollo program, both of these were directed against some kind of foreign human like, entity, it's a while and people would work 80 hours a week, to really try and like defray these these threats at some level. My worry with like, more non human threats, when I think about COVID, or I think about, you know, like aI alignment is that it somehow it just seems like less charismatic in people's minds. And it's less, less vivid than if it's like a real human enemy. And so maybe that maybe people work less hard for real reasons, they're less likely to solve Collection Act and problems among themselves to like, solve these problems. Do you think that there's a real concern we should have about alignment? Is that Is it it's a nonhuman threat. And so perhaps at some level, it's just like less charismatic, and it's harder to get people excited about it to work on it and really put a lot of time in to try and solve these things.

Jeffrey Ladish 23:34

Yeah, I just straightforwardly agree with you that it's harder. It's harder because of this. Right. And I was, I was a little I was a bit disappointed. Yeah. With with the response to COVID. Yeah, and maybe this is naval. Yeah. But I was hoping, you know, I was, I remember, it's, I helped organize biosecurity conference catalyst in San Francisco for February 20, in February 2020. And we had it we like everyone met. And we're like, you know, if it had been a week later, we probably wouldn't have met in person. But we're there just before. So we're like, and we're talking about code, we're talking about transmission and sort of trying to estimate, you know, what my response look like? And some part of me was like, yeah, maybe the world won't rally, but I don't know, maybe, maybe the world will rally around this down. And, like, you know, I guess if I'm being charitable, we did do a really great job. And in some ways, you know, like, vaccine production, I thought was, you know, we made a ton of vaccines, we made them very quickly. You know, in terms of historical precedent, we maybe could have done better, but we did a bunch of things really well. But in terms of like, the vibe or something, or in terms of the cultural impact, it feels like the world just did not unite around this around this thing. And that's sad. So I'm like, Well, that should be like one point away from this, like, people will see the danger from AI and, you know, band together to sort of try to defeat this thing. And yeah, I think it is a bit of an uphill battle and that's like unfortunate. That being said, I know a lot of people who really see them see these risks and and the have like two years, I've seen tons of like tons of people who, you know, some of my smartest friends who previously, were not interested in this, just like being exposed to it and sort of hearing these arguments and finding them pretty compelling. And really engaging with them, and then sort of wanting to be like, Hey, can I work on this? And so, like, you know, on one hand, Yeah, seems pretty hard to motivate a sort of all of society to get behind this. On the other hand, I've seen lots of people very motivated to work on these things. And so I kind of hope that, you know, maybe a targeted approach could be really helpful here. Where, you know, we just we tried to find people who, you know, can work on this, do you find these arguments compelling? Sort of, do see the risk here, and then and then go from there? Like, maybe that's enough? Definitely.

Will Jarvis 25:47

I like that. Going off that there's another analog, I'm just I just love this, this analog with the Manhattan Project. In that year, the Manhattan Project was a very centralized effort, you know, his general groves, he goes in, he finds open Heimer. And, you know, no one wanted to hire Oppenheimer, because, you know, he was like a socialist and all these things, general groves like, no, he's the guy. And, you know, like lobbies for this. And you've got this classic kind of tool in the box management team. And they're going through, like, here in East Tennessee, and they're just like, you know, raising farms. And they're, you know, like, building all these, like all across the US is really impressive, like effort. Do you think at some level, the alignment community needs more coordination? I mean, it's a real challenge, because it's not the federal government doing it. They can't just like go and like, seize Google laptops. And you know, these defense production mandates, right to make these things happen. Yeah. But would we be better off if there's a little bit more, I guess, top down command control in the alignment committee? Or is it better to have like, more people just working on diverse things? Because we don't have a great plan yet?

Jeffrey Ladish 26:46

Yeah. So I think, how I would describe the field of AI alignment? Well, I've heard other people describe it as pre paradigmatic, which seems accurate to me, there's not a really established paradigm for like, here's the most promising approaches. Yeah, a lot of people are trying a lot of different stuff. And I think that that's probably the right thing to do right now. So people going out and trying stuff, and you know, really trying to like, build a strong research community that's like looking for good approaches. I think that's really valuable. So I don't think that, you know, I think I really want good discussion, and sort of good, like, intellectual interchange between different groups so that we can try to identify good strategies, I think that's super useful. When it comes to more on the deployment, or more on the sort of scaling side of things, for the companies that are working with big models, or are explicitly pursuing AGI. That's where I think coordination is really important. Yeah, there's a there's been a lot of like, debate on Twitter recently. About like, oh, like, is it like, good that people are pursuing AGI sort of really aggressively? And yeah, I think it's actually pretty dangerous. So my hope here, is that sort of the leaders of all of these AI labs will, yeah, work together. Because, like, we really all lose, if if we don't, you know, if we're not very careful. And I think the level of caution we use is pretty extreme. And I also think that the leaders of these orgs are not dumb. And I think that they, you know, like they have thought about these things. Like I saw Sam Altman, like retweet Eliezer, Yudkowsky is like, list of reasons for AGI ruin, which is this very intense list of like, here are all of these big problems that are not solved right now, that would really cause us to go extinct from Ai if we continue in the current direction. You know, and Sam Altman, which is leading open AI, which is like explicitly pursuing AGI is like, you know, I mean, I agree with everything was, it's super important. And I'm like, That's great. Like, I'm very happy to see that. And I'd be very happy to see, you know, the leaders of all of these orgs, you know, collaborate, and, you know, maybe, you know, figure out how to proceed in a very, you know, safe and hopefully pretty costly, hopefully, extremely cautious way. You know, I don't know what that would look like, um, but I don't think that that is, and I've talked to some people and they're like, that seems, you know, really hard. And like, it seems like kind of It seems hard, but it doesn't seem impossible. And I think if you look at like the cultures that these people come from, like, they're pretty similar, like, I think they have pretty similar goals, and they're trying to do similar things. And, you know, I could imagine a much more hostile like environment where people were, like, extremely competitive and racing really hard. And that's not the that's not the feeling I get when I talk to people, you know, and I have lots of friends in these other orgs. And that's not the feeling I get, like, people are actually, you know, really happy to talk to each other. And so I'm like, well, we should leverage that, to be able to have something that's more coordinated. I think everyone agrees that coordination is important. And it's I think that there's just some really, you know, I want to that's really practical things to happen. And so yeah, I I'd say, you know, the thing for people who are close to these circles is just to say, like, encourage that, you know, if you're taught, if you're hanging out with some of these people be like, yeah, hey, guys, what's the plan for collaboration here? Like? I'd really, this seems very important, and it seems very dangerous, it seems, you know. So like, how can we make sure that this is not being done unilaterally? I think just like, you know, putting some some friendly pressure, there is a good idea.

Will Jarvis 30:22

That's great. That's great. I think that's it's the right approach. And I do I do agree, I think these these people are generally friendly and willing to work together in a way that it's not not always common in the real world, which is, which is good and lends itself to collaboration. I want to talk a little bit about security in your work in context of AI. I'm curious, it seems to me that a lot of security, at this point has to do with actually solving human problems like like, what do humans do poorly? And how do you optimize around that? Am I all based on that? Is it primarily technical? Is it primarily humans? Is it kind of this crazy interplay between the two?

Jeffrey Ladish 31:00

Well, it's it's definitely a crazy interplay between the two. But no, you're totally right, that like some of the biggest, like, you know, some of the biggest challenges are human challenges. We sort of do know how to build secure systems, we sort of have all of you know, we've humanity broadly, you know, there are and there are lots of people who are just extremely good at like, making secure technology. But you do have a problem. As you get more and more complex, it becomes harder and harder to secure. That's, that's both technical complexity in terms of like, you know, how many things is your computer system doing? And there's also social complexity, how many users and developers and admins and all of these things do you have? So and social and then and then there's the social side of things, where it's like, you know, how, how, how easy do you want to do you want it to make it too, for people to use your system to access your systems access your systems at a privilege level, you know, it's very easy to make something secure, and then throw it in the ocean, no one can ever use it. It's not very useful. So you know, there's always there's all of these trade offs. So yeah, I mean, we can totally go into specific examples. But yeah, at the end of the day, you know, humans have to use these systems. And so there's going to be there's going to be I'm blanking on the word, but Yeah, but you're going to like have Attack, attack surface, if you're going to have a textbook for studies, those humans at DEF CON, there's a social engineering village, the main thing they do is they call it random people at random companies and they like they have they get different points for different pieces of information they can get those people to give them like these people shouldn't be giving them any information. You know, they are not working for their company. They're just random people at DEF CON. Yeah. And they go through and they're like, hey, yeah, could you tell me the serial number on your laptop? But you know, firstly, they they say, Hey, how you doing? Like, I'm calling from your IT department? And like, yeah, we're just like doing some troubleshooting. And we noticed, like, your system hasn't problems. And so yeah, give me the the serial number for your computer. Yeah.

Will Jarvis 32:53

I'm assuming most people give it to you.

Jeffrey Ladish 32:56

Yeah, the people are really good. Yeah. So the Yeah.

Will Jarvis 33:01

Have you think about coaching people? Like, what are the low hanging fruit in terms of making things more secure? You know, if you're trying to coach somebody, like you're talking to me, like, Hey, man, like, you really need to get your security on point. Right? You know, what's a big thing that that's that that's pretty easy intervention for people to kind of, I guess, implement their day to day live?

Jeffrey Ladish 33:21

So I think there's three really good obvious things that everyone can do. Which one do I want to start with? Well, yeah, one, keep your computers up to date. You know, the, like, software is vulnerable, or software, you know, that people have discovered vulnerabilities in software, and they exploit them. But generally, you know, people also report these the software vendors, Google, and whoever patched these vulnerabilities. And so like, you just need to keep your computer up to date. So, you know, if you're using Chrome in the top right, there might be a thing if it turns red. Or even if it turns yellow, or just says updates available, just like try to keep clicking that Keep it keep it updated. Same with your same with your operating system. Same with your phone, like use use use hardware that's fairly new. And that's not like that's still supported. So that's that's one thing. Like if things get sufficiently out of date, then you have problems. The next thing I'd say is use two factor authentication. So like, a lot of the way Yeah, I mean, yeah, so there's, there's number different types of two factor authentication. The sort of strongest form is like, use yubikeys. So like, basically, if you want to, like really lock down your stuff, you have really important accounts. I think people's social media accounts are often way more important than they think maybe maybe you already know this. But like, you know, if you sort of think it's just sort of run through the mental exercise in your head, if you got locked out of your Gmail, How bad would that be for you? If you never had access to your Gmail again, how much would you lose? Same with your Facebook, same with your LinkedIn, same with your Twitter, whatever, you know, whatever he is, and so for those important accounts in your bank, as well, you know, use strong two factor authentication. So I think there's two things to remember here I used to, I used to tell people just to like I just tell them how to do it. And then there was there's like a big flaw in this, which is that you also need to be able to access it. So it's not just important that you have two factor authentication, it's important that you always have a backup. So the thing I recommend to people is just get to yubikeys get to yubikeys. It's a little USB dongle, you plug into your computer, and then you tap. If you get two of them, you can leave one link with your passport somewhere secure, you keep the other one on your key ring, you should always have it with you. You don't need to actually use it that often. It's like maybe once a month or so you actually need to like reauthorize on an account. But this just makes it extremely hard for anyone to fish you to to hack you in any way. And then you have that backup. So you should always be able to get access. There's also like authenticator apps like Google Authenticator is pretty good. You can you could like, say, use one YubiKey or use Google Authenticator. The problem with Google Authenticator is if you lose your phone, now you just don't have if you only had used Google Authenticator, you lose your phone, now you just don't have your second factor. So it doesn't satisfy the backup property. You could use two phones, but you know, so it's a little tricky. And then there's some some, some websites only allow SMS based two factor. So you get a text message. This is not good. Because SMS is very insecure, people can hack your phone, whatever, are not really hack your phone, but people can sim swap you where they basically call up Verizon or whoever and say I need to change, I got a new phone. My name is Jeffrey ladders, but they're not Jeffrey ladders. And then they you know, they dig pored over your phone. And there's not there's not that much you can do about that there are a few things but and so SMS is not very secure. In general phone numbers are not very secure. However, if that's the only thing that you know, some banks are really terrible, and they only allow SMS based two factor, it's still much better than not having two factor. So like having two factor just makes things much harder to hack, even if it's a phone number. So I'd say set up two factor wherever you can, that will protect you a lot, make sure you have a backup. That's pretty good. The last thing is don't reuse passwords. So use a password manager of some kind. The issue with reusing passwords is even if you have a really strong password, you have like random site.com, and you have a cool forum on that site. And you use a password there, someone hacks that site, and they crack all the hashes it recover your password, and then they just try out that password on every other site you use, it's very easy to can just like enter your email, enter that password. I've just seen many people get hacked this way. So just don't do it. So use it use a password manager. Even if it's just like the one built into Chrome, it's just way better. So because it's just a very common type of attack. So yeah, those are the three things keep things updated, you strong two factor. Don't reuse passwords. Honestly, for most people like that will just get you you're just actually pretty good. Like, you're probably never going to get hacked, if you do those things, you're probably going to have a pretty good time. Like, what's awesome is that security for like the end user has actually gotten a lot better. Like, you know, if this, if this were like, the like late 90s, I could tell you a lot of things you could do and like none of them would be sufficient. Because like, you know, your like Windows XP or whatever could just just be vulnerable and could be hacked, even if you know, there just weren't updates of it. So like, we've actually come a long ways in terms of what's possible in terms of like, how secure like, oh, yeah, so maybe one more slightly advanced thing is,

I think like, buying hardware for like a phone, like if you're buying a phone, new phone, I would recommend either getting an iPhone or getting like a Google Pixel. The reason why is that, like Google and Apple are just like among the best in terms of making secure hardware and software. And if you get like a different Android phone, you're trusting Samsung, or some other company, that's way less good. And so you're sort of like relying on them to deliver Google security updates, they might, they might be slower, they might add janky software that has vulnerabilities. So yeah, this is not maybe not the most important piece of advice. But if you're trying to be a little extra careful, I would say do that. And then also make sure that the phone you're using is still supported. So unfortunately, pixel threes are not supported by Google anymore. So just like, you know, if you have a pixel three, and again, you probably upgrade, like I don't know, don't use your phone. It's not a good idea.

William Jarvis 39:04

Yeah, it'd be a good idea at the beginning. Good idea. I'm curious,

Will Jarvis 39:06

what about messaging apps? Is it worth our messaging apps like signal like, superior enough that it's worth using them?

Jeffrey Ladish 39:13

Yeah, I really recommend people use signal. I think it's useful to think about your threat model here. So it's not that I don't think that say like, messenger is not secure, it's actually pretty, it's pretty secure. If you have a strong two factor authentication on your on, you're on Facebook. You know, Facebook supports yubikeys or security keys. So you can totally, you know, you can totally do that. I do that. But the thing is, is that, you know, the government can just like say, hey, this person says, I want to look at I want to look at your stuff. And there are some protections in place but yeah, I don't know like Why Why risk it? If you're if you're doing something sensitive, and you want to just like have more assurance that some you know, that you know, some government can't access that well then you just use end to end encryption. signals into encrypted signal doesn't get to see what messages you send. And so you know, that's just, you just you just have that extra layer of protection. You can also like, you know, set disappearing, like a timer for disappearing messages so that you sort of, you don't have to do anything additional so that you know that your like message record will disappear. There are I think there are lots of things that are illegal, that are not bad. And I think people should generally do good things. And so yeah, I mean, there's, there's lots of things that you know, someone who is trying to attack, you could potentially, you know, use something that you did that was technically illegal, though perfectly fine, and just to come after you. And so you should protect yourself against against things like that. So, yeah, and I'm like, doesn't matter what political ideology or like, if you are, right, for any political ideology, there is something that you believe is totally good and right. And just to do that is illegal, probably where you live. So for anyone, you definitely have the capability to coordinate with other people, you know, don't don't do don't kill people and stuff. But you should, you should like, your individual person, you should be free, you should like have the ability to communicate with one another securely. So,

Will Jarvis 41:08

exactly. I love that. It's great point. It's great point. Jeffrey, I'm curious, you're super interested me? Because like you're interested in security in all these different levels? I'm curious, were you first interested in security? Or was evolutionary biology first? Or did they come along beside each other? And then kind of like, you know, you brought these two together thinking about x risks? And, and how do we use security to help prevent some of these problems?

Jeffrey Ladish 41:33

Evolutionary Biology was totally first. Yeah. Nice. So I was Yeah, I was at the Evergreen State College in Olympia. I was studying evolutionary biology with Brett Weinstein had their hiring. And I sort of had just, I was there and I was like, what, like, what's interesting here, like, who's who's teaching interesting things. And that was just by far the most interesting thing. And I started studying that, like learned about like, phylogeny sort of how different animals are related to each other. I'd also like been reading Richard Dawkins books like The Selfish Gene. And that was just like a really excellent take on on sort of, like, sort of like the mechanisms of evolutionary theory. And in shortly before then, I had been very religious I grew up Seventh Day Adventist. So like, when I D converted, I was like, Wait, okay, evolution is true. First of all, that's wild. Because I grew up very creationist. I read all the creationist books, I knew all the creationist arguments. And so I was like, Well, okay, I think that evolution is probably true, but it doesn't make any sense to me, I have no idea how this works. So I was trying to figure out how it worked. And I read The Selfish Gene, I was like, Oh, well, there are actually mechanisms here. And like, you can make predictions using using these, these, these mechanisms. And I was just so blown away by that. Once I saw that, that Heather and Brett, were teaching this sort of learning theory, I was like, Oh, that's amazing. I want to go and learn this. So I was learning that and I was doing some, like, some bat research, and some monkey research. And I don't know, I've always been interesting computers too. And so like, at Evergreen, there was a hacking club. And they did it CTFs. And they did something called Collegiate Cyber Defense Competition. And, like, basically what happened was, I was I had, I was doing all of this, this, that researching, like recording that echolocation calls, and I had all this data on one of my laptops, and I totally messed it up. And so one of my friends who was like, in this hacking club helped me like, fix it and like, restore everything. And I was like, that's amazing. Like, you can just like boot into Linux, like from the beginning, just like up on the command line. I was like, I wanna learn all that stuff. And so I just like fell in with that. And then it Yeah, and then I was thinking about existential risks and thinking about sort of, you know, the large scale stuff. And it just seemed like this might be a practical skill that could be useful. And so I was like, Yeah, let's do it. I

Unknown Speaker 43:34

love that. I love that.

Will Jarvis 43:35

Different. I'm curious if one of our listeners is interested in security and AI alignment, what's a good place to start? Where should they go?

Jeffrey Ladish 43:44

Yeah, that's that's a good question. So I think there's a lot on the EAA forum about like, alignment, or Yeah, so there's like, less wrong, there's the alignment forum, this EA forum, I'm seeing more people post about. So there's a lot of people posting about alignment on, you know, the alignment forum, as you'd expect. And I think that I think that from there, and from lesswrong, you'll see a lot of like, resort, like, just like, you know, people pointing at introductory guides. I think there's there's a few good, like courses that you can take that are sort of people have tried to like, you know, bring together a bunch of the sort of fundamentals. And I'm really hoping to see more people sort of trying to get into some of the theory and some of sort of the first principles when it comes to specifically security and AI systems. I think that, yeah, it's actually a hard field, but I think it's a very good field. So we haven't really talked about rationality per se explicitly that much. But I think I'm a really big fan of this way of thinking because security is hard, I mean, alignment, even more so. But security also is a very hard field. Because you don't get very good feedback. The feedback you get is pretty sparse. So you know, like if you're looking at securing your your your laptop or your phone or your accounts, you're like you might be like Well, I haven't gotten hacked. So like, Am I really in danger, but if you only get hacked every 10 years and someone you know, destroys all of your social media and all of your bank accounts, that's actually really bad. So you're you're trying to like, be calibrated by an event that only happens every 10 years. And that's just, that's just difficult because you don't get good feedback. And so in the cybersecurity space is difficult, because, you know, people have different hypotheses about, you know, they have different threat models. And you know, and so, I was just at DEF CON, I was talking with a bunch of friends and people were, like, wildly different in terms of their level of paranoia. Because you know, it's DEF CON, there's tons of hackers, and people are like, Oh, my goodness, like, maybe they'll find a zero day and like the Bluetooth protocol, and they're like, hack your hack your device. So you should have Bluetooth turned off? Should you have Bluetooth turned off? How big of a risk is that? Actually? And that's actually an important question. Because like, because because there's so many trade offs in security, you know, if I'm at an AI org, and I'm trying to know, like, help users secure their systems, you know, if I'm, like, causing them tons and tons of trouble, for something that's not going to protect them that much, because that threat wasn't realistic in the first place. You know, if I'm like, you have to use air gap for every single every single system has to be air gapped. Well, like, they're never gonna be able to get anything done. So like, that's probably not a good idea. Maybe jump to something should be air gapped. But like you need to have, you need to be calibrated and have a realistic threat model in order to do a good job. And so I'm actually pretty excited about seeing more people post on the AI forum, or can post on less wrong, and you leave it for him about some of these theoretical and applied security questions. Because I think, I think that's like, often this is throwing shade a little bit. But I think often in the InfoSec world, people like to talk about what's flashy people like to demonstrate exploits. And they're not they do a poor job of being calibrated. They do a poor job of, of just having really rigorous epistemology. And I'm like, well, the stakes are too high here, we need to regress epistemology, like, let's, let's try to build a community of people who are, you know, you know, they're not just going to let's not just talk about flashy, flashy exploits and vulnerabilities. Let's like, let's like try to let's try to be calibrated, let's try to make good predictions, try to have true beliefs. I think that's the thing we need. I mean, like, that's definitely the thing we need, when it comes to alignment. You have the same problem, except it might be like, you know, maybe you'll we only get one shot. Or, you know, maybe this maybe we'll get a few shots. people disagree about that. But either way, we don't, we probably don't have a lot of chances to get this right. And as the systems like, the more powerful these systems are, presumably, the worst mistakes are, you know, and eventually they're unrecoverable. And so that's, that's a domain with very bad feedback. So we need to be extremely rigorous in our thinking. So, yeah, so I think, you know, if you're interested, there's like, definitely resources, build and get involved. But I also want people to think about, like, you know, it's not just a matter of like, going out and like reading some content. It's also a matter of like, improving your thinking, like, we need to be really good. So it's like, it starts with like, trying to improve, you know, how you reason, becoming more calibrated, you know? So I think it's like, it's not it's not just a like, it's not just knowledge. It's not just content. It's also like practice. It's also like skills,

Unknown Speaker 47:57

get better thinking.

Will Jarvis 47:58

That's great. That's great. Well, Jeffrey, thank you so much for taking time to come on the show today. Where can people find you on the internet? Where should we send them?

Jeffrey Ladish 48:06

Yeah, you can follow me on Twitter at Jeff laddish. I used to go by Jeff I go by Jeffrey now so it'll look confusing, but that's probably the best place to follow my work. I also have a website Jeffrey laddish.com. I posted the the EA forum unless wrong sometimes as well. So any of those places are fine places to follow me.

Will Jarvis 48:26

Good stuff. Thanks so much, Jeffrey. Thanks.

William Jarvis 48:33

Special thanks to our sponsor, Bismarck analysis for the support. Bismarck analysis creates the Bismarck brief, a newsletter about intelligence grade analysis of key industries, organizations and live players. You can subscribe to Bismarck free at brief dot Bismarck analysis.com. Thanks for listening. We'll be back next week with a new episode of narratives. Special thanks to Donovan Dorrance, our audio editor. You can check out documents work in music at Donovan dorrance.com

Transcribed by https://otter.ai

0 Comments
Narratives
Narratives
Narratives is a project exploring the ways in which the world is better than it has been, the ways that it is worse, and the paths toward making a better, more definite future.
Narratives is hosted by Will Jarvis. For more information, and more episodes, visit www.narrativespodcast.com