Vibes & Benchmarks Ep 03: Is SpaceX really worth $2 trillion?
Episode 3 of Vibes & Benchmarks is up. Ali Rohde and Josh Albrecht cover what actually happened in AI this week.
Topics this week:
- The SpaceX S-1. Why a ~$3B launch market can't justify a $2 trillion valuation — and why Josh would rather own two Anthropics than one SpaceX.
- George Hotz's "Eternal Sloptember." Can agents actually program, or are you using them wrong? Why AI coding tools magnify bad engineering instead of fixing it.
- "100% written by Claude Code." Whether that's a flex or a red flag.
- The cost-optimization phase. Uber admits AI token spend is "harder to justify" — what changes when tokens stop being 1% of the budget.
- AI solving Erdős problems. Why the "AI can't reason" camp is running out of road.
- The Pope's AI take. Chris Olah spoke at the Vatican, and why Pope Leo might understand AI better than Gary Marcus.
- OpenAI's reported IPO. What public-market discipline does to a mission-driven lab.
- Is AI making us safer? Short term, no — and the atomic-bomb, ozone, and self-driving-car analogies for when we finally react.
Subscribe to Outset Capital on YouTube for new episodes each week.
▸Read the full transcript
Intro
[00:19] Ali: Josh, welcome to Vibes and Benchmarks episode three.
[00:10] Josh: Well we have a new name on episode three. Sick.
[00:11] Ali: Yeah. I didn't run it by you. I figured you wouldn't care.
[00:16] Josh: I don't care.
T1 — GeoHot "The Eternal Sloptember"
[00:19] Ali: First of all I wanna talk about this new George Hotz piece called The Eternal Sloptember. I sent it to you. Did you see it?
[00:38] Josh: I did. I read the first half of it.
[00:43] Ali: Let me just summarize. This is the core claim, and I'm quoting: "I'm calling it now — the adoption of AI agents into software development will be one of the most costly mistakes in the field's history. Agents cannot program, and it's taking longer and longer to realize that they can't. They are a highly sophisticated statistical model designed to mimic the distribution of programming. The output is broken, but in a way that's getting harder and harder to detect — which is exactly what you would expect from an increasingly accurate statistical model." Do you agree?
[01:20] Josh: No. As he says in here — "and in before, you're using it wrong" — sorry, you're using it wrong. It just isn't the case. He's using it wrong, and other people are using it wrong. To not get value out of these is just you're using it wrong. Because you can still write all the code by hand, you can still read it all, you can still do exactly what you did before.
[01:48] Josh: It doesn't force you to do anything else, but it gives you these amazing capabilities.
[01:56] Ali: How do you use it right?
[01:58] Josh: I've talked about this many times, but to use it correctly, you should plan what you're gonna do, and then you should check if it actually did it — which is just the same fucking thing you do when you're normally coding. So just do programming like you're supposed to. It just happens that you don't have to type out as many of the words anymore.
[02:14] Ali: Got it. So you're saying he's doing it wrong. Many people are doing it wrong. Maybe you'd agree that a failure mode is that it's really easy to do it wrong with these tools, but it's not the tools, it's you.
[02:23] Josh: It's even worse than that. It's not the tools. You were already doing it wrong. You were doing it wrong the whole time before. Enterprise code bases were a giant fucking mess that made no sense. And now you can do that even faster. Since you were already doing it wrong, doing it wrong even faster doesn't help you. So this has nothing to do with the tools. They're just letting you do what you were gonna do, but better. And if you were gonna do a bad job, it's not gonna make you suddenly magically do a good job.
[02:51] Ali: So it's a magnifier. If you're bad, it makes you worse. If you're great, it makes you even better. He says he has been using it wrong, or it's not helping him.
[03:02] Josh: Sorry, what did this guy write before? Who's this guy?
[03:07] Ali: I actually had the same question. He's very well known, has a cult following. He's a security hacker and engineer. I think he became famous for jailbreaking.
[03:14] Josh: I mean, security — then I can definitely see it.
[03:19] Ali: He also said later, "Without fully endorsing all their ideas, I'm now in the LeCun/Marcus camp on LLMs. I don't think models like this will ever be able to program."
[03:29] Josh: That's very silly.
[03:31] Ali: "I think deep learning is still a solution, but real programming agents will need world models, not some RLVR shit that comments out the failing test and tells you all the tests are now passing. The real story of this era will be who manages to avoid harming themselves in their AI psychosis."
[03:48] Josh: Comments out the tests. Don't you have a CI pipeline? Aren't you checking how many tests ran? What are you doing?
[03:58] Ali: It sounds though that you kind of agree with his other point, which you alluded to — about large organizations. That agents will end up hurting large organizations more than high-performing individuals or small orgs. He says he's seen how his friends and coworkers adopt these tools and they're smart with it. But at large organizations with slower feedback loops and less alignment, they're gonna just basically use this to produce slop.
[04:22] Josh: Yeah, I agree with this part of it. One of the lines he has here is, you know, Apple is pushing AI on all their engineers. When people think in the abstract, they think AI will do all this stuff. But let's focus on a concrete example: do you think Mac OS will get better or worse in the next two years? For sure worse. No question.
[04:44] Ali: Because all of these engineers are shipping AI slop.
[04:47] Josh: Yeah, exactly. I agree it will end up with a bunch of low-quality code being produced. But you weren't supposed to be writing your operating system by vibe coding. You can use AI agents in a responsible way, even when writing an operating system — for example, to have them check for a lot more shit for you. But you shouldn't be just vibe-creating an operating system. Of course that's not gonna work out very well.
[05:15] Ali: But people will do that.
[05:16] Josh: They will for sure. But that's people holding it wrong and doing it wrong, nothing more than that.
T2 — "Claude Code is 100% written by Claude Code"
[05:23] Ali: This reminds me of another question I had prepared. Not really new — they've been claiming this for a while, but it came up in my feed again — Anthropic brags that Claude Code is a hundred percent written by Claude Code. Is that something to brag about? Does that make you more or less likely to want to use Claude Code?
[05:41] Josh: I think — show us the code. Wait, you did show us the code. It was a bunch of crap. People did. I remember looking at some reviews of it. It does not seem like it was code that you would be proud of. Also, it has this feeling of vibe-codedness to it. There's a really good example with fast mode. There's a whole multi-page document on fast mode about how this setting can only be enabled in this way, it has these different environment variables, and it works different than the other things.
[06:10] Josh: Who made this? How about try making it a fucking boolean? Like, it's true or false. It's either on or off. You already have settings. Just make it a regular boolean like everything else. I'm sure this one was a combination of Claude and PMs — like the worst of both worlds.
[06:29] Ali: You're saying AI is bad because, one, it produces AI slop by engineers, but two, it allows PMs to write AI slop.
[06:36] Josh: Well, in this case, it was because they allowed PMs to meddle more than they should have — "we don't want people to be able to turn off fast mode for all their things, we want it to stay on by default and then they'll pay us more." Like, yeah, okay. How about not? Regular engineers would have pushed back, but now Claude made it so that it's lower cost to them to do it.
T3 — Demis Hassabis: "Foothills of the singularity"
[06:59] Ali: All right, next topic. Google DeepMind CEO Demis Hassabis said at Google's developer conference last week that humanity is standing in the quote, "foothills of the singularity." Do you agree? I thought that was a nice turn of phrase.
[07:12] Josh: Yeah, sounds about right.
[07:18] Ali: He was really trying to get people jazzed about safety, saying we have just a few years left. I think his AGI timeline is twenty thirty or twenty twenty nine. Do you agree?
[07:29] Josh: I don't know what AGI means. We have things that are definitely better programmers than some programmers who exist today. Okay, cool — better mathematicians, et cetera. They make all sorts of mistakes, but they're just gonna make fewer and fewer mistakes. There's not gonna be this big jump at twenty thirty where we're like, now it's AGI.
[07:48] Ali: Basically, you don't think there's going to be this big moment ever.
[07:52] Josh: No. But the big moment is now. It's the next few years. It's this continual process of: there's new models, there's new techniques, things are getting better and better. You can do new and newer things every week. There's new infrastructure. Things are changing constantly. That's what it means to be in the singularity. It's just really high flux.
[08:11] Ali: He also mentioned self improvement — recursive self improvement, which we've talked about a little bit before. He said we're not yet at the point where the systems are getting better on their own, but the pace of development is clearly accelerating. Do you agree with that?
[08:25] Josh: It's one of these things where it's a spectrum. Already the systems do get better on their own. Google even has a thing called Vizier from years ago that does automated hyperparameter optimization for their systems. How is that not the system getting better on its own? It definitely is. It takes a lot of time to tune that stuff without such a system. And I guarantee that they have other similar systems for running experiments right now. So it's just a question of the fraction of the stuff, or the exponent on the improvement. It's low right now. It will go up a little bit, but it can only ever go so high anyway. There are fundamental limits to how fast you can do science. If you want to know what happens when I run a one billion parameter model or a one trillion parameter model with these hyperparameters, you kind of need to run it at some point.
[09:13] Ali: Yeah, I think this touches on our previous discussion about everyone talking about self-improvement models, but perhaps that's misguided and they're not gonna change that much.
[09:26] Josh: It's more that it'll just be a slow improvement. As we get better and better engineering practices and make the thing comment out the tests less and less. That's a thing you can fix. It'll happen. Maybe it happens one percent this year, maybe point five percent next year, maybe point one percent the next year. Those are all meaningful.
T4 — SpaceX S-1: "Not worth $2T. Not worth $100B."
[09:52] Ali: Let's talk about SpaceX. You sent me a piece about the SpaceX S-1 titled "SpaceX is not worth two trillion. It's not worth a hundred billion."
[10:02] Josh: They want two trillion for it? No way. That's insane.
[10:04] Ali: Yeah. He could get it easily, I think. There's just so much retail demand.
[10:11] Josh: Wow. Y'all are crazy. Yeah, I mean, that's what that is.
[10:19] Ali: You are not buying this story.
[10:20] Josh: I am not buying this story. I would much rather have two Anthropics than one SpaceX.
[10:26] Ali: Same. Reading the piece, was there anything in particular that stood out to you?
[10:32] Josh: It's the same thing that you and I talk about a lot with companies. If you have a really small market like launching shit into space, it's hard to make a really big company. As they point out, the total launch market is about three billion dollars. It's hard to make a one trillion, two trillion dollar company when you can only make three billion dollars a year. And then yes, you can glue on some AI things, and you can glue on whatever else they glued on to it — Twitter or whatever. But that's not SpaceX.
[11:03] Ali: Apparently they said that the whole TAM is twenty eight point five trillion and that of that, the AI business is twenty six point five trillion.
[11:13] Josh: Wait, wait, wait. Twenty eight point — sorry — twenty eight point five trillion for —
[11:20] Ali: This is what they said the TAM was.
[11:22] Josh: For launching stuff to space? No.
[11:24] Ali: No, no, no. For SpaceX AI, i.e. AI.
[11:27] Josh: Okay. Yeah, 'cause — wait, but twenty eight point whatever trillion minus point three trillion is —
[11:32] Ali: Is two trillion, which is not the size of everything else. Maybe he's already factoring in rolling Tesla up in it as well. No, he's not.
[11:44] Josh: I don't have any idea how those numbers check out to say this is a company worth two trillion dollars. Now, you could say, like, we're gonna launch a lot more stuff into space. We talked about how the space data centers are stupid. I don't think we're gonna be launching — there's no reason to do that. If you wanted to make two trillion dollars, somebody would have to have paid some of the trillions of dollars to put their stuff in space. Why not just put them on Earth? You weren't gonna make that many trillions of dollars worth of data centers. So yeah — doesn't make sense to put this in space.
[12:15] Ali: You don't buy the story. What about Cursor? How does this fit into things? Does that make you more bullish or less bullish?
[12:22] Josh: Buying a negative 23% gross margin business for $50 billion or $60 billion does not make you have a better business. I can make a very big business that sells dollars at a loss. It'll be called the dollars-for-less-than-a-dollar business. Those are not good businesses. Now, you have distribution and stuff or maybe doing some future thing — but what would you distribute to them? Your models that people don't like and don't use?
[12:42] Ali: At least you have an AI business.
[12:47] Josh: Anthropic's models, because they're selling — you're using your compute. That's just called Anthropic. That's why I want Anthropic.
[12:54] Ali: So it doesn't help. Do we think it'll actually happen, this acquisition?
[12:58] Josh: Well — is it before or after the IPO?
[13:06] Ali: After.
[13:08] Josh: If it's after the IPO, it feels like — I don't know, I would almost investigate that for securities fraud or something.
[13:17] Ali: Why?
[13:17] Josh: Well, because it's a weird, like — we're gonna acquire it — never mind. It's a little strange.
[13:22] Ali: That was the plan the whole time, because basically acquiring the company wasn't really feasible before going public. It just would probably slow down the process.
[13:30] Josh: What I'm saying is, if they don't acquire them after they go public, then it's a little —
[13:34] Ali: They have a built-in ten billion dollar breakup fee.
[13:37] Josh: I know, I know. So it's a little strange — but then as shareholders, you're like, why did you just lose ten billion dollars for no reason at all? Like this is — and you did it pretty clearly just to — it's not a two trillion dollar business. It's a business that loses tens to hundreds of billions of dollars.
[13:46] Ali: Hey — if it's a two trillion dollar business.
T5 — Uber COO Andrew McDonald: AI token spend "harder to justify"
[13:56] Ali: All right. Next subject. I read this quote by the COO of Uber, Andrew McDonald, and I thought it was really interesting. He basically said that AI token spend is getting harder to justify. He said headline stats like AI-driven code commits, token usage, and employee adoption can look explosive, but Uber still struggles to draw a direct line from those metrics to more useful consumer features, especially as they slow down hiring. I thought that was very honest. That's not what you hear. Everyone else is like, no, no, no. Salesforce, Marc Benioff is like, we spend 500K per engineer. It was quite honest. For Uber especially — what are you gonna do to make Uber better? It's not really about having great code or a ton of code. It's a marketplace that works great already that they've been working on for a long time. So it's like, okay, what is AI gonna do?
[14:52] Josh: Mostly AI can be used to justify layoffs.
[15:00] Josh: Yeah, for Uber, for example. It's like — what do any of your engineers do? Uber shouldn't really be — name a new feature that happened in Uber in the past, like, three years.
[15:01] Ali: Yeah, and just cut down on costs, like customer support and all that. Let's just get rid of all the engineers now. Actually, they just allowed people to intentionally prioritize female drivers.
[15:25] Josh: Okay, cool. Now you can discriminate based on gender against your drivers. Awesome feature. We're not using this one, but — okay.
[15:28] Ali: They allowed women to do this, I think, only.
[15:41] Ali: There was a great tweet — this was unrelated to Uber, but I thought it fit in well. This guy "Stay Sassy," S-A-S-S-Y, naturally. "Hello, 100x engineer. You spent 100k in tokens this month. What have you to show for it? I was building a harness for my AI tooling setup. Nothing that impacts the company bottom line. Sounds good to me. FYI, we're gonna go lay off half the company because we're over budget. Keep up the good work, buddy."
[16:11] Josh: Accurate.
[16:15] Ali: This I raised because we've been talking a little bit about the different phases of AI and how thus far we've been in the mostly experimentation phase, which is why people are kind of consuming as many tokens as possible. Everyone's using frontier models. And I think you've said that that will change once companies start moving into the cost optimization phase of AI. Maybe Uber is kind of early to that game. I'm guessing you expect other companies to share this realization soon.
[16:50] Josh: Yes. When the tokens are one percent of your overall budget, who cares? Doesn't matter. Has no impact on anything. When it is the same as headcount, which is like fifty percent of your overall spend — or more or less depending on your company — then you start to care about this in the same way that you start to care when you're giving people raises or how should we hire, or should we do layoffs. Those things matter a lot to the health of the business. So the token spend will also matter. And then you'll start to think about open source models or ways to cut down on this.
T6 — Andrej Karpathy joins Anthropic
[17:29] Ali: Let's talk about Andrej Karpathy. Did you see the recent news? You didn't see it? Josh, get on Twitter. He is joining Anthropic.
[17:33] Josh: No, no.
[17:41] Ali: Now a member of technical staff at Anthropic.
[17:44] Josh: He joined OpenAI before and then left after a year.
[17:47] Ali: He was a founder of OpenAI.
[17:48] Josh: Yeah, exactly. And then he came back, right? And then he left again, didn't he?
[17:53] Ali: Maybe, I don't remember.
[17:55] Josh: Yeah, I thought he did.
[17:57] Ali: It definitely lit up the timeline — obviously not yours, or you're not checking it. Does it matter?
[18:06] Josh: Not really. I'm glad that he's there. I like Andrej, I like Anthropic, but I mean — yeah, he's one of ten thousand people or whatever there. Cool. I'm sure he'll do good work. There's a lot of stuff to do.
[18:22] Ali: I think a lot of people paid attention because, one, he has such a huge following, but two, he is a co-founder of OpenAI. So for him to join Anthropic, even though he hasn't been at OpenAI for a long time, it's kind of a sign of the times.
[18:32] Josh: Yeah, but all the co-founders of Anthropic worked at OpenAI. So —
[18:36] Ali: Yes, but they weren't co-founders of OpenAI.
T7 — AI solving Erdős-class math problems
[18:42] Ali: We are seeing so many announcements of huge scientific improvements. OpenAI just solved an Erdős problem, then Google solved like eight the next day. I don't really know how to weight that. Are those huge? Do they matter? It's almost commonplace at this point — hearing like, AI solved this new challenge that has stymied people for a hundred years.
[19:14] Josh: Yeah. We had other AI improvements before, like making matrix multiplies faster — these eking out really small improvements on some of these technical benchmarks. Those are pretty hard problems. I think most of them don't have direct real-world implications. It's more of a mathematical curiosity than the world changing — like if you invent room-temperature semiconductors or something. So it's cool, but I think we already thought they were pretty powerful.
[19:58] Ali: You think they're precursors to scientific discoveries that do affect the real world that are not just academic?
[20:03] Josh: Yeah. Over the long term, yes. And this is why I found that original blog post we were talking about to be a little bit strange. You can't really be in the Marcus or Yann LeCun camp anymore when things are solving math problems that people have not solved for a long time. They might not be smart in the same way that we are, but they're certainly capable of doing useful stuff.
T8 — Pope Leo XIV + Chris Olah at the Vatican
NOTE — false start: Ali bails on the first take at 21:17 ("let's just skip this one"). They detour briefly (Obsidian/Claude Code/Google Docs aside) and Ali restarts at 22:15 with the clean take below. The clean version is what goes in the episode; first take should be cut entirely.
[22:15] Ali: All right, so this weekend Anthropic announced its newest partnership with Pope Leo, naturally, and Chris Olah, the Anthropic co-founder and interpretability lead, was invited to speak. He spoke.
[22:38] Josh: He spoke.
[22:39] Ali: What does that say?
[22:43] Josh: As much shit as I give Anthropic for different things, they're by far my favorite of the three companies making these models. I do think that a lot of the people there are trying to think about how do we make these things actually useful for humanity. Most of the people are doing this with good intentions. And I find myself usually agreeing with the Pope, despite not being particularly religious myself — that yes, these things are technologies that are gonna change the world quite a bit, and it's worth thinking about what kind of change we want and how we can do this responsibly. So yeah, I think the Pope is probably one of the better people to listen to about AI.
[23:25] Ali: He framed AI as a transformation of similar magnitude to the industrial revolution. I assume you agree.
[23:32] Josh: Seems like he knows what he's doing more than Gary Marcus or Yann LeCun.
[23:37] Ali: He knows his stuff.
T9 — Eric Schmidt booed at graduation / AI sentiment
[23:41] Ali: Kind of along these lines — he's responding to AI being a huge force and also kind of the effects of it being a negative force. Another place this showed up, or is increasingly showing up, is at graduation ceremonies. I don't know if you saw this, but Eric Schmidt got booed for mentioning AI in his commencement address, as did a number of other speakers. So people have learned: do not mention AI if you are a commencement speaker.
[23:56] Josh: This I heard about.
[24:09] Ali: Were you surprised?
[24:13] Josh: Not particularly. I saw something yesterday that this is the worst job market for young people ever in recorded history or something like that. Yeah, okay, makes sense. I wonder why they'd be booing AI, huh?
[24:27] Ali: Yeah. The crazy thing is that the S&P 500 just hit all-time highs.
[24:32] Josh: Unsurprising. It's not about — AI isn't about there not being new economic value created. It's gonna make a lot of value. Anthropic — wow, look at those shares so high. Wow. But does any of it get back to people? No, because they got laid off in the AI layoffs. So these two things are very congruent.
[24:56] Ali: Is there anything we can do about that? Do you think AI sentiment will ever improve, or is it gonna be like Congress or the media where it just declines dramatically and stays low, and the best you can hope for is that it doesn't get lower?
[25:11] Josh: Probably going down and just staying low. I don't know if you even should be hoping that it doesn't go lower. We hope that it goes lower and lower until we do something about it.
[25:22] Ali: Ooh, that's a dangerous game.
[25:23] Josh: Well, similarly for Congress and the media — I almost wanted to go a little bit lower so we'd be like, you know what? It's time. We gotta fix this shit.
[25:35] Ali: Do you think we will ever do anything about it?
[25:38] Josh: I think some things will get broken and things will change in the next decade.
[25:43] Ali: When you say things, what do you mean? For instance?
[25:45] Josh: Everything. For instance, just employment and jobs and the normal bargain that society makes — that if you work hard, then you'll get money and you can have a good life. That's gonna get broken when instead a company can hire a robot that looks the exact same as you and is ten times as smart and is much stronger and can work for them twenty-four-seven and never says no.
T10 — Anthropic memory (skipped mid-thread)
NOTE: Ali starts this topic at 26:15 and bails at 26:34 ("Skipping that one"). Cut entirely.
[26:15] Ali: Speaking of AI getting better, Anthropic says they're now working on shipping memory and improving memory so that you don't have to keep spending 20 minutes every day re-explaining things that you've done that the AI should know.
[26:30] Josh: Don't they already have this?
[26:34] Ali: I think it's working on getting better and better, and maybe also not just for engineering, but context that they're learning about you and your styles. Skipping that one.
T11 — Jensen Huang: NVIDIA conceded China to Huawei
[26:34] Ali: Another headline — Jensen Huang said Nvidia has largely conceded China to Huawei. Surprised?
[26:53] Josh: Nope. Not really surprised. I bet they'll still export stuff though.
[26:55] Ali: NVIDIA still exports stuff?
[26:59] Josh: I'm sure they'll still try to export stuff. Yeah.
[26:55] Ali: Some things. They'll still try. But it does seem like — I guess to what we were discussing last week — with all this uneven legislation, it has empowered China and Huawei in particular to become independent, to figure out their domestic manufacturing of chips so that they're not dependent on us.
[27:22] Josh: Yep. Yep.
T12 — OpenAI reportedly preparing IPO
[27:26] Ali: OpenAI is reportedly preparing to file for an IPO. This week, next week, ASAP. I think it is a little bit of a surprise in terms of timing. I think some people expected Anthropic to go first.
[27:37] Josh: Wait, but will Anthropic not still go first?
[27:42] Ali: I don't think so, because also Anthropic just closed like a thirty billion dollar round at a nine hundred billion valuation, which allows them a little bit more time and breathing room before going public.
[27:57] Josh: They could do it whenever, yeah.
[27:59] Ali: They could, but going public requires certain steps, and it seems like OpenAI is working on those faster. Neither of them has a date, neither of them has filed an S-1 the way that SpaceX has.
[28:06] Josh: Yeah, they're definitely both working towards it. So I don't know that we know which one will be first yet. That seems obvious that they're both working towards it. Probably racing, you know. They like racing.
[28:19] Ali: They like racing or raising?
[28:24] Josh: Racing. Sorry. Racing as in trying to see who can go as fast as possible. Right. Who can IPO first. And also raising. Yeah.
[28:28] Ali: Yes. Yes. Also raising. Yeah. Does public market discipline help OpenAI or hurt them? Or does it change anything?
[28:44] Josh: I don't know that it changes anything in the short term, but in the long run, one of the things that I think all of the well-intentioned people at OpenAI and Anthropic are forgetting is that when you have a public company, it doesn't matter if it's a B Corp or if all of your friends work there right now and you're all well-intentioned. What happens when all of you leave because you've made a bunch of money in the IPO and then the whole place is filled with Microsoft execs or whatever? What do you think is gonna happen to that company that you built? Is it gonna be the same good, amazing company doing good things for the world, or is it not? So yeah — that's what happens when it's a public company over time.
[29:29] Josh: Because there's just pressures. If you make a public company, it has to make money. There are shareholders who are like, "you've got to do this thing." "Well, our shareholders — so we've got to return money to our shareholders. That's what we're supposed to do." And then yeah — they don't say "we're gonna give it all away as UBI." That's not flying as a public company. "Well, they're investors, like, our value that we create is really creating value for other people. It's really trickling down to them, you see? So we gotta get all this money, and then our investors will also make a lot of money, and then the world will be good." Sure.
[29:59] Ali: You're basically saying it's a lot easier to do all of this — say all this — when you're a private company, and that once you go public, you kind of necessarily lose some control.
[30:08] Josh: It's both easier to say and to do, and is more realistic. As a private company, you have a lot more control over what you can do versus when you're a public company.
[30:20] Ali: That's why Stripe has stayed private for so long. But Stripe doesn't need the cash that Anthropic does.
[30:22] Josh: That's right.
T13 — Anthropic Mythos: vulnerability discovery at scale
[30:33] Ali: Anthropic is continuing to help companies find vulnerabilities via Mythos. Apparently, they have found 23,000-plus candidate vulnerabilities so far, 1,900 of which have been reviewed by external security firms. 90.8% have been confirmed valid. Is this significant? Do you think Mythos represents a step-function change in capabilities?
[30:59] Josh: It's like at least a half-step. It's hard to tell without being able to use it how good it is. I think you probably could take Opus and do a lot of this work as well, just not as efficiently. Instead of it being 98%, it might be 90%. Instead of finding 1,900, maybe it finds 500. Okay, so that's still a lot more, but it's not — yeah. And the next model is also gonna be able to do even more and find even crazier ways of breaking things.
T14 — Is AI making us more or less safe?
[31:32] Ali: Is AI making us more or less safe?
[31:36] Josh: From a computer security perspective, you mean?
[31:40] Ali: Let's start there.
[31:36] Josh: From computer security perspective, in the short term, probably less safe. We can see there are all these exploits — package attacks, supply chain attacks in open source. So yes, sort of obviously less safe. A bunch of people got keys exfiltrated and everything from that.
[32:01] Josh: Long term, arguably safer, because maybe people will start taking it seriously. It's a sort of game about attack and defense. Maybe people will just be like, our coworker Andrew — he's an open source guy who loves using all sorts of crazy open source tools — and just live in their own little private network of their own making. You can just live in your own little bubble if you want.
[32:32] Ali: What about outside of cybersecurity? More or less safe?
[32:36] Josh: Well, probably less safe again in the short term, because now you can damage real infrastructure that matters or something. I don't know that people have done a ton of that yet. Most of these things don't have a huge impact on the physical world. We don't have tons of robots. But somebody could hijack all Waymo self-driving cars and crash them, I suppose. That would set self-driving back a good decade or two. So less safe.
[33:11] Ali: So less safe.
[33:12] Josh: Less safe until people get their shit together because there's a big enough problem.
[33:17] Ali: You keep on saying this. I'm kind of skeptical that people will ever get their shit together.
[33:23] Josh: I think people did get their shit together after, say, the atomic bomb was dropped. People were like, whoa, whoa, whoa, wait a second. We can't keep fighting in the same normal way as that. There was a reconciling. We didn't drop atomic bombs after the first two, ever. So it was definitely a "people getting their shit together" thing.
[33:50] Josh: We were putting a giant hole in the ozone at some point in the past. And we were like, wait, wait, this is a terrible idea. We gotta stop that. So we do — sometimes when we do enough damage — realize, like, we should probably stop. Nuclear power, I think, maybe overreacted, right? Even with Three Mile Island and Chernobyl, there are probably other better ways of doing that, but there was so much of a reaction that we just don't really make new nuclear power plants anymore.
[34:13] Ali: So we don't react, we don't react, and then we way overreact, and then we don't react, we don't react. Okay, so we're gonna have a way overreaction.
[34:19] Josh: Well, if you have something as big as Chernobyl or whatever — but most computer security things don't really matter that much. Some people lose some money, you get locked out of your computer, you have to reset some keys. Okay, whatever. But when it crashes fifty percent of cars that are on the road, we will have a reckoning and we will do computers differently.
T15 — Google I/O: Search upgrade
[34:39] Ali: At Google I/O, the Google CEO launched its biggest upgrade to their search box in over 25 years, acknowledging that search was a model of the past and that its core product had to be more aligned with the AI age. I read that and I was kind of sad. I was also very impressed because — we've talked to Aravind from Perplexity and others, who was arguably — and the X-pholes — they've been calling this for years. And it was just said so plainly.
[35:13] Josh: Yeah. I mean, they've been doing it for a while now though — really, nothing has really changed.
[35:16] Ali: Yes, they've been innovating and doing AI Mode — but this was, I think, kind of the final nail in the coffin of the Blue Link. The world has really changed.
[35:24] Josh: Yep. I mean, it was already a little bit busted, so yeah.
Close
[35:31] Ali: Josh, thank you so much for joining.
[35:33] Josh: You're welcome. Thanks, Ali.