# Episode 21: Anthropic Mythos & Project Glasswing, Recursive Improving Agents, and Your Parallel Agent Limit

> This week Shimin and Dan (Rahul is out on vacation) unpack MiniMax M2.7 and the first experimental evidence of recursive self-improvement, Anthropic's Mythos model and Project Glasswing — a cybersecurity-capable model being withheld while major corporations and the Federal Reserve patch infrastructure first — Adam Argyle on why AI sucks at front-end, Addy Osmani's framework for finding your parallel agent limit, Pierre Villega on code commoditization and industrialization, Dan's rant about installing Claude Code on WSL, and two minutes to midnight with new financial signals from the AI bubble.

Published: 2026-04-17
Source: https://adipod.ai/episodes/21-anthropic-mythos-project-glasswing-recursive-improving-agents-and-your-parallel-agent-limit/

---
This week Shimin and Dan (Rahul is out on vacation) cover MiniMax M2.7 and the first experimental result on recursive self-improvement, Anthropic's Mythos model and Project Glasswing where a vulnerability-finding model is being held back while major corporations and the Fed patch infrastructure, Adam Argyle on why AI sucks at front-end, Addy Osmani on finding your parallel agent limit, Pierre Villega on code commoditization as industrialization, Shimin's vibe and tell about losing Pi agent and rebuilding the ADI site in Astro, a Dan's Rant on installing Claude Code on WSL, and two minutes to midnight.

## Takeaways

- MiniMax M2.7 is the first public experimental result on recursive self-improvement — ~30% benchmark gains from the model running its own arena-style experiments
- RSI shifts model improvement from a labor bottleneck (hire the best PhDs) to a capital bottleneck (who can run the most parallel experiments)
- Anthropic's Mythos found decades-old Linux/OpenBSD vulnerabilities by chaining primitives, not inventing new techniques — withheld under Project Glasswing while Amazon, Apple, Cisco, CrowdStrike, and the Fed patch first
- If vulnerability detection scales, open source may paradoxically get more secure long-term — given enough eyeballs (and GPUs), all bugs are shallow
- Front-end is harder to automate than back-end: models can't see, were trained on ancient garbage, and there's no accounting for human taste
- Parallel agent capacity doesn't scale linearly — context switching plus vigilance overhead grows geometrically; find your ceiling by deliberately blowing past it
- Code commoditization is industrialization — handcrafted code still matters, but marketplaces reward industrialized output, so learn the tools
- OpenAI's CRO publicly accusing Anthropic of inflating run rate by $8B via gross-vs-net revenue accounting is a signal OpenAI feels the enterprise heat

## Resources Mentioned

- [MiniMax M2.7: The Agentic Model That Helped Build Itself](https://firethering.com/minimax-m2-7-agentic-model/)
- [Anthropic debuts preview of powerful new AI model Mythos](https://techcrunch.com/2026/04/07/anthropic-mythos-ai-model-preview-security/)
- [Assessing Claude Mythos Preview's cybersecurity capabilities](https://red.anthropic.com/2026/mythos-preview/)
- [Why AI Sucks At Front End](https://nerdy.dev/why-ai-sucks-at-front-end)
- [Your parallel Agent limit](https://addyosmani.com/blog/cognitive-parallel-agents/)
- [Code Is Cheap Now, And That Changes Everything](https://perevillega.com/posts/2026-03-16-code-is-cheap-now/)
- [The AI gold rush is pulling private wealth into riskier, earlier bets](https://techcrunch.com/2026/04/07/the-ai-gold-rush-is-pulling-private-wealth-into-riskier-earlier-bets/)
- [OpenAI CRO Tells Staff Anthropic Inflates Run Rate by $8 Billion](https://www.implicator.ai/openai-cro-tells-staff-anthropic-inflates-run-rate-by-8-billion/)

## Chapters

- (00:00) - Introduction to AI and Software Development
- (02:45) - Minimax M 2.7 Model and Recursive Self-Improvement
- (05:04) - Anthropic's Mythos Model and Security Vulnerabilities
- (08:15) - AI's Limitations in Front-End Development
- (18:13) - Cognitive Debt and Managing Multiple AI Agents
- (32:01) - Managing Multiple Agents Effectively
- (34:42) - The Evolution of Code Value
- (38:29) - The Industrialization of Coding
- (41:00) - Navigating Cloud Code Challenges
- (45:39) - Ranting About Technology Installations
- (50:16) - The State of the AI Bubble


## Transcript

<details>
<summary>Show full transcript</summary>

Shimin (00:00)
right, I'm now sleeping this week.

Dan (00:00)
That's why I'm always pushing

open weight stuff on the podcast is it's like, you know, control the means of your token generation.

Shimin (00:21)
Hello and welcome back to Artificial Developer Intelligence, a weekly conversation show about AI and software development. We discuss the most interesting set of AI news, tools, and techniques each week by going through hundreds of links and dozens of newsletters so you don't have to. My name is Shimin Zhang and with me today is my co-host Dan. He's trained on ancient garbage, Lasky.

Dan, how are you? Rahul is out this week. He's busy on vacation somewhere.

Dan (00:50)
Yeah.

I'm doing okay. I guess I kind of am trained on ancient garbage.

Years and years of cruft built up in my brain.

Shimin (00:58)
Yeah, if by ancient garbage, mean like, you know, the classic classical texts, like the ancient Babylonian.

Dan (01:06)
I thought you meant like really, really old, like outdated HTML and JavaScript specs. Like, like, I think it was from.

Shimin (01:10)
⁓ That too, yeah. Egyptian

glyphs and the break tags, what difference does that make?

Dan (01:18)
It was

when they called it like DHTML, which was basically meant that you could start manipulating the DOM via JavaScript. Yeah. Anyway, that's pretty ancient.

Shimin (01:27)
I thought me calling it Ajax

is ancient, it gets the rabbit hole goes deep. Yeah. I bet it does. Well, on the show this week, we're going to start with the news Threadmail as always, where we're to talk about the Minimax 2.7 model as well as Anthropic's latest model. That's so good that we couldn't use it.

Dan (01:31)
No, this predated Ajax. Yeah.

True, and then we will be going to the, this is questionably the tool shed. I'll give it to you, but why AI sucks at front end. It's really maybe more of a post-processing, but yeah. So we'll be talking about that a little bit. And then.

Shimin (02:05)
We'll have a debate about where that post belongs. And then we're to go to the technique corner where we'll talk about your parallel agent limit.

Dan (02:09)
You

Then followed quickly by post-processing where we're going to have a couple different posts we're going through today. So there's the LLM Wiki, which we'll be chatting about. And then a sort of counter-post to some of the opinions we've talked about previously on here, which is like code is cheap now and that changes everything. So that's a pretty interesting perspective to look at. And then...

Shimin (02:33)
Yep.

I'm going to vibe and tell about what it is like to lose my favorite agent harness and also working on the new ADI site. Follow by.

Dan (02:45)
We're well a couple things so we're actually gonna do a dance rant this week I've found something that I'm angry about on the internet, so we'll talk about that stay tuned to find out what it is and ⁓ Then we're bringing back an old favorite

Shimin (02:58)
Yeah, as always, other than last week, really, we're going to talk about two minutes to midnight where we talk about the latest development in the financial side of the AI bubble.

And let's get into it. Yes. Well, to start, let's talk about Minimax 2.7.

Dan (03:08)
Let's get into it.

I guess.

I actually had to Google Minimax to know if it was a separate lab or not. So I just wanted to disclose that upfront because I didn't know. So it turns out they are in fact, Shanghai based lab, which I didn't realize. But yeah, so they've released the new M 2.7 model and you know, we like to cover

Shimin (03:24)
Mm-hmm.

Dan (03:43)
I like to cover open weight stuff a lot on this podcast, but, ⁓ the thing that is particularly interesting about this one is that it trained itself sort of. So, there was a pretty big announcement from them that essentially they, you're probably going to describe this better than I am, but I'll, I'll try to do first. they.

gave the model itself the ability to run like an LLM arena kind of thing on a discrete GPU. And then it could analyze its own performance based on that and then suggested changes to its own. I don't know if it was purely code or weights or what, but like essentially there was a feedback loop as part of me where you can help me a little bit. And

that was then fed back into the next training cycle. And they did that, what, 10 times or something like that. And overall, they claimed essentially a 30 % improvement in some of the benchmark scores. And overall, it scored fairly well on benchmarks.

Yeah, so pretty

Shimin (04:42)
Yeah.

I was going to say, just pretty much agree. They didn't release like the harness around for which they were running these agent experiments at. They gave the model itself, which is already a 229 billion parameter model. like a medium sized model, medium to large size model running

three 24 hour trials with a harness around memory feedback and self optimization. There has been a lot of

talk in the AI space about recursive self-improvement, RSI, the idea that as the models get better, they will reach a certain point where they will be able to self-improve, kind of like the singularity type event. I've heard...

Dan (05:30)
Mm-hmm.

Shimin (05:33)
folks like Mark Andreessen talk about RSI as a thing that is already happening in the industry. But personally, I haven't come across any actual experimental results or even announcements around RSI. So the fact that Minimax is able to use its own model to lead to a 30 % performance improvement is the first experimental result announcement that I have seen in this RSI field.

⁓ so I thought it would be good to discuss it. The hard part would be talking about recursive self-improvement without sounding like one of those machines will take over. We're doomed. ⁓ singularity nerds. Terminator.

Dan (06:13)
Hehehehehe

Well, the, the things

that's a little different about this is that it's still wound up being a baked model, right? It's not recursive self-improvement, like in the wild, if that makes sense. Like it's not continuously happening. It's like, this was part of the training process, essentially, like an additional like post-training step almost. but that does make me think, right? That was the first thing I thought of when I read this is like, Ooh.

How far out are we from someone deploying that essentially live continuous training, especially if you could somehow do it on real data that you were then collecting from users? Did this work or not? The little thumbs up, thumbs down kind of thing that you get on each of the responses.

Shimin (07:00)
Rahul isn't here today, so I'm going to try and speak for him. What changes here is... Yeah, let's get rid of all the technical... No. What happens here is the model improvement process, if RSI is established, moves from a labor bottleneck, we need to hire the best PhDs to improve our models into a...

Dan (07:03)
You're firing all the copywriters using ⁓ M2.7. writers, that's right.

Shimin (07:25)
a capital bottleneck. Who can spend the most money running the most number of parallel experimentations? And I think that has fundamental implications to where the industry is going. Namely, that it's going to suck up even more capital than it is.

Dan (07:32)
Mm-hmm.

We could be, but like it didn't seem like this used a ton of it. And I guess that's the other part to maybe note here is that like it didn't the runs that they gave it were just more or less standard inference runs. Right. So it did, it's not like training where it took like a huge network cluster of GPUs. Like they were running each experiment on a single, I think they say a 30 in there.

Shimin (08:03)
Yeah, but if it is correct.

Dan (08:05)
So yes, I guess if

you do them massively parallel, you'd in theory get better results. I don't know.

Shimin (08:11)
Yeah. And my suspicion is like Anthropic and Openair are already doing this. Like that's what, yeah, Mark Anderson's comments, like lead me to think is like a lot of companies are already doing this. So, so I think the shift is happening.

Dan (08:16)
Hmm, they're like huge scales, yeah.

Yeah, because it wouldn't be surprising to

assuming the like original sort of six month lead that we claim for the US frontier labs over like Chinese labs is still true. Then you could assume that they've been doing it for six months at least. Is that a fair assumption? I don't know, but good enough for good enough for the podcast.

Shimin (08:35)
Mm-hmm.

Yeah, exactly. Six to twelve months already.

I think it seems reasonable.

Yeah, let's wait for the US Frontier Labs to also announce that they've been doing RSI. think that announcement is probably coming pretty soon. the other news item we have this week is of course, speaking of, yeah, and self-improving models, rumored. Yeah.

Dan (09:06)
I was going to say speaking of US Frontier Lab announcements or lack thereof, sort of.

Shimin (09:14)
How do we even? This is the news that ⁓ everyone's been talking about all week. Anthropic?

Dan (09:19)
Well, let's start with

the backstory a little bit, right? So outside of the scope of the article. ⁓ I think I'd even put this on or hinted at it in one of the previous podcasts is like we'd read a little bit of stuff where in addition to the cloud code source leak, they also accidentally published like some internal blogs or something like that. They had references to this mythos model. And so there'd been hints of it sort of floating around and people weren't sure like

Is it real? Is that just a code name? Like what's going on? And then fast forward to, go ahead.

Shimin (09:52)
Right. Tuesday of last week, really the same day we were recording last week's episode, Anthropic pushed out the system card for its new model named Mythos. But Anthropic also stated that because Mythros is such a step up in capabilities from Opus, that it is not going to release it.

to the public due to the thousands of security vulnerabilities that it has found. What Anthropic is going to do instead is partnering with large corporations like Amazon, Apple, Broadcom, Cisco, CrowdStrike, et cetera, to patch up the infrastructure software layer that is the foundation of our modern world, I want to say.

⁓ In order to give these established players time to patch the vulnerabilities before releasing the model to the public. ⁓ this partner organization previewing program is called Project Glasswing. And this is, you know, we are reading the report from TechCrunch, but

Dan (10:38)
Kinda, yeah.

Shimin (11:00)
Anthropic itself has also has its own page about, you know, securing critical software for the AI era. think the biggest things we should be looking at when it comes to mythos is look at what kind of vulnerabilities they're actually finding, right? Like, are we talking about fairly minor bugs or are we talking about system crashing bugs?

So if we read the, mythos preview blog posts from red.anthropic.com, we see that they're talking about, 27 year old bugs from open, from Linux and open BSD. And these vulnerabilities include anywhere from, ⁓ gaining root level access to crashing servers by sending requests through a, through the network. So.

⁓ it does sound like

Dan (11:45)
The one I

saw, which I don't know if they're covered in this, it's just more of like internet rumors than real confirmed thing was someone pointed at Firefox's JavaScript virtual machine and it was able to essentially escape the sandbox of the virtual machine 85 % of the time. Now, what that percent means, I don't know. guess like they asked it to escape or something and I guess maybe.

basically means like pretty much the backbone of like web application security like having interactive websites in the first place would be you know crippled assuming it can do something similar with chromium right and v8 so be pretty wild

Shimin (12:22)
Right.

Yeah,

I think I read a tweet that said something along the lines of regular folks are looking at the stock market or whatnot and professional security researchers are converting their cash into gold bars. Something along the lines of that as a way of indicating how serious these security threats are.

Dan (12:44)
You

Shimin (12:48)
is in talk with the Fed and major banking institutions in the United States as a part of this project, Glass Wing project. Look man, a lot of times we talk about, hey, is this marketing or is this actually the Manhattan project? Yeah.

Dan (13:01)
I'm glad you brought that up because I was thinking it too. I was like, I might need to put on my fake security

researcher hat, or not a security researcher, like a skeptic hat for a minute and be like, if you were to do a viral marketing campaign, I don't know that you could do one more masterfully than this.

Shimin (13:20)
All right. Yeah. And the fact that they are not releasing the details of these exploits by, for obvious reasons. Yeah.

Dan (13:24)
Right? Exactly. And that's why it's like, yeah,

it reminds me of like when people were concerned about like quantum computers breaking like SSL or other like the true, you know, fundamental underpinnings of security online, which may still happen, but like, you know, it's the people are treating it with that level of like doom, I guess is what it feels like.

Shimin (13:33)
Right.

I've also heard this being described as a Y2K 2.0, ⁓ whether or not a little bit. does. In a couple of weeks, once the security issues are patched, I think we're going to learn a lot more. But as of right now, it really does feel like, ⁓ like the United States has detonated an experimental nuclear blast.

Dan (13:48)
Yeah, it does. It feels like that, right? Yeah.

Shimin (14:05)
Everybody else around the world is like, how will this affect my life going forward? I don't know. The model was not explicitly trained to look for security exploits. So I thought that was interesting. This is an emergent behavior from just the general capacities of the model. A and point B is

They also mentioned in this blog post that, you know, they're not doing any groundbreaking security vulnerability techniques. What the model is able to do is take multiple, ⁓ vulnerability primitives and chain them into a stack in order to do, you know, these more major exploits.

Dan (14:45)
And that's something we're sort of seeing writ large

too, is it's like chaining is getting easier and faster. Like across the board is like, think threat actors start using just AI generally. And then the other one we're seeing is like supply chain attacks are getting much easier to do and much easier to do at scale because like previously you would have had to like submit PRs and now you're like, Claude, submit these PRs. Go. All repos that look like they would take them and they do all that.

Shimin (15:09)
Right.

Dan (15:13)
stuff like invisible unicode characters.

Shimin (15:15)
Yeah. Yeah. It's mostly a mechanical process, but humans are not very good at spending a huge amount of time doing very tedious things of like chaining primitives together, especially when they're exploits that, you know, or memory leaks, and you know, stack attacks. Right.

Dan (15:29)
And or like really deep chains like,

you need to execute these 15 vulnerabilities sequentially in this order with this timing or else it doesn't work.

Shimin (15:39)
Right. So we talked about the jagged frontier of AI before. Like this seems like one such jagged frontier where the AI capabilities is so much ahead of what humans can do.

Every script kitty is now a script tiger, a script leopard. I've been looking for a new large mythical creature for maybe like a script griffin. Ooh, I would like to be a script griffin.

Dan (15:57)
Ha ha ha ha.

Even if they were kiddies, now they're

more like script adults. I don't know.

Shimin (16:05)
Yeah. So, I, yeah. What, what do you think about where we're going to see this will go in a couple of weeks.

Dan (16:12)
You know, I will, I'll partially take off my skeptic hat like halfway. I think that this will probably have some impact. Will it be as big as everyone's forecasting? Probably not. And it's going to depend on how they go about releasing the model. Assuming they do. Like, is it just going to be a, you know, Opus four six style rollout where like

You get it if you pay for it and it's more expensive or is it still going to be like more like a beta kind of thing where they opt in, make big companies can run it for a while like they're doing now, like broader, you know what I mean? And then after that, maybe then they roll it out to the public. I'm not sure.

Shimin (16:46)
Mm. Yep.

I also wonder what this means for the open source software. Like our folks, is it a good thing where the open source software will get hit by the most number of. Mythos requests and therefore be extra safe or will folks go back to close source software because

Dan (17:10)
I don't think the security model of open source changes, the open source has always, like the claim has always been open source is more secure, at least when it comes to security projects, because it's vetted by more people, more experts have a chance to look at it and more people can contribute, which means that like, in theory, it winds up at least in a

In a world where supply chain attacks aren't happening all the time, you wind up with a more hardened thing overall.

Shimin (17:38)
Yeah. And, yeah, given enough eyeballs, all bugs are shallow. think they mentioned that in this blog post.

Dan (17:42)
Right. And who's to say

that you couldn't then take the same model and apply it to fix open source at scale, you know? Yeah.

Shimin (17:49)
binary. ⁓

lastly, I want to end this discussion on a positive note. There is a silver lining in all of this. consider this currently a lot of governments and government agencies, both in the United States and abroad, not looking at you countries, Gulf nation states, are willing and North Korea, let's tag them on too.

Dan (17:57)
⁓

Shimin (18:13)
are willing to pay millions of dollars for zero day exploits created by humans and discovered by humans in order to do all kinds of nefarious things. If AI is indeed supernaturally good at looking for security vulnerabilities, then in the long term, this would actually make our open source software that much more secure.

Dan (18:31)
Hopefully, yeah.

Shimin (18:32)
Yeah. Hot takes. Cool. Moving on.

we have a

Dan (18:35)
Why AI

sucks at front end. Brought to you by...

Shimin (18:40)
Adam Argyle

Dan (18:41)
Yes.

⁓ I thought this was pretty interesting. It's almost a little bit borderline Dan's ranch territory. whether or not it belongs in the tool shed, I don't know. I'm pretty sure I misclicked when I was categorizing it and you just ran with it, So the gist of this blog post is that AI is great at

doing sort of like a UI grunt work that you may not want to handle as a software engineer. So the example that they gave was updating or swapping around design tokens. So, you know, things like your colors and sort of style primitives and stuff like that in a design system. But that it is particularly not great at

coming up with unique designs. So if you ask AI to design something, you will get very run of the mill. Like, it's not bad looking. It's just not innovative, I guess, design. And then the other issue they call out specifically is you can one shot the entire interface for your application. But as soon as you go try to tweak something, especially

If it's like highly involved CSS wise, like you're doing some sort of CSS tricks as it were, it just falls apart and either the words of the blog pulls something you haven't seen since IE6 or, or yeah, just doesn't do a good job at like giving you a pixel perfect thing. And so the, the rationale,

⁓ that they're claiming behind this is one, it is trained on ancient garbage. just like me, LLMs are trained on what's publicly available. And unfortunately, just like by law of averages, there's not very good, or not a lot of exceptional and like cutting edge CSS like floating around on the internet, at least in terms of open source repost for it to be trained on. Their second point, which is one that I feel deeply from having used

LLMs for UI stuff is they literally cannot see, right? I mean, think we've probably talked about this in the past, but I feel like Gemini is like much better at sort of visual stuff than Claude is. But even then, it literally can't like necessarily understand what you mean when you describe like the vertical line here is off by four pixels or something like that. Or my other personal favorite is

when I was playing around with like essentially a canvas based render, it was like, as you can see, the red box is on the screen. And I was like, no, no, it's not. There is no red box. And then it turned out the issue was like, I was using a scene graph and it didn't know that you needed to like attach parents to children and the scene graph, otherwise it's more of a Which is like, you know, I think really kind of.

Shimin (21:23)
Mm-hmm.

Mm.

Dan (21:35)
speaks to the first point of strain on ancient garbage, right? There probably wasn't very much scene graph code floating around particularly like out there, whereas there is a crap ton of react code, you know?

So yeah, it just both resonate with me a little bit and

And also I thought was a little bit of a hot take in terms of, know, I have had like reasonable experiences using Claude particularly ⁓ for doing like especially UX development. And I've even done some like pixel perfect shuffling with it that it actually did reasonably well on. And I think it helped if you can like

explain what the elements are in a really succinct way that you're trying to move around instead of just sort of like talking about it in a general sense. Like, you know that like the thing that's actually rendering as a circle on your screen is actually a LI tag with some styling on it. So you go, no, it's the LI that's doing this, that's off by X. And then it's a little bit better.

Shimin (22:32)
Right. But if we're talking about

LI Tags, then we're back to talking about HTML, which is a thing that it knows fairly well, right? Like it has an idea of the DOM because it's just tokens to it.

Dan (22:41)
Yeah. Yeah. Well, yeah, but I mean, like

specifying what element so that it can be smarter about what it's doing with the styling essentially. ⁓ like what Claude did when I was messing out that kind of layout is like it actually went in and I gave it, this, CLI tool that I sort of like vibe for it from someone that's like, essentially the, ⁓ Chrome.

Shimin (22:53)
Right. ⁓

Dan (23:06)
dev protocol stuff, as a CLI instead of an MCP. And, so nothing really or shattering, but, and then it's able to like execute JavaScript as well in the great browser context. So it was running like get bounding rects all over the place to understand the layout parameter and then just doing math to figure it out, which I thought was like actually kind of a pretty clever approach, you know, cause it can't see. yeah, it was just doing it with pure math, which is like.

Shimin (23:27)
It's impressive when it does work, yeah.

Dan (23:31)
Okay.

Shimin (23:32)
I think, well, a a few thoughts. One, I think the current tools just hasn't been trained to mimic the human aesthetic sense, right? Like that's a limitation of our training. think if, if some frontier lab spends a ton of time and like Google being the one with all the data and all the training.

Like they can produce something that does a much better job of matching human aesthetic preferences. And in fact, they have, they have created a Google stitch, which is a huge leg up to the generic Claude code. Let's make a cool website prompt that I've used to almost, almost no success. It's all, it's always generating like the most generic looking.

Dan (24:17)
Yeah.

Shimin (24:17)
mismatching color

kind of layouts possible. ⁓ So there need to be like a image native way. Give it feedback, give it an actual browser so it can take, it does, it has the ability of taking snapshots and then parsing the image files to do a deeper dive.

Dan (24:21)
Yeah.

But I will say

that to some degree, I don't think we'll ever necessarily be successful there without it being some sort of like averaged, boring thing, right? And the reason I think that is because there's this like element of human taste that there's just no accounting for. You know? Like...

Shimin (24:56)
Mm-hmm.

Otherwise.

Dan (25:00)
There's people that genuinely love like Comic Sans that are out there and like set their phones, custom Android UI to be Comic Sans, you know, and you're like, they're like, let me show you this thing. And they went by your phone and you just look at it you're like, what am I looking at? And they're like, there's cool link. And I'm like, no, I'm looking at your font. What happened? What did you do?

Shimin (25:23)
Like the Ryan

Gosling papyrus skip from SNL. I don't know if you saw that.

Dan (25:26)
Yeah. Yeah.

I just have to.

Shimin (25:28)
I'll just send that to you afterwards.

It's also interesting that for a long time, folks were saying, hey, frontend will be the first one to get automated. But this actually posits that maybe backend development is easier to automate via AI than frontend development. It's the revenge of the frontend developers. ⁓ I am one, so I'm biased here.

Dan (25:46)
Well, I argue

that it's a little more nuanced than that, right? Because it's like, if you give a decent vision model like Gemini a mock that's been done by a professional designer, it will execute the mock reasonably well, assuming it's not doing anything too crazy. So from that perspective, you could argue that front-end development is dead. But quite frankly,

I haven't seen that style of standalone front end development, where it's like, you take the picture and you spit out the CSS and the HTML and do nothing else. Be a thing in quite some time. You know, this is like back in the days when people were making like blog templates and selling them online. guess people still do do that to some degree, but like.

Shimin (26:29)
Yeah, the dot com bubble days where all you need is know what a tag is.

Dan (26:30)
That's yeah, that's not that's not

what front end development in 2026 means. It's like basically be full stack from the control plane on up and like, yeah.

Shimin (26:41)
I still think Google probably has the, has the most potential when it comes to in the Western labs, when it comes to creating and productizing a, you know, visual based LLM that, ⁓ or multimodal LLM that works well for front end development. But we will see about that. And also a big, big fan of Adam Argyle. He used to be the, ⁓ I think he was the

Chrome DevRel person for a long time. So I've been following his work for a number of years. He had a web-based podcast on CSS for long time. And to go from Adam to Addy Osmani, we're just in Google land today. Yeah, we have a post-processing post.

Dan (27:20)
Yeah, dev relations.

Shimin (27:25)
from Addy titled Your Parallel Aging Limit.

remember last week when we were talking about what happens when your cognitive debt gets you cognitive bankrupt and how you just collect a little bit of cognitive interest rate this entire time this article talks about how do you find your sustainable agent interest rate

in terms of this before, so you don't hit cognitive bankruptcy. So in like a country, you have this idea, right? Like a country can always borrow money as long as the GDP is growing at or matching the interest rate. So you're a parallel agent limit is like the right interest rate you can pay so that you still have a, so you're not incurring too much cognitive debt. And we've talked a lot about

the phenomenon cognitive debt on this podcast, but we haven't really talked about how do you know when you're incurring cognitive debt and like what are the guidelines for determining that. this blog post posits that, you know, first it defines a problem of working with multiple agents at once, which is the more agents you have,

the more context switching costs you have. And context switching is very expensive for a developer. And the more different the tasks are, the longer running the tasks are, the more switching costs you're going to incur also. It's this constant switching back of multiple agents, multiple projects that are all long running. So every time you have to switch, you have to load a lot into your head in context terms that makes working with agents

multiple agents exhausting. He also talks about when you have multiple agents running, you have to be vigilant the whole time to make sure they don't go off the rails. So that vigilant overhead is also taxing for you. The whole thing becomes draining when you have, you know, 12 agents running at the same time and you're like going back and forth. ⁓ is agent C doing okay? Or is agent D stuck?

I wonder what agent E is doing at the same time. Like that all becomes an issue and he calls it comprehension debt. We call it cognitive debt. The idea is the same. Like you can't keep up until eventually you give up and you go off the rails. A phenomenon that was really interesting from this post is this idea that agents, adding agents don't scale linearly. Like having

One agent is really straightforward. Having two agents becomes significantly harder, but still doable. At some point though, it's almost like a geometric increase, right? At some point you hit a point where you just can no longer keep up. And that's why agent orchestration is a thing. So all that...

Dan (30:04)
But the

reason why I was giggling when, when you said it was in the background, I'm sitting here thinking about like, wow, the large percentage of my day is just context switching anyway. ⁓ And, and yeah. And so, then the other thing that popped into my head was I immediately thought of like, do you remember, like it was probably about three months ago, there was this whole spate of articles that people going like working with LLMs or like, you know, agent, agenti coding.

Shimin (30:17)
So you're very skilled at it.

Dan (30:31)
is basically a manager skill. So like the better you are as a manager, the more you, more you can handle that. And I'm just like looking at this thinking like, yeah, that, that is actually kind of dead on. Like now that I sort of like put through this very nicely worded lens, because it's very different. It's not that different than having reports in there, you know, ⁓ I'm stuck by this. I'm blocked. This team is whatever, you know,

Shimin (30:47)
Yeah, I think

Right, right.

It's still exhausting just in a different way. Yeah. And so then he talks about finding your ceiling. And the way to do that is by usually by blowing past it. Because when we first get a coding agent, we just want to spin out as many as we can. And then eventually it blows by ⁓ your cognitive capacity to handle these agents. So.

So what do we do? What he does is he ⁓ treats long, long, agentic sessions the way he treats deep focus work. You're trying to time box each agent with explicit scopes per thread. So don't do a thing and then see the output and be like, by the way, let's also add this thing while we're at it and fix A, B, and C, because now all that is going into your cognitive context. Create, you know,

small scope and start a new agent with completely new context if you need to. And then to start with less threads then it may feel comfortable or feel right to you. It's always a little better to go a little slower than go a little faster. Because I think to counter the fact that naturally want to always have more.

Since the total number of agents you can run at the same time is going to be very much thread dependent and context dependent to begin with, there probably isn't a natural limit for all, for all tasks. But for me personally, on my side projects, it's probably four agents if they're working on different projects and like three, if it's the same project or other way around.

but usually they're fairly long running. So I have like four of them, like do a research, write a thing, and then do the code and do a bug fix. Like that I can handle the other same project. Um, yeah. What, what's your ceiling?

Dan (32:38)
I don't multiplex personal projects at all. Like that's usually just one Claude ⁓

Shimin (32:42)
Yeah, that's because you don't use

agents.

Dan (32:46)
mean, I do, but, probably not as long running, I think, as you're talking about. Like it's mostly like working like more like pair programming than it is at least for personal stuff. for work, I tend to have anywhere from four to six agent windows open at a given time. but not usually not more than.

like max four are doing something at once and typically only two are doing real work and the other ones are looking up relevant things that I'll then piece back into the other contexts like for like big sort of like distributed kind of things, know, it's like, need to know how this system works in detail and then this one needs to work, know, like, yeah.

Shimin (33:12)
Alright.

Now, so

we are opposites. On my personal project, I do a lot more, here's the research task, here's the find stuff. Where at my work project, I already know the code base fairly well, so I tend to follow a lot more closely. But it seems like we're pretty much in agreement, right? Like if you're going to follow the code very closely, you'll probably end up at like two or three. If you have more research-y tasks where you can push it off with a

Dan (33:27)
Yeah, fairing. Yeah.

Shimin (33:45)
fairly tight scope and come back and read the artifact then you can do a little more. Yeah.

Dan (33:50)
Yeah. And I found that to

be really a useful pattern too, generally is just like go out and the other one I've been doing lately that I think is fun as I'll clone a whole bunch of repos into one place and be like, like at mention files that I think are relevant in the different repos and then kick off a research and just give it like carte blanche to read all of them. And like, it's pretty interesting to like see how well an agent can like really comprehend.

with like a complex distributed system that has, you know, multiple separate code bases that all work together in some way, especially without being able to necessarily like see the language, right? Cause I don't necessarily always have like the info repos and stuff too. like, it's like, did a pretty good job inferring how those two things work together from just like a very vague, you know, prompt and some, you know, digging. So it's pretty cool.

Shimin (34:20)
Mmm.

Yeah, they're getting better

at doing autonomous research for sure. ⁓

Dan (34:45)
Yeah. Yeah.

Shimin (34:49)
yeah, next on the docket, let's talk about code is cheap and that changes everything by Pierre Villega.

Dan (34:57)
Mm-hmm with a little Kent Beck reference right to start us off with.

Shimin (34:59)
Harry.

Dan (35:02)
So there's the sort of now almost famous quote from Kent Beck, 90 % of my skills went to $0 and 10 % of my skills just went up a thousand acts. Um, and, uh, essentially what they're getting at with that is, um, the ability to have a vision and sort of like chart your path through a large complex project became incredibly valuable and the ability to stare at a screen and crank out.

like nitty gritty little detail work ⁓ became, I wouldn't argue worthless, but significantly less valuable than it was in the past. So yeah, the core point of this is one that I don't necessarily agree with, which is that code has always been expensive.

Shimin (35:33)
Mm-hmm.

Dan (35:50)
And I, and I think that's why I think it's a little bit of a counterpoint to it. Like at least my general opinion, cause it's like coding in my opinion, coding has never really been the problem. Like it's definitely a part of the job, but it's like not the expensive part. But I do agree with this post in the sense that like coding has gotten a lot cheaper than it was before even. But anyway, I'm putting too much of my own opinion. So, um, yeah. Uh,

Shimin (35:59)
Mm-hmm.

Dan (36:17)
Good little, little chunk on, the, the, you know, LMS are not a immediate, what, how can I explain this better? good code button, right? You can't push them and expect to get great, you know, perfectly working code out of the, out of the box. It still requires being a fairly good developer yourself, looking at what's going on, deciding whether or not that's a good thing. And.

allowing it in or not.

⁓ yeah. And then like, guess getting more into the stuff that I do agree with a lot. And, and this is where I, you know, I've mentioned this before too, but I feel like the camps that developers are breaking down into are like getting more nuanced as like AI happens over time. Right. So I feel like this post kind of falls into the, ⁓ the camp of the product.

a product focused developer that doesn't necessarily care as much about the code and she cares more about like shipping the artifact, right? In the sense that like, he kind of talks a little bit about here's a example of a really successful, See the consumer to consumer. Wait, what does C to C, consumer to consumer sales?

Yeah. Marketplace. Okay. Yes. So marketplace, I don't know, I like Craig's list or something like that. Right. Super successful. What, what's in it? Is it the code like all the, you know, JavaScript is it the sort of like non-functional requirements, like the latency, how good the, is it up, you know, can you use it all the time? Blah, blah. Or is it, the fact that it's a.

successful C2C sales that is the asset too, right? It's the other way I kind of looked at it.

Shimin (37:49)
Mm-hmm.

I think they make a point that what is good code, the definition of good code really hasn't changed. We still care about testability, maintainability, elegance of the code. But most importantly, we care about whether or not the code does the work that's supposed to do. That part hasn't changed. And good developers will care about the technical aspects of

creating well-maintainable, well-architected systems, that part hasn't changed at all. Right. And the output at the end of the day, the thing that, the only thing that ultimately matters to the marketplace has always been whether or not the system is of value to others.

Dan (38:30)
Yeah.

Shimin (38:30)
Yeah. And my biggest takeaway from this article, well, maybe not the biggest takeaway, but the one that I, my heart was screaming with agreement was this line here, but this isn't really about AI enthusiasm or AI skepticism. It's about industrialization. It has happened over and over in every sector. And the pattern is always the same. The people who industrialize, out compete those who don't.

You can buy a handmade pottery from Etsy or you can buy a mass produced from a store. Each proposition values different things. But if you're running a business that depends on pottery, you better understand the economics. And that I think gets at the heart of the issue. Like we can still handcraft fun weekend projects, but the marketplace that pays us our good salaries for now cares about good

code that produces economically valuable systems for people. And that part is always the same.

Dan (39:26)
Right, yeah, it's really about the business

value. ⁓

Shimin (39:30)
Yeah, so

it doesn't hurt to learn the tools.

Dan (39:33)
Yeah. And I mean, we're seeing that, right. As I think even folks that have been pretty skeptical are coming to that realization, perhaps painfully or not that now is the, well, we're a little bit past the now is the time to learn it. like, if you haven't, there's a very real chance that you will be left behind. and I'm seeing that personally as like attitudes shifting, right? Like there was a period of time where like,

I would intentionally instruct agents to remove all trace of it being agent coding, because I didn't want to be judged for using it on my work. And now it's all over the place. So there might be still people judging, but if so, they are doing it internally instead of publicly.

Shimin (40:15)
Yeah.

Silently judging. That's my favorite kind of judging. Sitting in the corner, crossing my legs. Yeah. All that. it's we're, I think, you know, we started a podcast right around the time. We started a podcast right around the time when Opus four five was released and that was kind of like looking back. That was probably the beginning. Yeah.

Dan (40:19)
I mean, you know.

I couldn't do it fully silently because it's a podcast.

See change, yeah. Yep.

Shimin (40:37)
And

now we're halfway through somewhere along the adoption cycle. All right. Let's move on to a vibe Intel where I'm going to share some personal experience with Claude code. Yeah. So I don't know if you heard, but, uh, anthropic has been threatening to to disable third party. Oauth

Dan (40:51)
Sad news.

Shimin (41:00)
for their regular subscriptions. Really so that OpenClaw is not running on a pro or max subscription and burning up tokens all night long. Done that for a while. They apparently killed all OpenClaws So now they're moving on to Pi Agents and Thursday or Friday last week, they included the Pi Agents system prompt as one of the

signatures they were looking for. I started getting 401s when I was trying to log in using my Claude Code subscription for Pi Agent. So I had to adopt. And what I ended up doing instead of going to OpenAI is I ended up just chucking all my existing Pi Agent skills and existing work

into a Claud code instance and just told Claude like, Hey, either reference these skills or produce a skill that does what these PI skills would do. ⁓ And I can report I did not use channels. I'm just using it like a regular Claude Code instance, but with these new skills that I've been creating using PI. I have to say I

Dan (41:59)
Did you use channels too?

Shimin (42:11)
don't like the output as much, potentially because...

potentially because I was just used to the old Pi outputs. But in theory, it's not that big of a difference. It's just some system prompt differences. ⁓ But it feels like since the old skills were created using a different harness, I could tell it just doesn't sit right with me. This feels like when a company does a huge redesign of their website and you're just like, this doesn't sit right. Something feels off, man.

Dan (42:23)
Mm-hmm.

I don't like it. I don't like it.

It's purple now and it used to be blue.

Shimin (42:41)
Uh, yeah, it it was

do things that it didn't used to do. And I, uh, I get a little ticked off. It's a little less eager. Um, but that is fine. Uh, I'm, learning to deal with, with the loss of pi agent and as of now. Uh, but what I did end up doing was I switched over a lot of my, um, podcast related workflows over to Claude code and I was using it to,

update our existing site that is just like a boiler plate site from our hosting services over to a brand spankling new Astro site from scratch using Claude Code this past weekend. And I did run into some of the issues that we talk about, was Claude is not good at doing front end design at all. And what I had to go back to was I went back to, are you familiar with the site Code Drop?

Dan (43:31)
No.

Shimin (43:31)
It's like a high-end magazine for web designs. So it showcases designer portfolios. just has a lot of cool design examples and templates. Think like a coffee table magazine, but for web design.

Dan (43:47)
I've seen something

similar which is like godly I think. Something like that.

Shimin (43:52)
Yeah.

And I've had code drop in my RSS fee for the longest time. So I went in there ⁓ looking for design inspirations and I found a cool like WebGL based dithering image animation. ⁓ and use that as our new background, image for our new site, along with blog posts and topics and glossary and all that good stuff.

Overall a smooth experience

but I am still mourning no longer be able to use Pi Agent with my ClaudeCode subscription. I need to find a way to deal with that.

Dan (44:25)
I'm glad that I didn't

invest super heavily in Pi agent yet. was still futzing around trying to give it a home lab home and where it could run free and unfettered and yeah, didn't get there yet.

Shimin (44:38)
Yeah, I may end up switching to like, Kimi or GLM or ZAI or something like that. I need to run some experiments to see which one I prefer.

Dan (44:44)
Well,

what I was playing with on that was, cause I have a desktop with like 128 gig of video RAM, like, so I can run it like decently big, open weight models. So what I was trying to do was make it so that, it would run on Claude for anything that like required sort of like real thinking. And then

Shimin (44:54)
Mm-hmm.

Dan (45:04)
stuff like just like heartbeat processing or like small tasks. I could just run locally to not that it was like necessarily saving anything because it was through the subscription, but like you could also look at a model like that too, where you just use API when you need it, right? Like legit. then

Shimin (45:09)
Mm.

Yeah. Well, not using

API, but, Pi agent has access to the command line. So Claude code is just one skill that I can use.

Dan (45:30)
⁓

Interesting. I hadn't thought of it that way. Yeah. I guess that's true.

Shimin (45:35)
Whoa. Well,

look forward to next week's Vibe and Tell, guys.

Dan (45:40)
What does it mean? Five codes? A container for Claude code?

Shimin (45:44)
That's gonna be fun. Anywho, so that's been my experience working.

Dan (45:47)
How much video RAM do

you have to run a model? You have like a 40 something, right? Yeah.

Shimin (45:52)
have a 4090 that I haven't been

playing around with, which I really should. So,

right, I'm now sleeping this week.

Dan (45:56)
That's why I'm always pushing

open weight stuff on the podcast is it's like, you know, control the means of your token generation.

Shimin (46:02)
I love that.

Okay, yeah, onto my favorite segment, Dan Durant. Dan, what new technology do you have to talk about this week? Do you rant about?

Dan (46:08)
Deep breath everyone.

⁓ it's

not new technology. Through a set of circumstances that we shouldn't name, I was challenged with installing cloud code on Windows subsystem for Linux, which is like the least accurately named product ever, because it sort of implies like Windows is running on Linux, but really it's the other way around.

We won't get into why, but, Holy cow. It was the most frustrating experience I've had in a long time. And the way that I solved it actually had nothing to do with Claude and everything to do with like old school sleuthing. So the issue is WSL basically runs in its own little VM sandbox, right? and so when you install Claude code, you just install the Linux version, which is great.

And then it tries to pop a browser. And I guess these days you can actually run browsers in WSL too, which I didn't realize that. But, um, but I didn't have one installed. So it instead just poops out the whole giant like, Oh, off URL, you know, and it's like, proceed a copy of this, which is great. Nice little feature there. Props anthropic.

Shimin (47:04)
I don't see why not, yeah. ⁓ graphics interface, but yeah.

Mm-hmm.

Dan (47:17)
And you paste it into a browser and then it's like login. If you're not logged in, do you authorize this cloud code instance? Okay. And then it gives you a code that you paste back in because it's like, you can't just like directly authorize it because running on sandbox. So

For some reason, and I don't know why, Windows terminal mangles the OAuth string that you paste back in. Just doesn't accept it. no different variant. know, control shift V, control V, right click paste. Nothing worked. I was just like, what do I do here? So I was, you know, old fashioned Googling it. Cause first thing you do is of course was ask Claude and Claude had no idea.

Which is unfortunate so that was old-fashioned googling and I came across a github issue for this very issue So apparently there's other people actually running on Windows who knew? I never thought I'd be one of them, but here we are and I The way to make it work is probably the most hilarious thing I've ever seen and it worked and then the funny part was like Claude left off with like a

does this work? When I was like, I'm going to try this and I told it and it was like, I'm genuinely surprised that it worked. I was like, me too Claude. And what it was, sorry, is literally a PowerShell script that invokes typing.

Shimin (48:26)
Yeah.

Of course.

clever.

Dan (48:39)
And it types

every single character of the, mean, first of all, it's kind of wild that that's even a thing, but basically it says sleep for three seconds and then type the off thing into the terminal window from the clipboard essentially. So it worked and then even press enter for you is the then part of the script, which I thought was cute. but it was like, well,

Shimin (48:52)
Yeah, that's very clever.

Dan (49:00)
How is it that bugged that you can't just paste the damn thing? Like, I don't understand what in the custom terminal renderer they're doing there, either on the window side or on the ink, which is, as we found out, the internals. Not React ink, Vadim, but ⁓ the other, their custom ink that they wrote.

Shimin (49:17)
Did you have any other issues running claude code in WSL? Are there any other system issues? Cause then you have to mount an amount in order to get your context in there.

Dan (49:19)
Once I got through the auth it ran fine. Yeah.

I mean, I didn't use it like super extensively. So, I don't know, but I think if you use, I'm pretty sure if you use VS code, it spins up WSL or you can set it to spin up WSL for you in the project folder and get auto amounts, all that stuff. like, think for the per project basis, it's probably not that bad, but couldn't tell you that that might be a future Dan's Rant I have to keep using this.

Shimin (49:40)
okay. Yeah.

It makes sense. Dan's Rant this

week brought to you by Microsoft. Thank you, Microsoft. You are not a sponsor, but you're certainly not going to be one now after all that.

Dan (49:57)
Microsoft X anthropic. Yeah.

Not unless you fix your damn terminal. Anyway, on the happier news or less happy news, let's see how we how we go this week. ⁓

Shimin (50:10)
Alright, well, good rant.

Yeah,

let's talk about 2 minutes to midnight, where as always, we talk about the state of the potential AI bubble using the atomic clock analogy from the Bulletin of Atomic Scientists. were at... Now that we're no longer worried about nuclear war. Yes, I feel so much better. Yeah, we were at a minute 45 and...

Dan (50:29)
which we can talk about again now that we're close to actually slinging nukes

Shimin (50:41)
What you got for us this week?

Dan (50:43)
Yeah. So first up is a tuck crunch article. feel like tuck crunch is always good for, the latest, greatest, AI financial news. and it is entitled the AI gold rush is pulling private wealth into riskier, earlier bets. so

The premise of the article is like, you know, for a long time, VCs.

⁓ have been, investing in, you know, all kinds of different startups.

But, and I guess you can also kind of get private equity on some of those bets too. But what we're seeing now is that people are so freaked out by the relative pace in AI that like, you know, pace of advancement that they're treating it as like an investment risk to not have sufficient market.

position in AI stocks. And because everyone is like scrambling for like that sort of like private market share, the only way to grow the pie of what's available to invest in is to go earlier and down like, you know, down the investment chain, so to speak. So smaller, earlier, less good ideas, et cetera. And that that is

you know, having a large impact on the sort of like chain of money that shapes startup investment.

Shimin (51:57)
Yeah, there was a time when I worked at a small family office in New York for managing money for billionaires. And that will not put it past me. That's the kind of thinking that is very in vogue right now on Wall Street.

But I think this could be a good thing when it comes to the overall impact of the bubble. If most of that money is coming from high net worth individuals, let's call them, then if the bubble does burst, it's only the high net worth individuals that will be impacted as opposed to the common stockholder, the yeoman farmers of the Midwest who are not billionaires.

Dan (52:30)
Yeah, I suppose.

I suppose that's true, but I think you also have to look at it through the lens of like, this is essentially a forcing function on the risk tolerance of the investors, right? So they,

are now being driven towards crappier bets overall in a way to just stay in the market, in this market. And I don't see that as a good thing. I think that it means that it's the equivalent of like, what was that crazy, like a really odd startup that everyone always likes to talk about as like the pet one or something like that. forget. Yeah, something like that. ⁓ Webvan, yeah, that's the one I was thinking of.

Shimin (53:09)
pets.com or webvan

Dan (53:15)
You know because it's like that where it's like when you sit there and look at the business model in the cold light of day you're like What is this you know or I think that LLM version of that is like this is just a wrapper around something and you're getting how much funding you know So

Shimin (53:28)
Yeah.

Well,

what we are saying is it's time to build your own LLM wrapper gang listeners. Dan's nodding. Dan's nodding agreement.

Dan (53:39)
Get funded, use it to buy,

use it to buy hardware so you can run your own model. don't know.

Shimin (53:44)
Yeah. I'm surprised. Okay.

Well, my article this week is from the implicator.ai. a great name. It's titled open AI CRO tells staff anthropic inflate run rate by eight million, eight billion. Sorry. Eight million. What does that trump change? ⁓ Exactly.

Dan (53:51)
Implicator which is a great name

That's Billion with a B! That'll barely even fund my AI wrapper. What are you talking about?

Shimin (54:10)
So what happened was this week the OpenAI chief revenue officer stated that Anthropic has overstated its 30 billion run rate by about 8 billion through revenue accounting tricks. And as always, we love it when the ⁓ podcast talks about accounting topics, right?

Dan (54:31)
You

Shimin (54:32)
⁓ as someone who was also, an ex accountant, let's talk about this. So when open AI sells its models through a third party provider, open AI only takes, only recognizes its net revenue. if I, if I sell my model to Dan for $9 a model, Dan sells it to the end customer for $10, taking $1.

as a cut, OpenAI in this case will only recognize the $9. And Anthropic, however, will recognize the entire $10 as its raw revenue number. Now you may ask, hey, who was right? Both are actually correct according to US GAAP. So both are valid. But

What I find actually most interesting about this news article is that open AI is not in a good spot if your chief revenue officer has you go on record saying Anthropic is overstating its revenue number. We know for a fact that Anthropic has, you know, Anthropic revenue has been growing pretty, pretty much like mad since the whole Pentagon thing. And since it's enterprise lead has just kind of

Dan (55:29)
Mm-hmm.

Shimin (55:41)
uh, broadened over time. think Anthropic almost has as much revenue as OpenAI at this point. Maybe it's even about to surpass it. So that could be another reason why the chief revenue officer is here going like, Hey, well, it's going to surpass it if you don't account for the cut that like AWS takes. Uh, but that's not a good sign. We have to like go into accounting rules details. Uh, you're in trouble. And, uh, Dan, I know you've, you've been

Dan (55:59)
Yeah.

Shimin (56:10)
on team OpenAI. In terms of you've been on team OpenAI is the first one that is going to fall and I am on team Oracle. So this does not look good for my bets here.

Dan (56:22)
⁓

Team OpenAI.

Shimin (56:25)
Team OpenAI

is winning, if by winning you mean losing. Yes.

Dan (56:27)
Yeah,

maybe. I mean, it's like, you know, with most things, are they too big to fail? Blah, blah. But like, yeah, it is pretty interesting signal.

Shimin (56:36)
All that said, how do we feel about the two minutes to midnight clock this week?

Dan (56:42)
We're at 145. I'm actually not, this is not a good news of the two things that we've brought at least by any means, but I'm also not super pessimistic about it because I think that OpenAI has recognized the fact that they have an enterprise weakness right now and they're pivoting towards it actively. They've just announced, I mean, they've killed Sora, a bunch of other things. They've done a lot of little moves like this where...

Shimin (56:54)
No, me neither.

Dan (57:08)
They're getting a lot more focused as a company. And I, at least in the current environment, don't see that as a bad thing.

Shimin (57:15)
Yeah. I, yeah.

Dan (57:17)
Will it pan out for them? Who knows? But I don't

think that, you know, an immediate danger of dying.

Shimin (57:22)
I mean, this is a segment where we bring up financial news from the web about AI companies. But I think the biggest news item is actually the anthropic mythos news item, right? The fact that...

The Federal Reserve Bank of the United States is having active discussions along with other large banks to talk about the latest AI model. We are already in too big to fail territory.

Dan (57:46)
Yeah, that's fair.

Shimin (57:47)
Right. so I'm, I mean, I'm happy to move this bad boy back. Like,

45 seconds, even a minute until the whole Project Glasswing thing shakes down. Because if you can hack any computer in the world, what is money?

Dan (58:01)
Yeah.

Like maybe now we're seeing the step change that people have been talking about. You know, we talk a lot about like the four, four, five to four six-ish era step change with like Opus models and you know, like Codex five four and all that is like really kind of move the needle and it's like, well, maybe there is a big thing and some of the anti-doomers are right. So we'll see.

Shimin (58:24)
The anti-tumors are right. yeah. Even if it's 25 % chance.

Dan (58:26)
I don't know. I don't even know where I fall on

that spectrum. I change every week. I think that's partially why we go back and forth on Picklock.

Shimin (58:37)
I'm putting my foot down. I

think I think we should go back even if it's only a 25 % chance of this thing being a true step change The if it is a step change, it's so earth-shattering that the expected value is still Just much higher, you know 25 of a billion is a very large number 25 percent of a billion. I should say yeah I'm happy to move it back like a minute

Dan (58:56)
Mm-hmm.

Okay.

I'm sold. Okay.

Shimin (59:04)
245, 245 and change. All right. And we're sold. And as always, I had like a gavel or something, that'll be nice. Ka-chunk, ka-chunk, ka-chunk. Yeah.

Dan (59:14)
That's my best

gavel sound effect for you.

Shimin (59:18)
Yeah.

Well, with the sound of the gavel, brings us to the end of show. Thank you again for joining us for our discussion this week. If you like the show, if you learned something new, please share the show with a friend. You can also leave us a review on Apple Podcasts or Spotify. It really helps people to discover the show and we really appreciate it.

Dan (59:24)
Mmm

Shimin (59:37)
If you have a segment idea, a question for us or a topic you want us to cover, please show us an email at humans at adipod.ai. We love to hear from you. You can find the full show notes, transcripts and everything else mentioned today at www.adipod.ai. Thank you again for listening and we'll catch you next week. Bye. Dan, you got to work on it and give me like ⁓ 10 sound effect ideas by next week.

Dan (1:00:02)
I know.

Shimin (1:00:06)
I only have the laser pew pew pew pew. That's my favorite, but...

Dan (1:00:10)
Excellent. Thank you.

</details>