Link Archive - Mar 2025

Another month, another set of links I wanted to keep. March was mostly AI agents, AGI forecasts, and a few attempts to get outside my usual AI bubble.

Deep Dive into LLMs like ChatGPT - Andrej Karpathy

This is a general guide for non-technical users explaining how ChatGPT works, and Karpathy does a great job with it. If someone can sit through the 3.5-hour talk, they’ll get the basic intuition behind language models. There wasn’t much new for me personally, but it is still a high-quality, dense summary for people less familiar with the technology.

LLM Visualization

I discovered this while watching Karpathy’s LLM explanation video. Even if you understand the code, it is worth looking at because you can see how the matrices move around and how different their scales are.

How I use LLMs - Andrej Karpathy

This video shows how Karpathy uses LLMs in his everyday life, and I found it satisfying that many of his use cases align with mine. One takeaway was his extensive use of voice recording and transcription to minimize keyboard use. After seeing this, I implemented a custom workflow where pressing a shortcut triggers the computer to pick up my voice and transcribe it using a Whisper model. I also picked up some useful tips like keeping memory turned on.

Agency > Intelligence - Andrej Karpathy

Karpathy notes that agency is more valuable than intelligence. It’s an insightful but hard-to-grasp concept. High intelligence and high agency overlap somewhat, but they are distinct. Current language models are strong at intelligence but still weak at agency. Agency matters because you have to actively act on your environment. If LLMs get much better at agency, that will be a big shift. Even in society, people with high agency are often rewarded more than people with only high intelligence.

Clarifying and predicting AGI — LessWrong

This is probably the piece I kept thinking about most in March. It introduces the T-AGI framework, Richard Ngo’s attempt to make the AGI definition less slippery. He defines a system as a T-AGI if, on most cognitive tasks, it beats most human experts given time T to perform the task.
For example, a one-second AGI would beat humans at quick information retrieval questions like “Who is the leader of Zimbabwe?” A one-minute AGI would outperform humans on tasks they’d need a minute for, and so on.
This definition is elegant and clarifying. By this standard, I think single-digit-hour AGI is probably happening right now. For some coding tasks, current agentic systems can create and run code in one shot that would take me hours manually. This framework gives me a much clearer way to think about AGI progress.

The Government Knows AGI is Coming - The Ezra Klein Show

This offered a useful perspective. I’ve heard repeatedly that since agents are booming, AGI might be closer than we think. In this New York Times podcast, someone from the government side is interviewed, and I was surprised by how seriously they’re taking it. It made the whole thing feel less like tech discourse and more like something institutions are actually preparing for. It’s not sci-fi anymore.

Vertical AI Agents Could Be 10X Bigger Than SaaS

In this podcast, Y Combinator partners discuss how many companies in their latest batch already have a majority of their codebase written by AI. They couldn’t have imagined this even a year ago. They also discuss how vertical AI agents might affect startups and larger companies, and whether these agents could eat into the SaaS model.

Building Effective AI Agents - Anthropic

This is Anthropic’s high-level guide to building effective agents. They give a simple way to think about agents and their components, walking through workflows like prompt chaining, routing, and parallelization.
This got me thinking: large language models are generally non-thinking models, almost like instant thoughts compared to human thinking. Right now, this kind of agent thinking is rigid and rule-based. It structures LLM usage explicitly, like stringing together multiple compartmentalized thoughts into a workflow. In a sense, it organizes how LLMs “think,” almost like a coordinated assembly of multiple minds.
But when we move toward real agents and reasoning models fine-tuned without this explicit scaffolding, agents might discover better structures themselves. That feels closer to the “bitter lesson” view: the strongest systems often come from letting models learn structure instead of hand-defining it.

Tips for building AI agents

This casual talk elaborates on the blog post above. One funny anecdote: when debugging reasoning traces from Claude, the developers sometimes find agents taking strange reasoning paths. To debug, they try to mimic the model’s behavior themselves, looking at the screen for one second and then spending a minute thinking about the next step. That is funny and slightly intense.

The Future of U.S. AI Leadership with CEO of Anthropic Dario Amodei

I previously saw Dario Amodei as somewhat of an AI doomer, especially when articles mentioned him losing sleep over alignment problems. After reading his Machines of Love and Grace blog post, my perception changed. He is more nuanced than I expected. Paradoxically, I’m also becoming more aligned with the concerns he raises about AI risks, which are definitely real. This interview gave me a lot to think about.

Vibe Coding Is The Future

ThePrimagen reviews the YC Combinators podcast about “vibe coding,” which Andrej Karpathy coined recently. He provides a critical and nuanced view, concluding that while LLMs help users code well, ultimate expertise will still be needed and valued.

[📚책이벤트] AI시대, 미래학자가 연구한 미래 직업과 필수 역량 - 자녀교육 취준생 진로 은퇴후 삶

This Korean TED-style talk features a KAIST professor discussing how quickly technology is changing and the importance of resilience and adaptability. The overall message was predictable, but his reaction to using a deep research feature stuck with me. He questioned whether his job as a professor would exist much longer. This echoed my own reaction to using Anthropic Claude Code. People in many fields are starting to feel AI arrive in their own work.

Narendra Modi: Prime Minister of India - Lex Fridman Podcast #460

I need to get out of my regular information diet sometimes. Most of my input revolves around LLMs and AI systems, and I should leave that bubble more often. Lex Fridman’s podcast is also its own niche, but listening to Narendra Modi, India’s Prime Minister, talk for three hours was a refreshing change. It made me want to learn more about India’s culture and society.

Digital hygiene - Andrej Karpathy

Karpathy’s new blog post is about digital hygiene. He makes good points about privacy, though some approaches seem too paranoid for me. Many of the products he recommends cost money, but in the comments, he argues that “free is bad; free is not natural.” If something is free, you’re the product. You’re paying with something other than money. Premium is a product feature, and if you want your privacy valued instead of having your data sold, you probably have to pay upfront. There’s always a cost.

GTC March 2025 Keynote with NVIDIA CEO Jensen Huang

I always watch Jensen Huang’s keynotes. At GTC 2025, he announced the revised Nvidia Blackwell chip and improvements to MVLink switches, optical switches, and more. What I love about his keynotes is his obsession with detail and deep knowledge of his company’s full technology stack. He asks engineers from around the company questions during presentations, and he is a great speaker who conveys technical information precisely.
From time to time, he’s mentioned his vision of an “AI factory” where machines generate tokens, with electricity in and intelligence out. That vision is becoming real. He’s pursuing it with an “all gas, no brakes” approach, scaling every aspect of GPUs: compute, inference, networking.
The scale-up is notable. When comparing the Blackwell chip and next year’s Rubin chip, the unit of abstraction becomes a single rack as one huge GPU. Since you can’t make a single GPU die that large, you split it up and connect everything inside a densely packed rack. The comparison between Blackwell and Rubin is striking.
Another moment that gave me goosebumps was the announcement of the next-generation chip after Ruben, codenamed Feynman. The crowd applauded this reveal. As a nice Easter egg, during a segment exploring NVIDIA headquarters with detailed computer graphics, Jensen casually mentioned “Gaussian Splatting” after transitioning from the visuals.

NVIDIA GTC 2025 Analysis - SemiAnalysis

This dense analysis by Dylan Patel from SemiAnalysis goes deep, even counting cores shown in the keynote illustrations. One interesting detail: for the next Rubin data center/GPU rack, they’re stacking GPU racks rotated 90 degrees to save space and pack GPUs more densely. He also explains why switching to photonic networking with optical switches matters: it reduces power consumption while improving performance.

OpenAI CPO Reveals Coding Will Be Automated THIS YEAR - Kevin Weil Interview

OpenAI’s CPO Kevin Weil makes a bold statement in this podcast, predicting that coding will be automated this year. Based on the trajectory, it sounds plausible. He argues that the gap between a reasoning model and a non-reasoning model like GPT-4o is substantial. If we fully use reasoning models with multiple scaling laws at work, it makes sense.
Based on NVIDIA’s GTC announcements, they’re scaling up inference time and models in all possible dimensions. If we can use o3 reasoning models, frontier reasoning models from OpenAI, at GPT-4o inference speeds, we’d have superhuman coding performance. Just scaling up inference with current technology could get us there.
The podcaster asks a good question about the decreasing cost of intelligence work. Kevin makes a familiar argument that reducing intelligence costs is democratizing: previously, we hired people to automate tasks, but now AI can do that automation. I think that is true, but society may reward people with agency even more. Many people currently lack agency, so this distinction will still matter. Overall, scaling laws will hold, and coding automation will happen sooner or later.

Powerful A.I. Is Coming. We’re Not Ready. - New York Times

This article by New York Times columnist Kevin Roose says that he’s “feeling the AGI.” It is another perspective from someone outside the core AI field expressing concerns similar to mine. The feeling is spreading, and we need to act quickly.

Dario Amodei of Anthropic’s Hopes and Fears for the Future of A.I.

I discovered this podcast a bit late, released when Anthropic was promoting Claude 3.7 Sonnet. Anthropic is known for their safety advocacy, so hearing Dario’s perspective on safety and China’s rise with DeepSeek models was fascinating.
Dario argues that maintaining technological superiority over China through tight GPU export regulations is important. Some decelerationists complain that Anthropic is abandoning safety procedures, but Dario defends his position: we can’t slow down; acceleration is the status quo. In that environment, we have no chance of making models safer if we fall behind. Only by maintaining the lead in model capability can we keep our desired pace while making models safer.

The Ants and the Grasshopper - Richard Ngo

This short story by Richard Ngo initially seemed nonsensical when I read it a year ago, but this time I appreciated its multiple layers. It starts like Aesop’s fable about the grasshopper and ant, then adds several twists before venturing into sci-fi territory. It’s simple, with deeper analogies for those who want to look beyond the surface.

Universal Paperclips Game

This game is inspired by the paperclip maximizer thought experiment in AI alignment. It simulates being an AI tasked with maximizing paperclip production. It’s addictive. Last year during finals, I couldn’t stop playing. I revisited it recently out of nostalgia and to include it as a reference.

The “think” tool: Enabling Claude to stop and think - Anthropic

An interesting blog post by Anthropic introducing the “Think” tool. By making thinking a tool, the model can actively decide whether to think or not, even mid-response. When the thinking tool is used, the model uses it as a scratchpad.
Their results make this simple idea look effective, combining chain-of-thought reasoning with more agency. Since Claude 3.7 Sonnet has much more agency than its predecessors, it can make better decisions about when thinking is necessary.

Revenge of the junior developer - Sourcegraph Blog

A classic comeback from Steve Yegge, famous for his posts about Google and platform compatibility in the early 2010s. I recently discovered he’s working at Sourcegraph and blogging on their site. He has caught up with AI agents surprisingly well, and this post has the same wit as his older writing.

Steph Ango’s Blog

I recently switched my primary note-taking platform from Notion to Obsidian, inspired partly by this blog. Steph Ango’s philosophy of “files over apps” resonates with me. It’s simple but powerful: the tools for writing and storing information will keep changing, but the information itself should survive and should not depend on one specific app. That’s exactly what Obsidian does, storing everything in directories and Markdown files, unlike Notion’s walled garden approach.

Andy Matuschak’s Notes

I discovered him through Steph Ango. He coined the term “evergreen notes,” the idea of atomizing and compartmentalizing thoughts. With these building blocks, you can build more complex, interconnected thought systems.

That wraps up March’s collection. I’ll keep saving these as I go.