Note: The following post is written based on a presentation I gave at SKKAI on March 29, 2025.

What is “vibe coding”?

The term “vibe coding” recently caught on, sparked by a tweet by Andrej Karpathy back in February 2025. He described it as a new way of coding where you “fully give in to the vibes, embrace exponentials, and forget that the code even exists.” This usually means using powerful LLMs inside tools like Cursor, often through voice commands and with very little keyboard use.

Karpathy explained the process: make simple requests, accept code changes without much review (“Accept All”), and feed error messages back to the LLM, which usually fixes them. He mentioned that the code can grow beyond easy comprehension, and bugs might get worked around rather than deeply debugged. It is less traditional coding and more directing, running, and assembling while the LLM handles the implementation details.

Since Karpathy is a co-founder of OpenAI and former head of Tesla’s Autopilot vision, the idea is worth paying attention to. The main idea is simple: humans set high-level goals, and the LLM does the detailed work. The workflow shifts toward observing, directing, and integrating the AI’s output, sometimes without deeply understanding the code yourself.

From intelligence to agency

The natural question is: is vibe coding actually feasible? Based on recent progress, the answer is increasingly yes, though with catches. This wasn’t really practical just a few months ago. The big change is LLM agency.

To understand vibe coding, it helps to separate LLM intelligence from LLM agency. Intelligence is about knowledge, reasoning, and understanding, areas where LLMs are improving fast. As I wrote before, benchmarks like MMLU show that modern LLMs can already have superhuman knowledge in certain areas. But this intelligence often stays inside a box, like a chat window, with limited ability to interact with the outside world.

Agency is about taking action, having initiative, and controlling things in an environment. It is the LLM’s ability to do tasks and make decisions that affect the real or digital world, not just talk about them. This is where standard LLM use often stops. For LLMs to really help with complex tasks like coding, they need enough agency to act for us.

Building agentic LLMs

Developing this agency is what makes vibe coding possible. LLMs need to use tools, change files, run code, and interact with systems, not just generate text. The ability to execute tasks from instructions is the foundation.

How do we give LLMs more agency? There seem to be two main routes. First, build smarter models designed for agency. This means training them on data showing agentic behavior, using methods like reinforcement learning, and designing models focused on task execution rather than only language. Evaluation is shifting too, with benchmarks like SWE-bench testing practical coding ability, which requires real agency. Anthropic showing Claude 3.7 Sonnet playing Pokemon is a good example of this focus on action and decision-making.

The second route is giving LLMs useful tools and ways to interact with the world. An LLM’s intelligence is limited if it can’t act. Tools connect the model to APIs, databases, file systems, web searches, and so on. The model needs reliable interfaces for getting information, running code, and changing its environment, turning “thinking” into “doing.”

Standardizing interaction with MCP

Integrating tools used to be a headache. Developers had to manually specify exactly how an LLM should use each API or system, a messy, non-standard process.

To fix this, the Model Context Protocol (MCP) was introduced. First proposed by Anthropic and now also adopted by OpenAI, MCP is an open standard for connecting AI models with external tools and data. Think of it as a “USB-C for AI applications.”

MCP works by creating a standard communication layer. Model providers, like OpenAI and Anthropic, make their LLMs MCP-compatible. Tool providers, like GitHub, Slack, or custom local tools, make their tools speak the MCP language. This lets models use any MCP tool through one interface, and tools become usable by any compatible model. The flow is typically: LLM decides a tool is needed -> sends MCP request -> tool executes -> returns MCP result -> LLM proceeds.

It’s no accident that the places where vibe coding is starting to happen, like the Cursor editor or the Claude Code interface, rely heavily on tool access. These tools give LLMs agency by providing controlled access to the file system, terminal, web search, and other interfaces. The formula is something like: smart model + useful tools = agentic capability = vibe coding.

Limitations and the road ahead

Of course, Vibe Coding isn’t perfect right now.

  • Supervision Needed: You still need to watch closely. LLMs can make mistakes or get stuck.
  • Scalability: It works better for smaller projects. Complexity can become an issue.
  • Understanding: Relying only on the LLM can mean you don’t understand your own codebase well (black box risk).
  • Maturity: The tech is new and not ready for everything, especially critical systems.
  • Debugging: Fixing bugs in code you didn’t write and barely understand is hard.

But this is likely just the beginning. LLM reasoning and agency are improving, and standards like MCP are making tool integration easier. “Vibe coding” still feels experimental, but the core trend is real: LLMs are becoming agents that can handle more complex tasks. It is worth watching closely.