Sunday, June 14, 2026

The Taste You Can't Outsource

It was late, and I was doing the kind of work that never makes it into a demo: adding guardrails to my Claude Code setup. While I was in there I pulled in SkillSpector, NVIDIA's security scanner for AI agent skills. It checks a skill for malicious patterns before you let it near your machine. The docs were stale and a couple of things were broken, so I did what I do now. I asked Claude what else was missing.

It came back with two recommendations. The second one stopped me cold.

Remove the call to OSV. Add an offline mode that doesn't reach the internet.

Wait, what is OSV, and why does it even need to connect? OSV is the Open Source Vulnerabilities database, a free public service (osv.dev) that maps known security flaws to specific package versions. When SkillSpector spots a dependency, it asks OSV one question: is this exact version known to be vulnerable? That single call is how the scanner knows what "bad" looks like today, instead of whatever happened to be true the day the code was written.

So for a scanner whose whole job is to catch known-bad code, the call to OSV isn't a feature. It's the part that does the looking. A mode that skips it isn't a leaner tool. It's still a cheese burger - with cheese and burger, just without the beef.

And the suggestion wasn't wrong, exactly. I pushed on it, and Claude made a coherent case: air-gapped CI, no network egress, faster runs. Every one of those is real. In a different tool it would be good advice. The model wasn't hallucinating. It was reasoning. It was just reasoning about everything except the one thing that made the tool worth building.

Build anything. In a day.

We're deep in the season of the grand claim. AI will replace engineers. You can ship a feature-complete product in an afternoon. There's a skill that turns an agent into your chief of staff, and a thread every week where someone stands up a whole app over a weekend and a thousand replies ask for the prompt.

I want to be generous, because the capability underneath is astonishing and I use it daily. But I think we're mistaking a capability demo for a product. Those builds are samples. They show what the clay can do. They are not the same as knowing what to make from it. And a product was never "what the model can build." It's what you wanted it to build. Different sentence.

The taste it can't have

The model has taste. Ask Claude to make something nice and it will. What it doesn't have, and I'd argue can't, is taste specific to you: to the single reason this thing exists and not some adjacent thing that would also be defensible.

That reason isn't in the code. It's in the point. And the point lives in your head, not the repository. So the model optimizes what it can see, like "faster" or "more flexible" or "offline," and quietly trades away the thing it can't: this is a security tool, and a security tool that doesn't check is worse than no tool, because it returns green without looking.

Here's the part that got under my skin. I'm proud of my Claude setup. It knows my preferences, my level, the work I do; it doesn't hand me the vanilla answer. By any measure it's well-grounded in me. And it still told me to unscrew the sensor. Which means this isn't a prompting problem you tune your way out of.

Knowing what to build is the job now

So who does well here? Not the fastest prompter. The person who can put on the product hat and keep the engineering skill to get it done, and knows which is which.

Knowing SkillSpector must call OSV is product knowledge. It's a judgment about what would, and wouldn't, bring value, and it's exactly the judgment the model skipped. The engineering question is what you reach for after: SkillSpector already falls back gracefully when OSV is unreachable, and that's the careful version of "offline." Deciding the database is optional is not the same as handling the day it's down. One is a product decision. The other is engineering.

And the engineering is the part I'm actually building right now. A scanner only protects you if you remember to run it, so I'm putting it in front of the door: a guardrail that checks a skill before it ever installs, and hands back a clean allow, ask, or deny. To the agent reaching for a new skill, and to the CI pipeline doing the same thing on a human's behalf. The OSV call stays non-negotiable; what gets easier is everything around it. Telling those two apart, the line you must never cross and the capability you can keep extending, is becoming the real skill. More on the build another time.

What I'm not saying

This isn't an "AI is overhyped" piece; I don't believe that, and the story doesn't support it. The model found real bugs in that library, fixed the stale docs in seconds, and its other recommendation was good. I shipped it. On the how, it's a genuine force multiplier.

But the harder a thing is to write down, and the reason a tool exists is almost impossible to write down, the longer it stays ours. So I screwed the sensor back in, kept the OSV call, and left the dangerous advice on the floor. At the end of a late night of plumbing I didn't feel threatened. I felt useful. The model could build almost any version of that tool I asked for. It just needed me to know which one was worth building.


Ideated and dictated by me, written by Claude

Thursday, June 4, 2026

A brief history of plugin.json

A Brief History of plugin.json (Claude Code)

A Brief History of plugin.json

The evolution of Claude Code's plugin manifest system into a full-fledged dependency management engine.

1. The Era of Fragmentation (Late 2025)

When Anthropic first stabilized the Claude Code CLI and the Model Context Protocol (MCP), extendability was highly fragmented.

  • MCP Servers handled external tools (APIs, databases, filesystems).
  • Settings files (.claude/settings.json) handled rules and configurations.
  • Skills (like custom prompts) lived as standalone instruction documents.
  • No unified concept of a "packaged extension" existed; workflows had to be manually wired together via scripts.

2. The Birth of the Manifest: .claude-plugin/plugin.json (Early 2026)

Anthropic introduced the Claude Code Plugin Architecture to unify components into a singular standalone structure.

  • Plugins bundled skills, custom sub-agents, hooks, and local .mcp.json tool declarations.
  • Introduced plugin.json as a passive metadata descriptor handling basic identity and simple version strings.
  • Versioning remained loose, resolving primarily by binding a project path to whatever git SHA happened to be HEAD at runtime.
{
  "name": "deploy-kit",
  "description": "Handles infrastructure provisioning and AWS EKS hooks",
  "version": "1.0.0"
}

3. The Broken Cache & Chaos Crisis (Spring 2026)

As enterprise adoption scaled, major operational and environmental cracks emerged in production environments.

  • Non-Deterministic Environments: Shifting git SHAs caused different team members to resolve different variations of the same plugin.
  • Cache Nightmares: Broken builds cached locally (~/.claude/plugins/cache/) persisted indefinitely due to a lack of native self-healing mechanisms.
  • Silent Failures: Master plugins relying on utility plugins had no mechanism to declare relationships, leading to fragile manual setup guides.

4. The Modern Era: Pure Parity & Version Constraints (June 2026)

Anthropic rolled out native Plugin Dependency Resolution (v2.1.143+), transforming plugin.json into an active package manager specification heavily inspired by Node's package.json.

  • Graph Enforcement: The CLI actively blocks disabling a plugin if active core systems rely on it, tracking the entire transitive dependency graph.
  • Cross-Marketplace Guardrails: Auto-installing dependencies across disparate marketplaces is locked down by default to prevent supply-chain attacks.
  • Strict Release Tagging: Enforced tag pushing (claude plugin tag --push) matches git tags directly to the manifest version.
{
  "name": "deploy-kit",
  "version": "3.1.0",
  "dependencies": [
    "audit-logger",
    {
      "name": "secrets-vault",
      "version": "~2.1.0"
    }
  ]
}

This structural shift marks the transition of AI tools away from loose "prompt scripts" and into production-grade, tightly governed software engineering components.

All content above provided by courtesy of Gemini. No claims mine.

Claude Code Shipped a Dependency Manager, and I Think It Is a New Frontier

Claude shipped a plugin dependency manager

Claude Code recently shipped dependency management for plugins, and as someone working on making collaboration better, I feel very excited about it. It means a skill can now be built on top of another skill. It means a platform team can publish a foundation and a hundred other teams can build on it without copying a single file. An entire layer of reuse that simply was not possible a few weeks ago is now sitting in the changelog, described in one modest sentence. I read it twice, and then I sat back, because I have seen this exact moment before in other corners of our industry, and I know what comes after it. It is a new frontier. Think about the introduction of pom.xml, or of package.json.

A quick word on how this post came together. I worked through most of this thinking out loud with Claude itself, tracing the feature from the docs into the actual GitHub issues, arguing about what counts as a dependency manager and what does not. Some of the sharpest framing here came out of that back and forth. I mention it because the irony is not lost on me that I used the tool to understand the tool.

Why those two files are the right comparison

The reason I reach for Maven and npm is that I have leaned on both, and the contrast between them taught me what dependency management actually buys.

In Maven, a dependency is an intent rather than a list of files.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-web</artifactId>
  <version>3.2.0</version>
</dependency>

I ask for one artifact and Maven brings in dozens, because Spring Boot declares its own dependencies, and those declare theirs, all the way down. The transitive set arrives without my touching it. The project owns a pom.xml, and when the build runs, Maven reads that file and provisions the world it describes. The project is the thing that sets resolution in motion.

npm expresses the same idea, with one addition I have come to value.

{
  "dependencies": {
    "express": "^4.18.0",
    "pino": "~9.0.0"
  }
}

Those carets and tildes are quietly profound. They are a contract about change. The caret welcomes compatible updates but refuses to cross a major boundary in silence. When I install, npm intersects every package's constraint, settles on versions that satisfy all of them at once, and writes the exact resolved set into a lockfile so the next machine reproduces it down to the patch. The constraint expresses tolerance for change, and the lockfile expresses intolerance for surprise, and a mature ecosystem needs both.

Both files, underneath the syntax, do the same four things. They let a consumer declare what it needs, version what it needs, resolve the transitive graph so one install pulls in everything below it, and reproduce that resolution elsewhere. Declare, version, resolve, reproduce. That quartet is what turned two configuration files into the foundation of entire economies, and it is the lens I want to hold Claude Code up against.

What actually shipped

https://github.com/anthropics/claude-code/issues/64457

A plugin now declares its dependencies in its manifest, in a form anyone who has opened a package.json will recognize.

{
  "name": "platform-observability",
  "version": "2.3.0",
  "description": "Shared observability conventions and tooling",
  "dependencies": [
    "audit-logging",
    { "name": "metrics-core", "version": "~2.1.0" }
  ]
}

There are two forms, and the choice between them is the same lesson npm teaches. A bare name accepts whatever the marketplace currently offers. An object with a version field pins a semantic version range, and the highest release that satisfies it is selected.

Now a higher-level plugin builds on that foundation.

{
  "name": "team-incident-toolkit",
  "version": "1.0.0",
  "dependencies": [
    { "name": "platform-observability", "version": "^2.0.0" }
  ]
}

Someone installs the leaf.

claude plugin install team-incident-toolkit@acme-marketplace

The runtime reads the manifest, sees the declared dependency, and pulls in platform-observability automatically, and with it the things that plugin in turn depends on. One command, the entire closure. The top of the tree arrives with the tree attached. Versions are anchored to git tags on the marketplace repository, and when two installed plugins constrain the same dependency, the runtime intersects their ranges and chooses the highest version satisfying both, raising a clear error when no version can satisfy everyone rather than quietly loading something that will misbehave later. That last behavior is the one I respect most, because the failures that hurt are never the loud ones.

Three of the four properties are unmistakably here. Declaration, versioning with real semantic ranges, and transitive resolution with honest conflict handling. The fourth, reproducibility, is where I have to temper the enthusiasm, because there is no lockfile yet. Resolution happens against live tags at the moment of install, so reproducibility rests on tight pinning by convention rather than on a committed artifact that guarantees every engineer and every build agent lands on an identical set. For anyone who has come to treat a lockfile as the thing that makes a teammate's checkout match their own, its absence is the conspicuous gap, and I expect it to be among the first things to close.

It is worth noting that this did not arrive fully formed, and the community is actively shaping it. The original request lived as issue #9444, asking for exactly this, plugins that declare dependencies on other plugins, with a shared library plugin underneath. Reading that thread is a small lesson in how these features mature in the open. And the rough edges are being found the same way. The version resolution itself was reported broken in issue #64457 and, as I write this, is only partially fixed, with local folder marketplaces still misbehaving, tracked in issue #65337. If you build on this today, build on a remote marketplace and verify your version pins resolve before you trust them.

So far this is a celebration with one footnote. Now I want to turn to the two things that keep me from calling this finished, because a frontier is exciting precisely because it is not yet settled.

It is Claude's, not a standard, and I hope that changes

The first thing to sit with is that this is Claude Code's dependency model, and Claude Code's alone. The manifest format, the tag convention, the resolution behavior, all of it lives inside one vendor's tooling. That is not a criticism. It is how every one of these stories begins. npm was a JavaScript thing before it was a movement. Maven was a Java thing. A dependency model almost always arrives as one ecosystem's local invention and only later, if it earns it, hardens into something the wider world agrees on.

But I find myself wishing for the standard already, because the underlying artifact is more portable than the plumbing around it. The skill itself, the folder with its instructions, is largely tool-agnostic. What differs is the wrapper. A different agent tool expresses the same dependency idea in a different manifest with a different resolver, and a team that lives across more than one tool ends up maintaining two disciplines for one set of skills. We have watched a parallel effort try to standardize the skill format itself. I would love to see the same energy reach the dependency layer, so that "this plugin depends on that plugin, in this version range" means the same thing no matter whose runtime reads it. We already have a standard for how agents talk to tools, and the field is converging on one for how agents talk to each other. A shared way to express how agent capabilities depend on one another feels like the natural third piece, and I hope someone takes it up. Until then, what we have is a very good vendor implementation, and a very good vendor implementation is exactly the seed a standard grows from.

There is no lifecycle, so the trigger is still a person

The second thing is subtler, and it is the one that took me a moment to articulate. Go back to pom.xml for a second. The reason that file is powerful is not only that it declares dependencies. It is that a lifecycle reads it. I run a build, and the build looks at the project, sees what it needs, and provisions it. The project itself is the trigger.

Claude Code has the declarations and the resolution, but not that lifecycle. The dependencies belong to the plugin, and resolution fires when a person installs a plugin, not when a repository is opened. A project can describe the plugins it wants, but nothing yet reads that description on open and provisions it the way a build reads a pom.xml. The engine is solid. What is missing is the project acting as the trigger.

In practice this means onboarding still begins with a deliberate human act. A new teammate clones a repository, and the right capabilities do not simply appear because the repository asked for them. Someone has to issue the first install. The dependency graph beneath that first install is fully automatic and genuinely impressive. The initiating step is not. It is a small manual seam stitched on top of a hard problem that has already been solved, and that ratio is exactly why I am optimistic rather than impatient. The expensive machinery is built. What remains is a convenience, and conveniences tend to follow quickly once the hard part is done.

Why I bothered to write this down

A changelog entry says plugins can declare dependencies. I read the same line and saw the thing I have learned to recognize.

The reason npm and Maven grew into economies was never the syntax. It was that dependency resolution is what turns a shared library into something a hundred teams can build on without copying it. Once that primitive exists, a platform team can own a foundation, publish it, version it, and let others compose on top while pinning the compatibility they have actually tested. That is now possible for agent capabilities, and it was not a month ago. A platform group can own a shared plugin, conventions and wiring and a few well-tested skills, and another team can depend on it in a single line and receive updates inside a range it trusts.

The question for those of us who think in systems is no longer how to share a skill, which has quietly become an ordinary solved problem. It is how to architect a layered set of capabilities the same way we architect a layered set of libraries. Stable foundations underneath, faster-moving work at the edges, version contracts between teams who do not sit together. These are old disciplines, and they have found a new substrate.

I will close with the honesty this deserves, because the ground is still moving as I write. The behavior I have described landed recently. The missing lockfile, the vendor-specific format, the install-time trigger that has not yet become a project-time one, all of it is under active iteration. What I have written is a snapshot. The durable point is not today's exact feature set but the direction, which is that agent tooling is compressing into months the same evolution that language ecosystems took years to work through. We are watching a pom.xml moment happen in real time. I do not know exactly what it grows into, and I am genuinely excited to find out.

And FWIW, I am excited about the analysis and havent tried it yet.

If this resonated, I would love to hear your perspective, especially if you are an engineer or engineering leader thinking about how shared capabilities should be built and distributed in an AI-first world.

Monday, March 16, 2026

Thunderstorms and Sunshine | A Principal Software Engineer's perspective

By a Principal Engineer, March 2026

It's a mid-March morning. Sunny, a little cloudy, with a pleasant breeze in the air. I sit here with my coffee, thinking about software engineering: where it's been, where it's going, and what AI really means for those of us who have given our lives to this craft. It's a calm morning. But I know what's coming. By afternoon, severe thunderstorms. Tornadoes on watch. Schools closing early.

Tomorrow, though, will be beautiful again.

That's where we are with AI and software engineering. And I think it's worth talking about honestly. Not with hype, not with fear, but with the perspective of someone who has been in this industry long enough to have seen a few storms before.

Where It Started

I was in 8th grade when I fell in love with software engineering. Not through a class, not through a mentor. Through a GW-BASIC program I found printed somewhere, typed up by hand, and ran on a DOS machine.

On the screen, an apple tree appeared. Apples grew, circled, disappeared, grew back. It was visually enigmatic. Beautiful. And the thing that hit me wasn't "I ran a program." It was: I made something beautiful that wouldn't have existed otherwise.

That feeling is one I've been chasing ever since.

I grew up, got my education, started engineering professionally. As a junior, you learn a lot and influence little. You're a small piece of something enormous, and that's humbling in a good way. But every now and then, something happens that cements why you're here. For me, one of those moments was tracking down a bug that had been crashing production servers for months. Objective-C code. An invalid memset, hiding in plain sight. It wasn't easy to find. It took patience, stubbornness, and a refusal to accept "we don't know why it's crashing." When I finally found it and fixed it, the joy was extraordinary. Not because it was glamorous. Because it was hard, and it mattered.

Those early moments shaped how I see software engineering. It is never easy. It is never one-shot. But there is absolute joy in doing things that would otherwise be very difficult to do.

The Drift, and the Return

Over the years, I moved deeper into architecture, design, and management. My teams wrote the code. I shaped the thinking, made the calls, unblocked the hard problems. I'd still jump in when something needed to move faster than anyone else could move it. That instinct never leaves you. But the hands-on building became rarer.

Then, in February 2025, vibe coding arrived.

I'd experimented with AI-assisted coding before. You'd describe a problem, get some code back, it was interesting but inefficient. More of a curiosity than a capability. Vibe coding changed that. For the first time, I could sit down with an idea and build, really build, with AI as a genuine collaborator.

I started with a side project. And within hours, I felt it again. The same joy. The apple tree from 8th grade. The feeling of making something that wouldn't have existed without me.

Here's what I think vibe coding actually unlocked: it removed a specific kind of friction that had accumulated over years. Not the intellectual friction, which is the fun part. The volume friction. The syntax differences between languages. The boilerplate. The fact that you can see exactly what needs to exist, you know it down to your bones, but materializing it takes days. AI collapsed that gap. And in doing so, it gave back the builder's joy to people like me who had drifted away from the code.

We don't write code because writing code is the end goal. We write code to build things. To make something real out of an idea. AI made that more accessible, not less meaningful.

Speed Goes Up. Judgment Matters More.

Let's be clear about something. AI accelerating code output does not mean engineering judgment matters less. It means it matters more.

Software engineering has always been both science and art. Over decades, our industry has accumulated hard-won principles: DRY, SOLID, 12-factor. Not arbitrary rules, but distilled lessons from countless projects that went sideways. A senior engineer doesn't always recite these principles by name. But they feel them. They look at a piece of code and know, almost instinctively, whether it's going to cause pain in six months.

That instinct doesn't transfer to AI. Not yet.

Here's a real example. Recently, I was reviewing AI-generated code that needed to process log data. The AI was stuck trying to read those logs top-to-bottom. It kept grinding away at the problem that way because that's the obvious path. But anyone who looked at how that log was actually structured would immediately know: read it bottom-to-top. That's it. Problem solved. The AI couldn't see that because it was constrained by its framing of the problem. I wasn't.

Another example: a developer on my team was running into issues where an AI was failing to generate detailed outputs for a batch of work items. His instinct was to increase the context window, to just throw more at it. Classic junior mistake, honestly. The right move was the opposite: break the problem into smaller chunks, give AI manageable pieces, and work through them sequentially. The same judgment I'd give a junior engineer, I now give to AI-assisted workflows. The nature of the guidance hasn't changed. The recipient has.

AI is an amplifier. If your thinking is good, it makes you significantly more productive. If your thinking is flawed, it produces flawed code faster. The engineering judgment, the taste, the architecture, the "wait, why are we even approaching it this way" instinct: that is not being replaced. It is being put to work more than ever.

The Factory Problem

Last week I saw a post about building a software engineering factory: end-to-end automated delivery of software. The ideas were genuinely interesting and I respect the ambition. But I keep coming back to something.

I graduated as a civil engineer before becoming a software engineer. In civil engineering, once the design is fixed, it's fixed. You don't change the foundations of a bridge mid-construction. The beauty and the complexity of software engineering both arise from the same source: it is fluid.

Scope creep exists because it can exist. Agile and SAFe came into being because until people actually see software running, they don't fully know what they want. We are great as humans at imagining abstract ideas. We are not great at knowing exactly how we want them realized until we can see and touch them. Two architects in a room will have an animated, sometimes heated discussion about design tradeoffs. That's not a bug. That's the process.

A factory model will deliver something standardized. That's valuable, maybe for 40 or 60% of use cases. But the cases that stand out, the products that resonate, the software that people love, those come from someone having a point of view. A taste. An opinion about what this specific thing should feel and do and be.

AI can build. But AI cannot want. It cannot tell you which vision is worth building. It cannot feel whether something will resonate with your specific audience in your specific context. That judgment, the product mindset, the vision, the "this is what I want it to be and here's why," is irreducibly human. You wouldn't delegate that to a developer without product sense. You wouldn't delegate it to a PM who doesn't understand the users. And you shouldn't delegate it to AI either.

The engineers and leaders who will thrive in this era are the ones who develop that product sensibility alongside their technical depth. Not one or the other. Both.

The Question I Don't Have an Answer To

Here's the thing that keeps me up at night. I want to be honest that I don't have a clean answer.

Taste is earned. You don't graduate as an architect. Nobody hands you the instinct for good system design. You build it over years: through production bugs, through painful refactors, through the memset hunts and the log-reading optimizations and the countless small decisions that add up to something called experience.

If AI absorbs more and more of that work, the debugging, the boilerplate, the small architectural decisions, where does the next generation of principal engineers come from? How do you develop taste without the struggle that forges it?

I don't know. I think there will be mistakes. Some will be caught by the architects and principal engineers who still have the eye for it. Some will make it to production. Companies I deeply respect have already shipped AI-assisted code that caused real revenue impact. That will keep happening for a while.

I believe it will get better. My rough mental model is that we are at step 3 of this evolution. Step 0 was humans writing everything. Step 1 was AI assistance. Step 2 was agentic and vibe coding. Step 3, where we are now, is AI that is starting to develop a kind of flavor. A taste. It doesn't always get it right, but sometimes it does, and you can feel the difference. In five years, I think AI will write reliably good code. In ten years, production-grade code without heavy supervision. But we are not there yet. And in the gap, human judgment is not optional.

Tomorrow Will Be Beautiful

This morning started with sunshine and a pleasant breeze. By afternoon, severe thunderstorms are coming. Tornadoes. Schools are closing early.

AI is going to speed things up. It is going to cause chaos. Teams will shrink. Roles will shift. Some of what we've taken for granted about how software gets built will be upended. That disruption is real, and pretending otherwise helps nobody.

But here's what I'm confident about: software engineering as a profession is going to stay. The number of people will change. The skills that matter will shift. But the need for humans who have taste, who have a vision, who know what good looks like, who can tell the difference between code that will hold up and code that will crumble, that need is not going away. If anything, as AI makes building cheaper and faster, the premium on knowing what to build and why goes up, not down.

The thunderstorms are coming. I'm not going to pretend they won't be disruptive.

But tomorrow is going to be a beautiful day.


If this resonated, I'd love to hear your perspective, especially if you're an engineer or engineering leader navigating these same questions.