Sunday, June 14, 2026

The Taste You Can't Outsource

It was late, and I was doing the kind of work that never makes it into a demo: adding guardrails to my Claude Code setup. While I was in there I pulled in SkillSpector, NVIDIA's security scanner for AI agent skills. It checks a skill for malicious patterns before you let it near your machine. The docs were stale and a couple of things were broken, so I did what I do now. I asked Claude what else was missing.

It came back with two recommendations. The second one stopped me cold.

Remove the call to OSV. Add an offline mode that doesn't reach the internet.

Wait, what is OSV, and why does it even need to connect? OSV is the Open Source Vulnerabilities database, a free public service (osv.dev) that maps known security flaws to specific package versions. When SkillSpector spots a dependency, it asks OSV one question: is this exact version known to be vulnerable? That single call is how the scanner knows what "bad" looks like today, instead of whatever happened to be true the day the code was written.

So for a scanner whose whole job is to catch known-bad code, the call to OSV isn't a feature. It's the part that does the looking. A mode that skips it isn't a leaner tool. It's still a cheese burger - with cheese and burger, just without the beef.

And the suggestion wasn't wrong, exactly. I pushed on it, and Claude made a coherent case: air-gapped CI, no network egress, faster runs. Every one of those is real. In a different tool it would be good advice. The model wasn't hallucinating. It was reasoning. It was just reasoning about everything except the one thing that made the tool worth building.

Build anything. In a day.

We're deep in the season of the grand claim. AI will replace engineers. You can ship a feature-complete product in an afternoon. There's a skill that turns an agent into your chief of staff, and a thread every week where someone stands up a whole app over a weekend and a thousand replies ask for the prompt.

I want to be generous, because the capability underneath is astonishing and I use it daily. But I think we're mistaking a capability demo for a product. Those builds are samples. They show what the clay can do. They are not the same as knowing what to make from it. And a product was never "what the model can build." It's what you wanted it to build. Different sentence.

The taste it can't have

The model has taste. Ask Claude to make something nice and it will. What it doesn't have, and I'd argue can't, is taste specific to you: to the single reason this thing exists and not some adjacent thing that would also be defensible.

That reason isn't in the code. It's in the point. And the point lives in your head, not the repository. So the model optimizes what it can see, like "faster" or "more flexible" or "offline," and quietly trades away the thing it can't: this is a security tool, and a security tool that doesn't check is worse than no tool, because it returns green without looking.

Here's the part that got under my skin. I'm proud of my Claude setup. It knows my preferences, my level, the work I do; it doesn't hand me the vanilla answer. By any measure it's well-grounded in me. And it still told me to unscrew the sensor. Which means this isn't a prompting problem you tune your way out of.

Knowing what to build is the job now

So who does well here? Not the fastest prompter. The person who can put on the product hat and keep the engineering skill to get it done, and knows which is which.

Knowing SkillSpector must call OSV is product knowledge. It's a judgment about what would, and wouldn't, bring value, and it's exactly the judgment the model skipped. The engineering question is what you reach for after: SkillSpector already falls back gracefully when OSV is unreachable, and that's the careful version of "offline." Deciding the database is optional is not the same as handling the day it's down. One is a product decision. The other is engineering.

And the engineering is the part I'm actually building right now. A scanner only protects you if you remember to run it, so I'm putting it in front of the door: a guardrail that checks a skill before it ever installs, and hands back a clean allow, ask, or deny. To the agent reaching for a new skill, and to the CI pipeline doing the same thing on a human's behalf. The OSV call stays non-negotiable; what gets easier is everything around it. Telling those two apart, the line you must never cross and the capability you can keep extending, is becoming the real skill. More on the build another time.

What I'm not saying

This isn't an "AI is overhyped" piece; I don't believe that, and the story doesn't support it. The model found real bugs in that library, fixed the stale docs in seconds, and its other recommendation was good. I shipped it. On the how, it's a genuine force multiplier.

But the harder a thing is to write down, and the reason a tool exists is almost impossible to write down, the longer it stays ours. So I screwed the sensor back in, kept the OSV call, and left the dangerous advice on the floor. At the end of a late night of plumbing I didn't feel threatened. I felt useful. The model could build almost any version of that tool I asked for. It just needed me to know which one was worth building.


Ideated and dictated by me, written by Claude

No comments :

Post a Comment