Systems

The Loop That Guards the Loop

The Loop That Guards the Loop

A package-update loop with nothing in front of it is not automation, it is an open door with a schedule attached.

A package-update loop with nothing in front of it is not automation, it is an open door with a schedule attached. We found that out the expensive way.

One of our own npm packages, @hasna/mementos, got hijacked at version 0.14.33. Not a dependency three levels down — ours, published under our name, compromised at the registry. The fix shipped as a one-line commit: reclaim @hasna/mementos npm at 0.14.34 after the 0.14.33 hijack. Terse, no drama, because the drama had already happened and the only useful response left was to take the package back and move on. But it changed how every automated update on every machine works from that day forward.

The rule that came out of it is simple to say and easy to skip if you are not paying attention: the package-update loop runs after the supply-chain-watch loop, never before. Updating dependencies without checking for active attacks first is how you install the attack yourself, on schedule, with logging, looking exactly like routine maintenance. A hijacked package does not announce itself. It just waits for the next automated bump to pull it in.

Saying the rule is not the same as enforcing it, so it got enforced structurally instead of politely. We built a conversations space called loops, with loops-security underneath it for supply-chain-watch, keyleak-scan, and dependency-audit. One of the fleet's own agents, hadrian, created the space, and its description is blunt about the hierarchy: post every run's outcome, incidents get posted immediately, and package-update agents read here before updating anything. That last clause is not a suggestion sitting in a memory file somewhere an agent might forget to check. It is a read the update loop is required to perform before it is allowed to write anything to a lockfile. The guard loop is upstream of the loop it guards, by construction, not by good behavior.

The original instruction, typed in full, is worth keeping exactly as it was written, because the ordering was never framed as a nice-to-have: for the package update make sure it checks for any supply chain attacks first, so it must run after the supply chain attack loop, and for each skill the agent must post its work in conversations. Three security loops share that one upstream space today — supply-chain-watch for known attacks against packages we actually use, keyleak-scan for anything that looks like a credential sitting somewhere it should not be, and dependency-audit for the slower-moving CVE and license class of problem. None of the three gets to work in isolation. All three post where the rest of the fleet can see them before anyone downstream acts on their assumptions.

The watch loop runs every 30 minutes, scanning the security feeds and the ecosystems we actually depend on, not once a day and not on demand only when someone remembers to ask. Frequency is part of the guard. An attack that gets published and exploited inside a 24-hour window does not wait for a human to open a laptop and think of the word audit. A loop that fires every half hour catches an active hijack inside the same window the attacker is using to spread it, instead of finding out about it during the next scheduled reading of a security newsletter.

The instructions inside the watch loop itself are the part I am proudest of, because they resist the obvious failure mode of security automation: alerting on everything. The skill's own text says it plainly — you are a security analyst, not a script. A headline about a compromised package is only an incident for us if we actually install the thing. Scanning every scary blog post and forwarding it as an alert is bloat, and bloat in a security channel is worse than bloat anywhere else, because it teaches everyone downstream to stop reading the channel. The signal we actually want is the intersection of two facts: actively exploited, and present in our lockfile. Anything short of that intersection gets noted, not escalated.

That split is deliberate and it maps to the same principle behind every loop we run: the schedule is deterministic, the judgment is not. A cron entry can tell an agent to check every fifteen minutes. It cannot tell the agent whether a given CVE actually touches anything we ship. That decision needs a model reading our actual dependency tree against the actual advisory, not a keyword match against a security mailing list.

Nothing that gets filtered out actually disappears, either. A headline that does not clear the actively-exploited-and-in-our-lockfile bar still gets noted in the channel, it just does not get escalated as an incident. Dependency-audit, the slower of the three security loops, is where that lower tier of signal accumulates and gets reviewed on its own cadence. The point of the filter is to protect the pace at which real incidents get attention, not to throw information away.

Below the loop layer sits a tool built specifically for this problem: shield, an AI-powered scanner for git repos with supply-chain attack detection. Its check-package command is built to run against real, named packages — axios, litellm, and the kind of dependency that has actually been hit by a supply-chain attack before, not a hypothetical one — cross-referenced against known npm and PyPI incidents, plus results from tools like Trivy. It also ships a pre-push hook that blocks the push outright if it finds exposed secrets in what you're about to send upstream. That is the guard sitting at the earliest possible point: before code leaves your machine, not after it lands in a registry.

The blunt version of why any of this matters showed up as a plain instruction, not a policy document: check and make sure we do not expose any code or credentials or leak anything across organizations, check for supply-chain attacks recently, and keep fixing issues, not just report them. Across organizations is the detail that raises the stakes past a single leaked token. A fleet running dozens of repos, several company divisions, and agents with real wallet and infrastructure access is not guarding a hobby project. The credentials sitting in that fleet's secrets store move real infrastructure and, in some cases, real money. A keyleak-scan that misses one exposed key does not cost you an afternoon of embarrassment, it costs you the thing the key protects.

The same discipline runs on the commit side independent of any one tool. A staged secrets scan runs before every commit and every push, full stop, and if it finds a credential it gets removed from the diff before anything continues. Never expose credential values is not a suggestion either — it is one of the small number of rules marked non-overridable, the kind that survives every other reorganization of how the fleet works. skill-keyleak-scan exists to make that habit automatic instead of a thing you remember to do under deadline pressure, which is exactly when you forget.

When the scan does find something, the response is not a ticket queued for later. It stops the commit cold, strips the offending value out of the diff, and only then lets the work continue. There is no path where a credential ships now and gets rotated later. Later is how credentials end up indexed by a scraper before anyone rotates anything.

This is not hardening against a hypothetical. One remediation task reads exactly like what it is: a critical-severity ticket to fix Next.js and transitive dependency vulnerabilities surfaced by a bun audit, tagged security, supply-chain, nextjs, audit, filed and worked like any other piece of technical debt, not treated as a fire drill. The instruction behind it was equally plain — check for supply-chain attacks recently and fix all issues, keep going until they are fixed, not until a report gets written about them. A report is not a fix.

There is a whole package domain devoted to making this kind of decision reusable instead of one-off: guardrails, reusable guardrail and policy decisions for agentic systems — allow, deny, warn, redact, or gate for approval, applied uniformly across actions, prompts, shell commands, MCP calls, browser use, and model routing. The point of building it as its own domain, instead of scattering ad hoc checks through every app, is that a policy decided once — block this pattern, redact this field, require approval for that action — applies everywhere an agent operates, not just in the one script someone remembered to guard. Decide the policy once, in one place owned by one package, and every surface that touches an agent inherits it automatically instead of hoping every app author remembered to copy the same check.

The loop framework guards itself the same way it guards everything else. OpenLoops ships with MCP mutations off by default, and turning one on requires an exact confirmation string, not a friendly yes. A framework that schedules recurring automated work is, by definition, a framework that can be talked into recurring automated damage if any model in the room decides that is the helpful thing to do. Requiring an exact string instead of an approval-shaped sentence closes the gap where a model talks itself, or gets talked, into treating a destructive action as routine.

The uncomfortable truth about all of this is that a security loop is worth exactly as much as your confidence that it actually ran. A supply-chain watch that silently stopped firing three weeks ago is worse than no watch at all, because it gives you the feeling of coverage without the coverage. That is a separate problem from writing the loop in the first place — it is the problem of trusting your own scheduler — and it gets its own answer elsewhere. Here, the answer is narrower: the loop that changes your dependencies does not get to run until the loop that watches for attacks has spoken, and the loop that watches for attacks does not get to cry wolf on every headline it reads. Everything else is decoration.

The mementos hijack was cheap in the end — one version number, one afternoon. It could have cost a lot more if the package-update loop had already been running unguarded when it happened, because the same automation that makes a fleet self-maintaining is exactly the automation an attacker wants to ride in on. An attacker does not need to breach your machine if your own update loop will happily pull the compromised version in for them on schedule, with your own commit history vouching for it. Guard the loop that changes things with a loop that watches first. Then make the watching loop smart enough to shut up about things that do not matter. That is the whole design, and it is not more complicated than that on purpose.

← Back to the articles

Newsletter

What we shipped, what broke,
and what we learned