Good automation, bad automation
The same technology compounds in one workflow and corrodes in the next. Five tests separate them — none require understanding the models.
The wrong question
“Should we use AI?” is a question about identity, and identity questions produce identity answers: all in, or all out. Both failure patterns we track on this site begin exactly there — with a company answering at the level of who it is rather than at the level of what each workflow needs.
The useful question is smaller and asked more often: is this task suited to automation, and what happens when the automation is wrong? Asked per workflow, it produces boring, defensible answers. Asked once for the whole company, it produces press releases.
The five tests
Cost of a wrong answer. Reversibility. The escape hatch. The feedback loop. Volume versus judgment. Each test can be scored by anyone who knows the workflow, in an afternoon, without a data scientist in the room. A workflow that passes four or five is a candidate; a workflow that fails three is a liability regardless of how good the demo looked.
The tests are deliberately model-agnostic. Models improve quarterly; the structure of a workflow — who bears the cost of an error, whether a human can take over with context, whether corrections feed back into quality — changes slowly. Strategy built on the stable part survives the release cycle.
Good automation is boring
The automations that compound are almost embarrassingly unglamorous: first drafts, ticket triage, reconciliation, internal search, meeting summaries. Nobody keynotes them. They win because errors are cheap and visible, humans stay in the loop, and every correction makes the system better the following week.
Their economics are also quieter than the headlines suggest. The return is rarely a headcount story; it is cycle time, error rates, and the senior people who stop doing rote work. Firms that insist on measuring these automations in redundancies usually break the very loop that made them work.
Bad automation is a press release
Automations chosen for narrative value invert every test: customer-facing, high error cost, hard to reverse — because the announcement has already been made, and rolling back a system is cheap but rolling back a story is not. The announcement being the point is precisely why the tests get skipped.
The pressure to choose these projects is structural, not foolish. When the market pays a premium for the word, boards ask why the word is missing. The defense is procedural: run every proposed automation, however strategic, through the same five tests, and let the ones that fail become roadmap items instead of launches.
Takeaways
- 01Never answer “should we use AI” at the company level. Answer it per workflow, with the five tests.
- 02Score workflows on the stable properties — error cost, reversibility, feedback — not on model quality, which changes quarterly.
- 03The compounding automations are boring by nature; if a project is exciting, check whose story it serves.
- 04Measure automation in cycle time and error rates first. Headcount framing breaks the feedback loop that makes automation good.
How to apply this on Monday
- Score your ten highest-volume workflows against the five tests; the ranking, not the scores, is the deliverable.
- Automate the top two. Leave the rest alone until the first two are measured against their human baseline.
- Re-score quarterly. Model releases change which workflows pass — that is the whole review.
Cases are anonymized composites: patterns assembled from public filings, court records, interviews and post-mortems, with identifying details changed. We analyze patterns, not people.