I have shipped products at Amazon for nine years. I sat in standups, signed off on PRs, watched engineers argue about deployment safety, read RCAs after outages. I thought I understood the discipline.
Today I spent the whole day undoing three weeks of consequences that proved I did not.
This is the writeup of one day untangling the messes a solo founder makes when they have the intellectual knowledge of engineering discipline but none of the institutional guardrails that enforce it. Every single one of these mistakes would have been caught in five minutes by a real team. None of them were caught for weeks, because the team is one person, and the one person was me.
The Day the Deploy Script Lied
Between April 24 and May 17, I shipped backend changes to my side project most days. My deploy-backend.sh reported success every time. It even had a verification step at the end.
This morning I realised the verification step was lying.
The script was pushing a new Docker image to ECR and calling update-service --force-new-deployment, which rolls the running task. But it never updated the task-definition’s pinned image. The task kept rolling onto the same April 24 image. Three weeks of code changes never reached a user.
In a real eng org, somebody would have noticed. A QA pass. A second engineer asking, “wait, did the SHA change?” A customer raising a bug that should have been fixed last week. Solo, the only check is whether I personally remembered to run aws ecs describe-services after each deploy. I didn’t. Not once in 23 days.
Fixed with two PRs. The script now SHA-pins the image, registers a new task-definition revision, and verifies the running task is on that revision before exiting. The verification is what makes it safe. Before today, “verification” meant “the script exited 0.”
The Branches That Quietly Drifted
I run a next branch (Amplify auto-deploys it to my staging subdomain) and a main branch (prod). Promotion is supposed to be a next to main PR every single time.
A few PRs ago I got impatient with a small docs change and merged direct to main. Then again with another bundle. Then once more. None of them got backported to next.
By today, the branches differed on seven files. Six existed only on main. That meant my next planned promotion would have deleted those six files from prod, because the next branch genuinely did not know they existed.
I caught it because I happened to diff the two branches before merging. Nothing structural would have stopped me from shipping the regression. GitHub branch protection requires a paid Team plan on private repos. I don’t have one, so gh pr merge --admin happily merges PRs with red or pending checks. There is no enforcement.
The Doc That Hallucinated Its Own Reality
The repo has a CLAUDE.md file, which is the doc my AI coding assistant reads to understand the project. It is supposed to be the canonical answer to “where are we?”
When I read it cold this morning, it was wrong in eight separate ways.
- It pointed at a worktree path that no longer existed
- It pointed at the wrong path for a publisher CLI
- A stale V2 commit pointer 19 commits behind reality
- A dead
localhost:3001ritual nobody ran any more - Internal contradiction inside the same section about whether Phase 2 was done
- V2 still presented as in-flight, weeks after it shipped
- Em-dashes and emojis the folder-level rule explicitly forbids
The cost wasn’t reading time. The cost was that every Claude Code session for the past month had been building plans on hallucinated context. I was getting confident answers rooted in a version of the repo that had not existed for over a month.
Rewrote it end to end. 281 lines. +226/-156. Now reflects reality. The reason it rotted: nobody read it cold to check. There’s only me, and I am the one whose work was making it stale.
The Files That Existed Only in Finder
I had deleted two HTML mockup files locally weeks ago because they were V2 review artifacts and the review was done. I deleted them in Finder, not in git. They sat as uncommitted deletions in my working tree until today, when a routine git pull --rebase blew up because of them.
The fix took thirty seconds. The lesson is bigger: any manual operation on a tracked directory creates a future failure mode. Always. Use git rm or do not touch the file.
The .gitignore That Did Not Ignore
I started using a RESUME.md file at the repo root as a session-handoff document between Claude Code sessions. Personal session state, not project truth. I added it to .gitignore today, expecting that to be enough.
It was not. The file had been committed back in April, and .gitignore only blocks new tracking. Already-tracked files stay tracked. I had to ship a separate PR with git rm --cached RESUME.md for the rule to take effect. Then I had to manually unblock my own pull on the main clone because the incoming change was about to delete the file I was actively using.
Forty minutes of yak-shaving for a one-line gitignore that should have existed since April.
What This Taught a PM
Every one of these messes was preventable with discipline I have known about for over a decade. PRs need second eyes. Direct-to-main merges need same-day backports. Docs go stale and need periodic audits. Tooling needs verification, not just optimism. git rm, not Finder.
So why didn’t I do them? Because solo means there is no system enforcing them. In an eng org, the guardrails are people. The reviewer who pushes back. The QA pass. The on-call who pages you when prod drifts. The teammate who diffs your PR before you merge.
Solo, the guardrails are nothing unless you build them as code or rituals you actually follow.
The intellectual understanding and the muscle memory are not the same skill. As a PM I could have written a Confluence page on every one of these practices. As an engineer of one, I trusted velocity and skipped them.
What I Shipped Today
14 PRs of hardening across two sessions. The shape of it:
- A GitHub Actions cron that diffs
mainandnextweekly and opens a tracking issue when they diverge - A
safe-merge.shwrapper that warns and sleeps 10 seconds before any non-nextmerge tomain - A new CLAUDE.md rule that direct-to-main merges must be backported same-day
- A SHA-pinned deploy script with mandatory post-deploy verification
- A rewritten CLAUDE.md that reflects reality
- A documented session-resume convention with proper
.gitignoreanchoring - The cleanup of every regression listed above
The Lesson, Plainly
For PMs thinking about building solo, especially in the AI-coding era when the velocity feels infinite: you will rebuild the entire institutional guardrail stack of your old company, or you will eat the cost of not having it.
Today was my bill. It came due all at once, on a day where I shipped zero product features. The hardening I shipped is the discipline I should have had on day one. The day to build it was always day one, and there is no point pretending otherwise.