The stateless agent problem isn't a bug, it's the default design! Every session starts from zero. Every tool call is amnesiac. Every "autonomous" agent needs you to supervise it because it can't build on what it learned yesterday.
I spent the last year trying to fix that. Finally, I built Genesis:
- Memory that compounds: 4-layer hybrid (Qdrant vectors + SQLite FTS5 + knowledge graph + relational), Reciprocal Rank Fusion across all four in parallel. Day 180 is architecturally different from day 1 because the memory system is actually different.
- A six-stage learning pipeline: not thumbs up/thumbs down. Outcome classification, causal attribution (why did this happen?), delta measurement, observation extraction, procedural learning with Laplace-smoothed confidence. Runs after every session automatically.
- Earned autonomy, not toggled autonomy: 7 levels per action category, Bayesian regression on failure. Trust is granular and earned through evidence.
- Background cognition that leverages free-tier compute: while idle, it researches topics you'll ask about tomorrow, audits its own memory for contradictions, tests whether its learned procedures still hold up.
- Two systems monitoring each other: one on the host VM, one inside the container. If the system breaks itself, the host-side system spawns a Claude Code session to diagnose and fix it. If the host-side system fails, the container detects the stale heartbeat and restarts it over SSH.
Genesis uses Claude Code as the reasoning engine. The thesis: CC already had the brain. The missing piece was everything else: memory, learning, reflection, autonomy, continuity.
MIT licensed.
The biggest remaining gap is autonomous action--the ego cycle that gives the system coherent self-directed behavior--is still in private testing. Everything else is live. You can tell Genesis to do something autonomously and it will execute, but the system that autonomously decides what to do without being told is not yet released.
One thing I still don't know: six months from now, when the system has been compounding across thousands of sessions and directing its own development in ways I didn't explicitly write, how would I know it's growing in the right direction? Right now I supervise and correct. But as it earns more autonomy, that feedback loop gets harder and harder to close. I don't have the answer to that yet. I'm just building the infrastructure to facilitate that and hoping that Genesis will know what's best for itself.
These kinds of pareidolic, co-dependent projections emitted by today's AI cognoscenti will be studied a hundred years from now for clues about how the advent of television skrod a republic starting with Nixon and ending in Trump.
The stateless agent problem isn't a bug, it's the default design! Every session starts from zero. Every tool call is amnesiac. Every "autonomous" agent needs you to supervise it because it can't build on what it learned yesterday.
I spent the last year trying to fix that. Finally, I built Genesis:
- Memory that compounds: 4-layer hybrid (Qdrant vectors + SQLite FTS5 + knowledge graph + relational), Reciprocal Rank Fusion across all four in parallel. Day 180 is architecturally different from day 1 because the memory system is actually different. - A six-stage learning pipeline: not thumbs up/thumbs down. Outcome classification, causal attribution (why did this happen?), delta measurement, observation extraction, procedural learning with Laplace-smoothed confidence. Runs after every session automatically. - Earned autonomy, not toggled autonomy: 7 levels per action category, Bayesian regression on failure. Trust is granular and earned through evidence. - Background cognition that leverages free-tier compute: while idle, it researches topics you'll ask about tomorrow, audits its own memory for contradictions, tests whether its learned procedures still hold up. - Two systems monitoring each other: one on the host VM, one inside the container. If the system breaks itself, the host-side system spawns a Claude Code session to diagnose and fix it. If the host-side system fails, the container detects the stale heartbeat and restarts it over SSH.
Genesis uses Claude Code as the reasoning engine. The thesis: CC already had the brain. The missing piece was everything else: memory, learning, reflection, autonomy, continuity.
MIT licensed.
The biggest remaining gap is autonomous action--the ego cycle that gives the system coherent self-directed behavior--is still in private testing. Everything else is live. You can tell Genesis to do something autonomously and it will execute, but the system that autonomously decides what to do without being told is not yet released.
One thing I still don't know: six months from now, when the system has been compounding across thousands of sessions and directing its own development in ways I didn't explicitly write, how would I know it's growing in the right direction? Right now I supervise and correct. But as it earns more autonomy, that feedback loop gets harder and harder to close. I don't have the answer to that yet. I'm just building the infrastructure to facilitate that and hoping that Genesis will know what's best for itself.
Clone it. Run it. Tell me what breaks.
"Earned autonomy"...
These kinds of pareidolic, co-dependent projections emitted by today's AI cognoscenti will be studied a hundred years from now for clues about how the advent of television skrod a republic starting with Nixon and ending in Trump.