You can use the idea to spin-off background agent tasks that can then be seamlessly merged back into context when they complete.
The example above is a product specific approach but the idea should be applicable in other environments.... it's really an attempt to integrate long running background tasks while continuing with existing context in an interactive manner.
When you start working on the problem of working with automation programs (AKA agents) in an interactive human-in-the-loop fashion, you will naturally run into these kinds of problems.
We've all seen sci-fi movies with AI assistants that seamlessly work with humans in a back and forth manner, async spin-offs are essential for making that work in practice for long running background tasks.
I like the term "asynchronous coding agent", which I define as the category of coding agent which runs in a container somewhere and files a PR when it's done.
OpenAI Codex Cloud, Claude Code for the web, Gemini Jules and I think Devin (which I've not tried) are four examples.
I like that "asynchronous coding agent" is more specific than "asynchronous agent" - I don't have a firm idea of what an "asynchronous agent" is.
One catch though is that the asynchronous coding agents are getting less asynchronous. Claude Code for the web lets you prompt it while it's running which makes it feel much more like regular Claude Code.
An example of a non-coding "asynchronous agent" in my mind is something like deep research. It runs for a while in ChatGPT or Gemini, and when it's done, it produces a markdown file or google doc with its findings. The parallel with your definition of "asynchronous coding agent" is that I'm not watching it work or involved in the process.
But your last point highlights exactly why the sync/async distinction doesn't hold up as a binary classification. Even with deep research, I go back and forth on a plan synchronously before sending it off to run async. Any good asynchronous coding agent should work the same way.
The real question is what happens when the background job wants attention. Does that only happen when it's done? Does it send notifications? Does it talk to a supervising LLM. The author is correct that it's the behavior of the invoking task that matters, not the invoked task.
(I still think that guy with "Gas Town" is on to something, trying to figure out connect up LLMs as a sort of society.)
"background job" is actually the more honest framing.
the interesting design question you're pointing at, what happens when it wants attention, is where the real complexity lives. in practice i've found three patterns:
(1) fire-and-forget with a completion webhook
(2) structured checkpointing where the agent emits intermediate state that a supervisor can inspect
(3) interrupt-driven where the agent can escalate blockers to a human or another agent mid-execution.
most "async agent" products today only implement (1) and call it a day. But (2) and (3) are where the actual value is, being able to inspect a running agent's reasoning mid-task and course-correct before it burns 10 minutes going down the wrong path.
the supervision protocol is the product, not the async dispatch.
>The Society of Mind is both the title of a 1986 book and the name of a theory of natural intelligence as written and developed by Marvin Minsky.
>In his book of the same name, Minsky constructs a model of human intelligence step by step, built up from the interactions of simple parts called agents, which are themselves mindless. He describes the postulated interactions as constituting a "society of mind", hence the title. [...]
>The theory
>Minsky first started developing the theory with Seymour Papert in the early 1970s. Minsky said that the biggest source of ideas about the theory came from his work in trying to create a machine that uses a robotic arm, a video camera, and a computer to build with children's blocks.
>Nature of mind
>A core tenet of Minsky's philosophy is that "minds are what brains do". The society of mind theory views the human mind – and any other naturally evolved cognitive system – as a vast society of individually simple processes known as agents. These processes are the fundamental thinking entities from which minds are built, and together produce the many abilities we attribute to minds. The great power in viewing a mind as a society of agents, as opposed to the consequence of some basic principle or some simple formal system, is that different agents can be based on different types of processes with different purposes, ways of representing knowledge, and methods for producing results.
>This idea is perhaps best summarized by the following quote:
>What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle. —Marvin Minsky, The Society of Mind, p. 308
That puts Minsky either neatly in the scruffy camp, or scruffily in the neat camp, depending on how you look at it.
Neuro-symbolic AI is the modern name for combining both; the idea goes back to the neat/scruffy era, the term to the 2010s. In 1983 Nils Nilsson argued that "the field needed both".
For example, combining Gary Drescher’s symbolic learning with LLMs grounds the symbols: the schema mechanism discovers causal structure, and the LLM supplies meanings, explanations, and generalization—we’re doing that in MOOLLM and spell it out here:
MOOLLM: A Microworld Operating System for LLM Orchestration
One weird skill I have is the ability to describe simple concepts as complex and confusing systems. I’ll take a go at that now.
When working with LLMs, one of my primary concerns is keeping tabs on their operating assumptions. I often catch them red-handed running with assumptions like they were scissors, and I’m forced to berate them.
So my ideal “async agents” are agents that keep me informed not of the outcome of a task, but of the assumptions they hold as they work.
I’ve always been a little slow recognizing things that others find obvious, such as “good enough” actually being good enough. I obtusely disagree. My finish line isn’t “good enough”, it’s “correct”, and yes, I will die on that hill still working on the same product I started as a younger man.
Jokes aside, I really would like to see:
1. Periodic notifications informing me of important working assumptions.
2. The ability to interject and course correct - likely requiring a bit of backtracking.
3. In addition to periodic working assumption notifications, I’d also like periodic “mission statements” - worded in the context of the current task - as assurance that the agent still has its eye on the ball.
So fairly large players are using “async agent” to mean something specific, which seems enough to warrant defining it. It also makes sense that it’s far less common than “autonomous agent”, since “async” is mostly used by technical folks, which is a much smaller audience. I’m definitely in that sf/swe/tech/startup information bubble, but that's where this stuff is taking off.
Sometimes senior staff misunderstand / misread a buzz word and, in their rush to make sure that everyone knows they are down with the kids, start saying "async agent" instead of "autonomous agent" and everyone just goes with it.
How about framing this in terms of two orthogonal axes the article doesn’t name: concurrency (actors) and continuity (durable execution).
* Durable execution: long‑running, resumable workflows with persistence, replay, and timeouts.
* Actors: isolated entities that own their state and logic, process one message at a time, and get concurrency by existing in large numbers (regardless of whether the runtime uses threads, async/await, or processes under the hood).
Combine the two and you get a "Durable actor", which seems close to what the article calls an “async agent”: a component that can receive messages, maintain state, pause/resume, survive restarts, and call out to an LLM or any other API.
And since spawning is already a primitive in the actor model, the article’s "subagent" fits naturally here too: it’s just another actor the first one creates.
hey, ishaan here (kartik's cofounder). this post came out of a lot of back-and-forth between us trying to pin down what people actually mean when they say "async agents."
the analogy that clicked for me was a turn-based telephone call—only one person can talk at a time. you ask, it answers, you wait. even if the task runs for an hour, you're waiting for your turn.
we kept circling until we started drawing parallels to what async actually means in programming. using that as the reference point made everything clearer: it's not about how long something runs or where it runs. it's about whether the caller blocks on it.
i just imagine it as the swap between "human watching agent while it runs"
vs "agent runs for a long time, tells the user over human interfaces when its done" eg. sends a slack. or something like gemini deep research.
an extension would be that they are triggered by events and complete autonomously with only human interfaces when it gets stuck.
theres a bit of a quality difference rather than exactly functionally, in that the agent mostly doesnt need human interaction beyond a starting prompt, and a notification of completion or stuckness. even if im not blocking on a result, it cant immediately need babying or i cant actually leave it alone
Async means I can delegate stuff to it and expect it's fixed when I come back. Also I can text it from my phone while I'm on the toilet. Very important.
OpenClaw meets this definition, but so does a 50 line Telegram wrapper around Claude Code / Codex ;)
I'll take a stab at it. An async agent is an agent that is triggered autonomously, without direct human intervention, where its execution does not have tight temporal coupling with the other components of the system (agent or otherwise).
Practically speaking, it means they often operate within a larger system, that due to its open-ended nature, produces emergent behavior, meaning behavior that was not explicitly designed.
thats a great stab :) We dig into this in the post, but the key distinction we landed on is that the trigger can be asynchronous without the agent itself being async. A cron job, webhook, or autonomous trigger is really about scheduling, not a property of the agent’s execution model.
In other words: triggering without a human ≠ async by itself. What matters is whether the caller blocks on the agent’s work, as opposed to how or when it was kicked off.
Do you try to pull the butter onto your knife periodically, or do you wait somehow until it pushes the butter onto your knife? When does it become less work to just go get the butter yourself?
Async Agent = a LLM powered application using a well understood thinking / planning loop and reasonably clear success criteria to process a prompt that takes longer than a tacitly agreed upon amount of time so that the user needs to be notified of the outcome instead of waiting for it.
Nothing new when people make up bullshit words that no one can define. "Life" is a really good one. Or AGI is a recent popular one. And don't forget my favorite bullshit word: Spirituality.
For async agents and for "life" I sort of have a blurry shape of what the thing is in my head, but spirituality is the strangest one. There is no shape. The word is utter bullshit, yet people can carry the word and use it without realizing it has no shape or meaning. It's not that I don't get what "spirituality" is. I "get" it as much as everyone else but it's I've taken the extra step to uncover the utter meaninglessness of the word.
Don't spend too much time thinking about this stuff. It's not profound. You are spending time debating and discussing a linguistic issue. Pinpointing the exact arbitrary definition of some arbitrary set of letters and sounds we call a "word" is an exercise in both arbitrariness and pointlessness.
If you're talking about the async agent described in the post (already regretting calling it that, let's call it orchestrator agent instead), looks like https://code.claude.com/docs/en/agent-teams is trying to achieve that
For an example of what an "async" agent implementation should help you accomplish: https://youtu.be/hGhnB0LTBUk?si=q78QjgsN5Kml5F1E&t=5m15s
You can use the idea to spin-off background agent tasks that can then be seamlessly merged back into context when they complete.
The example above is a product specific approach but the idea should be applicable in other environments.... it's really an attempt to integrate long running background tasks while continuing with existing context in an interactive manner.
When you start working on the problem of working with automation programs (AKA agents) in an interactive human-in-the-loop fashion, you will naturally run into these kinds of problems.
We've all seen sci-fi movies with AI assistants that seamlessly work with humans in a back and forth manner, async spin-offs are essential for making that work in practice for long running background tasks.
Paraphrase: It's not the time, or location, or even concurrency, it's `join()`.
I like the term "asynchronous coding agent", which I define as the category of coding agent which runs in a container somewhere and files a PR when it's done.
OpenAI Codex Cloud, Claude Code for the web, Gemini Jules and I think Devin (which I've not tried) are four examples.
I like that "asynchronous coding agent" is more specific than "asynchronous agent" - I don't have a firm idea of what an "asynchronous agent" is.
One catch though is that the asynchronous coding agents are getting less asynchronous. Claude Code for the web lets you prompt it while it's running which makes it feel much more like regular Claude Code.
An example of a non-coding "asynchronous agent" in my mind is something like deep research. It runs for a while in ChatGPT or Gemini, and when it's done, it produces a markdown file or google doc with its findings. The parallel with your definition of "asynchronous coding agent" is that I'm not watching it work or involved in the process.
But your last point highlights exactly why the sync/async distinction doesn't hold up as a binary classification. Even with deep research, I go back and forth on a plan synchronously before sending it off to run async. Any good asynchronous coding agent should work the same way.
"Background job"?
The real question is what happens when the background job wants attention. Does that only happen when it's done? Does it send notifications? Does it talk to a supervising LLM. The author is correct that it's the behavior of the invoking task that matters, not the invoked task.
(I still think that guy with "Gas Town" is on to something, trying to figure out connect up LLMs as a sort of society.)
"background job" is actually the more honest framing.
the interesting design question you're pointing at, what happens when it wants attention, is where the real complexity lives. in practice i've found three patterns: (1) fire-and-forget with a completion webhook (2) structured checkpointing where the agent emits intermediate state that a supervisor can inspect (3) interrupt-driven where the agent can escalate blockers to a human or another agent mid-execution.
most "async agent" products today only implement (1) and call it a day. But (2) and (3) are where the actual value is, being able to inspect a running agent's reasoning mid-task and course-correct before it burns 10 minutes going down the wrong path.
the supervision protocol is the product, not the async dispatch.
I've written an async agent. It's triggered by a http request. It does a specific processing and updates a database table regarding it's output
Marvin Minsky thought of it a long time before Gas Town, and yes, he was on to something.
https://en.wikipedia.org/wiki/Society_of_Mind
>The Society of Mind is both the title of a 1986 book and the name of a theory of natural intelligence as written and developed by Marvin Minsky.
>In his book of the same name, Minsky constructs a model of human intelligence step by step, built up from the interactions of simple parts called agents, which are themselves mindless. He describes the postulated interactions as constituting a "society of mind", hence the title. [...]
>The theory
>Minsky first started developing the theory with Seymour Papert in the early 1970s. Minsky said that the biggest source of ideas about the theory came from his work in trying to create a machine that uses a robotic arm, a video camera, and a computer to build with children's blocks.
>Nature of mind
>A core tenet of Minsky's philosophy is that "minds are what brains do". The society of mind theory views the human mind – and any other naturally evolved cognitive system – as a vast society of individually simple processes known as agents. These processes are the fundamental thinking entities from which minds are built, and together produce the many abilities we attribute to minds. The great power in viewing a mind as a society of agents, as opposed to the consequence of some basic principle or some simple formal system, is that different agents can be based on different types of processes with different purposes, ways of representing knowledge, and methods for producing results.
>This idea is perhaps best summarized by the following quote:
>What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle. —Marvin Minsky, The Society of Mind, p. 308
That puts Minsky either neatly in the scruffy camp, or scruffily in the neat camp, depending on how you look at it.
https://en.wikipedia.org/wiki/Neats_and_scruffies
Neuro-symbolic AI is the modern name for combining both; the idea goes back to the neat/scruffy era, the term to the 2010s. In 1983 Nils Nilsson argued that "the field needed both".
https://en.wikipedia.org/wiki/Neuro-symbolic_AI
For example, combining Gary Drescher’s symbolic learning with LLMs grounds the symbols: the schema mechanism discovers causal structure, and the LLM supplies meanings, explanations, and generalization—we’re doing that in MOOLLM and spell it out here:
MOOLLM: A Microworld Operating System for LLM Orchestration
See: Schema Mechanism: Drescher's Causal Learning
https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...
Also: LLM Superpowers for the Gambit Engine:
https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...
Schema Mechanism Skill:
https://github.com/SimHacker/moollm/blob/main/skills/schema-...
Schema Factory Skill:
https://github.com/SimHacker/moollm/blob/main/skills/schema-...
Example Schemas:
https://github.com/SimHacker/moollm/tree/main/skills/schema-...
One weird skill I have is the ability to describe simple concepts as complex and confusing systems. I’ll take a go at that now.
When working with LLMs, one of my primary concerns is keeping tabs on their operating assumptions. I often catch them red-handed running with assumptions like they were scissors, and I’m forced to berate them.
So my ideal “async agents” are agents that keep me informed not of the outcome of a task, but of the assumptions they hold as they work.
I’ve always been a little slow recognizing things that others find obvious, such as “good enough” actually being good enough. I obtusely disagree. My finish line isn’t “good enough”, it’s “correct”, and yes, I will die on that hill still working on the same product I started as a younger man.
Jokes aside, I really would like to see:
1. Periodic notifications informing me of important working assumptions. 2. The ability to interject and course correct - likely requiring a bit of backtracking. 3. In addition to periodic working assumption notifications, I’d also like periodic “mission statements” - worded in the context of the current task - as assurance that the agent still has its eye on the ball.
I've never heard anyone speak of "async agents". Autonomous agents, yes. Async? No. Sounds like a information bubble, if you ask me. A quick google trends lookup validates this: https://trends.google.com/explore?q=async%2520agents%2Cauton...
And I agree, "async agents" makes little sense
Here's Stripe using the term today - https://x.com/stevekaliski/status/2021034048945070360?s=20
And here's Google using the term to describe Jules - https://news.ycombinator.com/item?id=44813854
So fairly large players are using “async agent” to mean something specific, which seems enough to warrant defining it. It also makes sense that it’s far less common than “autonomous agent”, since “async” is mostly used by technical folks, which is a much smaller audience. I’m definitely in that sf/swe/tech/startup information bubble, but that's where this stuff is taking off.
Sometimes senior staff misunderstand / misread a buzz word and, in their rush to make sure that everyone knows they are down with the kids, start saying "async agent" instead of "autonomous agent" and everyone just goes with it.
How about framing this in terms of two orthogonal axes the article doesn’t name: concurrency (actors) and continuity (durable execution).
* Durable execution: long‑running, resumable workflows with persistence, replay, and timeouts.
* Actors: isolated entities that own their state and logic, process one message at a time, and get concurrency by existing in large numbers (regardless of whether the runtime uses threads, async/await, or processes under the hood).
Combine the two and you get a "Durable actor", which seems close to what the article calls an “async agent”: a component that can receive messages, maintain state, pause/resume, survive restarts, and call out to an LLM or any other API.
And since spawning is already a primitive in the actor model, the article’s "subagent" fits naturally here too: it’s just another actor the first one creates.
I like this way of thinking about it. I wish more comments would be as thoughtful as this!
hey, ishaan here (kartik's cofounder). this post came out of a lot of back-and-forth between us trying to pin down what people actually mean when they say "async agents."
the analogy that clicked for me was a turn-based telephone call—only one person can talk at a time. you ask, it answers, you wait. even if the task runs for an hour, you're waiting for your turn.
we kept circling until we started drawing parallels to what async actually means in programming. using that as the reference point made everything clearer: it's not about how long something runs or where it runs. it's about whether the caller blocks on it.
Not to be all captain hindsight, but I was puzzled as I was skimming the post, as this seemed obvious to me:
Something is async when it takes longer than you're willing to wait without going off to do something else.
i just imagine it as the swap between "human watching agent while it runs"
vs "agent runs for a long time, tells the user over human interfaces when its done" eg. sends a slack. or something like gemini deep research.
an extension would be that they are triggered by events and complete autonomously with only human interfaces when it gets stuck.
theres a bit of a quality difference rather than exactly functionally, in that the agent mostly doesnt need human interaction beyond a starting prompt, and a notification of completion or stuckness. even if im not blocking on a result, it cant immediately need babying or i cant actually leave it alone
Async means I can delegate stuff to it and expect it's fixed when I come back. Also I can text it from my phone while I'm on the toilet. Very important.
OpenClaw meets this definition, but so does a 50 line Telegram wrapper around Claude Code / Codex ;)
https://github.com/a-n-d-a-i/ULTRON
Spoiler: it just pipes the msg into claude -p $msg or codex exec $msg
You can do anything if you believe
The toilet test is definitely a winner, but "delegate from beach" has a nicer ring to it...
I'll take a stab at it. An async agent is an agent that is triggered autonomously, without direct human intervention, where its execution does not have tight temporal coupling with the other components of the system (agent or otherwise).
Practically speaking, it means they often operate within a larger system, that due to its open-ended nature, produces emergent behavior, meaning behavior that was not explicitly designed.
thats a great stab :) We dig into this in the post, but the key distinction we landed on is that the trigger can be asynchronous without the agent itself being async. A cron job, webhook, or autonomous trigger is really about scheduling, not a property of the agent’s execution model.
In other words: triggering without a human ≠ async by itself. What matters is whether the caller blocks on the agent’s work, as opposed to how or when it was kicked off.
- I ask for butter and walk away. - It passes the butter to where I expect it to be when I return. - That is its purpose.
(Rick and Morty reference: https://www.youtube.com/watch?v=X7HmltUWXgs)
That's just a slow response with extra steps.
There's also the concept of a daemon process that looks for work to do and tells you about it without being prompted.
Do you try to pull the butter onto your knife periodically, or do you wait somehow until it pushes the butter onto your knife? When does it become less work to just go get the butter yourself?
Async Agent = a LLM powered application using a well understood thinking / planning loop and reasonably clear success criteria to process a prompt that takes longer than a tacitly agreed upon amount of time so that the user needs to be notified of the outcome instead of waiting for it.
This is mostly because the actual description is boring and not exciting marketing.
Nothing new when people make up bullshit words that no one can define. "Life" is a really good one. Or AGI is a recent popular one. And don't forget my favorite bullshit word: Spirituality.
For async agents and for "life" I sort of have a blurry shape of what the thing is in my head, but spirituality is the strangest one. There is no shape. The word is utter bullshit, yet people can carry the word and use it without realizing it has no shape or meaning. It's not that I don't get what "spirituality" is. I "get" it as much as everyone else but it's I've taken the extra step to uncover the utter meaninglessness of the word.
Don't spend too much time thinking about this stuff. It's not profound. You are spending time debating and discussing a linguistic issue. Pinpointing the exact arbitrary definition of some arbitrary set of letters and sounds we call a "word" is an exercise in both arbitrariness and pointlessness.
So, say you want this. How do you do it with Claude Code?
If you're talking about the async agent described in the post (already regretting calling it that, let's call it orchestrator agent instead), looks like https://code.claude.com/docs/en/agent-teams is trying to achieve that
[dead]
Read my post on this from 9 months ago: https://jdsemrau.substack.com/p/designing-agents-architectur...