Dispatch Tasks To Claude Code Headless Mode

by Admin 44 views
Dispatching Tasks to Claude Code Using Headless Mode: A Deep Dive

Hey guys, ever wondered how we can make our virtual assistants super smart by letting them talk directly to powerful tools like Claude Code? Well, today we're diving deep into a super cool feature that allows us to implement a seamless task dispatch mechanism from our VirtualAssistant to Claude Code, all thanks to its nifty headless mode (claude -p). This isn't just about sending a task; it's about establishing a fully tracked, transactional flow, ensuring every step is accounted for. Imagine OpenCode triggering a new task, and our VirtualAssistant elegantly handing it over to Claude Code, tracking its progress, and confirming completion. Sounds awesome, right? Let's break down how this magic happens.

The Nuts and Bolts: How It Works

At its core, this whole system revolves around our existing agent_tasks table, where all the tasks are neatly stored. The real game-changer here is Claude Code's headless mode. What does that mean, you ask? It means we can interact with Claude Code programmatically, without needing the full graphical interface. The command claude -p "task" --output-format json is our golden ticket. It allows us to send a task directly to Claude Code and get a structured JSON response back, which includes super useful info like a session_id, the result of the task, and even the cost involved. This JSON output is key for us to understand what Claude Code did and how it did it.

We're aiming for a robust, transactional flow: we send the task, get an acknowledgement that Claude Code has received and started working on it, and finally, confirm when it's complete. This ensures reliability and allows us to build trust in the system. And who kicks things off? Our OpenCode platform will be initiating this whole process through a new API endpoint. It’s like a chain reaction of efficiency!

Unpacking the Architecture: A Visual Guide

Let's visualize how this all fits together. Picture this:

  1. OpenCode is our initiator. It sends a command, say, "Pošli úkol #123" (which translates to 'Send task #123').
  2. This command lands on our VirtualAssistant's doorstep.
  3. The VirtualAssistant, being the smart agent it is, first finds task #123 in its agent_tasks table.
  4. It then performs some validation – is this task ready to go? Is it already done?
  5. Crucially, it checks if Claude Code is idle. We don't want to overwhelm our AI buddy, right?
  6. If everything checks out, the VirtualAssistant proceeds to dispatch the task to Claude Code.
  7. As soon as Claude Code receives the task, it sends back a signal that it's started working on it. The VirtualAssistant registers this by marking the task as in_progress.
  8. The VirtualAssistant then returns a confirmation to OpenCode, letting it know the task has been successfully handed over and is now being processed.
  9. Once Claude Code has finished its work, it signals completion.
  10. The VirtualAssistant receives this completion signal and marks the task as completed in the agent_tasks table.

This step-by-step process ensures that we have full visibility and control over each task, from its inception to its successful execution by Claude Code.

Mastering Task States: Keeping Track of Everything

To manage this workflow effectively, we need to update our agent_tasks table with new states. Think of these states as milestones in a task's journey:

  • pending: This is the starting point. The task is created and waiting patiently to be sent out for processing. It hasn't been dispatched yet.
  • in_progress: This state signifies that the task has been successfully sent to Claude Code, and Claude has confirmed its receipt and has started working on it. The wheels are in motion!
  • completed: Hooray! Claude Code has finished processing the task, and the results are available. The mission is accomplished.
  • failed: Uh oh. Something went wrong during the processing of the task. This could be due to an error within Claude Code, an issue with the task itself, or a problem during dispatch. We need to investigate this.

To support these states, we'll be adding a few new columns to our agent_tasks table. These columns are essential for tracking the lifecycle of a task and for debugging purposes:

ALTER TABLE agent_tasks ADD COLUMN sent_at TIMESTAMP NULL;
ALTER TABLE agent_tasks ADD COLUMN started_at TIMESTAMP NULL;  -- This is when Claude confirmed receipt and started working
ALTER TABLE agent_tasks ADD COLUMN completed_at TIMESTAMP NULL;
ALTER TABLE agent_tasks ADD COLUMN claude_session_id TEXT NULL;
ALTER TABLE agent_tasks ADD COLUMN status TEXT DEFAULT 'pending';

The sent_at timestamp will record when the task was dispatched. started_at will mark when Claude Code actually began processing. completed_at will be set once the task is successfully finished. The claude_session_id is super important; it's like a unique fingerprint for each interaction with Claude Code, which will be invaluable if we ever need to follow up on a specific session or link related tasks. And of course, the status column will hold our new states: pending, in_progress, completed, or failed. This structured approach gives us unparalleled insight into our task execution pipeline.

The New Gateway: API Endpoint for Task Dispatch

To make this all happen, we're introducing a new API endpoint: POST /api/tasks/dispatch. This will be the primary way for external systems, like OpenCode, to trigger the task dispatch process. Let's look at what you can expect when you interact with this endpoint.

Request:

When you send a request to this endpoint, you can specify a particular task you want to dispatch, or you can let the system pick the next available task. It looks like this:

{
  "taskId": 123,        // Optional: Specify a particular task ID.
  "targetAgent": "claude"  // Optional: Defaults to "claude" if not provided.
}

If you omit the taskId, the VirtualAssistant will intelligently select the first pending task from the agent_tasks table that's designated for the Claude agent. The targetAgent parameter allows for future expansion, letting us dispatch to different agents, but for now, it's primarily focused on claude.

Response (Success):

If everything goes smoothly, you'll get a response indicating success, along with the task ID that was dispatched and its new status:

{
  "success": true,
  "taskId": 123,
  "status": "in_progress",
  "message": "Task dispatched to Claude"
}

This confirms that the task has been handed over to Claude Code and is now being processed. The status in_progress is a key indicator that the dispatch was successful and Claude has acknowledged the task.

Response (Error - Task Already Completed):

Sometimes, you might try to dispatch a task that has already been completed. In such cases, the API will return an error, letting you know the situation, and importantly, providing a Text-to-Speech (TTS) notification for the user:

{
  "success": false,
  "taskId": 123,
  "error": "task_already_completed",
  "message": "Úkol #123 je již dokončený"  // "Task #123 is already completed" in Czech
}

This prevents redundant processing and keeps our task statuses accurate. The TTS message is a nice touch for user feedback.

Response (Error - Claude Busy):

We also need to handle scenarios where Claude Code might be busy with another task. To avoid conflicts and ensure efficient processing, the API will return an error if Claude is currently occupied:

{
  "success": false,
  "error": "agent_busy",
  "message": "Claude pracuje na jiném úkolu" // "Claude is working on another task" in Czech
}

This agent_busy error is crucial for managing concurrency and ensuring that Claude Code isn't overloaded. It prompts the caller to perhaps try again later or to queue the task.

Crucial Validation Rules: Before We Send

Before we even think about firing up Claude Code, our VirtualAssistant performs a series of crucial validation checks. This ensures we're only dispatching valid tasks and that our system behaves predictably. Here’s what we’ll be checking:

  1. Does the Task Exist? The very first thing we do is check if the taskId provided actually exists in our agent_tasks table. If we can't find it, we return an error straight away: "Task not found". We can't process something that isn't there, right?
  2. Is the Task Already Completed? This is a big one. We need to ensure we're not trying to re-process a task that's already finished. If the task's status is completed, we return an appropriate error, like "task_already_completed", and trigger that helpful TTS notification: "Úkol už je dokončený". This saves us from doing unnecessary work and keeps our system clean.
  3. Is Claude Available? This is where dependency #199 comes into play, specifically the agent_responses table. We need to know if Claude Code is currently busy with another task. If our check reveals that Claude is occupied (i.e., the agent is not idle), we return an "agent_busy" error with the message "Claude pracuje na jiném úkolu". This prevents us from sending multiple tasks simultaneously to Claude and ensures a smoother workflow. We only proceed if Claude gives us the green light.

These validation rules are the gatekeepers of our dispatch system. They ensure that every task dispatched is legitimate, hasn't been done already, and that our AI agent is ready to receive it, preventing errors and maintaining the integrity of our task execution pipeline.

Step-by-Step Implementation: Bringing It All Together

Alright, let's get down to the nitty-gritty of how we're going to build this feature. It involves a few key components working in harmony:

1. Database Migration: Setting the Stage

First things first, we need to update our database schema. This involves adding the new columns we discussed earlier to the agent_tasks table: sent_at, started_at, completed_at, claude_session_id, and status. These columns are the backbone for tracking our tasks. We'll use a database migration tool (as mentioned in dependency #200 for auto migrations) to safely roll out these changes.

2. The TaskDispatchService: Our Orchestrator

We'll create a dedicated TaskDispatchService. This service will house the core logic for dispatching tasks. It will expose an interface, ITaskDispatchService, with a method like DispatchTaskAsync(int? taskId = null). This method will encapsulate the validation rules, the logic for picking a task (if taskId is null), and the interaction with Claude Code. It will also include a helper method, IsAgentIdleAsync(string agentName), to check the availability of our agents, specifically Claude Code.

3. Claude Code Integration: The Headless Commander

This is where the magic of headless mode comes alive. We'll write C# code to execute the claude command-line tool. This involves using the System.Diagnostics.Process class to launch claude with the correct arguments: the task content (taskContent) and the --output-format json flag. We need to make sure we're capturing the standard output (RedirectStandardOutput = true) so we can parse the JSON response from Claude Code. Here’s a peek at the code:

// Example of executing headless Claude
var process = new Process
{
    StartInfo = new ProcessStartInfo
    {
        FileName = "claude",
        Arguments = {{content}}quot;-p \"{taskContent}\" --output-format json",
        RedirectStandardOutput = true,
        UseShellExecute = false
    }
};
process.Start();
// ... then read the output and parse the JSON

4. API Controller: The Public Face

We'll build a new controller with a POST /api/tasks/dispatch endpoint. This controller will receive the incoming requests, delegate the work to the TaskDispatchService, handle the validation logic (as outlined previously), and format the appropriate success or error responses. It will also be responsible for triggering TTS notifications when errors occur, ensuring our users are informed.

5. Completion Callback: Knowing When It's Done

This is a crucial part: how does our VirtualAssistant know when Claude Code has finished a task? We have a couple of options:

  • Option A: Polling for Process Completion: We could periodically check if the claude process we started has finished. This is straightforward but might not be the most efficient method, as we'd be constantly checking.
  • Option B: Claude Calls Back: A more elegant solution is to have Claude Code (or a related service) call an existing hook, like our /api/hub/complete endpoint, once it finishes a task. This is an asynchronous callback mechanism, which is generally more efficient and reactive. This approach relies on Claude Code being configured to notify us upon completion, likely passing the session_id so we can match it to the correct task.

We'll need to decide which completion mechanism is most suitable and implement it accordingly. The callback option (Option B) is often preferred for its efficiency.

Dependencies: What We Need to Make This Work

As we implement this feature, we're relying on a couple of other pieces of work:

  • #199 - AgentResponse Entity: This is essential for our IsAgentIdleAsync check. We need the agent_responses table to know if Claude Code is currently busy processing another request.
  • #200 - Auto Migrations: Having automated database migrations will make deploying the schema changes (adding new columns to agent_tasks) much smoother and less error-prone.

Acceptance Criteria: How We Know We're Done

Before we can confidently say this feature is complete, we need to make sure it meets all the requirements. Here’s our checklist:

  • [ ] The POST /api/tasks/dispatch endpoint is implemented and accessible.
  • [ ] The endpoint correctly validates if the requested task exists and is not already completed.
  • [ ] We accurately check if Claude Code is idle before attempting to dispatch a new task.
  • [ ] The claude -p command is executed with the correct task content and output format.
  • [ ] The task status in agent_tasks is updated sequentially: pendingin_progresscompleted.
  • [ ] Timestamps (sent_at, started_at, completed_at) are accurately recorded in the database.
  • [ ] Appropriate error messages are returned for validation failures, including TTS notifications where required.
  • [ ] The claude_session_id is successfully stored for each dispatched task, enabling future tracking and potential multi-turn interactions.

Future Enhancements: What's Next?

While we're focusing on the core functionality now, it's always good to think ahead. Here are a few ideas for future enhancements that are outside the scope of this initial implementation but would make this feature even more powerful:

  • --force Flag: Imagine needing to re-run a task that was previously completed, perhaps with updated parameters. A --force flag in the dispatch request could allow us to override the completed status and re-process the task.
  • Automatic Retry on Failure: What if a task fails due to a temporary glitch? Implementing an automatic retry mechanism could help in recovering from transient errors without manual intervention.
  • Task Priority Queue: For systems with many tasks and varying importance, a priority queue system would allow us to ensure that critical tasks are processed before less urgent ones.

These are just a few thoughts, but they highlight the potential for growth and improvement in our task dispatch system. For now, let's focus on getting the core functionality right!

Labels

  • enhancement
  • backend
  • agent:claude