As we build more complex AI workflows, we run into a new kind of “technical debt”: behavioral bugs. Agents hallucinate, ignore instructions, or get confused by edge cases. Unlike traditional software, you can’t step through a debugger to see why an LLM made a specific decision.
Or can you?
I recently discovered a powerful meta-workflow: using the AI agent to debug its own past behaviors.
Because my agent’s “brain” is defined by text files (personas) and its “execution logs” are chat histories, I can use the agent’s own tools to diagnose and fix issues. It turns prompt engineering into a rigorous debugging loop.
Here is the workflow I developed, illustrated by a recent session where I optimized my automated Code Reviewer assistants.
The Meta-Workflow Link to heading
The core idea is to treat the AI agent not just as the doer of tasks, but as the analyst of its own performance. The loop consists of three steps:
- Log Analysis (
read_thread): Feed the agent the history of a failed session. Ask it to trace its own decision tree. - Source Review (
Read): Have the agent inspect the markdown files that define its persona and rules. - Patching (
edit_file): Ask the agent to rewrite its own instructions to prevent the error from happening again.
The Experiment: Fixing a Code Reviewer Link to heading
To test this, I used a set of custom personas I built for reviewing Merge Requests (MRs). The system was acting up—missing tools, posting comments when it shouldn’t, and ignoring context.
Instead of manually tweaking prompts and hoping for the best, I asked my current agent to look at the failure threads and fix them.
Case Study 1: Tracing Logic Failures Link to heading
The Bug: My agent refused to use the GitLab tools, falling back to the CLI, even though the tools were available.
The Debug: I pointed the assistant to the failed thread and asked, “Why did you switch to CLI here?”
Using read_thread, the assistant analyzed the timeline and found the root cause:
“In the failed session, I checked for tools before running the discovery command. In the successful session, you forced me to list tools first.”
The Fix: The assistant patched the mr_agent.md persona to force a “Discovery Phase”—explicitly running list_tools() before attempting any action. It wasn’t a random guess; it was a fix derived from the diff between success and failure.
Case Study 2: diagnosing Hallucinations Link to heading
The Bug: An agent tried to post a comment, failed, and then “helpfully” posted it as a public thread instead of a draft.
The Debug: I asked the assistant to find out why the initial tool call failed. It reviewed the tool inputs in the logs and realized the agent was hallucinating a JSON wrapper (draft_note) that didn’t exist in the tool’s schema.
The Fix: The assistant didn’t just tell me “it hallucinated.” It opened the mr_commenter.md persona file and added a Schema Anchoring section with the exact JSON structure required, effectively “pinning” the correct syntax for future runs. It also added a hard failure clause: “If this fails, STOP. Do not fallback.”
Case Study 3: Injecting Missing Context Link to heading
The Bug: The reviewer flagged a try-catch block as a “Behavior Change” warning, ignoring the fact that the JIRA ticket explicitly requested that error suppression.
The Debug: By reviewing the decision logic, the assistant identified that the “Code Quality” persona was operating in a silo. It had access to the code diff but was never instructed to cross-reference the JIRA Acceptance Criteria.
The Fix: The assistant edited the reviewer persona to include a Context Injection rule: “Before flagging a behavior change, check if the JIRA ticket requested it.”
Conclusion Link to heading
We often treat LLMs as black boxes that we have to poke and prod until they work. But if you have an agent with filesystem access and history awareness, you can invert that relationship.
By using this meta-workflow—Analyze Log -> Review Source -> Patch Logic—you move from “vibes-based” prompting to actual engineering. You aren’t just chatting with the AI; you’re pair-programming with it to refactor its own brain.