How I Built a Self-Healing AI Content Agent with Claude Agent SDK

I got tired of the content grind. Write a post, format it for X, reformat for Bluesky, render a video, upload to YouTube, cross-post to Instagram — repeat daily. It was eating 2-3 hours every day.

So I built an AI agent that does all of it. It runs 24/7 on a Mac mini, publishes to five platforms, and fixes itself when things break. Claude Code runs on my existing Max subscription — the only variable cost is Gemini for image generation, which is pennies per image.

This isn’t a chatbot. It’s a persistent agent that runs on a schedule, does real work, sends me the results on Discord, and goes back to monitoring. I approve everything from my phone.

Here’s the full architecture, with real code from the production system.

What the Agent Actually Does

Every day, without me touching anything:

Scans X for viral tweets in the AI/automation space and drafts quote tweets with tactical breakdowns
Generates YouTube Shorts — writes scripts, creates voiceover with Edge TTS, renders video with Remotion, generates thumbnails, uploads
Publishes to X, Bluesky, Instagram, and YouTube — each formatted for the platform
Tracks analytics — pulls YouTube stats, Google Search Console data, and X engagement metrics
Sends everything to Discord — I tap ✅ or ❌ from my phone

The approval step is important. This isn’t a “set it and forget it” system. Every piece of content gets human review before it goes live. The agent handles the 95% of work that’s execution — I handle the 5% that’s judgment.

The Stack

Here’s what’s running:

Component	What It Does	Cost
Mac mini M4 Pro	Hardware — runs everything locally	$500 one-time
Claude Agent SDK	The AI brain — TypeScript, persistent session, typed tools	Included in Max subscription
Discord.js	Control plane — approvals, alerts, commands	Free
Edge TTS	Voiceover generation (Microsoft neural voices)	Free
Remotion	Video rendering (React-based)	Free (open source)
Whisper.cpp	Word-level caption sync on Apple Silicon	Free
MCP Servers	Connects YouTube, X, Gmail, Airtable, GSC APIs	Free
pm2	Process manager — auto-restart, logging, monitoring	Free

Total recurring cost: near zero, if you exclude my Claude Code Max subscription I already pay for. Gemini image generation is the only variable API cost — a few cents per image. Compare that to a content manager ($3-5K/month) or even a virtual assistant ($500-1K/month).

Architecture: One Persistent Agent Session

The first version was a Python daemon that spawned separate claude -p processes for each task. It worked, but each task started with zero context. The agent couldn’t remember what it posted yesterday or what performed well last week.

The current architecture is fundamentally different: one persistent TypeScript agent session that never dies. Everything — Discord messages, scheduled tasks, webhook events — feeds into the same agent loop. The agent maintains full context across all interactions.

// index.ts — the entry point
async function main() {
  const agent = await startAgent()       // persistent Agent SDK session
  const bot = await startDiscord(agent)  // Discord.js — feeds messages into agent
  startScheduler(agent)                  // node-cron — feeds tasks into agent
  startWebhooks(agent)                   // GitHub webhook listener

  await bot.sendMessage('general', 'Koda online. Ready for tasks.')
}

The Agent Core

Built on Anthropic’s Claude Agent SDK. The key feature: streaming input mode. The agent stays alive and accepts new messages at any time — from Discord, from the scheduler, or from webhook events.

// agent.ts — persistent session with message queue
const agent = new Agent({
  model: 'claude-sonnet-4-20250514',
  tools: [...mcpTools, ...agentTools],
  systemPrompt: loadFile('SOUL.md'),
})

// Message queue — Discord, scheduler, webhooks all push here
const messageQueue: Message[] = []

export async function sendMessage(content: string, source: string) {
  messageQueue.push({ content, source, timestamp: Date.now() })
  await processQueue()
}

Session persistence means the agent recovers from restarts. On crash, pm2 restarts the process. The agent loads its session ID from disk and resumes with full context history.

The Scheduler

17 scheduled tasks running on node-cron. Each task is a prompt that gets fed into the same agent session — so the agent has full context when executing.

// scheduler.ts
const tasks: ScheduledTask[] = [
  {
    name: 'youtube_analytics',
    cron: '0 7 * * *',           // 7 AM daily
    prompt: 'Pull YouTube analytics for the last 7 days. Compare to previous period.',
    type: 'silent'               // runs without approval
  },
  {
    name: 'viral_scan',
    cron: '0 10 * * *',          // 10 AM daily
    prompt: 'Scan X for viral tweets in AI/automation. Draft quote tweets.',
    type: 'approval'             // sends to Discord for approval
  },
  {
    name: 'social_post',
    cron: '0 12 * * *',          // Noon daily
    prompt: 'Draft a post for X following brand-voice-skill.md.',
    type: 'approval'
  },
  {
    name: 'goal_check',
    cron: '0 8 15 * * *',        // 8:15 AM daily
    prompt: 'Check GOALS.md. If any goal is behind, propose actions.',
    type: 'silent'
  }
]

Tasks are deduplicated — if youtube_analytics already ran today, it gets skipped on re-runs. Results are tracked in .task-results/YYYY-MM-DD.json.

Self-Healing

When a task fails, the agent doesn’t just log the error. It gets the full error output in context and tries to fix it — because it’s the same persistent session, it already knows the codebase and recent changes.

Up to 2 heal attempts per task. If it still fails, I get a Discord alert with the error details.

The difference from the old Python daemon: the old system spawned a fresh Claude instance to heal, which had zero context about what went wrong. The new system heals within the same session — the agent already knows what it was trying to do, what tools it called, and what the error means.

Risk Classification

Not every action should require approval. Checking analytics is safe. Posting a tweet needs a human eye.

The agent classifies every tool call by risk level:

// YOLO risk classifier
const riskLevels = {
  HIGH: ['post_tweet', 'publish_video', 'delete_tweet', 'gmail_send'],
  MEDIUM: ['generate_image', 'skool_airtable_sync', 'create_record'],
  LOW: ['youtube_analytics', 'gsc_search_analytics', 'gmail_search']
}

HIGH-risk actions get sent to Discord for approval before executing. LOW-risk actions run silently. MEDIUM adapts based on whether I’m active in Discord or idle.

Process Management: pm2

The agent runs under pm2 — a Node.js process manager that handles auto-restart, logging, and monitoring.


# Start the agent in daemon mode

npm run daemon    # runs: pm2 start ecosystem.config.js

# Check status

pm2 status

# View logs

pm2 logs koda

# Restart

pm2 restart koda

pm2 restarts the agent automatically on crash (max 10 restarts). Logs go to data/logs/koda-*.log. The agent sends a startup message to Discord every time it boots, so I know when restarts happen.

Discord as Control Plane

This was a better choice than building a web dashboard. Discord is always on my phone, supports rich embeds, reactions, and threads — and it’s free.

The Discord bot (discord.js) routes messages bidirectionally:

Me → Agent: I type in the Discord channel, the bot feeds it to the agent session
Agent → Me: The agent sends results, approvals, alerts back to Discord
Reactions: ✅ to approve, ❌ to reject content before publishing

The agent sends structured messages for different events:

Content approval:

🎬 New YouTube Short ready for review

Title: A Teaspoon of Neutron Star Weighs 6 Billion Tons
Duration: 58 seconds
Platforms: YouTube, Instagram

React ✅ to approve or ❌ to reject

Analytics digest:

📊 YouTube Analytics — Last 7 Days

Views:        1,247
Watch time:   42.3 hours
Subscribers:  +8
Top video:    Neutron Star (959 views)

Shorts feed:  93% of traffic

Self-healing alert:

🔧 Self-healed: viral_scan

Error: X API rate limit exceeded
Fix: Added exponential backoff (2s, 4s, 8s)
Status: Task completed on retry #1

I can also send commands directly in the Discord channel — “post this to X”, “check YouTube stats”, “draft a blog post about X.” The agent picks it up and responds in the same thread.

The Video Pipeline

This is where the automation really shines. Going from a script to a published YouTube Short takes about 3 minutes of compute time and zero human effort (besides the approval tap):

Script → Claude writes the narration based on a topic
Images → Gemini generates scene-specific images matching each narration segment
Voiceover → Edge TTS (en-US-AndrewNeural) generates natural-sounding audio
Captions → Whisper.cpp creates word-level timestamp sync on Apple Silicon Metal
Render → Remotion composites everything into a vertical 1080×1920 video
Thumbnail → Gemini generates a background, Python overlays title text with glow effects
Preview → Compressed version sent to Discord for approval
Publish → Uploads to YouTube and Instagram simultaneously


# The full pipeline in one command

python orchestrate_video.py tutorials/neutron-star.json

# Or step by step

python generate_voiceover.py tutorials/neutron-star.json --update-durations
npx remotion render TechTutorial out/neutron-star.mp4 --props=/tmp/props.json --gl=angle
python publish.py out/neutron-star.mp4 --title "Title" --platforms youtube,instagram

Total cost per video: a few cents in Gemini API fees for images. Edge TTS, Whisper, and Remotion are all free.

The Memory System

An agent that forgets everything between sessions is useless for content work. It needs to know what topics performed well, what voice to use, what mistakes to avoid.

I built a 6-layer memory system:

Layer 1: Bootstrap files — loaded every session. Identity (SOUL.md), skills (SKILL.md), user context (USER.md), operational rules (CLAUDE.md). These are the agent’s “personality.”

Layer 2: Observations — the agent records patterns as it works using the observe() tool. “Space Shorts get 10x more views than nature curiosities.” “Negation lists outperform feature lists on X.” Tagged by type: rule, preference, fact, habit, event.

Layer 3: Dream cycle — a nightly job that consolidates observations. Deduplicates similar entries, applies importance decay (rules last 365 days, events expire in 14 days), and promotes recurring patterns to the curated learnings file.

Layer 4: Daily logs — what happened today. Actions, outcomes, decisions, errors. Written continuously throughout the session, not batched at the end.

Layer 5: Curated learnings — the distilled “brain.” Under 100 lines of hard-won knowledge. “Shorts must be under 60 seconds.” “Images must exactly match narration.” These feed directly into content decisions.

Layer 6: Search — before making any decision, the agent searches across all layers for relevant past context.

The dream cycle is the key innovation. Without it, observations pile up forever and the agent drowns in noise. With it, only patterns that appear 3+ times get promoted to long-term memory. Everything else decays naturally.

MCP Servers: Connecting Everything

Model Context Protocol (MCP) is how the agent talks to external services. Each API gets its own MCP server that exposes typed tools:

YouTube MCP — upload videos, get analytics, manage playlists, read comments
X MCP — post tweets, get engagement metrics, delete posts
Bluesky MCP — post, repost, like, get timeline
Gmail MCP — search emails, send, create drafts, manage calendar
Airtable MCP — read/write tables for CRM and content tracking
Google Search Console MCP — search analytics, indexing status, submit sitemaps
n8n MCP — workflow management and data tables
Context7 MCP — documentation lookup for any library

The agent calls these tools naturally in conversation:

Agent: "Let me check yesterday's YouTube performance."
→ Calls youtube_analytics_overview(start_date="2026-04-04")
→ "Views were up 23% — the magnetar Short is picking up.
   847 views in the first 24 hours."

No custom integration code. No webhook plumbing. The MCP server handles auth, rate limiting, and response formatting. The agent just calls the tool and gets structured data back.

vs. Hiring a Content Manager

	AI Agent	Content Manager	Virtual Assistant
Monthly cost	~$0 (beyond existing subscription)	$3,000-5,000	$500-1,000
Availability	24/7	Business hours	Part-time
Platforms	5 simultaneously	2-3 usually	1-2
Video production	Automated	Outsourced ($$$)	Manual
Ramp-up time	Already knows your voice	2-4 weeks	1-2 weeks
Scaling	Same cost at 10x volume	Linear cost increase	Linear
Judgment calls	Needs human approval	Independent	Needs guidance

The agent wins on execution speed and cost. A human wins on creative judgment and strategy. The sweet spot: agent handles 95% of execution, human handles 5% of decisions.

vs. n8n / Make / Zapier

I used n8n for a year before building this. Here’s why I switched:

n8n handles data flow between APIs. It’s great for “when X happens, do Y.” But content creation isn’t a linear flow — it requires judgment, context, and iteration.
An AI agent can look at analytics, decide what content to create, write it, generate assets, format it for each platform, and adapt based on what worked last time. Try doing that in a node-based workflow.
The tradeoff: n8n is more reliable for simple automations. The agent is more capable but needs monitoring (hence pm2 and the self-healing system).

If your workflow is “trigger → transform → send,” use n8n. If your workflow involves creative decisions, use an agent.

Getting Started

If you want to build something similar:

Start with the Agent SDK. Don’t wrap Claude Code CLI in a shell script like I did in v1. The Claude Agent SDK gives you typed tools, streaming input, and session persistence out of the box.
Add Discord early. It’s your control plane. Every action should send a message, every publish should require approval. Discord.js makes this straightforward.
Use pm2 for process management. Auto-restart on crash, log rotation, and monitoring — all built in. Don’t build a custom watchdog.
Build a risk classifier. Not everything needs approval. Analytics and reads are safe. Posts and deletes need a human. Classify your tools and only gate the dangerous ones.
Use the memory system. Even a simple LEARNINGS.md file that the agent reads at startup makes a massive difference in content quality over time.

I’m documenting the full build process — agent setup, MCP server configuration, video pipeline, memory system — in my Build & Automate community. If you want step-by-step modules with real production code, that’s where it’s happening.

This post was published using Notipo — my Notion to WordPress sync tool. Write in Notion, publish to WordPress automatically.

How I Built a Self-Healing AI Content Agent with Claude Agent SDK

What the Agent Actually Does

The Stack

Architecture: One Persistent Agent Session

The Agent Core

The Scheduler

Self-Healing

Risk Classification

Process Management: pm2

Discord as Control Plane

The Video Pipeline

The Memory System

MCP Servers: Connecting Everything

vs. Hiring a Content Manager

vs. n8n / Make / Zapier

Getting Started

Adding Bluesky Rich Link Preview cards to Postiz

Docker Containers with Unraid NFS: Fix Stale File Handle Errors

Automate Your Blog in 2026

TensorPix Review: AI Video Upscaling That Actually Works

Vidnoz AI Review: Text-to-Video That Skips the Camera

How I Built an Autonomous AI Agent with Claude Code

What the Agent Actually Does

The Stack

Architecture: One Persistent Agent Session

The Agent Core

The Scheduler

Self-Healing

Risk Classification

Process Management: pm2

Discord as Control Plane

The Video Pipeline

The Memory System

MCP Servers: Connecting Everything

vs. Hiring a Content Manager

vs. n8n / Make / Zapier

Getting Started

Similar Posts