Run a Claude Agent on a Schedule: Cron Jobs, Persistent Memory, and Error Recovery

Building an AI agent is the easy part. Making it run a Claude agent on a schedule reliably every day without babysitting — that’s where most tutorials stop.

I run an autonomous Claude agent called Koda. It executes 15+ scheduled tasks daily: pulling analytics from YouTube, Google Search Console, and Instagram, scanning X for viral tweets, drafting content, syncing Skool members to Airtable, and reviewing its own memory. It runs on a Mac mini via pm2, uses node-cron for scheduling, and recovers from its own failures.

This post covers exactly how to set it up with persistent memory and self-healing error recovery. Real code from a system running in production.

The Architecture

The stack is simple:

Claude Agent SDK — the agent runtime
node-cron — schedules tasks using standard cron syntax
pm2 — process manager that keeps the agent alive across reboots
File-based memory — learnings, daily logs, and task state persisted to disk
Circuit breaker pattern — prevents a failing task from crashing the whole system

Here’s how these pieces fit together:

pm2 (process manager)
  └── index.ts (entry point)
        ├── agent.ts (Claude SDK wrapper)
        ├── scheduler.ts (node-cron + task runner)
        ├── bot.ts (Discord bridge for approvals)
        └── ~/.koda/ (persistent state)
              ├── tasks.json (task definitions)
              ├── learnings.md (curated knowledge)
              ├── data/daily-logs/ (per-day logs)
              └── data/analytics/ (snapshots)

Define Tasks as Data, Not Code

Every scheduled task is a JSON entry in tasks.json. No hardcoded prompts in your source code.

{
  "youtube_analytics": {
    "prompt": "Pull YouTube 7-day analytics. Save snapshot to data/analytics/{date}.json. Send a morning report with: total views, subs gained, top 3 videos.",
    "cron": "0 7 * * *",
    "type": "silent",
    "limits": { "maxTurns": 25, "maxBudgetUsd": 5 }
  },
  "content_proposal": {
    "prompt": "Read goals.md to identify the biggest gap. Propose 1 YouTube Short, 1 social post, 1 blog post with data backing each idea. DO NOT publish without approval.",
    "cron": "0 10 1,4,7,10,13,16,19,22,25,28 * *",
    "type": "approval"
  }
}

Key design decisions:

type: "silent" vs type: "approval" — silent tasks execute and report. Approval tasks send a draft to Discord and wait for a human reaction before executing.
limits — cap turns and budget per task. A runaway analytics pull won’t burn your API credits.
cron syntax — standard five-field cron. "0 7 <em> </em> *" = every day at 7 AM. "0 10 1,4,7,10..." = every 3 days at 10 AM.

This separation means you can add, remove, or modify tasks by editing a JSON file — no code changes, no redeployment.

The Scheduler Loop

The scheduler loads tasks from disk and registers each one with node-cron:

import cron from "node-cron";

interface TaskDef {
  prompt: string;
  cron: string;
  type: "silent" | "approval";
  timeout?: number;
  limits?: { maxTurns: number; maxBudgetUsd: number };
}

function loadTasks(): Record<string, TaskDef> {
  const raw = readFileSync("~/.koda/tasks.json", "utf-8");
  return JSON.parse(raw);
}

export function startScheduler(agent: Agent, bot: Bot) {
  const tasks = loadTasks();

  for (const [name, task] of Object.entries(tasks)) {
    cron.schedule(task.cron, async () => {
      console.log(`[scheduler] Running: ${name}`);
      try {
        const result = await agent.runIsolatedTask(
          task.prompt,
          task.limits
        );
        if (task.type === "approval") {
          bot.sendApproval(name, result);
        }
      } catch (err) {
        handleTaskFailure(name, err);
      }
    });
  }
}

The critical method is runIsolatedTask(). This runs the task in a separate context so a failure in one task doesn’t corrupt the agent’s main conversation.

Persistent Memory

An agent that forgets everything between runs is useless. Here’s the memory layer:

Learnings file (~/.koda/learnings.md) — curated knowledge the agent reads at every session start. Under 100 lines. Updated when new patterns emerge.

## Content Performance
- Hook formula: "[noun] that [unexpected verb]" — 3x baseline views
- Longer Shorts (1m20-1m25s) get higher avg view duration (80s)
- Views drop 95%+ within a week when no new Shorts posted

Daily logs (~/.koda/data/daily-logs/YYYY-MM-DD.md) — raw observations, task results, and decisions. One file per day.

Analytics snapshots (~/.koda/data/analytics/YYYY-MM-DD.json) — structured data from YouTube, GSC, Instagram. The agent reads previous snapshots to detect trends.

A nightly task reviews the last 3 daily logs and updates learnings.md if new patterns emerged. The agent builds knowledge over time — it doesn’t just execute tasks, it gets better at them.

Self-Healing Error Recovery

Tasks fail. APIs time out. Rate limits hit. The question isn’t whether failures happen — it’s whether the system recovers without you waking up at 3 AM. I wrote about this pattern in detail in How I Built a Self-Healing AI Content Agent.

Circuit breaker pattern:

const failureCounts = new Map<string, number>();
const MAX_FAILURES = 3;
const COOLDOWN_MS = 30 * 60 * 1000; // 30 min

function handleTaskFailure(name: string, err: Error) {
  const count = (failureCounts.get(name) || 0) + 1;
  failureCounts.set(name, count);

  if (count >= MAX_FAILURES) {
    console.error(`[scheduler] ${name} tripped circuit breaker (${count} failures)`);
    setTimeout(() => failureCounts.set(name, 0), COOLDOWN_MS);
    bot.report(`⚠️ Task ${name} disabled for 30 min after ${count} consecutive failures: ${err.message}`);
  }
}

Self-heal loop: Every 15 minutes, the scheduler checks if any tasks are stuck or if the agent process is degraded:

setInterval(async () => {
  const health = checkHealth();
  if (health.restartCount > 5) {
    bot.report("High restart count detected — investigating");
    await agent.runIsolatedTask("Read error logs and diagnose the issue");
  }
}, 15 * 60 * 1000);

Missed task recovery: If the agent was down when a cron triggered, it detects and runs missed tasks on startup:

function checkMissedTasks(tasks: Record<string, TaskDef>) {
  const lastRun = readLastRunTimes();
  const now = Date.now();

  for (const [name, task] of Object.entries(tasks)) {
    const last = lastRun[name] || 0;
    const interval = cronToMs(task.cron);
    if (now - last > interval * 1.5) {
      console.log(`[scheduler] Missed task detected: ${name}`);
      queueTask(name, task);
    }
  }
}

Keep It Alive with pm2

The agent runs as a pm2 process with automatic restarts:

// ecosystem.config.cjs
module.exports = {
  apps: [{
    name: "koda",
    script: "bash",
    args: '-c "source ~/.secrets.zsh && npx tsx src/index.ts"',
    cwd: "/Users/kjetil/code/koda",
    autorestart: true,
    max_restarts: 10,
    restart_delay: 5000,
    max_memory_restart: "1G",
  }]
};

Start it: pm2 start ecosystem.config.cjs && pm2 save

Now it survives reboots, crashes, and memory leaks. pm2 logs koda shows real-time output.

Daily Budget Enforcement

Without a budget cap, a looping agent can burn through API credits overnight.

const DAILY_BUDGET_USD = 50;
let dailySpend = 0;

function trackCost(tokens: number) {
  const cost = (tokens / 1_000_000) * 3; // rough Claude pricing
  dailySpend += cost;

  if (dailySpend >= DAILY_BUDGET_USD) {
    console.warn("[scheduler] Daily budget hit — pausing all tasks");
    pauseAllTasks();
  }
}

// Reset at midnight
cron.schedule("0 0 * * *", () => { dailySpend = 0; });

A Typical Day in Production

07:00  youtube_analytics    → pulls stats, saves snapshot, reports to Discord
07:15  instagram_analytics  → pulls follower count, logs top posts
07:30  bluesky_stats        → checks post performance
07:45  learnings_review     → reads last 3 daily logs, updates learnings.md
08:00  skool_member_sync    → exports Skool members, diffs, syncs to Airtable
08:15  goal_check           → reads goals.md, identifies biggest gap
08:45  gsc_analytics        → pulls search queries, logs keyword opportunities
09:00  viral_tweet_scan     → finds viral tweets, drafts + posts a quote tweet
10:00  content_proposal     → proposes YouTube Short + blog + social post
20:00  conversation_memory  → extracts decisions from today's logs

15 tasks. Zero manual intervention on good days. When something fails, the circuit breaker catches it, Discord gets notified, and the self-heal loop investigates.

Getting Started

Start with 1-2 tasks. Don’t build 15 on day one. Pick your most repetitive daily task — analytics pulls are a good first candidate.
Use tasks.json from the start. Separating task definitions from code saves you pain later.
Add the circuit breaker before you need it. The first time a task loops and burns $20 in API credits, you’ll wish you had it.
Persistent memory is non-negotiable. Even a simple learnings.md file that the agent reads on startup transforms its output quality over time.
pm2 + Discord = your monitoring stack. You need to know when things break without checking a terminal.

The full source for Koda’s scheduler is around 650 lines of TypeScript. Most of that is error handling, logging, and edge cases. The core scheduling logic is under 100 lines.

What’s Next

Once your agent runs reliably on a schedule, the next steps are:

Multi-agent orchestration — spawn sub-agents for complex tasks (research agent → writer agent → publisher agent)
Approval workflows — the agent proposes, you approve via Discord reaction, it executes
Outcome tracking — log what the agent published, check performance 48 hours later, feed results back into the learnings file
Automate your blog publishing with the same agent that runs your scheduled tasks

The goal isn’t to build a chatbot. It’s to build a system that does real work while you sleep.

FAQ

How much does it cost to run a scheduled Claude agent?

With a Claude Max subscription, the base cost is fixed. The only variable cost is external API calls (Gemini for images, etc). I spend less than $1/day on variable costs running 15+ daily tasks.

Can I run this on a VPS instead of a Mac mini?

Yes. Any Linux VPS with Node.js 18+ works. pm2 runs the same way. A $10/month Hetzner instance is plenty.

What happens when the agent crashes mid-task?

pm2 auto-restarts the process. On startup, the scheduler detects missed tasks and runs them. The circuit breaker prevents crash loops.

How do I prevent the agent from posting without approval?

Set type: "approval" on any task. The agent drafts content and sends it to Discord. Nothing publishes until you react with ✅.

This post was published using Notipo — my Notion-to-WordPress sync tool. Write in Notion, publish to WordPress automatically.