# Agent Hospital -- Full API Documentation

## Overview

Agent Hospital diagnoses and repairs AI agents. The healing flow is:

1. POST /api/v1/heal -- send health report, get diagnosis + repair commands
2. Execute the whitelisted repair commands locally
3. POST /api/v1/heal/results -- send repair results + optional fresh health data
4. Repeat steps 2-3 until decision is "healed" or "escalate" (max 5 turns)

## Authentication

Ed25519 keypair-based. The recommended client (`npx @agent-hospital/client heal`) handles this automatically:

1. On first run, it registers a host and agent, generates an Ed25519 keypair, and caches credentials at `~/.agent-hospital/credentials.json`.
2. Each request is signed with a short-lived JWT (60s TTL) using the private key.
3. The JWT is sent as `Authorization: Bearer <jwt>`.

If calling the API directly, you must first register:

1. `POST /api/v1/hosts/register` with `{"name": "my-host"}` -- returns `enrollmentToken`
2. Generate an Ed25519 keypair
3. `POST /api/v1/agents/register` with `{"hostToken": "<token>", "publicKey": "<base64-raw-32-bytes>", "name": "my-agent", "framework": "openclaw"}` -- returns `agentId`
4. Sign JWTs with header `{"alg":"EdDSA","typ":"agent+jwt"}`, payload `{"sub":"<sha256-hex-of-raw-pubkey>","iat":...,"exp":...,"jti":"..."}`

## Endpoints

### POST /api/v1/heal

Start a healing session.

**Headers:**
- Content-Type: application/json
- Authorization: Bearer <signed-jwt>

**Request Body:**
```json
{
  "name": "my-agent-hostname",
  "framework": "openclaw",
  "host": "my-agent-hostname",
  "report": { <HealthReport object -- see schema below> },
  "fileContents": { "SOUL.md": "...", "openclaw.json": "..." },
  "availableActions": ["restart-daemon", "prune-sessions", "kill-port-conflict", "clean-logs", "fix-context-window"]
}
```

**Response (200):**
```json
{
  "agentId": "uuid",
  "sessionId": "uuid",
  "diagnosis": {
    "overallStatus": "warning",
    "findings": [{ "area": "gateway", "severity": "warning", "finding": "...", "evidence": "...", "recommendation": "..." }],
    "rootCause": "Gateway process not responding",
    "confidence": 0.85,
    "narrative": "Your gateway is down. Prescribing restart."
  },
  "decision": {
    "decision": "more_repairs",
    "narrative": "Gateway needs restart.",
    "commands": [
      { "action": "restart-daemon", "description": "Restart the OpenClaw daemon process", "whitelisted": true },
      { "action": "systemctl restart nginx", "description": "Restart reverse proxy", "whitelisted": false }
    ],
    "confidence": 0.85,
    "severity": "warning"
  }
}
```

The `decision` field tells you what to do next:
- `"healed"` -- agent is healthy, session complete. No further action needed.
- `"more_repairs"` -- execute the commands in `commands` array. Commands with `whitelisted: true` are safe to auto-execute. Commands with `whitelisted: false` are recommendations for manual execution. Then call /api/v1/heal/results.
- `"escalate"` -- cannot fix automatically, needs human investigation.

### POST /api/v1/heal/results

Send repair execution results back to the doctor for evaluation.

**Headers:**
- Content-Type: application/json
- Authorization: Bearer <signed-jwt>

**Request Body:**
```json
{
  "sessionId": "uuid-from-heal-response",
  "results": [
    { "action": "restart-daemon", "success": true, "output": "Daemon restarted successfully" },
    { "action": "prune-sessions", "success": false, "output": "Permission denied" }
  ],
  "postRepairHealth": { <optional fresh HealthReport after repairs> }
}
```

**Response (200):**
```json
{
  "decision": {
    "decision": "healed",
    "narrative": "All repairs succeeded. Agent is healthy.",
    "commands": [],
    "confidence": 0.92,
    "severity": "healthy"
  }
}
```

Repeat the loop: if decision is `"more_repairs"`, execute new commands and call /heal/results again.

## HealthReport Schema

The report object should contain these fields (all optional except framework):

```
{
  framework: "openclaw" | "hermes",
  version: string | null,
  host: string,
  timestamp: ISO 8601 string,
  runtime: { processAlive: bool, pid: number|null, uptimeSeconds: number|null },
  gateway: { alive: bool, port: number, latencyMs: number|null },
  integrations: [{ type: string, connected: bool, error: string|null, ... }],
  crons: [{ id: string, expression: string, lastStatus: "success"|"error"|"unknown", ... }],
  memory: { type: string, entryCount: number, storeReachable: bool, lastWrite: string|null, sizeBytes: number },
  model: { provider: string, modelId: string, apiReachable: bool, authValid: bool, latencyMs: number|null },
  disk: { homeDirMb: number, sessionsMb: number, logsMb: number, totalMb: number },
  personality: { file: string, exists: bool, wordCount: number },
  workspace: { files: {...}, skills: { count, names, validation }, pipeline: { exists, stageCount } } | null,
  configSnapshot: object | null,
  logsTail: string | null,
  sessions: { totalCount, oldestTimestamp, newestTimestamp, olderThan30Days, totalSizeMb },
  logAnalysis: { totalErrors, totalWarnings, topErrors: [{pattern, count, lastSeen}], errorRate },
  probes: [{ probe: string, success: bool, latencyMs, output, error }]
}
```

## Available Whitelisted Actions

### OpenClaw
- restart-daemon -- Restart the OpenClaw daemon process
- prune-sessions -- Delete session files older than 30 days
- kill-port-conflict -- Kill processes blocking port 18789
- clean-logs -- Delete log files older than 7 days
- fix-context-window -- Truncate oversized .jsonl files (>50MB)

### Hermes
- restart-gateway -- Restart the Hermes gateway
- prune-sessions -- Delete sessions older than 30 days
- checkpoint-wal -- Checkpoint SQLite WAL file
- kill-port-conflict -- Kill processes blocking port 8642
- clean-logs -- Delete log files older than 7 days

## Quickest Way (recommended)

Run this on the machine where your agent runs:

npx @agent-hospital/client heal

This handles the entire flow automatically: register, generate Ed25519 identity, detect framework, collect health, send to doctor, execute repairs, report back, loop until healed. No setup or API keys needed.

## Dashboard

View healing sessions and agent status at https://agent-hospital.ai

## Stateless Health Check

POST /api/v1/check-in -- returns status (healthy/warning/critical) without creating any records. Useful for quick polling.