# Wraith AI Copilot — Design Spec

> **Date:** 2026-03-17
> **Purpose:** First-class AI copilot integration — Claude as an XO (Executive Officer) with full terminal, filesystem, and RDP desktop access through Wraith's native protocol channels
> **Depends on:** Wraith Desktop v0.1.0 (all 4 phases complete)
> **License:** MIT (same as Wraith)

---

## 1. What This Is

An AI co-pilot that shares the Commander's view and control of remote systems. The XO (Claude) can:

- **See** RDP desktops via FreeRDP3 bitmap frames → Claude Vision API
- **Type** in SSH terminals via bidirectional stdin/stdout pipes
- **Click** in RDP sessions via FreeRDP3 mouse/keyboard input channels
- **Read/write files** via SFTP — the same connection the terminal uses
- **Open/close sessions** — autonomously connect to hosts from the connection manager

This is NOT a chatbot sidebar. It's a second operator with the same access as the human, working through the same protocol channels Wraith already provides.

**Why this is unique:** No other tool does this. Existing AI coding assistants work on local files. Wraith's XO works on remote servers — SSH terminals, Windows desktops, remote filesystems — all through native protocols. No Playwright, no browser automation, no screen recording. The RDP session IS the viewport. The SSH session IS the shell.

---

## 2. Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│  Wraith Application                                              │
│                                                                  │
│  ┌─ AI Service (internal/ai/) ─────────────────────────────────┐ │
│  │                                                             │ │
│  │  ┌──────────────┐  ┌───────────────┐  ┌─────────────────┐  │ │
│  │  │ Claude API    │  │ Tool Dispatch  │  │ Conversation    │  │ │
│  │  │ Client        │  │ Router         │  │ Manager         │  │ │
│  │  │ (HTTP + SSE)  │  │                │  │ (SQLite)        │  │ │
│  │  └──────┬───────┘  └───────┬───────┘  └─────────────────┘  │ │
│  │         │                  │                                 │ │
│  │         │    ┌─────────────▼──────────────┐                 │ │
│  │         │    │      Tool Definitions       │                 │ │
│  │         │    │                             │                 │ │
│  │         │    │  Terminal: write, read, cwd │                 │ │
│  │         │    │  SFTP: list, read, write    │                 │ │
│  │         │    │  RDP: screenshot, click,    │                 │ │
│  │         │    │       type, keypress, move  │                 │ │
│  │         │    │  Session: list, connect,    │                 │ │
│  │         │    │           disconnect        │                 │ │
│  │         │    └─────────────┬──────────────┘                 │ │
│  │         │                  │                                 │ │
│  └─────────┼──────────────────┼─────────────────────────────────┘ │
│            │                  │                                    │
│            ▼                  ▼                                    │
│  ┌─────────────────┐  ┌──────────────────────────────────────┐   │
│  │  Claude API      │  │  Existing Wraith Services            │   │
│  │  (Anthropic)     │  │                                      │   │
│  │                  │  │  SSHService.Write/Read                │   │
│  │  Messages API    │  │  SFTPService.List/Read/Write         │   │
│  │  + Tool Use      │  │  RDPService.SendMouse/SendKey        │   │
│  │  + Vision        │  │  RDPService.GetFrame → JPEG encode   │   │
│  │  + Streaming     │  │  SessionManager.Create/List          │   │
│  └─────────────────┘  └──────────────────────────────────────┘   │
│                                                                   │
│  ┌─ Frontend ─────────────────────────────────────────────────┐  │
│  │  CopilotPanel.vue — right-side collapsible panel            │  │
│  │  ├── Chat messages (streaming, markdown rendered)           │  │
│  │  ├── Tool call visualization (what the XO did)              │  │
│  │  ├── RDP screenshot thumbnails inline                       │  │
│  │  ├── Session awareness (which session XO is focused on)     │  │
│  │  ├── Control toggle (XO driving / Commander driving)        │  │
│  │  └── Quick commands bar                                     │  │
│  └────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────┘
```

---

## 3. AI Service Layer (`internal/ai/`)

### 3.1 Authentication — OAuth PKCE (Max Subscription)

Wraith authenticates against the user's Claude Max subscription via OAuth Authorization Code Flow with PKCE. No API key needed. No per-token billing. Same auth path as Claude Code, but with Wraith's own independent token set (no shared credential file, no race conditions).

**OAuth Parameters:**

| Parameter | Value |
|---|---|
| Authorize URL | `https://claude.ai/oauth/authorize` |
| Token URL | `https://platform.claude.com/v1/oauth/token` |
| Client ID | `9d1c250a-e61b-44d9-88ed-5944d1962f5e` |
| PKCE Method | S256 |
| Code Verifier | 32 random bytes, base64url (no padding) |
| Code Challenge | SHA-256(verifier), base64url (no padding) |
| Redirect URI | `http://localhost:{dynamic_port}/callback` |
| Scopes | `user:inference user:profile` |
| State | 32 random bytes, base64url |

**Auth Flow:**

```
1. User clicks "Connect to Claude" in Wraith copilot settings
2. Wraith generates PKCE code_verifier + code_challenge
3. Wraith starts a local HTTP server on a random port
4. Wraith opens browser to:
     https://claude.ai/oauth/authorize
       ?code=true
       &client_id=9d1c250a-e61b-44d9-88ed-5944d1962f5e
       &response_type=code
       &redirect_uri=http://localhost:{port}/callback
       &scope=user:inference user:profile
       &code_challenge={challenge}
       &code_challenge_method=S256
       &state={state}
5. User logs in with their Anthropic/Claude account
6. Browser redirects to http://localhost:{port}/callback?code={auth_code}&state={state}
7. Wraith validates state, exchanges code for tokens:
     POST https://platform.claude.com/v1/oauth/token
     {
       "grant_type": "authorization_code",
       "code": "{auth_code}",
       "redirect_uri": "http://localhost:{port}/callback",
       "client_id": "9d1c250a-e61b-44d9-88ed-5944d1962f5e",
       "code_verifier": "{verifier}",
       "state": "{state}"
     }
8. Response: { access_token, refresh_token, expires_in, scope }
9. Wraith encrypts tokens with vault and stores in SQLite settings:
     - ai_access_token (vault-encrypted)
     - ai_refresh_token (vault-encrypted)
     - ai_token_expires_at (unix timestamp)
10. Done — copilot is authenticated
```

**Token Refresh (automatic, silent):**

```
When access_token is expired (checked before each API call):
  POST https://platform.claude.com/v1/oauth/token
  {
    "grant_type": "refresh_token",
    "refresh_token": "{decrypted_refresh_token}",
    "client_id": "9d1c250a-e61b-44d9-88ed-5944d1962f5e",
    "scope": "user:inference user:profile"
  }
  → New access_token + refresh_token stored in vault
```

**Implementation:** `internal/ai/oauth.go` — Go HTTP server for callback, PKCE helpers, token exchange, token refresh. Uses `pkg/browser` to open the authorize URL.

**Fallback:** For users without a Max subscription, allow raw API key input (stored in vault). The client checks which auth method is configured and uses the appropriate header.

### 3.2 Claude API Client

Direct HTTP client — no Python sidecar, no external SDK. Pure Go.

```go
type ClaudeClient struct {
    auth       *OAuthManager    // handles token refresh + auth header
    model      string           // configurable: claude-sonnet-4-5-20250514, etc.
    httpClient *http.Client
    baseURL    string           // https://api.anthropic.com
}

// SendMessage sends a messages API request with tool use + vision support.
// Returns a streaming response channel for token-by-token delivery.
func (c *ClaudeClient) SendMessage(messages []Message, tools []Tool, systemPrompt string) (<-chan StreamEvent, error)
```

**Auth header:** `Authorization: Bearer {access_token}` (from OAuth). Falls back to `x-api-key: {api_key}` if using raw API key auth.

**Message format:** Anthropic Messages API v1 (`/v1/messages`).

**Streaming:** SSE (`stream: true`). Parse `event: content_block_delta`, `event: content_block_stop`, `event: message_delta`, `event: tool_use` events. Emit to frontend via Wails events.

**Vision:** RDP screenshots sent as base64-encoded JPEG in the `image` content block type. Resolution capped at 1280x720 for token efficiency (downscale from native resolution before encoding).

**Token tracking:** Parse `usage` from the API response. Track `input_tokens`, `output_tokens`, `cache_creation_input_tokens`, `cache_read_input_tokens` per conversation. Store totals in SQLite.

### 3.2 Tool Definitions

```go
var CopilotTools = []Tool{
    // Terminal
    {Name: "terminal_write", Description: "Type text into an active SSH terminal session",
     InputSchema: {sessionId: string, text: string}},
    {Name: "terminal_read", Description: "Get recent terminal output from an SSH session (last N lines)",
     InputSchema: {sessionId: string, lines: int (default 50)}},
    {Name: "terminal_cwd", Description: "Get the current working directory of an SSH session",
     InputSchema: {sessionId: string}},

    // SFTP
    {Name: "sftp_list", Description: "List files and directories at a remote path",
     InputSchema: {sessionId: string, path: string}},
    {Name: "sftp_read", Description: "Read the contents of a remote file (max 5MB)",
     InputSchema: {sessionId: string, path: string}},
    {Name: "sftp_write", Description: "Write content to a remote file",
     InputSchema: {sessionId: string, path: string, content: string}},

    // RDP
    {Name: "rdp_screenshot", Description: "Capture the current RDP desktop screen as an image",
     InputSchema: {sessionId: string}},
    {Name: "rdp_click", Description: "Click at screen coordinates in an RDP session",
     InputSchema: {sessionId: string, x: int, y: int, button: string (default "left")}},
    {Name: "rdp_doubleclick", Description: "Double-click at coordinates",
     InputSchema: {sessionId: string, x: int, y: int}},
    {Name: "rdp_type", Description: "Type a text string into the RDP session",
     InputSchema: {sessionId: string, text: string}},
    {Name: "rdp_keypress", Description: "Press a key or key combination (e.g. 'enter', 'ctrl+c', 'alt+tab', 'win+r')",
     InputSchema: {sessionId: string, key: string}},
    {Name: "rdp_scroll", Description: "Scroll the mouse wheel at coordinates",
     InputSchema: {sessionId: string, x: int, y: int, delta: int}},
    {Name: "rdp_move", Description: "Move the mouse cursor to coordinates",
     InputSchema: {sessionId: string, x: int, y: int}},

    // Session Management
    {Name: "list_sessions", Description: "List all active SSH and RDP sessions",
     InputSchema: {}},
    {Name: "connect_ssh", Description: "Open a new SSH session to a saved connection",
     InputSchema: {connectionId: int}},
    {Name: "connect_rdp", Description: "Open a new RDP session to a saved connection",
     InputSchema: {connectionId: int}},
    {Name: "disconnect", Description: "Close an active session",
     InputSchema: {sessionId: string}},
}
```

### 3.3 Tool Dispatch Router

```go
type ToolRouter struct {
    ssh         *ssh.SSHService
    sftp        *sftp.SFTPService
    rdp         *rdp.RDPService
    sessions    *session.Manager
    connections *connections.ConnectionService
}

// Dispatch executes a tool call and returns the result
func (r *ToolRouter) Dispatch(toolName string, input json.RawMessage) (interface{}, error)
```

The router maps tool names to existing Wraith service methods. No new protocol code — everything routes through the services we already built.

**Terminal output buffering:** The `terminal_read` tool needs recent output. Add an output ring buffer to SSHService that stores the last N lines (configurable, default 200) of each session's stdout. The buffer is written to by the existing read goroutine and read by the tool dispatcher.

**RDP screenshot encoding:** The `rdp_screenshot` tool calls `RDPService.GetFrame()` to get the raw RGBA pixel buffer, downscales to 1280x720 if larger, encodes as JPEG (quality 85), and returns as base64. This is the image that gets sent to Claude's Vision API.

### 3.4 Conversation Manager

```go
type Conversation struct {
    ID        string
    Messages  []Message
    Model     string
    CreatedAt time.Time
    TokensIn  int
    TokensOut int
}

type ConversationManager struct {
    db       *sql.DB
    active   *Conversation
}

// Create starts a new conversation
// Load resumes a saved conversation
// AddMessage appends a message and persists to SQLite
// GetHistory returns the full message list for API calls
// GetTokenUsage returns cumulative token counts
```

Conversations are persisted to a `conversations` SQLite table with messages stored as JSON. This allows resuming a conversation across app restarts.

### 3.5 System Prompt

```
You are the XO (Executive Officer) aboard the Wraith command station. The Commander
(human operator) works alongside you managing remote servers and workstations.

You have direct access to all active sessions through your tools:
- SSH terminals: read output, type commands, navigate filesystems
- SFTP: read and write remote files
- RDP desktops: see the screen, click, type, interact with any GUI application
- Session management: open new connections, close sessions

When given a task:
1. Assess what sessions and access you need
2. Execute efficiently — don't ask for permission to use tools, just use them
3. Report what you found or did, with relevant details
4. If something fails, diagnose and try an alternative approach

You are not an assistant answering questions. You are an operator executing missions.
Act decisively. Use your tools. Report results.
```

---

## 4. Data Model Additions

```sql
-- AI conversations
CREATE TABLE IF NOT EXISTS conversations (
    id          TEXT PRIMARY KEY,
    title       TEXT,
    model       TEXT NOT NULL,
    messages    TEXT NOT NULL DEFAULT '[]',  -- JSON array of messages
    tokens_in   INTEGER DEFAULT 0,
    tokens_out  INTEGER DEFAULT 0,
    created_at  DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at  DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- AI settings (stored in existing settings table)
-- ai_api_key_encrypted  — Claude API key (vault-encrypted)
-- ai_model              — default model
-- ai_max_tokens         — max response tokens (default 4096)
-- ai_rdp_capture_rate   — screenshot interval in seconds (default: on-demand)
-- ai_token_budget       — monthly token budget warning threshold
```

Add migration `002_ai_copilot.sql` for the conversations table.

---

## 5. Frontend: Copilot Panel

### Layout

```
┌──────────────────────────────────────────┬──────────────┐
│                                          │              │
│           Terminal / RDP                  │   COPILOT    │
│           (existing)                     │   PANEL      │
│                                          │   (320px)    │
│                                          │              │
│                                          │  [Messages]  │
│                                          │  [Tool viz]  │
│                                          │  [Thumbs]    │
│                                          │              │
│                                          │  [Input]     │
├──────────────────────────────────────────┴──────────────┤
│  Status bar                                              │
└──────────────────────────────────────────────────────────┘
```

The copilot panel is a **right-side collapsible panel** (320px default width, resizable). Toggle via toolbar button or Ctrl+Shift+K.

### Components

**`CopilotPanel.vue`** — main container:
- Header: "XO" label, model selector dropdown, token counter, close button
- Message list: scrollable, auto-scroll on new messages
- Tool call cards: collapsible, show tool name + input + result
- RDP screenshots: inline thumbnails (click to expand)
- Input area: textarea with send button, Shift+Enter for newlines, Enter to send

**`CopilotMessage.vue`** — single message:
- Commander messages: right-aligned, blue accent
- XO messages: left-aligned, markdown rendered (code blocks, lists, etc.)
- Tool use blocks: collapsible card showing tool name, input params, result

**`CopilotToolViz.vue`** — tool call visualization:
- Icon per tool type (terminal icon, folder icon, monitor icon, etc.)
- Summary line: "Typed `ls -la` in Asgard (SSH)", "Screenshot from DC01 (RDP)"
- Expandable detail showing raw input/output

**`CopilotSettings.vue`** — configuration modal:
- API key input (stored encrypted in vault)
- Model selector
- Token budget threshold
- RDP capture settings
- Conversation history management

### Streaming

Claude API responses stream token-by-token:

```
Go: Claude API (SSE) → parse events → Wails events
Frontend: listen for Wails events → append to message → re-render
```

Events:
- `ai:text:{conversationId}` — text delta (append to current message)
- `ai:tool_use:{conversationId}` — tool call started (show pending card)
- `ai:tool_result:{conversationId}` — tool call completed (update card)
- `ai:done:{conversationId}` — response complete
- `ai:error:{conversationId}` — error occurred

---

## 6. Go Backend Structure

```
internal/
  ai/
    service.go              # AIService — orchestrates everything
    service_test.go
    oauth.go                # OAuth PKCE flow — authorize, callback, token exchange, refresh
    oauth_test.go
    client.go               # ClaudeClient — HTTP + SSE to Anthropic API
    client_test.go
    tools.go                # Tool definitions (JSON schema)
    router.go               # ToolRouter — dispatches tool calls to Wraith services
    router_test.go
    conversation.go         # ConversationManager — persistence + history
    conversation_test.go
    types.go                # Message, Tool, StreamEvent types
    screenshot.go           # RDP frame → JPEG encode + downscale
    screenshot_test.go
    terminal_buffer.go      # Ring buffer for terminal output history
    terminal_buffer_test.go
  db/
    migrations/
      002_ai_copilot.sql    # conversations table
```

---

## 7. Frontend Structure

```
frontend/src/
  components/
    copilot/
      CopilotPanel.vue       # Main panel container
      CopilotMessage.vue      # Single message (commander or XO)
      CopilotToolViz.vue      # Tool call visualization card
      CopilotSettings.vue     # API key, model, budget configuration
  composables/
    useCopilot.ts             # AI service wrappers, streaming, state
  stores/
    copilot.store.ts          # Conversation state, messages, streaming
```

---

## 8. Implementation Phases

| Phase | Deliverables |
|---|---|
| **A: Core** | AI service, Claude API client (HTTP + SSE streaming), tool definitions, tool router, conversation manager, SQLite migration, terminal output buffer |
| **B: Terminal Tools** | Wire terminal_write/read/cwd + sftp_* tools to existing services, test with real SSH sessions |
| **C: RDP Vision** | Screenshot capture (RGBA → JPEG → base64), rdp_screenshot tool, vision in API calls |
| **D: RDP Input** | rdp_click/type/keypress/move/scroll tools, coordinate mapping, key combo parsing |
| **E: Frontend** | CopilotPanel, message streaming, tool visualization, settings, conversation persistence |

---

## 9. Key Implementation Details

### Terminal Output Buffer

```go
type TerminalBuffer struct {
    lines []string
    mu    sync.RWMutex
    max   int // default 200 lines
}

func (b *TerminalBuffer) Write(data []byte)      // append, split on newlines
func (b *TerminalBuffer) ReadLast(n int) []string // return last N lines
func (b *TerminalBuffer) ReadAll() []string
```

Added to SSHService — the existing read goroutine writes to both the Wails event (for xterm.js) AND the buffer (for AI reads).

### RDP Screenshot Pipeline

```
RDPService.GetFrame()           → raw RGBA []byte (1920×1080×4 = ~8MB)
    ↓
image.NewRGBA() + copy          → Go image.Image
    ↓
imaging.Resize(1280, 720)       → downscaled for token efficiency
    ↓
jpeg.Encode(quality=85)         → JPEG []byte (~100-200KB)
    ↓
base64.StdEncoding.Encode()     → base64 string (~150-270KB)
    ↓
Claude API image content block  → Vision input
```

One screenshot ≈ ~1,500 tokens. At on-demand capture (not continuous), this is manageable.

### Key Combo Parsing

For `rdp_keypress`, parse key combo strings into FreeRDP input sequences:

```
"enter"     → scancode 0x1C down, 0x1C up
"ctrl+c"    → Ctrl down, C down, C up, Ctrl up
"alt+tab"   → Alt down, Tab down, Tab up, Alt up
"win+r"     → Win down, R down, R up, Win up
"ctrl+alt+delete" → special handling (Ctrl+Alt+Del)
```

Map key names to scancodes using the existing `input.go` scancode table.

### Token Budget

Track cumulative token usage per day/month. When approaching the configured budget threshold:
- Show warning in the copilot panel header
- Log a warning
- Don't hard-block — the Commander decides whether to continue

---

## 10. Security Considerations

- **OAuth tokens** stored in vault (same AES-256-GCM encryption as SSH keys). Access token + refresh token both encrypted at rest.
- **Tokens never logged** — mask in all log output. Only log token expiry times and auth status.
- **Token refresh is automatic and silent** — no user interaction needed after initial login. Refresh token rotation handled properly (new refresh token replaces old).
- **Independent from Claude Code** — Wraith has its own OAuth session. No shared credential files, no race conditions with other Anthropic apps.
- **Fallback API key** also stored in vault if used instead of OAuth.
- **Conversation content** may contain sensitive data (terminal output, file contents, screenshots of desktops). Stored in SQLite alongside other encrypted data. Consider encrypting the messages JSON blob with the vault key.
- **Tool access is unrestricted** — the XO has the same access as the Commander. This is by design. The human is always watching and can take control.
- **No autonomous session creation without Commander context** — the XO can open sessions, but the connections (with credentials) were set up by the Commander
- **PKCE prevents token interception** — authorization code flow with S256 challenge ensures the code can only be exchanged by the app that initiated the flow