440 lines
21 KiB
Markdown
440 lines
21 KiB
Markdown
# Wraith AI Copilot — Design Spec
|
||
|
||
> **Date:** 2026-03-17
|
||
> **Purpose:** First-class AI copilot integration — Claude as an XO (Executive Officer) with full terminal, filesystem, and RDP desktop access through Wraith's native protocol channels
|
||
> **Depends on:** Wraith Desktop v0.1.0 (all 4 phases complete)
|
||
> **License:** MIT (same as Wraith)
|
||
|
||
---
|
||
|
||
## 1. What This Is
|
||
|
||
An AI co-pilot that shares the Commander's view and control of remote systems. The XO (Claude) can:
|
||
|
||
- **See** RDP desktops via FreeRDP3 bitmap frames → Claude Vision API
|
||
- **Type** in SSH terminals via bidirectional stdin/stdout pipes
|
||
- **Click** in RDP sessions via FreeRDP3 mouse/keyboard input channels
|
||
- **Read/write files** via SFTP — the same connection the terminal uses
|
||
- **Open/close sessions** — autonomously connect to hosts from the connection manager
|
||
|
||
This is NOT a chatbot sidebar. It's a second operator with the same access as the human, working through the same protocol channels Wraith already provides.
|
||
|
||
**Why this is unique:** No other tool does this. Existing AI coding assistants work on local files. Wraith's XO works on remote servers — SSH terminals, Windows desktops, remote filesystems — all through native protocols. No Playwright, no browser automation, no screen recording. The RDP session IS the viewport. The SSH session IS the shell.
|
||
|
||
---
|
||
|
||
## 2. Architecture
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────┐
|
||
│ Wraith Application │
|
||
│ │
|
||
│ ┌─ AI Service (internal/ai/) ─────────────────────────────────┐ │
|
||
│ │ │ │
|
||
│ │ ┌──────────────┐ ┌───────────────┐ ┌─────────────────┐ │ │
|
||
│ │ │ Claude API │ │ Tool Dispatch │ │ Conversation │ │ │
|
||
│ │ │ Client │ │ Router │ │ Manager │ │ │
|
||
│ │ │ (HTTP + SSE) │ │ │ │ (SQLite) │ │ │
|
||
│ │ └──────┬───────┘ └───────┬───────┘ └─────────────────┘ │ │
|
||
│ │ │ │ │ │
|
||
│ │ │ ┌─────────────▼──────────────┐ │ │
|
||
│ │ │ │ Tool Definitions │ │ │
|
||
│ │ │ │ │ │ │
|
||
│ │ │ │ Terminal: write, read, cwd │ │ │
|
||
│ │ │ │ SFTP: list, read, write │ │ │
|
||
│ │ │ │ RDP: screenshot, click, │ │ │
|
||
│ │ │ │ type, keypress, move │ │ │
|
||
│ │ │ │ Session: list, connect, │ │ │
|
||
│ │ │ │ disconnect │ │ │
|
||
│ │ │ └─────────────┬──────────────┘ │ │
|
||
│ │ │ │ │ │
|
||
│ └─────────┼──────────────────┼─────────────────────────────────┘ │
|
||
│ │ │ │
|
||
│ ▼ ▼ │
|
||
│ ┌─────────────────┐ ┌──────────────────────────────────────┐ │
|
||
│ │ Claude API │ │ Existing Wraith Services │ │
|
||
│ │ (Anthropic) │ │ │ │
|
||
│ │ │ │ SSHService.Write/Read │ │
|
||
│ │ Messages API │ │ SFTPService.List/Read/Write │ │
|
||
│ │ + Tool Use │ │ RDPService.SendMouse/SendKey │ │
|
||
│ │ + Vision │ │ RDPService.GetFrame → JPEG encode │ │
|
||
│ │ + Streaming │ │ SessionManager.Create/List │ │
|
||
│ └─────────────────┘ └──────────────────────────────────────┘ │
|
||
│ │
|
||
│ ┌─ Frontend ─────────────────────────────────────────────────┐ │
|
||
│ │ CopilotPanel.vue — right-side collapsible panel │ │
|
||
│ │ ├── Chat messages (streaming, markdown rendered) │ │
|
||
│ │ ├── Tool call visualization (what the XO did) │ │
|
||
│ │ ├── RDP screenshot thumbnails inline │ │
|
||
│ │ ├── Session awareness (which session XO is focused on) │ │
|
||
│ │ ├── Control toggle (XO driving / Commander driving) │ │
|
||
│ │ └── Quick commands bar │ │
|
||
│ └────────────────────────────────────────────────────────────┘ │
|
||
└──────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 3. AI Service Layer (`internal/ai/`)
|
||
|
||
### 3.1 Claude API Client
|
||
|
||
Direct HTTP client — no Python sidecar, no external SDK. Pure Go.
|
||
|
||
```go
|
||
type ClaudeClient struct {
|
||
apiKey string // decrypted from vault on demand
|
||
model string // configurable: claude-sonnet-4-5-20250514, etc.
|
||
httpClient *http.Client
|
||
baseURL string // https://api.anthropic.com
|
||
}
|
||
|
||
// SendMessage sends a messages API request with tool use + vision support.
|
||
// Returns a streaming response channel for token-by-token delivery.
|
||
func (c *ClaudeClient) SendMessage(req *MessageRequest) (<-chan StreamEvent, error)
|
||
```
|
||
|
||
**Message format:** Anthropic Messages API v1 (`/v1/messages`).
|
||
|
||
**Streaming:** SSE (`stream: true`). Parse `event: content_block_delta`, `event: content_block_stop`, `event: message_delta`, `event: tool_use` events. Emit to frontend via Wails events.
|
||
|
||
**Vision:** RDP screenshots sent as base64-encoded JPEG in the `image` content block type. Resolution capped at 1280x720 for token efficiency (downscale from native resolution before encoding).
|
||
|
||
**Token tracking:** Parse `usage` from the API response. Track `input_tokens`, `output_tokens`, `cache_creation_input_tokens`, `cache_read_input_tokens` per conversation. Store totals in SQLite.
|
||
|
||
### 3.2 Tool Definitions
|
||
|
||
```go
|
||
var CopilotTools = []Tool{
|
||
// Terminal
|
||
{Name: "terminal_write", Description: "Type text into an active SSH terminal session",
|
||
InputSchema: {sessionId: string, text: string}},
|
||
{Name: "terminal_read", Description: "Get recent terminal output from an SSH session (last N lines)",
|
||
InputSchema: {sessionId: string, lines: int (default 50)}},
|
||
{Name: "terminal_cwd", Description: "Get the current working directory of an SSH session",
|
||
InputSchema: {sessionId: string}},
|
||
|
||
// SFTP
|
||
{Name: "sftp_list", Description: "List files and directories at a remote path",
|
||
InputSchema: {sessionId: string, path: string}},
|
||
{Name: "sftp_read", Description: "Read the contents of a remote file (max 5MB)",
|
||
InputSchema: {sessionId: string, path: string}},
|
||
{Name: "sftp_write", Description: "Write content to a remote file",
|
||
InputSchema: {sessionId: string, path: string, content: string}},
|
||
|
||
// RDP
|
||
{Name: "rdp_screenshot", Description: "Capture the current RDP desktop screen as an image",
|
||
InputSchema: {sessionId: string}},
|
||
{Name: "rdp_click", Description: "Click at screen coordinates in an RDP session",
|
||
InputSchema: {sessionId: string, x: int, y: int, button: string (default "left")}},
|
||
{Name: "rdp_doubleclick", Description: "Double-click at coordinates",
|
||
InputSchema: {sessionId: string, x: int, y: int}},
|
||
{Name: "rdp_type", Description: "Type a text string into the RDP session",
|
||
InputSchema: {sessionId: string, text: string}},
|
||
{Name: "rdp_keypress", Description: "Press a key or key combination (e.g. 'enter', 'ctrl+c', 'alt+tab', 'win+r')",
|
||
InputSchema: {sessionId: string, key: string}},
|
||
{Name: "rdp_scroll", Description: "Scroll the mouse wheel at coordinates",
|
||
InputSchema: {sessionId: string, x: int, y: int, delta: int}},
|
||
{Name: "rdp_move", Description: "Move the mouse cursor to coordinates",
|
||
InputSchema: {sessionId: string, x: int, y: int}},
|
||
|
||
// Session Management
|
||
{Name: "list_sessions", Description: "List all active SSH and RDP sessions",
|
||
InputSchema: {}},
|
||
{Name: "connect_ssh", Description: "Open a new SSH session to a saved connection",
|
||
InputSchema: {connectionId: int}},
|
||
{Name: "connect_rdp", Description: "Open a new RDP session to a saved connection",
|
||
InputSchema: {connectionId: int}},
|
||
{Name: "disconnect", Description: "Close an active session",
|
||
InputSchema: {sessionId: string}},
|
||
}
|
||
```
|
||
|
||
### 3.3 Tool Dispatch Router
|
||
|
||
```go
|
||
type ToolRouter struct {
|
||
ssh *ssh.SSHService
|
||
sftp *sftp.SFTPService
|
||
rdp *rdp.RDPService
|
||
sessions *session.Manager
|
||
connections *connections.ConnectionService
|
||
}
|
||
|
||
// Dispatch executes a tool call and returns the result
|
||
func (r *ToolRouter) Dispatch(toolName string, input json.RawMessage) (interface{}, error)
|
||
```
|
||
|
||
The router maps tool names to existing Wraith service methods. No new protocol code — everything routes through the services we already built.
|
||
|
||
**Terminal output buffering:** The `terminal_read` tool needs recent output. Add an output ring buffer to SSHService that stores the last N lines (configurable, default 200) of each session's stdout. The buffer is written to by the existing read goroutine and read by the tool dispatcher.
|
||
|
||
**RDP screenshot encoding:** The `rdp_screenshot` tool calls `RDPService.GetFrame()` to get the raw RGBA pixel buffer, downscales to 1280x720 if larger, encodes as JPEG (quality 85), and returns as base64. This is the image that gets sent to Claude's Vision API.
|
||
|
||
### 3.4 Conversation Manager
|
||
|
||
```go
|
||
type Conversation struct {
|
||
ID string
|
||
Messages []Message
|
||
Model string
|
||
CreatedAt time.Time
|
||
TokensIn int
|
||
TokensOut int
|
||
}
|
||
|
||
type ConversationManager struct {
|
||
db *sql.DB
|
||
active *Conversation
|
||
}
|
||
|
||
// Create starts a new conversation
|
||
// Load resumes a saved conversation
|
||
// AddMessage appends a message and persists to SQLite
|
||
// GetHistory returns the full message list for API calls
|
||
// GetTokenUsage returns cumulative token counts
|
||
```
|
||
|
||
Conversations are persisted to a `conversations` SQLite table with messages stored as JSON. This allows resuming a conversation across app restarts.
|
||
|
||
### 3.5 System Prompt
|
||
|
||
```
|
||
You are the XO (Executive Officer) aboard the Wraith command station. The Commander
|
||
(human operator) works alongside you managing remote servers and workstations.
|
||
|
||
You have direct access to all active sessions through your tools:
|
||
- SSH terminals: read output, type commands, navigate filesystems
|
||
- SFTP: read and write remote files
|
||
- RDP desktops: see the screen, click, type, interact with any GUI application
|
||
- Session management: open new connections, close sessions
|
||
|
||
When given a task:
|
||
1. Assess what sessions and access you need
|
||
2. Execute efficiently — don't ask for permission to use tools, just use them
|
||
3. Report what you found or did, with relevant details
|
||
4. If something fails, diagnose and try an alternative approach
|
||
|
||
You are not an assistant answering questions. You are an operator executing missions.
|
||
Act decisively. Use your tools. Report results.
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Data Model Additions
|
||
|
||
```sql
|
||
-- AI conversations
|
||
CREATE TABLE IF NOT EXISTS conversations (
|
||
id TEXT PRIMARY KEY,
|
||
title TEXT,
|
||
model TEXT NOT NULL,
|
||
messages TEXT NOT NULL DEFAULT '[]', -- JSON array of messages
|
||
tokens_in INTEGER DEFAULT 0,
|
||
tokens_out INTEGER DEFAULT 0,
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
-- AI settings (stored in existing settings table)
|
||
-- ai_api_key_encrypted — Claude API key (vault-encrypted)
|
||
-- ai_model — default model
|
||
-- ai_max_tokens — max response tokens (default 4096)
|
||
-- ai_rdp_capture_rate — screenshot interval in seconds (default: on-demand)
|
||
-- ai_token_budget — monthly token budget warning threshold
|
||
```
|
||
|
||
Add migration `002_ai_copilot.sql` for the conversations table.
|
||
|
||
---
|
||
|
||
## 5. Frontend: Copilot Panel
|
||
|
||
### Layout
|
||
|
||
```
|
||
┌──────────────────────────────────────────┬──────────────┐
|
||
│ │ │
|
||
│ Terminal / RDP │ COPILOT │
|
||
│ (existing) │ PANEL │
|
||
│ │ (320px) │
|
||
│ │ │
|
||
│ │ [Messages] │
|
||
│ │ [Tool viz] │
|
||
│ │ [Thumbs] │
|
||
│ │ │
|
||
│ │ [Input] │
|
||
├──────────────────────────────────────────┴──────────────┤
|
||
│ Status bar │
|
||
└──────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
The copilot panel is a **right-side collapsible panel** (320px default width, resizable). Toggle via toolbar button or Ctrl+Shift+K.
|
||
|
||
### Components
|
||
|
||
**`CopilotPanel.vue`** — main container:
|
||
- Header: "XO" label, model selector dropdown, token counter, close button
|
||
- Message list: scrollable, auto-scroll on new messages
|
||
- Tool call cards: collapsible, show tool name + input + result
|
||
- RDP screenshots: inline thumbnails (click to expand)
|
||
- Input area: textarea with send button, Shift+Enter for newlines, Enter to send
|
||
|
||
**`CopilotMessage.vue`** — single message:
|
||
- Commander messages: right-aligned, blue accent
|
||
- XO messages: left-aligned, markdown rendered (code blocks, lists, etc.)
|
||
- Tool use blocks: collapsible card showing tool name, input params, result
|
||
|
||
**`CopilotToolViz.vue`** — tool call visualization:
|
||
- Icon per tool type (terminal icon, folder icon, monitor icon, etc.)
|
||
- Summary line: "Typed `ls -la` in Asgard (SSH)", "Screenshot from DC01 (RDP)"
|
||
- Expandable detail showing raw input/output
|
||
|
||
**`CopilotSettings.vue`** — configuration modal:
|
||
- API key input (stored encrypted in vault)
|
||
- Model selector
|
||
- Token budget threshold
|
||
- RDP capture settings
|
||
- Conversation history management
|
||
|
||
### Streaming
|
||
|
||
Claude API responses stream token-by-token:
|
||
|
||
```
|
||
Go: Claude API (SSE) → parse events → Wails events
|
||
Frontend: listen for Wails events → append to message → re-render
|
||
```
|
||
|
||
Events:
|
||
- `ai:text:{conversationId}` — text delta (append to current message)
|
||
- `ai:tool_use:{conversationId}` — tool call started (show pending card)
|
||
- `ai:tool_result:{conversationId}` — tool call completed (update card)
|
||
- `ai:done:{conversationId}` — response complete
|
||
- `ai:error:{conversationId}` — error occurred
|
||
|
||
---
|
||
|
||
## 6. Go Backend Structure
|
||
|
||
```
|
||
internal/
|
||
ai/
|
||
service.go # AIService — orchestrates everything
|
||
service_test.go
|
||
client.go # ClaudeClient — HTTP + SSE to Anthropic API
|
||
client_test.go
|
||
tools.go # Tool definitions (JSON schema)
|
||
router.go # ToolRouter — dispatches tool calls to Wraith services
|
||
router_test.go
|
||
conversation.go # ConversationManager — persistence + history
|
||
conversation_test.go
|
||
types.go # Message, Tool, StreamEvent types
|
||
screenshot.go # RDP frame → JPEG encode + downscale
|
||
screenshot_test.go
|
||
terminal_buffer.go # Ring buffer for terminal output history
|
||
terminal_buffer_test.go
|
||
db/
|
||
migrations/
|
||
002_ai_copilot.sql # conversations table
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Frontend Structure
|
||
|
||
```
|
||
frontend/src/
|
||
components/
|
||
copilot/
|
||
CopilotPanel.vue # Main panel container
|
||
CopilotMessage.vue # Single message (commander or XO)
|
||
CopilotToolViz.vue # Tool call visualization card
|
||
CopilotSettings.vue # API key, model, budget configuration
|
||
composables/
|
||
useCopilot.ts # AI service wrappers, streaming, state
|
||
stores/
|
||
copilot.store.ts # Conversation state, messages, streaming
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Implementation Phases
|
||
|
||
| Phase | Deliverables |
|
||
|---|---|
|
||
| **A: Core** | AI service, Claude API client (HTTP + SSE streaming), tool definitions, tool router, conversation manager, SQLite migration, terminal output buffer |
|
||
| **B: Terminal Tools** | Wire terminal_write/read/cwd + sftp_* tools to existing services, test with real SSH sessions |
|
||
| **C: RDP Vision** | Screenshot capture (RGBA → JPEG → base64), rdp_screenshot tool, vision in API calls |
|
||
| **D: RDP Input** | rdp_click/type/keypress/move/scroll tools, coordinate mapping, key combo parsing |
|
||
| **E: Frontend** | CopilotPanel, message streaming, tool visualization, settings, conversation persistence |
|
||
|
||
---
|
||
|
||
## 9. Key Implementation Details
|
||
|
||
### Terminal Output Buffer
|
||
|
||
```go
|
||
type TerminalBuffer struct {
|
||
lines []string
|
||
mu sync.RWMutex
|
||
max int // default 200 lines
|
||
}
|
||
|
||
func (b *TerminalBuffer) Write(data []byte) // append, split on newlines
|
||
func (b *TerminalBuffer) ReadLast(n int) []string // return last N lines
|
||
func (b *TerminalBuffer) ReadAll() []string
|
||
```
|
||
|
||
Added to SSHService — the existing read goroutine writes to both the Wails event (for xterm.js) AND the buffer (for AI reads).
|
||
|
||
### RDP Screenshot Pipeline
|
||
|
||
```
|
||
RDPService.GetFrame() → raw RGBA []byte (1920×1080×4 = ~8MB)
|
||
↓
|
||
image.NewRGBA() + copy → Go image.Image
|
||
↓
|
||
imaging.Resize(1280, 720) → downscaled for token efficiency
|
||
↓
|
||
jpeg.Encode(quality=85) → JPEG []byte (~100-200KB)
|
||
↓
|
||
base64.StdEncoding.Encode() → base64 string (~150-270KB)
|
||
↓
|
||
Claude API image content block → Vision input
|
||
```
|
||
|
||
One screenshot ≈ ~1,500 tokens. At on-demand capture (not continuous), this is manageable.
|
||
|
||
### Key Combo Parsing
|
||
|
||
For `rdp_keypress`, parse key combo strings into FreeRDP input sequences:
|
||
|
||
```
|
||
"enter" → scancode 0x1C down, 0x1C up
|
||
"ctrl+c" → Ctrl down, C down, C up, Ctrl up
|
||
"alt+tab" → Alt down, Tab down, Tab up, Alt up
|
||
"win+r" → Win down, R down, R up, Win up
|
||
"ctrl+alt+delete" → special handling (Ctrl+Alt+Del)
|
||
```
|
||
|
||
Map key names to scancodes using the existing `input.go` scancode table.
|
||
|
||
### Token Budget
|
||
|
||
Track cumulative token usage per day/month. When approaching the configured budget threshold:
|
||
- Show warning in the copilot panel header
|
||
- Log a warning
|
||
- Don't hard-block — the Commander decides whether to continue
|
||
|
||
---
|
||
|
||
## 10. Security Considerations
|
||
|
||
- **API key** stored in vault (same AES-256-GCM encryption as SSH keys)
|
||
- **API key never logged** — mask in all log output
|
||
- **Conversation content** may contain sensitive data (terminal output, file contents, screenshots of desktops). Stored in SQLite alongside other encrypted data. Consider encrypting the messages JSON blob with the vault key.
|
||
- **Tool access is unrestricted** — the XO has the same access as the Commander. This is by design. The human is always watching and can take control.
|
||
- **No autonomous session creation without Commander context** — the XO can open sessions, but the connections (with credentials) were set up by the Commander
|