Files
flynn/docs/plans/2026-02-13-webchat-attachments-plan.md
2026-02-13 15:03:48 -08:00

12 KiB
Raw Permalink Blame History

Webchat Attachment Support Implementation Plan

Date: 2026-02-13
Status: Planning
Scope: Minimal — Image attachments first (audio as follow-up)

Overview

Enable webchat users to send image attachments through the gateway to the agent. This completes the deferred "webchat" phase from the P4 media pipeline (phase 6c_gateway_protocol_attachments marked the protocol-level support as complete, but UI integration was deferred).

Current State

  • Protocol: GatewayAttachment type exists in src/gateway/protocol.ts (added in phase 6c_gateway_protocol_attachments)
  • Backend: agent.send handler in src/gateway/handlers/agent.ts already accepts optional attachments parameter and converts them to channel Attachment[] format
  • Media pipeline: All model clients support multimodal messages (images via base64 or URL)
  • Other channels: Telegram, Discord, Slack, WhatsApp all extract and pass image attachments
  • Webchat: Currently text-only

Scope

In Scope

  • Image attachments (jpeg, png, gif, webp)
  • File input UI in webchat
  • Client-side base64 encoding
  • Gateway protocol attachment parameter
  • Basic error handling (file size, type validation)
  • Tests for gateway attachment handling
  • Update docs/plans/state.json (mark phase 6d_webchat_attachments complete)

Out of Scope (Follow-up)

  • Audio attachments (requires different UX — separate mic button, audio preview)
  • Document attachments (PDF, etc.)
  • Image preview/thumbnail UI in message list
  • Drag-and-drop file upload
  • Multiple file selection
  • Image compression/resizing client-side

Implementation Plan

Phase 1: UI Changes (Frontend)

File: src/gateway/ui/pages/chat.js

1.1 Add File Input Element

Add a hidden file input and an "attach" button next to the send button.

<!-- Inside the chat input wrapper -->
<input type="file" id="file-input" accept="image/jpeg,image/png,image/gif,image/webp" style="display:none" />
<button id="attach-btn" class="btn-icon" title="Attach image">📎</button>
<button id="send-btn" class="btn-primary">Send</button>

Tasks:

  • Add file input and attach button to render() function
  • Store references in _elements object
  • Wire click handler: attach button triggers file input
  • Accept only image MIME types

1.2 File Selection Handler

Read selected file, validate, encode to base64, store in component state.

let _selectedFile = null; // { mimeType, data, filename }

async function handleFileSelect(event) {
  const file = event.target.files?.[0];
  if (!file) return;
  
  // Validate type
  const supportedTypes = ['image/jpeg', 'image/png', 'image/gif', 'image/webp'];
  if (!supportedTypes.includes(file.type)) {
    showSystemMessage('❌ Unsupported file type. Only JPEG, PNG, GIF, and WebP images are supported.');
    return;
  }
  
  // Validate size (5MB limit)
  const MAX_SIZE = 5 * 1024 * 1024;
  if (file.size > MAX_SIZE) {
    showSystemMessage('❌ File too large. Maximum size is 5MB.');
    return;
  }
  
  // Read and encode
  const reader = new FileReader();
  reader.onload = () => {
    const base64 = reader.result.split(',')[1]; // Strip data URI prefix
    _selectedFile = {
      mimeType: file.type,
      data: base64,
      filename: file.name,
    };
    updateAttachmentUI();
  };
  reader.onerror = () => {
    showSystemMessage('❌ Failed to read file.');
  };
  reader.readAsDataURL(file);
}

Tasks:

  • Implement handleFileSelect()
  • Validate MIME type against ['image/jpeg', 'image/png', 'image/gif', 'image/webp']
  • Validate file size (5MB max)
  • Read file as base64 using FileReader
  • Store in _selectedFile state variable
  • Clear file input after reading

1.3 Attachment UI Indicator

Show selected file name and allow removal before sending.

function updateAttachmentUI() {
  const indicator = _elements.attachmentIndicator;
  if (!indicator) return;
  
  if (_selectedFile) {
    indicator.textContent = `📎 ${_selectedFile.filename}`;
    indicator.classList.remove('hidden');
  } else {
    indicator.textContent = '';
    indicator.classList.add('hidden');
  }
}

function clearAttachment() {
  _selectedFile = null;
  const fileInput = _elements.fileInput;
  if (fileInput) fileInput.value = '';
  updateAttachmentUI();
}

UI element (add to input wrapper):

<div id="attachment-indicator" class="attachment-indicator hidden">
  <span class="attachment-name"></span>
  <button class="btn-clear" title="Remove">×</button>
</div>

Tasks:

  • Add attachment indicator element to UI
  • Implement updateAttachmentUI() to show/hide indicator
  • Implement clearAttachment() to reset state
  • Add clear button handler

1.4 Update Send Handler

Pass attachments array to agent.send RPC call.

async function sendMessage() {
  const text = _elements.input.value.trim();
  const attachment = _selectedFile;
  
  if (!text && !attachment) return; // Nothing to send
  
  // ... existing command parsing ...
  
  // Prepare RPC params
  const params = { message: text };
  if (attachment) {
    params.attachments = [attachment];
  }
  
  // Add user message to UI (with attachment indicator if present)
  let displayText = text;
  if (attachment) {
    displayText = `📎 ${attachment.filename}\n${text}`;
  }
  _elements.messages.appendChild(createMessageEl('user', displayText));
  
  // Clear input and attachment
  _elements.input.value = '';
  clearAttachment();
  
  // Send RPC
  try {
    await client.call('agent.send', params, handleStreamingEvent);
  } catch (err) {
    // ... error handling ...
  }
}

Tasks:

  • Modify sendMessage() to include attachments in RPC params if _selectedFile is set
  • Update user message display to show attachment indicator
  • Clear _selectedFile after sending
  • Allow sending attachment-only messages (text optional)

1.5 CSS Styling

File: src/gateway/ui/style.css

/* File input and attach button */
#attach-btn {
  background: transparent;
  border: none;
  font-size: 1.2rem;
  cursor: pointer;
  padding: 0.5rem;
  opacity: 0.7;
}

#attach-btn:hover {
  opacity: 1;
}

/* Attachment indicator */
.attachment-indicator {
  display: flex;
  align-items: center;
  gap: 0.5rem;
  padding: 0.5rem;
  background: var(--primary-bg);
  border: 1px solid var(--border);
  border-radius: 4px;
  margin-bottom: 0.5rem;
  font-size: 0.9rem;
}

.attachment-indicator.hidden {
  display: none;
}

.attachment-indicator .btn-clear {
  background: transparent;
  border: none;
  color: var(--text-muted);
  cursor: pointer;
  font-size: 1.2rem;
  line-height: 1;
  padding: 0 0.25rem;
}

.attachment-indicator .btn-clear:hover {
  color: var(--text-primary);
}

Tasks:

  • Add styles for attach button
  • Add styles for attachment indicator
  • Add styles for clear button
  • Ensure responsive layout

Phase 2: Testing

2.1 Backend Integration Tests

File: src/gateway/handlers/handlers.test.ts (extend existing tests)

Add tests for agent.send with attachments:

describe('agent.send with attachments', () => {
  it('should accept image attachments and convert to channel format', async () => {
    const params = {
      message: 'What is this?',
      attachments: [
        {
          mimeType: 'image/jpeg',
          data: 'base64data',
          filename: 'photo.jpg',
        },
      ],
    };
    // ... test that agent.process receives correct Attachment[]
  });

  it('should allow attachment-only messages', async () => {
    const params = {
      message: '',
      attachments: [{ mimeType: 'image/png', data: 'base64' }],
    };
    // ... test that message is processed
  });

  it('should handle missing data/url gracefully', async () => {
    const params = {
      message: 'test',
      attachments: [{ mimeType: 'image/jpeg' }], // no data or url
    };
    // ... test graceful handling (attachment ignored or error)
  });
});

Tasks:

  • Add tests for agent.send with attachments parameter
  • Verify conversion from GatewayAttachment[] to Attachment[]
  • Test attachment-only messages (empty text)
  • Test malformed attachments (missing data/url)

2.2 Manual Testing Checklist

Test cases:

  1. Select and send a JPEG image → agent receives image, responds with description
  2. Select and send a PNG image → agent receives image
  3. Select an invalid file type (e.g., .txt) → error message shown, file rejected
  4. Select a large file (>5MB) → error message shown, file rejected
  5. Select an image, then clear it before sending → no attachment sent
  6. Send attachment-only message (no text) → agent receives image
  7. Send text + image → agent receives both
  8. Send message, then send another with attachment → both messages processed correctly
  9. Test on mobile (touch events for attach button)

Tasks:

  • Run full test suite: pnpm test:run
  • Run typecheck: pnpm typecheck
  • Manual testing in browser (Desktop Chrome, Firefox)
  • Manual testing on mobile (Android/iOS)

Phase 3: Documentation Updates

3.1 Update docs/plans/state.json

Add new phase entry under p6-enhanced-media-pipeline:

"6d_webchat_attachments": {
  "priority": "P6",
  "status": "completed",
  "description": "Webchat UI for image attachments: file input, base64 encoding, attachment indicator, send handler, gateway protocol integration",
  "files_modified": [
    "src/gateway/ui/pages/chat.js",
    "src/gateway/ui/style.css",
    "src/gateway/handlers/handlers.test.ts"
  ],
  "test_status": "manual + integration tests passing"
}

Tasks:

  • Add 6d_webchat_attachments phase
  • Update p6-enhanced-media-pipeline summary to include webchat
  • Update overall test count if applicable

3.2 Update README or User Docs (if needed)

File: README.md (webchat section)

Add note about attachment support:

### Webchat Features

- Real-time chat with streaming responses
- Markdown rendering with syntax highlighting
- Slash commands (/help, /reset, /compact, /usage, /status, /model)
- Web search toggle
- **Image attachments** (JPEG, PNG, GIF, WebP up to 5MB)
- Message actions (copy, edit)

Tasks:

  • Update README webchat section (if exists)
  • Add attachment feature to feature list

File Changes Summary

Modified Files

  1. src/gateway/ui/pages/chat.js

    • Add file input element and attach button
    • Implement handleFileSelect() for base64 encoding
    • Add _selectedFile state variable
    • Add updateAttachmentUI() and clearAttachment() helpers
    • Modify sendMessage() to include attachments in RPC params
    • Update user message display to show attachment indicator
  2. src/gateway/ui/style.css

    • Styles for attach button
    • Styles for attachment indicator
    • Styles for clear button
  3. src/gateway/handlers/handlers.test.ts

    • Add tests for agent.send with attachments
    • Test attachment-only messages
    • Test malformed attachments
  4. docs/plans/state.json

    • Add 6d_webchat_attachments phase entry
  5. README.md (optional)

    • Update webchat feature list

No Changes Needed

  • src/gateway/protocol.ts (already has GatewayAttachment type)
  • src/gateway/handlers/agent.ts (already handles attachments parameter)
  • src/models/media.ts (already has conversion logic)
  • src/channels/types.ts (already has Attachment type)

Follow-up Work (Not in Scope)

Audio Attachments

  • Requires different UX: separate mic button, recording UI, audio waveform preview
  • Audio transcription via Whisper (already supported in backend)
  • Estimate: 1-2 days

Enhanced UX

  • Image preview/thumbnail in message list
  • Drag-and-drop file upload
  • Multiple file selection
  • Client-side image compression/resizing (reduce bandwidth)
  • Copy/paste image from clipboard
  • Estimate: 2-3 days

Notes

  • Security: Client-side validation is not sufficient. Backend should also validate MIME types and file sizes if needed (currently delegated to model providers).
  • Performance: 5MB limit keeps base64 payloads reasonable for WebSocket messages. Larger files should use URL-based attachments (out of scope).
  • Compatibility: FileReader API is widely supported (IE10+, all modern browsers).
  • Mobile: File input works on mobile browsers (triggers camera/gallery picker).

Estimated Effort

  • Phase 1 (UI): 2-3 hours
  • Phase 2 (Testing): 1 hour
  • Phase 3 (Docs): 30 minutes
  • Total: ~4 hours