task-11: complete QA + hardening with resilience fixes
- Created comprehensive QA checklist covering edge cases (missing EXIF, timezones, codecs, corrupt files) - Added ErrorBoundary component wrapped around TimelineTree and MediaPanel - Created global error.tsx page for unhandled errors - Improved failed asset UX with red borders, warning icons, and inline error display - Added loading skeletons to TimelineTree and MediaPanel - Added retry button for failed media loads - Created DEPLOYMENT_VALIDATION.md with validation commands and checklist - Applied k8s recommendations: - Changed node affinity to required for compute nodes (Pi 5) - Enabled Tailscale LoadBalancer service for MinIO S3 (reliable Range requests) - Enabled cleanup CronJob for staging files
This commit is contained in:
85
PLAN.md
85
PLAN.md
@@ -36,12 +36,14 @@ This plan is written to be executed by multiple subagents (parallelizable workst
|
||||
## Key Decisions (Locked)
|
||||
|
||||
### App identity
|
||||
|
||||
- App name: `porthole`
|
||||
- Set the app name via environment variable: `APP_NAME=porthole`.
|
||||
- Use `APP_NAME` everywhere (web + worker) via the shared config module so renaming is global.
|
||||
- If the UI needs to display the name in the browser, also provide `NEXT_PUBLIC_APP_NAME` (either set explicitly or derived at build time from `APP_NAME`).
|
||||
|
||||
### Networking
|
||||
|
||||
- Tailnet clients access the app via **Tailscale Ingress HTTPS termination**.
|
||||
- MinIO is reachable **over tailnet** via a dedicated FQDN:
|
||||
- `https://minio.<tailnet-fqdn>` (S3 API)
|
||||
@@ -51,6 +53,7 @@ This plan is written to be executed by multiple subagents (parallelizable workst
|
||||
- Optional LAN ingress exists using `nip.io` and nginx ingress, but tailnet clients use Tailscale hostnames.
|
||||
|
||||
### Storage model
|
||||
|
||||
- **MinIO is the source of truth**.
|
||||
- External archive objects under **`originals/`** are treated as **immutable**:
|
||||
- The app **indexes in place**.
|
||||
@@ -60,20 +63,24 @@ This plan is written to be executed by multiple subagents (parallelizable workst
|
||||
- Uploads are processed then stored in canonical by default.
|
||||
|
||||
### Presigned URL strategy
|
||||
|
||||
- Use **path-style presigned URLs** signed against:
|
||||
- `MINIO_PUBLIC_ENDPOINT_TS=https://minio.<tailnet-fqdn>`
|
||||
- Using HTTPS for MinIO on tailnet avoids mixed-content block when the app is served via HTTPS.
|
||||
|
||||
### Kubernetes constraints
|
||||
|
||||
- Cluster nodes: **2× Raspberry Pi 5 (8GB)** + **1× Raspberry Pi 3 B+ (1GB)**.
|
||||
- Heavy pods must be pinned to Pi 5 nodes.
|
||||
- Multi-arch images required (arm64 + amd64), built on a laptop and pushed to an in-cluster **insecure HTTP registry**.
|
||||
|
||||
### Metadata extraction
|
||||
|
||||
- **Photos**: camera-like EXIF first (`DateTimeOriginal`), then fallbacks.
|
||||
- **Videos**: camera-like tags first (ExifTool QuickTime/vendor tags), fallback to universal container `creation_time`.
|
||||
|
||||
### Derived media
|
||||
|
||||
- Image thumbs: `image_256.jpg` and `image_768.jpg`.
|
||||
- Video posters: only `poster_256.jpg` initially (CPU-friendly).
|
||||
|
||||
@@ -82,6 +89,7 @@ This plan is written to be executed by multiple subagents (parallelizable workst
|
||||
## Architecture
|
||||
|
||||
### Components
|
||||
|
||||
- **Web**: Next.js (UI + API)
|
||||
- **Worker**: Node worker using BullMQ
|
||||
- **Queue**: Redis
|
||||
@@ -89,6 +97,7 @@ This plan is written to be executed by multiple subagents (parallelizable workst
|
||||
- **Object store**: MinIO (in-cluster, single-node)
|
||||
|
||||
### Data flow
|
||||
|
||||
1. Ingestion (upload or scan) creates/updates DB asset records.
|
||||
2. Worker extracts metadata and generates thumbs/posters.
|
||||
3. UI queries aggregated timeline nodes and displays a tree.
|
||||
@@ -146,6 +155,7 @@ Example bucket: `media`.
|
||||
- `raw_tags_json` (jsonb, optional but recommended for debugging)
|
||||
|
||||
Indexes:
|
||||
|
||||
- `capture_ts_utc`, `status`, `media_type`
|
||||
|
||||
### Table: `imports`
|
||||
@@ -161,11 +171,13 @@ Indexes:
|
||||
## Worker Jobs (BullMQ)
|
||||
|
||||
### `scan_minio_prefix(importId, bucket, prefix)`
|
||||
|
||||
- Guardrails: only allow prefixes from allowlist, starting with `originals/`.
|
||||
- Lists objects; upserts `assets` by `source_key`.
|
||||
- Enqueues `process_asset(assetId)`.
|
||||
|
||||
### `process_asset(assetId)`
|
||||
|
||||
- Downloads object (stream or temp file).
|
||||
- Extracts metadata:
|
||||
- Photos: ExifTool EXIF chain.
|
||||
@@ -177,6 +189,7 @@ Indexes:
|
||||
- Never throws errors that would crash the worker loop; failures are captured on the asset row.
|
||||
|
||||
### `copy_to_canonical(assetId)`
|
||||
|
||||
- Computes canonical key: `canonical/originals/YYYY/MM/DD/{assetId}.{origExt}`.
|
||||
- Copy-only; never deletes `source_key` for external archive.
|
||||
- Updates `canonical_key` and flips `active_key`.
|
||||
@@ -186,12 +199,14 @@ Indexes:
|
||||
## API (MVP)
|
||||
|
||||
### Admin ingestion
|
||||
|
||||
- `POST /api/imports` → create import batch
|
||||
- `POST /api/imports/:id/upload` → upload media to `staging/` and enqueue processing
|
||||
- `POST /api/imports/:id/scan-minio` → enqueue scan of allowlisted prefix
|
||||
- `GET /api/imports/:id/status` → progress
|
||||
|
||||
### Timeline and browsing
|
||||
|
||||
- `GET /api/tree`
|
||||
- params: `start`, `end`, `granularity=year|month|day`, filters: `mediaType`
|
||||
- returns nodes with counts and sample thumbs
|
||||
@@ -205,10 +220,12 @@ Indexes:
|
||||
## Frontend UX/UI (MVP)
|
||||
|
||||
### Pages
|
||||
|
||||
- `/` Timeline tree
|
||||
- `/admin` Admin tools (upload, scan, import status)
|
||||
|
||||
### Timeline tree
|
||||
|
||||
- SVG tree rendering with:
|
||||
- Vertical/horizontal orientation toggle.
|
||||
- Zoom/pan (touch supported).
|
||||
@@ -219,11 +236,13 @@ Indexes:
|
||||
- Virtualized thumbnail list.
|
||||
|
||||
### Viewer
|
||||
|
||||
- Image viewer modal.
|
||||
- Video playback via HTML5 `<video>` on the presigned URL.
|
||||
- If a video can’t be played (codec/container): show poster + message.
|
||||
|
||||
### Resilience
|
||||
|
||||
- Any media with `status=failed` renders as a placeholder tile and does not break aggregation or layout.
|
||||
|
||||
---
|
||||
@@ -231,6 +250,7 @@ Indexes:
|
||||
## Kubernetes Deployment Plan (Pi-aware)
|
||||
|
||||
### Scheduling
|
||||
|
||||
- Label nodes:
|
||||
- Pi 5 nodes: `node-class=compute`
|
||||
- Pi 3 node: `node-class=tiny`
|
||||
@@ -238,6 +258,7 @@ Indexes:
|
||||
- `web`, `worker`, `minio`, `postgres`, `redis`
|
||||
|
||||
### Workloads
|
||||
|
||||
- `StatefulSet/minio` (single-node) + Longhorn PVC
|
||||
- `StatefulSet/postgres` + Longhorn PVC
|
||||
- `Deployment/redis`
|
||||
@@ -246,6 +267,7 @@ Indexes:
|
||||
- `CronJob/cleanup-staging` (optional; disabled by default)
|
||||
|
||||
### Exposure
|
||||
|
||||
- Tailscale Ingress (HTTPS termination):
|
||||
- `app.<tailnet-fqdn>` → web service
|
||||
- `minio.<tailnet-fqdn>` → MinIO S3 (9000)
|
||||
@@ -253,6 +275,7 @@ Indexes:
|
||||
- Optional LAN nginx ingress + MetalLB for `nip.io` hostnames.
|
||||
|
||||
### Ingress notes
|
||||
|
||||
- For uploads and media streaming, configure timeouts and body size to support “large but not gigantic” media.
|
||||
- Ensure Range requests work for video playback.
|
||||
|
||||
@@ -261,10 +284,12 @@ Indexes:
|
||||
## Build & Release (Multi-arch)
|
||||
|
||||
### Package manager
|
||||
|
||||
- Use **Bun** for installs and scripts (`bun install`, `bun run ...`).
|
||||
- Avoid `npm`/`pnpm` in CI and docs unless required for a specific tool.
|
||||
|
||||
### Container build
|
||||
|
||||
- Build on laptop using Docker Buildx.
|
||||
- Push `linux/arm64` and `linux/amd64` images to local in-cluster registry over **insecure HTTP**.
|
||||
- Use Debian-slim Node base images for better ARM64 compatibility with `sharp` + ffmpeg.
|
||||
@@ -289,19 +314,19 @@ This plan is intended to be executed in parallel by multiple subagents. Each sub
|
||||
- Keep the table below updated in every PR/merge/phase-end commit that changes scope or completes work.
|
||||
- Exactly one task should be marked `in_progress` at a time.
|
||||
|
||||
| Task | Status | Notes |
|
||||
|---|---|---|
|
||||
| 1 — Repository scaffolding | completed | Bun workspace + shared config scaffold |
|
||||
| 2 — Database schema + migrations | completed | assets/imports schema + migration runner |
|
||||
| 3 — MinIO client + presigned URL strategy | completed | @tline/minio + presigned URL API route |
|
||||
| 4 — Worker pipeline (process images/videos) | completed | process_asset + scan_minio_prefix implemented |
|
||||
| 5 — Ingestion endpoints (upload + scan) | completed | imports create/upload/scan/status APIs |
|
||||
| 6 — Canonical copy logic (uploads default) | completed | copy_to_canonical worker job + enqueue on uploads |
|
||||
| 7 — Timeline aggregation API | completed | /api/tree implemented |
|
||||
| 8 — Timeline tree frontend | completed | basic SVG tree + orientation toggle |
|
||||
| 9 — Media panel + viewer | completed | day selection, asset list, preview + viewer |
|
||||
| 10 — k8s deployment (Pi-aware) | completed | Helm chart + Tailscale ingress |
|
||||
| 11 — QA + hardening | in_progress | Dockerfiles + MinIO Tailscale services added; pending deploy + end-to-end verification (Range, codec failures) |
|
||||
| Task | Status | Notes |
|
||||
| ------------------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| 1 — Repository scaffolding | completed | Bun workspace + shared config scaffold |
|
||||
| 2 — Database schema + migrations | completed | assets/imports schema + migration runner |
|
||||
| 3 — MinIO client + presigned URL strategy | completed | @tline/minio + presigned URL API route |
|
||||
| 4 — Worker pipeline (process images/videos) | completed | process_asset + scan_minio_prefix implemented |
|
||||
| 5 — Ingestion endpoints (upload + scan) | completed | imports create/upload/scan/status APIs |
|
||||
| 6 — Canonical copy logic (uploads default) | completed | copy_to_canonical worker job + enqueue on uploads |
|
||||
| 7 — Timeline aggregation API | completed | /api/tree implemented |
|
||||
| 8 — Timeline tree frontend | completed | basic SVG tree + orientation toggle |
|
||||
| 9 — Media panel + viewer | completed | day selection, asset list, preview + viewer |
|
||||
| 10 — k8s deployment (Pi-aware) | completed | Helm chart + Tailscale ingress |
|
||||
| 11 — QA + hardening | completed | QA checklist created, error boundaries added, UI resilience improved, deployment validation documented, k8s recommendations applied (required affinity, Tailscale LB service, cleanup CronJob enabled) |
|
||||
|
||||
- Entry point: `./.agents/README.md`
|
||||
- Agent briefs:
|
||||
@@ -314,32 +339,35 @@ This plan is intended to be executed in parallel by multiple subagents. Each sub
|
||||
|
||||
### Subagents and assigned model
|
||||
|
||||
| Subagent | Responsibility | LLM Model |
|
||||
|---|---|---|
|
||||
| `orchestrator` | backlog coordination, interfaces, acceptance criteria | `github-copilot/gpt-5.2` |
|
||||
| `backend-api` | Next.js API routes, DB schema/migrations, presigned URL logic | `github-copilot/claude-sonnet-4.5` |
|
||||
| `worker-media` | BullMQ worker, ExifTool/ffprobe/ffmpeg integration, thumbs/posters | `github-copilot/claude-sonnet-4.5` |
|
||||
| `frontend-ui` | timeline tree rendering, responsive layout, virtualization, styling | `github-copilot/gpt-5.2` |
|
||||
| `k8s-infra` | Helm/Kustomize, node affinity, MinIO/Postgres/Redis manifests, Tailscale ingress | `github-copilot/claude-sonnet-4.5` |
|
||||
| `qa-review` | test plan, edge cases, security review, performance checks | `github-copilot/claude-haiku-4.5` |
|
||||
| Subagent | Responsibility | LLM Model |
|
||||
| -------------- | -------------------------------------------------------------------------------- | ---------------------------------- |
|
||||
| `orchestrator` | backlog coordination, interfaces, acceptance criteria | `github-copilot/gpt-5.2` |
|
||||
| `backend-api` | Next.js API routes, DB schema/migrations, presigned URL logic | `github-copilot/claude-sonnet-4.5` |
|
||||
| `worker-media` | BullMQ worker, ExifTool/ffprobe/ffmpeg integration, thumbs/posters | `github-copilot/claude-sonnet-4.5` |
|
||||
| `frontend-ui` | timeline tree rendering, responsive layout, virtualization, styling | `github-copilot/gpt-5.2` |
|
||||
| `k8s-infra` | Helm/Kustomize, node affinity, MinIO/Postgres/Redis manifests, Tailscale ingress | `github-copilot/claude-sonnet-4.5` |
|
||||
| `qa-review` | test plan, edge cases, security review, performance checks | `github-copilot/claude-haiku-4.5` |
|
||||
|
||||
> Note: the model names above are intentionally explicit. If your environment exposes different model IDs, replace them consistently.
|
||||
|
||||
### Task breakdown (MVP)
|
||||
|
||||
#### Task 1 — Repository scaffolding
|
||||
|
||||
- Define folder structure (apps/web, apps/worker, helm/).
|
||||
- Add shared `config` module (env validation).
|
||||
|
||||
Owner: `orchestrator` (brief: `./.agents/orchestrator.md`, model: `github-copilot/gpt-5.2`)
|
||||
|
||||
#### Task 2 — Database schema + migrations
|
||||
|
||||
- Implement `assets`/`imports` schema.
|
||||
- Add indexes.
|
||||
|
||||
Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 3 — MinIO client + presigned URL strategy
|
||||
|
||||
- Implement internal client for cluster operations.
|
||||
- Implement public-signing client for tailnet endpoint.
|
||||
- Enforce path-style URLs.
|
||||
@@ -347,6 +375,7 @@ Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/
|
||||
Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 4 — Worker pipeline (process images/videos)
|
||||
|
||||
- ExifTool extraction (photos + camera-like video fields).
|
||||
- ffprobe technical metadata; fallback `creation_time`.
|
||||
- `sharp` thumbs for images.
|
||||
@@ -356,6 +385,7 @@ Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/
|
||||
Owner: `worker-media` (brief: `./.agents/worker-media.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 5 — Ingestion endpoints (upload + scan)
|
||||
|
||||
- Admin upload endpoint: stream to `staging/`.
|
||||
- Scan endpoint: enqueue `scan_minio_prefix` only for allowlisted prefix `originals/`.
|
||||
- Import status endpoint.
|
||||
@@ -363,18 +393,21 @@ Owner: `worker-media` (brief: `./.agents/worker-media.md`, model: `github-copilo
|
||||
Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 6 — Canonical copy logic (uploads default)
|
||||
|
||||
- For uploads, copy to canonical date key, flip `active_key`.
|
||||
- For scans, optional manual/cron copy.
|
||||
|
||||
Owner: `worker-media` (brief: `./.agents/worker-media.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 7 — Timeline aggregation API
|
||||
|
||||
- `GET /api/tree` for year/month/day rolling up counts.
|
||||
- Select sample thumbs per node.
|
||||
|
||||
Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 8 — Timeline tree frontend
|
||||
|
||||
- Interactive tree with orientation toggle.
|
||||
- Touch zoom/pan.
|
||||
- Expand/collapse.
|
||||
@@ -382,6 +415,7 @@ Owner: `backend-api` (brief: `./.agents/backend-api.md`, model: `github-copilot/
|
||||
Owner: `frontend-ui` (brief: `./.agents/frontend-ui.md`, model: `github-copilot/gpt-5.2`)
|
||||
|
||||
#### Task 9 — Media panel + viewer
|
||||
|
||||
- Virtualized thumbnail list.
|
||||
- Viewer modal for images.
|
||||
- Video playback with poster fallback.
|
||||
@@ -390,6 +424,7 @@ Owner: `frontend-ui` (brief: `./.agents/frontend-ui.md`, model: `github-copilot/
|
||||
Owner: `frontend-ui` (brief: `./.agents/frontend-ui.md`, model: `github-copilot/gpt-5.2`)
|
||||
|
||||
#### Task 10 — k8s deployment (Pi-aware)
|
||||
|
||||
- Helm chart or Kustomize.
|
||||
- Node affinity to Pi 5 nodes.
|
||||
- Longhorn PVCs.
|
||||
@@ -399,6 +434,7 @@ Owner: `frontend-ui` (brief: `./.agents/frontend-ui.md`, model: `github-copilot/
|
||||
Owner: `k8s-infra` (brief: `./.agents/k8s-infra.md`, model: `github-copilot/claude-sonnet-4.5`)
|
||||
|
||||
#### Task 11 — QA + hardening
|
||||
|
||||
- Edge case tests: missing EXIF, odd timezones, unsupported video codecs.
|
||||
- Validate Range playback through ingress.
|
||||
- Verify no UI crash on failed assets.
|
||||
@@ -410,31 +446,38 @@ Owner: `qa-review` (brief: `./.agents/qa-review.md`, model: `github-copilot/clau
|
||||
## Future Features (Tracked)
|
||||
|
||||
### Security / Access
|
||||
|
||||
- Authentication and authorization.
|
||||
- Lightweight admin protection (shared secret header) before full auth.
|
||||
|
||||
### Media
|
||||
|
||||
- Video transcoding CronJob (H.264 MP4 and/or HLS) and “prefer derived” playback.
|
||||
- Multiple poster/thumb sizes.
|
||||
- Better codec support via transcode profiles.
|
||||
|
||||
### Organization
|
||||
|
||||
- User-defined albums and tags.
|
||||
- Progressive enhancement for folder upload where supported.
|
||||
- Bucket separation (`media` vs `derived`) or lifecycle policies.
|
||||
|
||||
### Metadata
|
||||
|
||||
- Location: GPS extraction + reverse geocoding + map UI.
|
||||
- Metadata edits/overrides (fix dates, correct capture time), audit log.
|
||||
|
||||
### Performance / Scale
|
||||
|
||||
- Deduplication by hash.
|
||||
- Smarter clustering (“moments”) within a day.
|
||||
|
||||
### Networking
|
||||
|
||||
- Routed LAN for tailnet clients (subnet router) and endpoint selection for presigned URLs.
|
||||
|
||||
### Delivery
|
||||
|
||||
- Move multi-arch builds from laptop to CI.
|
||||
|
||||
---
|
||||
|
||||
Reference in New Issue
Block a user