chore: initialize repository scaffolding

2025-09-23 10:15:09 -07:00
commit fed8b629c7
3 changed files with 712 additions and 0 deletions
@@ -0,0 +1,17 @@
+__pycache__/
+*.py[cod]
+*.egg-info/
+.env
+.venv/
+.env.*
+.pytest_cache/
+.coverage
+htmlcov/
+node_modules/
+.next/
+dist/
+build/
+.DS_Store
+.idea/
+.vscode/
+coverage/
@@ -0,0 +1,692 @@
+# SPEC-1 – Classy Perplexity‑style News Aggregator (Raspberry Pi 5 K8s)
+
+## Background
+
+You want a Perplexity‑style web app that aggregates news from a defined pool of reference websites and presents results in a classy, attractive, highly responsive UI. The target runtime is a Raspberry Pi 5 Kubernetes cluster, so the system must be lightweight, ARM64‑friendly, and resilient to node churn or SD‑card fragility. The product should feel like a modern AI assistant for news discovery: fast search, crisp summaries, clear source attributions, and mobile‑first ergonomics.
+
+Initial working assumptions (to be confirmed):
+
+* Content sources are a curated list of reputable outlets and blogs that permit aggregation with proper linking and snippet‑length quoting.
+* We will index headlines, metadata, and short excerpts; full‑text storage will be minimized or avoided unless licensed.
+* The app will support semantic search + conversational Q\&A over the indexed corpus, with citations to original articles.
+* Real‑time(ish) freshness target: new articles discoverable within 2–5 minutes of publication.
+* UI aims to echo Perplexity’s clean card layout, with source badges, inline citations, and a composer panel for queries.
+* Deployment must fit on 2–4 ARM64 nodes, using lightweight containers and a small replicated datastore.
+
+## Requirements
+
+**Scope for MVP**: Start with **Reuters** as the single source. Use official **RSS/Atom feeds and daily sitemaps** when available; gracefully fall back to HTML scraping for sections without feeds, storing only metadata/snippets with links. Freshness target 2–5 minutes. UI mirrors Perplexity’s card+chat layout with inline citations.
+
+### MoSCoW
+
+**Must‑have**
+
+* Aggregate from Reuters via RSS/Atom + sitemaps; fallback HTML scraper with robots.txt compliance toggle.
+* ARM64‑ready containers deployable on Raspberry Pi 5 K8s (k3s or MicroK8s).
+* Ingest pipeline with deduplication, canonical URL normalization, and rate‑limit/backoff.
+* Index headlines, authors, timestamps, topics, short excerpt (<= 320 chars), and source URL.
+* Full‑text search over stored fields; semantic search embeddings over titles+snippets.
+* Summarization and on‑page Q\&A with **clear citations** to source URLs.
+* Classy, responsive UI with Perplexity‑style query composer, results cards, and source badges.
+* Observability: structured logs, basic metrics (ingest latency, queue depth, 95p response), and alerting.
+* Legal safety rails: configurable snippet length, per‑domain robots policy, and kill‑switch per source.
+
+**Should‑have**
+
+* Topic taxonomy and tags (World, Business, Tech, etc.).
+* Incremental sitemap polling (by date) + change‑list RSS polling with jitter to avoid burst load.
+* Reader mode extraction (readability‑style) used **only for summarization** in memory, not stored.
+* Caching layer (HTTP + summary cache) to keep Raspberry Pi costs low.
+* Multi‑node HA for index and queue; rolling updates.
+
+**Could‑have**
+
+* User accounts for saved searches and daily digests.
+* Multi‑source expansion via declarative YAML for new sites.
+* Related‑story clustering and timeline views.
+* Basic mobile PWA installability and offline read‑later for snippets.
+
+**Won’t‑have (MVP)**
+
+* Paywalled content bypassing or full‑text storage of copyrighted articles.
+* Personalized recommendations or email digests.
+* Editorial curation tooling beyond tags and pinning.
+
+## Method
+
+### High‑level architecture
+
+```plantuml
+@startuml
+skinparam componentStyle rectangle
+skinparam shadowing false
+skinparam ArrowColor #888
+skinparam DefaultFontName Inter
+
+rectangle "k0s Cluster (ARM64 Raspberry Pi 5)" as K8S {
+  node "Namespace: news" as NS {
+    [Ingest Scheduler]
+(CronJobs)
+    [Feed+Sitemap Poller]
+(FastAPI Worker)
+    [HTML Scraper]
+(Worker, Trafilatura)
+    [Normalizer/Dedupe]
+(Worker)
+    [Embedder]
+(Worker -> OpenAI embeddings)
+    [Summarizer]
+(Worker -> OpenAI gpt-4o-mini)
+
+    database "PostgreSQL + pgvector" as PG
+    [Redis]
+(Cache + Queue)
+
+    [API Gateway]
+(FastAPI)
+    [Web UI]
+(Next.js, Tailwind, shadcn)
+  }
+}
+
+[Feed+Sitemap Poller] --> [HTML Scraper]
+[HTML Scraper] --> [Normalizer/Dedupe]
+[Normalizer/Dedupe] --> PG
+[Embedder] --> PG
+[Summarizer] --> PG
+
+[Ingest Scheduler] --> [Feed+Sitemap Poller]
+[Embedder] --> [OpenAI Embeddings API]
+[Summarizer] --> [OpenAI Chat Completions]
+
+[API Gateway] --> PG
+[API Gateway] --> Redis
+[Web UI] --> [API Gateway]
+@enduml
+```
+
+**Why these choices (MVP):**
+
+* **Source**: Start with **Reuters** using news sitemaps (with pagination parameters) and RSS; where feeds don’t exist, scrape respectfully with robots awareness.
+* **Storage**: **PostgreSQL + pgvector** keeps the stack compact (one DB for metadata, text search, and vectors). Postgres full‑text covers keyword search; pgvector powers semantic search.
+* **Workers**: Python **FastAPI** workers using **Trafilatura** for robust article extraction and metadata parsing. **Redis** as the lightweight queue/cache (Dramatiq or RQ).
+* **Summaries/Q\&A**: On‑demand summaries and answer synthesis via **gpt‑4o‑mini** with **inline citations**. Embeddings via **text‑embedding‑3‑small**. Both accessed through API keys/secrets in Kubernetes.
+* **UI**: **Next.js 14 App Router**, Tailwind + shadcn for a Perplexity‑style, low‑latency interface.
+* **k0s**: ARM64‑friendly. Use **nginx‑ingress** for HTTP routing, with optional **HAProxy Ingress** for TCP/advanced policies.
+
+### Data model (PostgreSQL)
+
+```sql
+-- Sources (static for MVP)
+CREATE TABLE sources (
+  id SERIAL PRIMARY KEY,
+  name TEXT NOT NULL UNIQUE,        -- e.g., 'Reuters'
+  base_url TEXT NOT NULL,           -- e.g., https://www.reuters.com
+  rss_urls TEXT[] NOT NULL DEFAULT '{}',
+  sitemap_urls TEXT[] NOT NULL DEFAULT '{}',
+  robots_txt TEXT,
+  enabled BOOLEAN NOT NULL DEFAULT true
+);
+
+-- Raw fetch jobs (observability + retries)
+CREATE TABLE fetch_jobs (
+  id BIGSERIAL PRIMARY KEY,
+  source_id INT REFERENCES sources(id),
+  url TEXT NOT NULL,
+  kind TEXT NOT NULL CHECK (kind IN ('rss','sitemap','article')),
+  status TEXT NOT NULL CHECK (status IN ('queued','fetched','parsed','failed')),
+  http_status INT,
+  etag TEXT,
+  last_modified TIMESTAMPTZ,
+  attempts INT NOT NULL DEFAULT 0,
+  error TEXT,
+  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+  updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+);
+CREATE INDEX ON fetch_jobs (status, created_at);
+
+-- Canonical articles (no copyrighted full text stored)
+CREATE TABLE articles (
+  id BIGSERIAL PRIMARY KEY,
+  source_id INT REFERENCES sources(id) NOT NULL,
+  canonical_url TEXT NOT NULL,
+  url_hash BYTEA NOT NULL,          -- SHA-256 of canonical_url
+  title TEXT NOT NULL,
+  author TEXT,
+  category TEXT,                    -- World, Business, Tech, etc.
+  published_at TIMESTAMPTZ,
+  fetched_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+  snippet TEXT,                     -- <= 320 chars, from feed/lede
+  summary TEXT,                     -- model-generated abstract
+  image_url TEXT,
+  language TEXT DEFAULT 'en',
+  UNIQUE (source_id, url_hash)
+);
+CREATE INDEX ON articles (published_at DESC);
+CREATE INDEX ON articles USING GIN (to_tsvector('english', coalesce(title,'') || ' ' || coalesce(snippet,'')));
+
+-- Embeddings for semantic search (title+snippet)
+CREATE EXTENSION IF NOT EXISTS vector;
+CREATE TABLE article_embeddings (
+  article_id BIGINT PRIMARY KEY REFERENCES articles(id) ON DELETE CASCADE,
+  embedding vector(1536) -- dimension for text-embedding-3-small
+);
+CREATE INDEX ON article_embeddings USING ivfflat (embedding vector_cosine_ops);
+
+-- Tags and mapping (optional but handy)
+CREATE TABLE tags (
+  id SERIAL PRIMARY KEY,
+  name TEXT UNIQUE
+);
+CREATE TABLE article_tags (
+  article_id BIGINT REFERENCES articles(id) ON DELETE CASCADE,
+  tag_id INT REFERENCES tags(id) ON DELETE CASCADE,
+  PRIMARY KEY (article_id, tag_id)
+);
+```
+
+### Ingestion flow
+
+1. **Discovery**
+
+* Poll **RSS/Atom** endpoints with ETag/Last‑Modified to minimize bandwidth.
+* Poll **news sitemaps** using incremental parameters (e.g., `from=` offsets when supported). Maintain per‑endpoint cursors.
+* For sections without feeds, enqueue **HTML pages** discovered from site index pages (rate‑limited) and respect `robots.txt` (configurable).
+
+2. **Fetch & Extract**
+
+* HTTP client with retry + exponential backoff and per‑host concurrency caps (e.g., 2–4). Respect `Cache-Control` where present.
+* Use **Trafilatura** with `favor_precision=true` to extract main content for **in‑memory summarization only**; do not persist full text.
+* Generate a **canonical URL** (resolve redirects, strip tracking params) and compute `url_hash`.
+
+3. **Normalize & Deduplicate**
+
+* If `(source_id, url_hash)` exists, skip insert; else create `articles` row with metadata and **snippet** (<=320 chars).
+* Classify category using rule‑based hints (URL path, RSS category) with a fallback lightweight classifier.
+
+4. **Summaries & Embeddings**
+
+* Create a short **summary** (60–90 words, neutral tone) with inline citation marker `[1]` → canonical URL.
+* Compute **embedding** on `(title + "
+  " + snippet)` and upsert into `article_embeddings`.
+
+5. **Indexing & Cache**
+
+* Postgres GIN index supports keyword search; pgvector handles ANN semantic search.
+* Cache hot queries and summaries in Redis for 5–15 minutes.
+
+### API design (FastAPI)
+
+* `GET /v1/search?q=&mode=hybrid&page=` — Hybrid search (keyword + vector rerank), returns cards with title, snippet, badges, and citations.
+* `GET /v1/articles/{id}` — Metadata + summary.
+* `POST /v1/ask` — Conversational answer over top‑k retrieved articles, always with citations.
+* `POST /v1/feedback` — Thumbs up/down and optional comment.
+
+### UI flows (Next.js 14)
+
+* **Home**: Center composer, query suggestions, trending topics.
+* **Results**: Perplexity‑style answer at top with source chips; below, cards for each cited article; sticky composer for follow‑ups.
+* **Interactions**: Cmd/Ctrl‑K global search, `?` keyboard help, skeleton loaders, optimistic UI.
+
+### Kubernetes (k0s) deployment sketch
+
+* **Namespaces**: `news`, `news-observe`.
+* **Ingress**: `nginx-ingress` for HTTPS; optional parallel **HAProxy Ingress** for TCP/advanced use. Certs via cert‑manager + DNS‑01 or HTTP‑01.
+* **Deployments** (ARM64 images):
+
+  * `api` (FastAPI, Uvicorn Gunicorn): 2 replicas, HPA on CPU 60% & p95 latency SLI.
+  * `web` (Next.js): 2 replicas, static export (optional) behind Node adapter.
+  * `worker` (ingest/summarize/embed): 2–4 replicas, separate queues for `poll`, `scrape`, `summ`, `embed`.
+  * `postgres` (Bitnami ARM64) with persistent volume; enable `pgvector` extension.
+  * `redis` (Bitnami ARM64) for cache/queue.
+* **RBAC/Secrets**: Kubernetes Secrets for API keys; service accounts per deployment.
+* **Resources (starting)**: api 200m/512Mi; web 100m/256Mi; worker 300m/1Gi; redis 50m/256Mi; postgres 250m/2Gi.
+* **Autoscaling**: HPA + VPA recommendations; cluster metrics via kube‑metrics‑server.
+
+### Ranking & answer synthesis
+
+* **Hybrid search**: BM25 (Postgres full‑text) for recall → take top 50; compute cosine similarity on vectors → rerank → top 8.
+* **Answer**: Prompt model with the top 6 snippets + titles and URLs; enforce **citation after each sentence** where evidence exists. Refuse to answer beyond source material.
+
+### Rate limiting & ethics
+
+* Per‑source QPS caps (e.g., 0.5–1 rps) and adaptive backoff.
+* Honor robots.txt by default; switchable per your policy. Always link prominently to original.
+* Snippets limited; no storage of full article text.
+
+## Implementation
+
+### 0) Repo layout
+
+```
+news-agg/
+  apps/
+    api/            # FastAPI (Python 3.11)
+    web/            # Next.js 14 UI
+    workers/        # poll/scrape/summarize/embed (FastAPI tasks + RQ/Dramatiq)
+  deploy/
+    base/           # K8s Kustomize base (namespaces, RBAC, NetworkPolicies)
+    overlays/
+      pi-prod/
+        kustomization.yaml
+        postgres.yaml
+        redis.yaml
+        api.yaml
+        web.yaml
+        workers.yaml
+        cron-poller.yaml
+        ingress-nginx.yaml
+        ingress-haproxy.yaml (optional)
+        secrets.example.yaml
+  ops/
+    helm-values/
+      bitnami-postgresql.yaml
+      bitnami-redis.yaml
+  scripts/
+    build.sh        # multi-arch docker buildx
+    db_migrate.sql  # tables + pgvector
+```
+
+### 1) Container images (ARM64)
+
+* **Python base**: `python:3.11-slim` + `uv`/`pip-tools`; compile wheels at build time.
+* **Node**: `node:18-alpine` → `next build` then run with `node` or export static.
+* Use **`docker buildx`** to produce `linux/arm64` images. Example:
+
+```
+docker buildx build --platform linux/arm64 -t registry/pi/news-api:0.1 -f apps/api/Dockerfile --push .
+```
+
+**apps/api/Dockerfile** (snippet)
+
+```Dockerfile
+FROM python:3.11-slim
+RUN apt-get update && apt-get install -y build-essential libpq-dev && rm -rf /var/lib/apt/lists/*
+WORKDIR /app
+COPY apps/api/pyproject.toml apps/api/uv.lock ./
+RUN pip install -U pip && pip install uv
+RUN uv pip install --system -r requirements.txt || true
+COPY apps/api/ .
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
+```
+
+### 2) k0s cluster prep (once)
+
+* Install **nginx‑ingress** and (optionally) **HAProxy Ingress** via manifests/Helm.
+* Install **cert-manager** for TLS if exposing publicly.
+* Add **metrics‑server** for HPA and **KEDA** (optional) for queue-based scaling.
+
+### 3) Datastores
+
+**PostgreSQL (Bitnami, pgvector)**
+
+```yaml
+# deploy/overlays/pi-prod/postgres.yaml
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata: { name: pgdata, namespace: news }
+spec:
+  accessModes: ["ReadWriteOnce"]
+  resources: { requests: { storage: 20Gi } }
+---
+apiVersion: v1
+kind: ConfigMap
+metadata: { name: pg-init, namespace: news }
+data:
+  00-init.sql: |
+    CREATE EXTENSION IF NOT EXISTS vector;
+    -- migrations applied by apps on startup too
+---
+apiVersion: helm.cattle.io/v1
+kind: HelmChart
+metadata: { name: pg, namespace: kube-system }
+spec:
+  chart: oci://registry-1.docker.io/bitnamicharts/postgresql
+  targetNamespace: news
+  version: 15.x.x
+  valuesContent: |
+    image:
+      repository: bitnami/postgresql
+      tag: 15-debian-12
+    primary:
+      extraVolumes:
+        - name: pg-init
+          configMap: { name: pg-init }
+      extraVolumeMounts:
+        - name: pg-init
+          mountPath: /docker-entrypoint-initdb.d
+      persistence:
+        existingClaim: pgdata
+    auth:
+      username: news
+      password: ${PG_PASSWORD}
+      database: news
+```
+
+**Redis (Bitnami)**
+
+```yaml
+# deploy/overlays/pi-prod/redis.yaml
+apiVersion: helm.cattle.io/v1
+kind: HelmChart
+metadata: { name: redis, namespace: kube-system }
+spec:
+  chart: oci://registry-1.docker.io/bitnamicharts/redis
+  targetNamespace: news
+  version: 18.x.x
+  valuesContent: |
+    architecture: standalone
+    auth:
+      enabled: false
+```
+
+### 4) Secrets & Config
+
+```yaml
+# deploy/overlays/pi-prod/secrets.example.yaml (copy to secrets.yaml and fill)
+apiVersion: v1
+kind: Secret
+metadata: { name: app-secrets, namespace: news }
+type: Opaque
+data:
+  OPENAI_API_KEY: <base64>
+  APP_SIGNING_KEY: <base64>
+---
+apiVersion: v1
+kind: ConfigMap
+metadata: { name: app-config, namespace: news }
+data:
+  SNIPPET_MAX: "320"
+  SOURCES: |
+    - name: Reuters
+      base_url: https://www.reuters.com
+      rss:
+        - https://www.reuters.com/rss/worldNews
+      sitemaps:
+        - https://www.reuters.com/sitemap_news.xml
+      robots_policy: honor
+  RANKING: "hybrid"
+```
+
+### 5) Workers (poll, scrape, summarize, embed)
+
+```yaml
+# deploy/overlays/pi-prod/workers.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata: { name: workers, namespace: news }
+spec:
+  replicas: 3
+  selector: { matchLabels: { app: workers } }
+  template:
+    metadata: { labels: { app: workers } }
+    spec:
+      containers:
+        - name: workers
+          image: registry/pi/news-workers:0.1
+          envFrom:
+            - secretRef: { name: app-secrets }
+            - configMapRef: { name: app-config }
+          env:
+            - { name: REDIS_URL, value: redis://redis-master.news.svc.cluster.local:6379/0 }
+            - { name: DATABASE_URL, value: postgresql://news:$(PG_PASSWORD)@pg-postgresql.news.svc.cluster.local:5432/news }
+          resources:
+            requests: { cpu: "300m", memory: "1Gi" }
+            limits:   { cpu: "900m", memory: "2Gi" }
+          livenessProbe: { httpGet: { path: /healthz, port: 8080 }, initialDelaySeconds: 15 }
+          readinessProbe:{ httpGet: { path: /readyz,  port: 8080 }, initialDelaySeconds: 5 }
+```
+
+**Cron: feed/sitemap polling**
+
+```yaml
+apiVersion: batch/v1
+kind: CronJob
+metadata: { name: poller, namespace: news }
+spec:
+  schedule: "*/2 * * * *"  # every 2 minutes
+  jobTemplate:
+    spec:
+      template:
+        spec:
+          restartPolicy: OnFailure
+          containers:
+            - name: poll
+              image: registry/pi/news-workers:0.1
+              args: ["poll"]
+              envFrom:
+                - secretRef: { name: app-secrets }
+                - configMapRef: { name: app-config }
+```
+
+### 6) API service (FastAPI)
+
+```yaml
+# deploy/overlays/pi-prod/api.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata: { name: api, namespace: news }
+spec:
+  replicas: 2
+  selector: { matchLabels: { app: api } }
+  template:
+    metadata: { labels: { app: api } }
+    spec:
+      containers:
+        - name: api
+          image: registry/pi/news-api:0.1
+          ports: [{ containerPort: 8080 }]
+          envFrom:
+            - secretRef: { name: app-secrets }
+            - configMapRef: { name: app-config }
+          env:
+            - { name: REDIS_URL, value: redis://redis-master.news.svc.cluster.local:6379/0 }
+            - { name: DATABASE_URL, value: postgresql://news:$(PG_PASSWORD)@pg-postgresql.news.svc.cluster.local:5432/news }
+          resources:
+            requests: { cpu: "200m", memory: "512Mi" }
+            limits:   { cpu: "600m", memory: "1Gi" }
+---
+apiVersion: v1
+kind: Service
+metadata: { name: api, namespace: news }
+spec:
+  selector: { app: api }
+  ports:
+    - name: http
+      port: 80
+      targetPort: 8080
+```
+
+**FastAPI search (sketch)**
+
+```python
+# apps/api/search.py
+from pgvector.psycopg import register_vector
+import psycopg, numpy as np
+
+EMBED_DIM = 1536
+
+def hybrid_search(conn, q, k=8):
+    with conn.cursor() as cur:
+        # 1) Embedding
+        v = embed(q)  # call OpenAI embeddings
+        # 2) Keyword recall
+        cur.execute("""
+          SELECT id, title, snippet, canonical_url,
+                 ts_rank(to_tsvector('english', coalesce(title,'')||' '||coalesce(snippet,'')), plainto_tsquery(%s)) AS rank
+          FROM articles
+          WHERE to_tsvector('english', coalesce(title,'')||' '||coalesce(snippet,'')) @@ plainto_tsquery(%s)
+          ORDER BY rank DESC
+          LIMIT 50
+        """, (q, q))
+        rows = cur.fetchall()
+        ids = [r[0] for r in rows] or [-1]
+        # 3) Vector rerank
+        cur.execute("""
+          SELECT a.id, a.title, a.snippet, a.canonical_url,
+                 1 - (e.embedding <=> %s::vector) AS sim
+          FROM articles a
+          JOIN article_embeddings e ON e.article_id = a.id
+          WHERE a.id = ANY(%s)
+          ORDER BY sim DESC LIMIT %s
+        """, (np.array(v), ids, k))
+        return cur.fetchall()
+```
+
+### 7) Web UI (Next.js 14)
+
+* App Router, Tailwind, shadcn/ui. Server actions call API.
+* Components: `Composer`, `AnswerBox` (with sentence-level citations), `ResultCard`, `SourceChip`.
+* Add **PWA** manifest + basic offline cache for shell.
+
+### 8) Ingress (nginx primary, HAProxy optional)
+
+```yaml
+# deploy/overlays/pi-prod/ingress-nginx.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: news
+  namespace: news
+  annotations:
+    kubernetes.io/ingress.class: nginx
+    nginx.ingress.kubernetes.io/proxy-body-size: "1m"
+spec:
+  tls:
+    - hosts: [news.local]
+      secretName: news-tls
+  rules:
+    - host: news.local
+      http:
+        paths:
+          - path: /
+            pathType: Prefix
+            backend: { service: { name: web, port: { number: 80 } } }
+          - path: /v1
+            pathType: Prefix
+            backend: { service: { name: api, port: { number: 80 } } }
+```
+
+### 9) Observability
+
+* **Logging**: JSON logs via `structlog` (API/workers), `stdout` aggregated by k0s.
+* **Metrics**: Prometheus scraping (use `prometheus-fastapi-instrumentator`), Grafana dashboards.
+* **Tracing**: OpenTelemetry SDK exporting to Tempo/OTLP (optional).
+* SLOs: p95 search < 600ms (warm); ingest freshness p95 < 5 min.
+
+### 10) CI/CD (GitHub Actions)
+
+* Build multi-arch images with `setup-buildx-action`, push to your registry.
+* Deploy via `kubectl` or ArgoCD (optional). Gate with manual approval.
+
+### 11) Prompts & safety rails
+
+* **Summary prompt**: 60–90 words, neutral tone, forbid speculation, 1–2 citations with URLs.
+* **Answer prompt**: Use only retrieved snippets; every sentence claims must cite `[n]`. If insufficient evidence, say so.
+* **Guardrails**: Max 6 articles per answer; truncate inputs to token budget.
+
+### 12) Performance knobs (Raspberry Pi friendly)
+
+* Enable HTTP caching (ETag/If‑Modified‑Since).
+* Redis cache TTL 10m for hot queries.
+* Per‑host concurrency: 2 (scraper); global QPS: 0.5–1 for Reuters.
+* Use gzip/deflate when fetching; strip images when scraping.
+
+### 13) Data retention
+
+* Keep `articles` 30 days rolling (configurable). Older rows archived to `articles_archive` without embeddings.
+
+### 14) Security
+
+* NetworkPolicies: only API/worker → DB/Redis; web → API; deny egress by default except OpenAI domains.
+* Secrets from Kubernetes; rotate quarterly. Read‑only service accounts for web.
+* TLS everywhere; CSP headers on web.
+
+## Milestones
+
+**MVP timeline: 2 weeks (LAN only, no TLS)**
+
+### Week 1 — Foundations & ingest
+
+* **Day 1–2**: Cluster prep (k0s), namespaces, nginx Ingress (HTTP only), metrics‑server. Registry access + buildx pipeline.
+* **Day 3**: Postgres (pgvector) + Redis live; migrations applied.
+* **Day 4**: Workers scaffolded (poll, scrape) with Reuters RSS + sitemap pollers; ETag/Last‑Modified implemented; robots policy set to *honor*.
+* **Day 5**: Normalizer/dedupe; article schema writes; minimal admin page to view ingest logs.
+
+**Exit criteria**: Reuters articles flowing into DB with title/snippet/category/published\_at; p95 freshness under 10 min.
+
+### Week 2 — Search, summaries, UI polish
+
+* **Day 6**: Embeddings worker + index (pgvector ivfflat). Hybrid search in API.
+* **Day 7**: Summarizer worker; store 60–90 word summaries; cache.
+* **Day 8**: Next.js UI (composer, answer box, cards, source chips). Basic keyboard nav.
+* **Day 9**: Observability: Prometheus scrape + Grafana dashboard; SLOs wired.
+* **Day 10**: Hardening (quotas, retries), data retention job; smoke tests; cut **MVP v0.1.0**.
+
+**Exit criteria**: Query returns an answer with citations in < 800ms warm path; summaries stable; LAN users can search and read cited sources.
+
+## Gathering Results
+
+### KPIs (Primary)
+
+* **Freshness (p95)**: time from article publication → available in search. Target: ≤ 5 minutes; stretch ≤ 2 minutes.
+* **Answer Accuracy**: % of answer sentences that have at least one valid citation to the retrieved set. Target: ≥ 95%.
+
+### KPIs (Secondary)
+
+* **Coverage**: % of Reuters articles discovered vs. listed in sitemaps over last 24h. Target: ≥ 98%.
+* **Latency (p95)**: query → first contentful paint (UI) and API response time. Targets: API ≤ 600ms warm; UI FCP ≤ 1.5s on LAN.
+* **Stability**: worker error rate < 1%; scraper retry rate < 10%.
+
+### Instrumentation
+
+* **Prometheus metrics**
+
+  * `ingest_freshness_seconds{source=…}` (histogram)
+  * `ingest_discovered_total{kind= rss|sitemap|scrape}`
+  * `scrape_http_status_total{code=…}`
+  * `search_latency_seconds` (histogram)
+  * `answer_citation_coverage_ratio` (gauge)
+  * `worker_queue_depth{queue=…}`
+* **Structured logs** (JSON): include `trace_id`, `job_id`, and normalized URL.
+* **Dashboards (Grafana)**: Freshness, Search Latency, Coverage vs Sitemap, Error budget burn.
+
+### Accuracy evaluation
+
+* **Automatic**:
+
+  * Parse answer into sentences; verify each sentence has at least one citation.
+  * Check that citation URLs match the top‑k retrieved set and that snippets contain supporting tokens (simple ROUGE‑like overlap).
+  * Flag low‑evidence sentences for review.
+* **Human review** (1–2×/week):
+
+  * 50 sampled answers; label: correct / partially supported / unsupported / off‑topic.
+  * Compute **hallucination rate** (unsupported sentences ÷ total) and track trend.
+
+### Feedback loop
+
+* UI **thumbs up/down** with optional comment saved to `feedback` table:
+
+```sql
+CREATE TABLE feedback (
+  id BIGSERIAL PRIMARY KEY,
+  query TEXT NOT NULL,
+  answer_id TEXT,
+  verdict TEXT CHECK (verdict IN ('up','down')),
+  comment TEXT,
+  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
+);
+```
+
+* Downvotes auto‑create a JIRA/GitHub issue if `answer_citation_coverage_ratio < 0.9`.
+
+### Experimentation
+
+* **Prompt variants** A/B via header flag in API (e.g., `x-prompt=v2`).
+* **Ranking tweaks**: switch BM25 weight vs vector weight; record NDCG\@10 on labeled queries.
+
+### Post‑mortems & safety
+
+* Blameless post‑mortem for any incident where hallucination rate > 10% in a day or freshness p95 > 10 min for >1h.
+* Daily data retention job verified; no full‑text persists beyond in‑memory summary context.
+
@@ -0,0 +1,3 @@
+# Classy Perplexity-style News Aggregator
+
+This repository houses the scaffolding for a Perplexity-inspired Reuters news aggregator designed for Raspberry Pi 5 clusters. See `INSTRUCTIONS.md` for the full specification and implementation guidelines.