# How This Site Was Made

## Purpose
This site is a focused, retrieval-augmented (RAG) assistant built to answer questions about my leadership approach, engineering execution, and real-world scenarios. It is intentionally narrow to avoid hallucination and to keep answers grounded in my actual experience.

## Principles
- Truth over fluency: if the source material does not support an answer, the assistant should say so.
- Small surface area: fewer moving parts, fewer ways to fail.
- Fast learning loops: short deploy cycles with clear feedback.
- Operational realism: design for a small team and real uptime constraints.

## Architecture (high level)
- Frontend: Next.js (Amplify SSR)
- API: API Gateway for /api/token, CloudFront + Lambda URL for /api/chat streaming
- Retrieval: content and embeddings stored in S3
- Models: OpenAI or Bedrock, selected by environment variables
- Security: Turnstile, JWTs, WAF, strict CORS

## Data flow
1) User opens the site and asks a question.
2) The UI calls /api/token for a short-lived JWT (Turnstile optional depending on WAF signal).
3) /api/chat enforces scope, validates the JWT, retrieves sources, and queries the model.
4) The response is streamed as SSE events to the UI.

## Content pipeline
- Source materials live in /content as Markdown.
- A local script chunks content and (optionally) embeds it.
- Embeddings are uploaded to S3 and retrieved by Lambda at runtime.
- The model only receives the question and retrieved sources.

## Streaming
- /api/chat streams SSE through CloudFront + Lambda URL.
- /api/token stays on API Gateway.
- This split keeps auth simple while enabling true streaming for chat.

## Tradeoffs and constraints
- Retrieval quality is only as good as the content and chunking strategy.
- Local chunking is intentionally simple and should evolve as content grows.
- Streaming is optimized for end-user experience but increases infra complexity.

## How to update content
1) Edit files in /content.
2) Rebuild the index.
3) Upload embeddings to S3.
4) Deploy if needed.

## Roadmap
- Improve chunking and retrieval quality.
- Add explicit citation formatting in responses.
- Add internal QA checks for high-risk topics.

## Why this design works
It stays grounded, keeps the stack small, and emphasizes clarity and reliability over novelty.
