Task List — AI Estimation Tool

Week 1–2

Infrastructure, Knowledge Base, Seed Data

M

Minh

Set up AWS environment (account, region, IAM roles for deployment)
Deploy auth stack — Cognito User Pool + App Client
Deploy data stack — DynamoDB tables, S3 buckets
Deploy API stack — API Gateway HTTP v2 + JWT authorizer + route stubs
Deploy agent stack — SQS queue, Lambda stubs, IAM execution roles
Set up local dev environment — configure .env.local pointing to real AWS dev account
Confirm local login works end-to-end (Cognito → JWT → API Gateway)

V

Vinh

Write 5 seed estimation sheets covering: web app, mobile, system integration, data pipeline, AI feature addition
Tag each document with metadata: project_type, tech_stack, scale, duration_weeks, total_person_hours
Set up Bedrock Knowledge Base — S3 data source, embedding model, hybrid search enabled
Load seed documents into Knowledge Base; confirm retrieval returns results
Confirm RAG search returns relevant results for a test query against Bedrock Knowledge Base

V

Vinh — Reusable Workflow Proposal

Research and propose reusable workflow/skill patterns across three areas. Output is a short proposal doc per area — not code. Implementation follows in Week 7 once patterns are proven.

Infrastructure — research patterns for AWS architecture and scaffolding (Claude Code skills, IaC generators, community prompt libraries). Propose: what artifact helps a new team go from product brief → CloudFormation stubs + local dev setup fastest?
Backend — research patterns for Node.js/TypeScript Lambda projects (code generation, API scaffolding, test generation, handler conventions). Propose: what workflow helps a backend dev go from API spec → working tested Lambda handler fastest?
Frontend — research patterns for Nuxt 3 / Vue component work (component generation, page scaffolding, composable patterns). Propose: what workflow helps a frontend dev go from design/spec → working page fastest?
For each area: recommend whether the artifact should be a Claude Code skill, a workflow, a template, or a combination — with rationale
Share proposals with Minh (infra + backend review) and Hoat (frontend review)

🔄

Coordination Checkpoint — End of Week 2

Agree on agent output contract: exact fields, markdown structure, citation format
Lock example.html as the reference output before prompts are written
Vinh's reusable workflow proposals (infra / backend / frontend) are shared and agreed

Week 3–4

Prompt Design, Estimation Agent, Backend API

V

Vinh

Write estimation agent system prompt — structured output, citation instructions, assumptions, confidence rating
Write RAG search sub-prompt — extracts search queries from user input
Test prompts against Bedrock (Claude Haiku) — do not finalise on Groq alone
Build prompt test suite: 3 inputs with expected output shape
Seed SSM parameters: prompt text, model IDs, guardrail ID

A

Vinh + Minh

Define keyword blocklist — client names, internal rate cards, NDA terms
Configure Bedrock Guardrail with keyword blocklist only
Run test inputs through guardrail — confirm legitimate content passes
Iterate blocklist based on false positives

A

Hoat + Minh

POST /sessions — create session, store in DynamoDB
POST /sessions/{id}/run — validate input, enqueue to SQS
GET /sessions/{id}/status — poll for result
GET /sessions — list sessions for tenant
Consumer Lambda — dequeue SQS → call estimation agent → write result to DynamoDB
Estimation agent Lambda — RAG search → Bedrock invoke → return structured output
POST /proposals — convert result to downloadable .md, store in S3, return presigned URL
getLLMClient() factory — wraps Bedrock InvokeModel
getVectorStoreClient() factory — wraps Bedrock Knowledge Base
Unit tests for all handlers and lib modules
Confirm full pipeline works against AWS dev: form input → SQS → agent → result in DynamoDB

🔄

Coordination Checkpoint — End of Week 4

Run one full estimation end-to-end locally — review output against example.html
Hoat confirms the markdown structure is renderable in the frontend

Week 5–6

Frontend, Integration, End-to-End Testing

H

Hoat

Login page — email/password, error handling, redirect on success
Session list page — table of past sessions, status indicator, link to result
New session page — input form (project name, type, tech stack, scale, features, integrations, deadline, context)
Session result page — renders estimation output (summary, phase breakdown, role breakdown, assumptions, confidence, citations)
Download button — fetches presigned URL, triggers .md file download
Loading/polling state — show progress while agent runs
"AI-suggested — requires human review" label on every result page
Mobile-responsive layout

A

All

End-to-end test: submit form → poll status → view result → download .md
RAG citations appear on result page and match documents in knowledge base
Guardrail blocks a test input containing a blocklisted keyword
Guardrail does not block legitimate estimation content
Multi-tenant isolation: user from tenant A cannot see sessions from tenant B
Test with minimal input — confirm agent still produces output
Test with detailed input — confirm assumptions list is shorter

Week 7–8

Bug Fixes, Demo Preparation, Documentation

A

All

Fix issues found in integration testing
Prompt iteration — improve output quality based on real Bedrock test runs
Guardrail iteration — adjust blocklist based on false positives or misses
Load at least 5 realistic estimation documents into Knowledge Base
Prepare 3 demo scenarios: web portal (high confidence), AI feature addition (medium), legacy migration (low confidence)
Run all 3 scenarios end-to-end, review outputs
Deploy to production AWS (all stacks)
Confirm production Bedrock + Knowledge Base works (not local Groq/ChromaDB)

M

Minh + Vinh

Update README.md — local dev quick start, env vars, deploy steps
Document prompt design decisions — what was tried, what was rejected, why
Document guardrail keyword list and rationale
Document RAG retrieval strategy — metadata tags, hybrid search config
Write architecture summary — patterns proven, what comes next

V

Vinh — Reusable Workflow Implementation

Implement the three workflows agreed in Week 1–2. Do not start before Week 7 — patterns must be proven end-to-end first.

Infrastructure workflow — implement and validate against a hypothetical new product brief
Backend workflow — implement and validate by scaffolding a sample Lambda handler outside this project
Frontend workflow — implement and validate by scaffolding a sample Nuxt page outside this project
For each: confirm output is actionable without manual cleanup
Store all artifacts in .claude/commands/ or the agreed location; document how a new project adopts them

Week-by-Week Breakdown