What makes a RAG answer trustworthy?
Notes on citations, retrieval quality, chunking, and why confident language can hide weak context.
Short writeups on machine learning experiments, RAG systems, model evaluation, debugging datasets, and turning notebooks into useful software.
A practical breakdown of retrieval quality, chunking, citations, confidence, and how to debug answers that sound correct but are not grounded enough.
Read articleAll Notes
Notes on citations, retrieval quality, chunking, and why confident language can hide weak context.
A practical debugging checklist for noisy labels, leakage, imbalance, and misleading accuracy.
How I wrap experiments into endpoints with validation, logging, and a deployment-friendly structure.
Why model demos should include failure cases, baselines, and measurable improvement before being called useful.
A folder structure I use for experiments, configs, datasets, notebooks, APIs, and reproducible training.
Design decisions, performance notes, responsive layout fixes, and how a portfolio can communicate engineering taste.