Two Years of Writing, One Afternoon of Code

A hands-on RAG build, and an honest look at AI coding agents

Dec 04, 2025

∙ Paid

Recently, in an afternoon, I built a RAG application that lets me chat with two years of my own writing. 146 posts, now searchable through natural conversation. I did this while taking notes for the article you’re reading.

I’ve been writing about AI and software development since before ChatGPT existed. Until now, I’ve focused on chasing the next development. But I realized I’d accumulated a corpus worth exploring, and with today’s tooling, I could build the harness to do it in a single sitting.

This post is two things: a practical guide to building your own personal RAG, and an honest assessment of Claude Opus 4.5 as a coding agent. If you’ve heard about RAG but never built one, or you built one a year ago and haven’t touched it since, this is your on-ramp. I’ll walk through the decisions that matter: choosing a vector database, chunking strategy, metadata design, and the iteration required to get useful results. We’re not building enterprise infrastructure here. We’re building a functional POC that solves a personal need, but as you will see, this plots a path to scale the solution.

For the vector database, I chose ChromaDB. Beyond my positive experience with it during my AWS days, ChromaDB offers something valuable for developers who want to experiment without commitment: a frictionless scaling path. You can start with an ephemeral in-memory database for quick tests, move to local file persistence for real development, then shift to a client-server architecture, and finally to their hosted cloud offering. Each transition requires minimal code changes. You’re never locked into a decision you can’t easily reverse.

Note, all the code for this post is in github.

Throughout this build, I’ll include periodic check-ins on how Opus 4.5 performed. Benchmarks tell one story; using a model for real work tells another. Spoiler: I was impressed, with some caveats worth discussing.

Let’s build.

Choosing a Stack

I knew I wanted a vector embeddings solution. During my time at AWS, even before Bedrock existed, I built RAG systems using a self-hosted ChromaDB. I remembered it being pleasant to work with, so I checked in on the project.

ChromaDB has been busy. They published a seminal paper on Context Rot, and they’ve expanded their deployment options. The progression now looks like this:

Ephemeral (in-memory) for quick experiments
Local persistent file storage for real development
Client/server architecture where you control the server
Cloud-hosted, fully managed by Chroma

This scaling ramp matters. You can get started with zero infrastructure, iterate locally, and transition to production hosting without rewriting your application. For a “scratch your own itch” project like this, that flexibility removes the friction of “but what if I want to scale this POC?”

ChromaDB also provides an llms.txt file for their documentation, which would help Claude Code understand their APIs without hallucinating outdated patterns.

Honorable Mention: LanceDB

LanceDB deserves a mention. I explored it while learning Rust, as they had an early and excellent Rust SDK. Returning to their site after a year away, I see they’ve expanded significantly: storage, search, feature engineering, analytics, and training. They also offer cloud hosting with reasonable pricing for hobby projects.

Their hybrid search capabilities are particularly interesting. There’s a good chance I’ll return to LanceDB for a future exploration, especially to compare hybrid search approaches.

But for this build, ChromaDB’s familiarity and documentation support made it the pragmatic choice.

First Prompt, First Working Code

Continue reading this post for free, courtesy of Sam Keen.

Or purchase a paid subscription.