How I Built My Portfolio's AI Chatbot Memory with pgvector
My portfolio's chatbot answers questions about me, and the interesting part is the retrieval layer that decides what the model is allowed to know. I built it on Postgres with pgvector, inside Django.
# How I Built My Portfolio's AI Chatbot Memory with pgvector
My portfolio has a chatbot that answers questions about me - my experience, my skills, whether I'm available for work. The interesting part isn't the chat UI; it's the retrieval layer that decides what the model is allowed to know before it answers. I built that on Postgres with the pgvector extension, inside a Django app.
## Two tables: facts, and corrections that win
Knowledge about me lives in two places.
`ProfileFact` rows hold the raw material - a title, the content, a category, and tags. Categories cover things like experience, skills, certifications, contact, and availability.
`AnswerCorrection` rows are the override layer. When the model gives a wrong answer to a specific question, I store the question paired with the correct answer. Corrections take priority over facts, so I can fix a bad answer without reworking the underlying facts.
## Embeddings live next to the data
Every row gets an embedding from OpenAI's `text-embedding-3-small` model - 1536 dimensions - stored in a pgvector column right next to the row it describes. No separate vector database, no sync job. The vector and the source of truth are the same row.
The embeddings stay fresh through a Django `post_save` signal: whenever a row is saved, it recomputes the embedding. To avoid burning API calls on no-op saves, I hash the source text and short-circuit when the hash hasn't changed. Editing the text re-embeds; toggling an unrelated flag doesn't.
## Retrieval: cosine distance with guardrails
At query time I rank candidates by cosine distance against the question's embedding:
- Top 12 facts within a 0.55 distance threshold.
- Top 3 corrections within a tighter 0.45 threshold - corrections have to be a closer match to fire.
Two guardrails matter:
- **Corrections are injected ahead of facts**, so they override the model's defaults.
- **`contact` and `availability` rows are always included**, even when they aren't the closest match. Whatever someone asks, the bot should never be unable to say how to reach me or whether I'm open to work.
## Degrading gracefully
Embeddings depend on an external API, so I planned for it being gone. If the embedding service is completely unavailable - say the key is revoked - retrieval falls back to recent active facts instead of failing the request. The bot still answers; it's just less precise about picking the most relevant facts. A slightly worse answer beats an error page.
If you're building retrieval-augmented features and want to keep your vectors in Postgres rather than standing up a separate vector store, I'm glad to talk through the trade-offs. Reach out via [tahayusufkomur.me](https://tahayusufkomur.me).