Spiritual AI Guide
A production-deployed RAG chatbot that semantically searches 1,649 personal Obsidian notes and generates spiritually-grounded, cited responses.
What is this project?
Over three years, I accumulated 1,649 Markdown notes in Obsidian while reading 75+ books on spirituality, psychology, philosophy, and neuroscience. This project turns that private knowledge base into a conversational AI that can answer questions, surface relevant passages, and cite its sources โ acting as an AI study partner for the material I have studied.
The system implements Retrieval-Augmented Generation (RAG) end-to-end: notes are chunked semantically, embedded with all-MiniLM-L6-v2 into 384-dimensional vectors stored in ChromaDB, retrieved via hybrid BM25 + dense search, and passed as grounded context to GPT-4 Turbo for citation-anchored response generation.
RAG Pipeline
Ingestion
Parse 1,649 Obsidian .md files, extract WikiLink graph, preserve metadata (category, book, file path).
Semantic Chunking
Split by Markdown headers โ paragraph overflow โ 800-token chunks with 150-token sliding overlap.
Dense Embedding
Encode all 1,772 chunks with all-MiniLM-L6-v2, L2-normalise, store in ChromaDB HNSW index (cosine metric).
Hybrid Retrieval + Re-ranking
ANN search (top-10 candidates) โ composite re-rank: 70% cosine similarity + 20% keyword overlap + 10% link density.
Grounded Generation
Structured prompt injects retrieved chunks with [Source: X] labels. GPT-4 Turbo generates a cited response streamed via SSE.
AI Skills Involved
Technical Stack
Backend
- Python 3.11 + FastAPI
- ChromaDB (HNSW, cosine)
- sentence-transformers
- Pydantic v2 validation
- Uvicorn async server
AI / ML
- OpenAI GPT-4 Turbo
- Ollama Llama 3.1 8B
- all-MiniLM-L6-v2 (384D)
- Anthropic Claude 3+
- Google Gemini (optional)
Frontend & DevOps
- Next.js 14 + TypeScript
- Tailwind CSS
- Docker + docker-compose
- Netlify (frontend)
- Render (backend)