2026 · GraphRAG knowledge graph
GraphMind
A GraphRAG system over financial trade data that fuses a Neo4j knowledge graph, dense vector search, and BM25 into one retrieval surface for a LangGraph agent.
Why I built this
Pure vector RAG misses relationship questions, and a pure knowledge graph misses fuzzy semantic ones. Financial data needs both, plus exact-match on tickers like HDFCBANK that embeddings smear together. Building all three retrievers and fusing them taught me where each one breaks and why a hybrid, agent-routed approach beats any single retriever.
Architecture
Three retrievers, one agent
- Knowledge graph · Neo4j Cypher for multi-hop analyst, fund, trade, instrument, and sector relationships
- Dense · ChromaDB vector search over sentence-transformer embeddings
- Sparse · BM25Okapi for exact terms like tickers
- Fusion · dense and BM25 merged with Reciprocal Rank Fusion before ranking
- Routing · a LangGraph ReAct agent picks the right tool per question
Graph ingestion and constraints
Uniqueness constraints on Instrument, Fund, Trade, Analyst, and Sector nodes are created before ingestion so repeated MERGE statements never silently duplicate nodes. NetworkX handles the smaller in-process graph and visualization alongside the Neo4j store.
Tech stack
Technologies used
core
infra
tools
Key highlights
Proof points
- 01
Fuses three retrievers, a Neo4j knowledge graph, ChromaDB dense vectors, and BM25, into a single retrieval surface.
- 02
Dense and sparse results are combined with Reciprocal Rank Fusion, so semantic meaning and exact-match terms like tickers both contribute.
- 03
A LangGraph ReAct agent routes each question to graph traversal, vector search, or hybrid retrieval as the question demands.
- 04
Uniqueness constraints on Instrument, Fund, Trade, Analyst, and Sector nodes prevent duplicate graph nodes during MERGE-based ingestion.
Focus areas
Explore the work