← Back to Portfolio

🤖

tiny-chatbot-agents

2026.01 - 2026.02

📋 Project Overview

A sophisticated enterprise-grade RAG chatbot system featuring dual-stage retrieval, hybrid search, and hallucination verification.

🎯 Problem Definition & Goals

Problem: Enterprise chatbots suffer from hallucinations and data privacy concerns.
Goal 1: Build a fully local RAG system for company-specific QnA.
Goal 2: Implement dual-stage retrieval prioritizing curated answers.
Goal 3: Add hallucination verification for grounded answers.

⚙️ Key Features & Contributions

Dual-Stage RAG: First QnA database, then ToS document search.
Hybrid Search: Vector search, rule-based matching, and knowledge graph.
Advanced Ranking: Bi-Encoder for fast retrieval, Cross-Encoder for re-ranking.
Hallucination Verification: LLM-based verification step.
MCP Server: Claude Desktop compatibility.
Local LLM Ready: Works with vLLM and Ollama.

🔧 Technical Challenges & Solutions

Retrieval Quality: Implemented hybrid approach combining multiple methods.
Hallucination Detection: Added verification agent cross-referencing sources.
Unknown Answer Handling: Implemented confidence scoring and "I don't know" responses.
Latency Optimization: Caching, parallel retrieval, and early termination.

📈 Results & Learnings

Improved Accuracy: Dual-stage retrieval significantly improved relevance.
Reduced Hallucinations: Verification catches ~85% of hallucinated responses.
Security Compliance: Fully local deployment meets enterprise requirements.
Key Learning: Deep expertise in production RAG systems.

🛠️ Technologies

Python Streamlit vLLM Ollama Playwright ChromaDB Cross-Encoder

🔗 Links

💻 GitHub Repository