Key Components
A comprehensive architecture that handles everything from data ingestion to secure deployment
Ingestion Layer
Multiple memory types supported including Google Docs, PDFs, Confluence, Jira, and audio transcripts.
Embedding & Vectorization
Content chunking and vector embeddings using OpenAI or Microsoft Azure AI Services.
Storage & Indexing
Advanced vector storage with metadata optimization and relevance filtering.
Retrieval Engine
High-performance similarity search with batch processing for datasets with 90K+ rows.
LLM Generation
Context-aware response generation with OpenAI GPT, Mistral AI, and Perplexity.ai enrichment.
Security & Orchestration
Enterprise-grade deployment on GKE with Nest.js, featuring HIPAA compliance.
Workflow Example
See how Hana's RAG pipeline processes data from ingestion to accurate response generation
Ingestion
User adds memory via dashboard or chat command
Example: @Hana /memory add "Q3 goals: Increase efficiency by 20%"
Processing
Content is chunked, embedded, and stored in MongoDB Atlas
Example: Vector embeddings generated with metadata
Query
User asks a question in natural language
Example: @Hana What are our Q3 goals?
Retrieval
Similarity search retrieves relevant chunks
Example: Score threshold >0.5 for high relevance
Generation
LLM generates contextualized response
Example: Based on memory: Q3 goals include 20% efficiency increase
Output
Accurate, cited response delivered to user
Example: Response with source attribution
Unique Optimizations
Advanced techniques that make Hana's RAG pipeline reliable, efficient, and enterprise-ready
Multi-Stage RAG
Initial semantic search followed by re-ranking for better precision.
Auto-Resync
Daily automatic updates keep data fresh without manual intervention.
Deep Integrations
Real-time data from Google Workspace, Stability.ai, and more.
Efficiency Optimized
Batch processing, metadata grouping, score thresholds reduce costs.
Enterprise Ready
CASA Tier-2, HIPAA compliant, no data used for external training.