Obelisk RAG Documentation¶

This document serves as the central reference for Obelisk's Retrieval Augmented Generation (RAG) implementation. It provides an overview of components, configuration options, and integration points, while directing readers to specific documentation files for detailed information.

Current Implementation¶

The Obelisk RAG system provides document indexing, vector storage, and query capabilities through a modular architecture with the following key components:

Core Components¶

Component	Documentation	Purpose
Document Processor	implementation.md	Handles markdown parsing, chunking, and metadata extraction
Embedding Service	implementation.md	Generates vector embeddings using Ollama
Vector Storage	vector-database.md	Stores and retrieves document vectors using ChromaDB
Query Pipeline	query-pipeline.md	Processes queries to retrieve relevant context
RAG Service	implementation.md	Central service orchestrating all RAG components

APIs & Integration¶

Feature	Documentation	Status
OpenAI-compatible API	using-rag.md	✅ Implemented
Ollama Integration	ollama-integration.md	✅ Implemented
OpenWebUI Integration	openwebui-integration.md	✅ Implemented

User Guides¶

getting-started.md - Step-by-step guide with RAG overview and first-time setup
using-rag.md - Comprehensive usage documentation with examples
implementation-status.md - Current implementation status and roadmap

Vector Database Implementation¶

The current implementation uses ChromaDB for vector storage. Future development roadmap includes potential migration to Milvus Lite for improved performance and scalability.

Database	Status	Documentation
ChromaDB	Current Implementation	vector-database.md
Milvus Lite	Future Roadmap	vector-database.md

Documentation Progress¶

Based on the analysis in task.md, we've made significant improvements to the documentation:

Completed Tasks¶

✅ High Priority - Consolidated API to use only the OpenAI-compatible endpoint - Created comprehensive openwebui-integration.md documentation - Updated vector database documentation to reflect ChromaDB implementation - Fixed WebUI integration inconsistencies

✅ Medium Priority - Updated API documentation with OpenAI-compatible format - Fixed environment variable inconsistencies (OLLAMA_URL vs OLLAMA_API_URL) - Added detailed document chunking documentation to implementation.md - Documented error handling strategies and debug mode

All Tasks Completed ✅¶

Low Priority Items
✅ Document the discrepancy between documented embedding batching and implementation
✅ Update roadmap documentation to clearly mark implemented vs. planned features in architecture-draft.md
✅ Add debug mode documentation
✅ Improve troubleshooting guidance with more examples in using-rag.md

Documentation Maintenance Guidelines¶

When updating the RAG documentation, follow these principles:

Accuracy: Documentation must precisely match the implemented code
Completeness: Cover all aspects including setup, configuration, usage, and APIs
Examples: Include working code samples for key functionality
Future Direction: Clearly label future roadmap items vs. current implementation

Document Update Matrix¶

When changing...	Update these docs
API endpoints	`using-rag.md`
Configuration options	`using-rag.md`, `getting-started.md`
Vector database	`vector-database.md`, `implementation.md`
Query pipeline	`query-pipeline.md`, `implementation.md`
OpenWebUI integration	`openwebui-integration.md` (to be created)
Ollama integration	`ollama-integration.md`

Architecture documentation and detailed implementation information can be found in separate files:

architecture-draft.md - Technical architecture with implementation status markers
implementation-status.md - Implementation roadmap and current status

Implementation Notes¶

The current RAG implementation includes:

Document Processing
Markdown parsing with YAML frontmatter extraction
RecursiveCharacterTextSplitter for document chunking
Real-time file watching with the watchdog library
Embedding Generation
Ollama integration using mxbai-embed-large model
1024-dimension vector embeddings
Support for both document and query embedding
Vector Storage
ChromaDB for vector database storage
Metadata filtering for search refinement
Local persistence on disk
Query Processing
Top-k similarity search with configurable parameters
Contextual prompt construction
LLM response generation via Ollama
API & Integration
OpenAI-compatible API endpoint for tool integration
Source information included in API responses
Docker containerization for deployment

Configuration Reference¶

The RAG system can be configured through environment variables:

Variable	Description	Default
VAULT_DIR	Directory containing markdown files	/app/vault
CHROMA_DIR	Directory for vector database	/app/data/chroma_db
OLLAMA_URL	URL of the Ollama service	http://ollama:11434
OLLAMA_MODEL	Ollama model for generation	llama3
EMBEDDING_MODEL	Model for embeddings	mxbai-embed-large
RETRIEVE_TOP_K	Number of document chunks to retrieve	3
API_HOST	Host to bind API server	0.0.0.0
API_PORT	Port for API server	8000
LOG_LEVEL	Logging level	INFO
RAG_DEBUG	Enable debug mode	false