Chatbot Integration¶
Obelisk includes integration with Ollama and Open WebUI to provide AI-powered chat capabilities directly within your documentation site. This section describes how to set up and use these features.
Overview¶
The chatbot integration consists of three key components:
- Ollama: A lightweight, local AI model server that runs models like Llama2, Mistral, and others.
- Open WebUI: A web interface for interacting with the AI models served by Ollama.
- RAG System: A Retrieval Augmented Generation system that enhances responses with content from your documentation.
Together, these services provide a complete AI chat experience that is directly connected to your documentation content, providing accurate, contextually relevant answers.
How It Works¶
The chatbot integration uses Docker Compose to orchestrate the services:
graph TD
User[User] --> WebUI[Open WebUI]
WebUI --> Ollama[Ollama Model Server]
WebUI --> RAG[RAG System]
RAG --> Ollama
RAG --> VectorDB[(Vector Database)]
Ollama --> Models[(AI Models)]
WebUI --> Config[(Configuration)]
User --> Obelisk[Obelisk Docs]
Obelisk --> DocContent[(Documentation Content)]
DocContent --> RAG
For a comprehensive view of all components and their interactions, see the complete architecture diagram.
- Users interact with the Open WebUI interface at
http://localhost:8080
- Queries can be processed either directly by Ollama or through the RAG system
- When using RAG, the system retrieves relevant content from your documentation
- Ollama loads and runs AI models to generate responses enhanced with your content
- The Obelisk documentation server runs independently at
http://localhost:8000
Services Configuration¶
Ollama Service¶
The Ollama service runs the model server with GPU acceleration:
ollama:
container_name: ollama
image: ollama/ollama:latest
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
- CUDA_VISIBLE_DEVICES=0
- LOG_LEVEL=debug
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [gpu]
count: all
volumes:
- ollama:/root/.ollama
- models:/models
ports:
- "11434:11434"
networks:
- ollama-net
restart: unless-stopped
Open WebUI Service¶
The Open WebUI service provides the chat interface:
open-webui:
container_name: open-webui
image: ghcr.io/open-webui/open-webui:main
environment:
- MODEL_DOWNLOAD_DIR=/models
- OLLAMA_API_BASE_URL=http://ollama:11434
- OLLAMA_API_URL=http://ollama:11434
- LOG_LEVEL=debug
volumes:
- data:/data
- models:/models
- open-webui:/config
ports:
- "8080:8080"
depends_on:
- ollama
networks:
- ollama-net
restart: unless-stopped
Getting Started¶
To start using the chatbot integration:
- Ensure you have Docker and Docker Compose installed
- For GPU acceleration, install the NVIDIA Container Toolkit
- Start the full stack:
- Access the chat interface at
http://localhost:8080
- Access your documentation at
http://localhost:8000
Available Models¶
By default, no models are pre-loaded. You can pull models through the Open WebUI interface or directly via Ollama commands:
# Connect to the Ollama container
docker exec -it ollama bash
# Pull a model (example: mistral)
ollama pull mistral
Popular models to consider:
llama2
- Meta's Llama 2 modelmistral
- Mistral AI's 7B modelphi
- Microsoft's Phi modelgemma
- Google's Gemma model
Customizing the Chat Experience¶
You can customize the chat experience by:
- Configuring Open WebUI settings through the interface
- Creating custom model configurations
- Using the RAG system to enhance responses with your documentation
- Customizing the RAG system parameters for better retrieval
See the Open WebUI documentation, Ollama documentation, and our RAG documentation for more details.
RAG System Integration¶
The Retrieval Augmented Generation (RAG) system enhances your chatbot with knowledge from your documentation:
-
Index your documentation:
-
Start the RAG API server:
-
In Open WebUI, add a new API-based model:
- Name: "Obelisk RAG"
- Base URL: "http://localhost:8000"
- API Path: "/query"
- Request Format:
{"query": "{prompt}"}
- Response Path: "response"
For detailed instructions on setting up and using the RAG system, see the RAG Getting Started Guide and Using RAG.