Quick Start
This guide walks you through getting Cognitive Companion running on your local network.
Prerequisites
| Component | Purpose | Notes |
|---|---|---|
| NVIDIA GPU (10 GB+ VRAM) | Person-ID service + vLLM + Ollama | RTX 3060 or better |
| Docker + NVIDIA Container Toolkit | Container runtime | For all services |
| Home Assistant | Sensor integration, audio playback, actions | REST API + long-lived token |
| MinIO (or S3-compatible) | Media object storage | Pre-signed URL support required |
| vLLM | Vision + translation model serving | Cosmos-Reason2-8B, TranslateGemma-12b |
| Ollama | Logic reasoning model | gemma3:4b |
| Python 3.11+ | Backend runtime | 3.12 recommended |
| uv | Python package manager | For local development |
| Node.js 18+ | Frontend build | For admin console, websocket audio interface |
Optional Components
| Component | Purpose |
|---|---|
| Telegram Bot | Caregiver alert notifications |
| Google Gemini API | Real-time voice conversations |
| TTS service | Text-to-speech announcements |
Step 1: Configure Environment
git clone https://github.com/SilverMind-Project/cognitive-companion.git
cd cognitive-companion
cp .env.example .envEdit .env with your service URLs and API keys:
# LLM Providers
VISION_MODEL_URL=http://localhost:8001/v1
TRANSLATE_MODEL_URL=http://localhost:8002/v1
LOGIC_MODEL_URL=http://localhost:11434
# Home Assistant
HOME_ASSISTANT_URL=http://homeassistant.local:8123
HOME_ASSISTANT_TOKEN=your_long_lived_access_token
# Object Storage
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin
# Person Identification
PERSON_ID_SERVICE_URL=http://localhost:8100
# Authentication
CC_ADMIN_API_KEY=your_admin_key
CC_CAREGIVER_API_KEY=your_caregiver_key
CC_MCP_API_KEY=your_mcp_keyReview config/settings.yaml for application behavior: event aggregation windows, LLM model names, polling intervals, and more. See Configuration for a full reference.
Step 2: Start All Services
Option A: Docker Compose (recommended)
The fastest way to run the full stack. From the parent directory containing both repositories:
# Start backend, frontend, person-ID (GPU), and MinIO
docker compose up -d
# Verify
curl http://localhost:8000/api/v1/health # Backend
curl http://localhost:8100/health # Person-ID serviceDocker Compose handles inter-service networking automatically. The backend connects to person-id:8100 and minio:9000 internally.
TIP
The person-ID service requires GPU access. Ensure the NVIDIA Container Toolkit is installed.
See Deployment for the full Docker Compose and Kubernetes reference.
Option B: Run Services Individually
Start the Person Identification Service (GPU-accelerated face recognition):
cd ../person-identification-service
docker build -t person-id-service .
docker run --gpus all -p 8100:8100 -v ./data:/app/data person-id-serviceSee the Person Identification Service README for enrollment instructions and API documentation.
Start the Backend:
# With Docker
docker build -t cognitive-companion .
docker run -p 8000:8000 \
-v ./data:/app/data \
-v ./config:/app/config \
--env-file .env \
cognitive-companion
# Or for local development (requires uv: https://docs.astral.sh/uv/)
cd backend && uv sync --extra gemini && cd ..
uv run --directory backend uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reloadThe gemini extra installs the google-genai package for voice companion support. Omit it if you don't need real-time voice.
Start the Frontend:
cd frontend
npm install
npm run dev # Development server at http://localhost:5173For production, the frontend is containerized with nginx:
cd frontend
docker build -t cognitive-companion-ui .
docker run -p 80:80 cognitive-companion-uiStep 5: Initial Setup
- Open the admin console at
http://localhost:5173/admin - Set your admin API key in the settings
- Create rooms. Define the physical spaces in your home (kitchen, bedroom, etc.)
- Register sensors. Add cameras and presence sensors, assigning each to a room
- Enroll household members. Go to Members & Enrollment, register each person, then click the face-recognition icon to upload 5-10 reference photos per person
- Create rules. Use the visual pipeline builder to assemble step sequences
Your First Rule
A basic camera monitoring rule might look like:
person_identification → llm_call (vision) → llm_call (reasoning) → notification- Go to Rules → New Rule, enter a name, and click Create - you'll land on the rule detail page
- On the Settings tab, set the trigger type to
sensor_eventand bind it to a camera sensor - Switch to the Pipeline tab and add steps from the palette in order:
- Person Identification: identify who is in the frame
- LLM Call (vision): describe what is happening
- LLM Call (reasoning): decide if a notification is warranted
- Translation: translate the message to Tamil (or your target language)
- Notification: send the alert to configured channels
- Configure each step's settings in its config dialog
- Enable the rule and save
The rule will now execute whenever the bound camera sends an event. You can monitor execution in the Workflows view and inspect pipeline data in the Events log.
What's Next
- Configuration Reference: All settings explained
- Architecture Deep Dive: How the system is designed
- Pipeline Step Types: All 10 step types explained
- Hardware Setup: Setting up cameras and displays