Quick Start

This guide walks you through getting Cognitive Companion running on your local network.

Prerequisites

Component	Purpose	Notes
NVIDIA GPU (10 GB+ VRAM)	Person-ID service + vLLM + Ollama	RTX 3060 or better
Docker + NVIDIA Container Toolkit	Container runtime	For all services
Home Assistant	Sensor integration, audio playback, actions	REST API + long-lived token
MinIO (or S3-compatible)	Media object storage	Pre-signed URL support required
vLLM	Vision + translation model serving	Cosmos-Reason2-8B, TranslateGemma-12b
Ollama	Logic reasoning model	gemma3:4b
Python 3.11+	Backend runtime	3.12 recommended
uv	Python package manager	For local development
Node.js 18+	Frontend build	For admin console, websocket audio interface

Optional Components

Component	Purpose
Telegram Bot	Caregiver alert notifications
Google Gemini API	Real-time voice conversations
TTS service	Text-to-speech announcements

Step 1: Configure Environment

bash

git clone https://github.com/SilverMind-Project/cognitive-companion.git
cd cognitive-companion
cp .env.example .env

Edit .env with your service URLs and API keys:

bash

# LLM Providers
VISION_MODEL_URL=http://localhost:8001/v1
TRANSLATE_MODEL_URL=http://localhost:8002/v1
LOGIC_MODEL_URL=http://localhost:11434

# Home Assistant
HOME_ASSISTANT_URL=http://homeassistant.local:8123
HOME_ASSISTANT_TOKEN=your_long_lived_access_token

# Object Storage
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin

# Person Identification
PERSON_ID_SERVICE_URL=http://localhost:8100

# Authentication
CC_ADMIN_API_KEY=your_admin_key
CC_CAREGIVER_API_KEY=your_caregiver_key
CC_MCP_API_KEY=your_mcp_key

Review config/settings.yaml for application behavior: event aggregation windows, LLM model names, polling intervals, and more. See Configuration for a full reference.

Step 2: Start All Services

Option A: Docker Compose (recommended)

The fastest way to run the full stack. From the parent directory containing both repositories:

bash

# Start backend, frontend, person-ID (GPU), and MinIO
docker compose up -d

# Verify
curl http://localhost:8000/api/v1/health   # Backend
curl http://localhost:8100/health           # Person-ID service

Docker Compose handles inter-service networking automatically. The backend connects to person-id:8100 and minio:9000 internally.

TIP

The person-ID service requires GPU access. Ensure the NVIDIA Container Toolkit is installed.

See Deployment for the full Docker Compose and Kubernetes reference.

Option B: Run Services Individually

Start the Person Identification Service (GPU-accelerated face recognition):

bash

cd ../person-identification-service
docker build -t person-id-service .
docker run --gpus all -p 8100:8100 -v ./data:/app/data person-id-service

See the Person Identification Service README for enrollment instructions and API documentation.

Start the Backend:

bash

# With Docker
docker build -t cognitive-companion .
docker run -p 8000:8000 \
  -v ./data:/app/data \
  -v ./config:/app/config \
  --env-file .env \
  cognitive-companion

# Or for local development (requires uv: https://docs.astral.sh/uv/)
cd backend && uv sync --extra gemini && cd ..
uv run --directory backend uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload

The gemini extra installs the google-genai package for voice companion support. Omit it if you don't need real-time voice.

Start the Frontend:

bash

cd frontend
npm install
npm run dev          # Development server at http://localhost:5173

For production, the frontend is containerized with nginx:

bash

cd frontend
docker build -t cognitive-companion-ui .
docker run -p 80:80 cognitive-companion-ui

Step 5: Initial Setup

Open the admin console at http://localhost:5173/admin
Set your admin API key in the settings
Create rooms. Define the physical spaces in your home (kitchen, bedroom, etc.)
Register sensors. Add cameras and presence sensors, assigning each to a room
Enroll household members. Go to Members & Enrollment, register each person, then click the face-recognition icon to upload 5-10 reference photos per person
Create rules. Use the visual pipeline builder to assemble step sequences

Your First Rule

A basic camera monitoring rule might look like:

text

person_identification → llm_call (vision) → llm_call (reasoning) → notification

Go to Rules → New Rule, enter a name, and click Create - you'll land on the rule detail page
On the Settings tab, set the trigger type to sensor_event and bind it to a camera sensor
Switch to the Pipeline tab and add steps from the palette in order:
- Person Identification: identify who is in the frame
- LLM Call (vision): describe what is happening
- LLM Call (reasoning): decide if a notification is warranted
- Translation: translate the message to Tamil (or your target language)
- Notification: send the alert to configured channels
Configure each step's settings in its config dialog
Enable the rule and save

The rule will now execute whenever the bound camera sends an event. You can monitor execution in the Workflows view and inspect pipeline data in the Events log.

What's Next

Configuration Reference: All settings explained
Architecture Deep Dive: How the system is designed
Pipeline Step Types: All 10 step types explained
Hardware Setup: Setting up cameras and displays

Quick Start ​

Prerequisites ​

Optional Components ​

Step 1: Configure Environment ​

Step 2: Start All Services ​

Option A: Docker Compose (recommended) ​

Option B: Run Services Individually ​

Step 5: Initial Setup ​

Your First Rule ​

What's Next ​

Quick Start

Prerequisites

Optional Components

Step 1: Configure Environment

Step 2: Start All Services

Option A: Docker Compose (recommended)

Option B: Run Services Individually

Step 5: Initial Setup

Your First Rule

What's Next