Person Identification and Tracking
Cognitive Companion uses GPU-accelerated face recognition to identify household members across camera feeds, then fuses those detections with Home Assistant presence sensors for whole-house location tracking.
Face Recognition
The person identification system runs as a companion microservice using InsightFace (buffalo_l model pack) with ArcFace 512-dimensional embeddings.
Enrollment
Upload 5-10 reference photos per person through the admin UI (Members & Enrollment page) or via the API. No model fine-tuning is needed because ArcFace generalizes from pretrained weights.
Enrolling from the Admin UI:
- Go to Members & Enrollment in the admin console
- Click the face-recognition icon next to a member
- Upload reference photos (drag-and-drop or file picker)
- The backend proxies the images to the person-ID service, which extracts face embeddings
Best practices for reference photos:
- Use photos with varied lighting conditions
- Include different angles (frontal, 3/4 profile)
- Ensure the face is clearly visible and unobstructed
- Avoid group photos; use one person per reference image
- Photos from deployment cameras give the best domain match
Identification Pipeline
The v2 backend sends batched frames to the person-ID service:
- Camera uploads a frame via
POST /api/v1/device/recamera - The event aggregator batches frames (configurable size/window)
- When a
person_identificationpipeline step executes, it sends the batch toPOST /api/v1/identify-batch - The service returns per-frame face detections with:
- Identity: matched person name or "unknown"
- Confidence: similarity score (0.0 to 1.0)
- Bounding box: face location in the frame [x1, y1, x2, y2]
Annotated Images
When include_annotated_image is enabled in a pipeline step's config, the person-ID service returns a copy of each frame with bounding boxes and name labels drawn over detected faces. These annotated images are:
- Stored in
pipeline_dataunder theannotated_imageskey - Available for forwarding to downstream notification steps
- Useful for visual confirmation in Telegram and WebSocket alerts
Confidence Thresholds
The recognition threshold is configured in the person-ID service's config/settings.yaml under recognition.threshold (default 0.4). Detections below this threshold are reported as "unknown". Override it at deployment time with the RECOGNITION_THRESHOLD environment variable.
Guest Image Saving
When the save_guest_images flag is set to true on an identification request, the person-ID service saves the full frame image to disk whenever unidentified guests are detected. Images are organized by date under data/guests/:
data/guests/
├── 2026-03-23/
│ ├── 143022-123456_f0_2guests.jpg
│ └── 143022-234567_f1_1guests.jpg
└── ...This is useful for:
- Reviewing visitors: see who has been at the door
- Building enrollment datasets: identify frequent visitors and enroll them
- Auditing false negatives: check if known members were misclassified as guests
The flag defaults to false and can be enabled per-request on both the single (/identify) and batch (/identify-batch) endpoints.
Motion Direction Detection
Cross-frame centroid tracking classifies movement direction:
| Direction | Description |
|---|---|
left-to-right | Moving across the frame from left to right |
right-to-left | Moving across the frame from right to left |
towards-camera | Face/body getting larger (approaching) |
away-from-camera | Face/body getting smaller (leaving) |
stationary | No significant movement between frames |
Use case: Door-mounted cameras can infer entering vs. leaving a room based on movement direction relative to camera placement.
Camera Topology
Raw motion directions carry no room-level meaning on their own. Camera topology maps each raw direction to a semantic transition for a specific camera, so a rule can fire when someone enters the kitchen rather than just when motion is detected.
Configuration
Add a movement_map to a sensor's config_json in the admin UI or via PUT /api/v1/sensors/{id}:
{
"movement_map": {
"left-to-right": "entering",
"right-to-left": "exiting",
"towards-camera": "approaching_exit",
"away-from-camera": "entering_depth",
"stationary": "stationary"
}
}Valid semantic values: entering, exiting, approaching_exit, entering_depth, stationary. Any raw direction not present in the map is ignored.
How it works
When a person_identification step runs and raw motion direction data is available from the person-ID service, infer_room_transition() in backend/services/camera_topology.py looks up the direction in the sensor's movement_map and returns a frozen RoomTransition dataclass:
@dataclass(frozen=True)
class RoomTransition:
person_id: str
person_name: str
sensor_id: str
direction_raw: str # e.g. "left-to-right"
semantic: str # e.g. "entering"
from_room_id: str | None
from_room_name: str | None
to_room_id: str | None
to_room_name: str | None
confidence: floatEach computed transition is written to PersonLocationHistory with the direction_semantic field populated. The person_identification step also writes transitions to pipeline_data["room_transitions"] for use in downstream steps and notification templates.
Room Transition filter
The room_transition context filter lets rules fire only when a person makes a specific type of transition. Configure it on a rule alongside other context filters:
| Field | Type | Description |
|---|---|---|
person_id | str (required) | Person whose transitions to watch |
semantic | str (optional) | Semantic transition type to match (e.g. "entering") |
to_room_name | str (optional) | Destination room name (case-insensitive) |
from_room_name | str (optional) | Origin room name (case-insensitive) |
within_minutes | int (default 5) | Lookback window for recent transitions |
Example: Fire a reminder rule only when grandma enters the kitchen:
context_type: room_transition
config:
person_id: grandma
semantic: entering
to_room_name: Kitchen
within_minutes: 2This filter is evaluated against PersonLocationHistory records, so it works correctly across both direct camera detections and HA-sensor-inferred transitions.
Whole-House Location Tracking
The PersonTrackingService maintains a real-time location state for each household member by fusing two data sources:
Camera Detections
When a person is identified by a camera, their location is updated to the room where that camera is installed. This is the primary, high-confidence location signal.
Home Assistant Presence Sensors
For rooms without cameras (e.g., bathrooms), HA presence sensors (PIR/mmWave) provide occupancy data. The tracking service correlates presence sensor activations with the most recent camera sighting:
- A person is last seen by a camera in the hallway
- The bathroom presence sensor activates
- The tracking service infers that person is now in the bathroom
- When the bathroom sensor deactivates and a camera picks them up again, the location updates
Location State
Each person's current location is stored as a PersonLocationState record:
- Room name: current room
- Source:
cameraorha_sensor(indicates confidence level) - Last updated: timestamp of the most recent detection
- Stale timeout: locations older than the configured timeout (default from
person_tracking.stale_timeout_minutes) are considered stale
Location History
Every location change creates a PersonLocationHistory entry, providing a full timeline of where a person has been throughout the day. Query via GET /api/v1/persons/{id}/history?hours=24.
Home Assistant Propagation
Person locations are pushed to Home Assistant input_text helpers:
input_text.cc_{person_id}_location = "kitchen"This allows HA automations and dashboards to display person locations and use them in conditions.
API Endpoints
Member Management
| Method | Path | Description |
|---|---|---|
GET | /persons | List all household members |
POST | /persons | Register a new member |
GET | /persons/{id} | Get member details |
PATCH | /persons/{id} | Update a member |
DELETE | /persons/{id} | Remove a member and their data |
Face Enrollment
| Method | Path | Description |
|---|---|---|
GET | /persons/enrolled | List face enrollment status from person-ID service |
POST | /persons/{id}/enroll | Upload reference photos to enroll a face (multipart) |
GET | /persons/{id}/enrollment | Get enrollment details (embedding count, created date) |
DELETE | /persons/{id}/enrollment | Remove face enrollment data |
Location Tracking
| Method | Path | Description |
|---|---|---|
GET | /persons/locations | Current location of all tracked members |
GET | /persons/{id}/location | Current location of a specific member |
GET | /persons/{id}/history | Location timeline (?hours=24) |
GET | /persons/{id}/sightings | Recent camera sightings (?limit=20) |
Activity Tracking
The activity_detection pipeline step records detected activities for tracked persons:
| Activity Type | Description |
|---|---|
eating | Person detected eating a meal |
sleeping | Person detected sleeping or resting |
medication | Person detected taking medication |
Activities are recorded as PersonActivity records and can be used as context filters in downstream rules. For example, a lunch reminder rule can check whether an eating activity was recently recorded before sending a reminder.
Query activities via GET /api/v1/activities?person_id=...&activity_type=....
Person Tracking vs Activity Tracking
These two systems are independent. Person tracking identifies who is where using face recognition cameras and presence sensors. Activity tracking identifies what a person is doing using vision and logic LLMs. A rule can use both as separate context filters: for example, "only trigger when grandma is home in the kitchen AND no eating activity was recorded in the last 30 minutes."
Context Filters for Rules
Person tracking and activity data are available as rule context filters, allowing rules to fire only when specific presence or activity conditions are met.
Person Presence Filter
The person_presence filter checks whether a person is home, away, or in a specific room.
| Config Field | Type | Description |
|---|---|---|
person_id | string (required) | The household member to check |
status | string | home, away, or unknown (default: home) |
room_name | string | Optional room name; only applies when status is home |
Examples:
- Person is home (any room):
person_id: "grandma",status: "home" - Person is in a specific room:
person_id: "grandma",status: "home",room_name: "Kitchen" - Person is away:
person_id: "grandma",status: "away"
The filter checks the PersonLocationState table, which is continuously updated by camera detections and HA sensor correlation. Locations older than 30 minutes are considered stale and treated as away.
Person Activity Filter
The person_activity filter checks whether a person performed a specific activity within a time window.
| Config Field | Type | Description |
|---|---|---|
person_id | string (required) | The household member to check |
activity_type | string (required) | Activity to look for (e.g. eating, medication) |
within_minutes | number | Time window to search (default: 30) |
Multi-Camera Room Mapping
Each camera sensor is associated with a room. When a face is detected on any camera, the person's location is updated to that camera's room. This enables room-level presence tracking across the house:
- Place face-level cameras in each room for identification
- Configure each camera sensor with the correct room assignment
- The
PersonTrackingServicefuses all camera detections into a single location state per person - Cameras that cannot identify a person (top-down, rear-facing) can still be used for vision analysis; rules should reference the person's last known location from other cameras via the
person_presencecontext filter
For doorway cameras that capture motion direction, the include_motion flag on the person_identification step provides left-to-right, right-to-left, towards-camera, and away-from-camera labels that downstream logic steps can use to infer room transitions.