Mnemos AI
AI Architecture

How Mnemos reasons, without leaking.

An honest look at the agent orchestration, retrieval pipeline, memory graph, and permission enforcement that make Mnemos safe to put in front of your auditors.

Layer
Capture

Voice and text interviews flow through an agent that probes for edge cases. Diarized transcripts are written to durable storage and queued for extraction.

capture-service
extraction-service
Layer
Reasoning

Hybrid retrieval blends semantic, lexical, and graph hops. The agent loop is constrained to cite retrieved candidates and to declare uncertainty.

retrieval-service
graph-service
Layer
Memory

The memory graph is the durable substrate. Entities, edges, and freshness scores are projected from interview output and integration events.

graph-service
platform.event_outbox
Layer
Governance

ACLs, tenant isolation, audit lineage, and compliance controls live as first-class services. Every retrieval is filtered and logged.

identity-service
platform.audit_log
Agent orchestration

Constrained, observable, replayable.

Planner-executor split

The planner agent decomposes a question into retrieval steps. The executor runs them in parallel and merges results. Both run on a constrained tool surface.

Persisted runs

Every plan, tool call, and intermediate result is written to the platform.event_outbox. You can replay any session for audit.

Hot-swappable policy

Agent prompts and tools live in a versioned policy registry. A new policy is shadow-evaluated before going live.

Retrieval pipeline

Hybrid, permission-aware, cited.

1. Plan
The planner decomposes the question into lexical keywords, semantic anchors, and graph traversal hints.
2. Parallel retrieval
pgvector ANN, pg_trgm lexical, and graph hops execute in parallel. Each returns ranked top-K with provenance.
3. Permission filter
Every candidate is evaluated against the asker's effective ACL — Mnemos roles + inherited source ACLs. Restricted items are dropped pre-synthesis.
4. Re-rank
A small re-ranker scores the surviving set on query alignment, freshness, and authority signals from the graph.
5. Synthesize
The selected model receives only the retained candidates. The prompt forces citation by source span.
6. Lineage
The retrieval set, filter decisions, model used, and the rendered answer are persisted to the audit log under a single ID.
Memory graph

A graph backed by Postgres, ltree, and event sourcing.

Postgres-native

Entities and edges live in a relational schema with ltree for hierarchies, citext for case-insensitive identity, and pgvector for embedding columns.

Event-sourced

Every state change is appended to platform.event_outbox. The graph is a projection — rebuildable, auditable, time-traversable.

Hybrid index

Each entity carries a semantic embedding, a trigram index, and a relational shape. Queries blend all three to find the right node.

Permission model

Enforced at retrieval, not at render.

ACL inheritance from source systems, layered on Mnemos roles, and anchored by Postgres row-level security at the tenant boundary. Three lines of defense, each independently sufficient.

Layer 1 — Tenancy

Row-level security on every tenant-scoped table. The application role cannot bypass. Every connection runs in a set-local context.

Layer 2 — ACL

Inherited from Google Drive, Notion, SharePoint, and Confluence. Stored alongside each node as a denormalized effective scope, refreshed on source webhook.

Layer 3 — Role

Mnemos roles layer on top of source ACLs. A user must have both. Restricted content is dropped before the model receives the candidate list.

Talk to the engineers who built it.

Architecture review with a Mnemos engineer is part of every enterprise evaluation. We come with diagrams and answers, not slides.