Research

mema, verifiable memory for AI agents.

Every retrieval comes with its receipt. Hash-chained, governance-checked, hard-erase capable. For stacks under FINMA, GDPR, and nFADP.

§01 what mema is

Every memory is a verifiable knowledge asset.

mema treats every memory as a verifiable knowledge asset. Seven composable layers, one markdown vault, no graph database, no blockchain. Inspired by Zep, Hindsight, Mem0, and OriginTrail, and shipped without their dependencies. The substrate is human-readable: every record is a file your engineer can open with `cat` and your auditor can re-hash by hand.

§02 the problem

Why agents fail.

Agents don't fail because the model is bad. They fail because they don't have the right context.

Chat Memory

Chat-only memory. Blind to business data, blind to events, blind to what the user did yesterday.

Static RAG

Stale and incomplete. Doesn't reflect what just happened. Doesn't track how facts change.

No Audit Trail

Nobody can verify why the agent answered the way it did. No receipt, no reproducibility.

Context is scattered, and nobody can verify it.

§03 architecture

Seven layers. One vault.

Sources

ChatDocumentsTool calls

The 7-layer vault

L1Episodic

L2Semantic

L3Cognitive

L4Governance

L5Retrieval

L6Audit

L7Asset

Verifiable output

MEMA RECEIPT

kind: fact
hash: sha256:ab4f…d11e
ual: mema://owner/01KR…
anchored: ✓

VERIFIED

Seven composable layers, from the raw event to the verifiable asset. Each has a clear purpose and an endpoint.

L1 Episodic raw events

Raw events: conversations, documents, tool calls, observations.

Immutable after write. The evidence base for everything higher. No claim without an episode.

POST /v2/observe

L2 Semantic entities & facts

Entities + facts with bi-temporal validity.

valid_from / valid_to describe truth in the world. invalidated_at / superseded_by describe what we've learned. Zep-style.

POST /v2/fact

L3 Cognitive beliefs & supersession

Experiences, observations, beliefs, with confidence and supersession.

Reflection runs offline, rule-based. No LLM call on the write path. LLM-augmented reflection is opt-in (v2.1).

POST /v2/cognitive

L4 Governance purpose & retention

Purpose, retention, provenance, hard erasure.

Every record carries purpose, retention, jurisdiction, evidence. policyCheck() decides at recall time. Hard-erase overwrites the file: GDPR Art. 17 / nFADP Art. 32.

POST /v2/erase

L5 Retrieval search pipeline

Keyword + vector + graph + temporal + policy in one pipeline.

Fused scoring, graph expansion via derived_from, fully audited in L6. Every hit returns score_components, governance, why_retrieved.

POST /v2/recall

L6 Audit hash chain

Append-only SHA-256 hash chain, with external sealed witness.

Every operation is logged. verifyChain() detects any tampering. audit-witness.log defeats sqlite_sequence reset attacks.

GET /v2/audit/verify

L7 Asset verifiable asset

UAL + content_hash + metadata_hash + anchor lifecycle.

Every record can be wrapped as a verifiable asset. OriginTrail DKG Knowledge Asset pattern, without the blockchain. Pluggable anchor targets.

POST /v2/asset/wrap

§04 every answer ships with its evidence

Every retrieval comes with its receipt.

Example receipt. Every field is verifiable in the audit log.

score_componentskeyword + vector + graph fused, every component inspectable.
ualstable resolvable identifier; re-hash the file, compare, done.
governancepolicy decision with reason. Every denial is logged.
verification_statusunverified → verified → anchored lifecycle.

§05 the numbers

Measurably better.

M01 96.0% Precision@1 25-query benchmark · 347-doc corpus

M02 <50ms median recall latency single-shot retrieval, no agentic loop

M03 L1–L7 verifiable memory layers Episodic, Semantic, Cognitive, Governance, Retrieval, Audit, Asset

M04 0 silent failures every recall returns a verifiable receipt

METHOD LoCoMo long-context memory benchmark. 25-query sample on the regulated 347-document corpus. Recommended: 15/5 for real-time chat, 30/30 for batch research.

§06 where mema sits

We don't try to out-intelligence Zep or Hindsight. We build the foundation underneath.

Bi-temporal facts. Hash-chained audit. Hard erasure. That's the foundation. Recall performance is downstream.

Property	Zep	Hindsight	Mem0	OriginTrail	mema
Inspectable substrate	–	–	–	·	✓
Bi-temporal facts	✓	·	–	–	✓
Epistemic separation	·	✓	–	–	✓
Online LLM extraction by design	–	·	✓	–	–
Multi-tenant isolation	·	·	·	✓	✓
Hash-chained audit no blockchain	–	–	–	✓	✓
Hard erasure	·	·	–	–	✓
Verifiable assets (UAL/hashes)	–	–	–	✓	✓
External anchoring pluggable	–	–	–	✓	✓
Local-first	–	–	–	–	✓
Vendor-neutral	–	–	–	–	✓

Legend ✓ supported – not supported · partial

§07 what gets deployed

Three realities.

Financial services

Regulated assistant memory

A Swiss private bank deploys an internal assistant for relationship managers.

Every recall is audit-logged. Erasure requests are honoured: the file is tombstoned, the audit entry preserved.

Agentic workflows

Tool-call provenance

Every tool call becomes an L1 episode with a source hash.

When something goes wrong, the auditor walks derived_from from the answer back to the calls that justified it.

Pharma R&D

Compliance erase

Pharma R&D feeds protocols and decisions into mema.

GDPR Art. 17 / nFADP Art. 32 requests are honoured by hardErase: content overwritten, audit reference preserved.

§08 verified by adversarial review

Independently torn down three times, all findings fixed.

Three independent adversarial reviews.

All findings fixed and regression-tested.

97 automated assertions, all green.

v1 isolation, v2 smoke, v2 professional, v2 assets, security-hardening rounds 1 & 2.

MIT-licensed, Swiss-built, open core.

Repo at github.com/machtsinnch/mema.

mema is our open, ongoing research, the engine behind our context engineering. Not a product pitch; a place we keep learning. On GitHub ↗