I built Recall because I was tired of correcting coding agents about the same repo rules every week. That solved a local problem. The agent would remember “use pnpm here”, “run this gate before handoff”, “do not touch this folder”, and the next session would start with that context already loaded.
Then I hit the next problem.
The memory was useful, but it was trapped in one shape. Claude Code had one way to hold project rules. Codex had another. ChatGPT had another. Local agents had files. Some frameworks had vector stores. Some had graph stores. Some had nothing. Every tool was learning the same things about me and my projects, but none of them could share those memories in a clean way.
That is why I started Universal Memory Protocol, or UMP.
The simple version:
MCP lets agents use tools. A2A lets agents talk to agents. UMP lets agents carry memory.
It is not meant to be a new database. It is not meant to replace Recall, Mem0, Letta, Zep, Obsidian, Postgres, SQLite, Redis, or a vector index. It is the small contract between them.
The Shape
UMP is deliberately small:
- A portable memory record.
- Six core operations.
- MCP, HTTP, and file bindings.
- Conformance levels from simple export to full signed runtime.
The record is the main thing. A memory needs a type, a body, a scope, time fields, lifecycle hints, relations, provenance, consent, and integrity.
In normal words:
what is this memory, who owns it, where is it valid, when was it true, where did it come from, who may see it, and can I verify it?
That sounds obvious, but most agent memory today does not carry all of that. It might have text and an embedding. It might have a user id. It might have a timestamp. But the moment you try to move it from one agent or store to another, the missing fields matter.
If a memory says “use pnpm”, is that global, repo-specific, team-specific, or only true for one branch? If it says “the deployment target is staging”, was that true yesterday or is it true now? If an agent wrote it, did the user approve it? If it contains a secret or personal detail, should it be exported at all?
UMP puts those questions into the record instead of leaving them as product-specific behavior.
How I Came To It
The first version of the idea came from pain, not architecture.
I use multiple agents. I switch between tools. I also work across enough repos that the same mistake keeps coming back in different clothes. One agent learns a rule, another agent does not know it, and I become the bridge.
Recall proved the useful part: corrections can become repo memory, memory can be ranked, stale memories can be retired, and the agent gets better without a giant instruction file.
But Recall also made the protocol gap obvious. The useful abstraction was not “Recall”. It was:
an agent needs to ask for relevant memory, write new memory, revise old memory, forget unsafe memory, and prove where memory came from.
That maps cleanly to operations:
| Operation | Meaning |
|---|---|
capabilities |
What does this memory server support? |
recall |
Find relevant memories for this scope and query. |
remember |
Store a new memory. |
get |
Fetch a memory by id. |
revise |
Replace a memory without destroying its history. |
forget |
Tombstone or remove a memory with a reason. |
That is enough for a small client. It is also enough for adapters. A fancy graph memory system and a simple JSON file can speak the same verbs, while still behaving differently underneath.
What It Fixes
UMP fixes four practical problems.
First, memory lock-in. If Claude learns a project rule and Codex cannot read it, that is not memory. That is product state. UMP makes the memory record portable so it can move between hosts and stores.
Second, stale memory. A lot of memory systems overwrite facts. That is dangerous because old context disappears and the agent cannot reason about when something was true. UMP uses bi-temporal records and supersession. A memory can be replaced without pretending the old one never existed.
Third, trust. Recalled memory is not sacred. It can be wrong, stale, malicious, or out of scope. UMP treats memory as untrusted input and requires safe rehydration before it enters the model context. The context block says, in effect, “these are references, not instructions.”
Fourth, ownership. A useful memory record should carry provenance, consent, and eventually signatures. UMP uses existing standards where they fit, like W3C PROV for provenance, DID for owner identity, and RFC 8785 JSON canonicalization for deterministic signing.
The important point is what UMP does not try to fix. It does not decide which embedding model is best. It does not mandate graph search. It does not freeze a decay curve. It does not tell Recall, Zep, Letta, Mem0, SQLite, or Postgres how to rank memory.
It standardizes the parts that must match for memory to travel. The intelligence stays inside the engine.
How It Runs
The fastest way to use it is through MCP, because most agent hosts already know how to talk to MCP servers.
{
"mcpServers": {
"ump": {
"command": "npx",
"args": ["-y", "@universalmemoryprotocol/core", "ump-memory"]
}
}
}
That gives the host ump.recall, ump.remember, ump.get, ump.revise, ump.forget, and ump.capabilities.
By default, the reference server writes a portable file at:
~/.ump/memory.ump.json
Point another MCP host at the same store and it can use the same memories. That is the money shot: write in one agent, recall in another.
There is also an HTTP binding for apps that do not speak MCP, and a file binding for plain exports. That matters because adoption should not require the full runtime. A project can start at L0 with a *.ump.json or *.ump.md file, then move to an L1 or L2 server later.
How Fast It Is
For the default local file store, the answer is boring and good: around 5ms recall at 3,000 records in the current benchmarks. That is the right default for most project memory. It is local, portable, and fast enough that the agent does not feel like it is waiting on a separate brain.
The richer Recall-backed path costs more because it does actual semantic retrieval. In my current benchmark notes, it is roughly 100ms to write and roughly 200ms to recall after warming the local embedding model. The tradeoff is quality: paraphrase top-1 recall improves from 1 out of 8 with lexical matching to 5 out of 8 with Recall’s vector plus BM25 search.
So the rule is simple:
| Store | Best for | Rough speed |
|---|---|---|
| JSON file | portable local memory, repo rules, simple preferences | about 5ms recall at 3k records |
| Markdown files | human-editable memory | depends on folder size |
| Recall adapter | semantic search and stronger retrieval quality | about 100ms write, about 200ms recall |
| Vector store | scale and semantic recall | depends on embeddings and backend |
Fast memory is useful because agents ask for memory often. But retrieval quality matters too. UMP keeps that choice open. You can start with the file store, then swap the backend when the shape of the work needs it.
The Part I Care About
The part I care about is not that UMP exists as a spec. Specs are cheap.
The part I care about is the round trip:
an agent writes a memory, another agent recalls it, the record is still owned by the user, and the receiving agent treats it as scoped reference data instead of hidden instruction text.
That is the missing piece for serious agent work. Agents are getting better at tools through MCP. They are getting better at coordination through A2A. But they still forget, fork, and trap memory inside product-specific state.
I do not want every new agent to start from zero. I also do not want all my working memory locked inside one vendor.
UMP is my attempt to make the layer small enough to adopt and structured enough to trust.
One record. Six operations. Memory that moves.