PERSISTENT STATE NEWS

PERSISTENT STATE NEWS

Cached Thoughts

Our thoughts and updates, cached for later retrieval.

PERSISTENT STATE NEWS

Cached Thoughts

Our thoughts and updates, cached for later retrieval.

All Posts

Announcements

Changelog

Changelog

Nov 10, 2025

Embedding Models Now Available in Model Library

We’re excited to announce that embedding models are now fully integrated into the Backboard Model Library. This update expands the power and flexibility of your AI stack—giving developers direct access to embeddings across multiple providers, right alongside LLMs.

What’s New

You can now:

  • Browse 12 embedding models from OpenAI, Google, and Cohere in the Model Library

  • Filter by provider, dimensions, and model type

  • Use embedding models directly when creating or configuring assistants

New API Endpoints

Developers can programmatically access embedding models using the following endpoints:

GET /api/models/embedding/all           # List all embedding models
GET /api/models/embedding/{model_name}  # Get details for a specific embedding model
GET /api/models/embedding/providers     # List available embedding providers
GET /api/models?model_type=embedding    # Filter models by type

Available Models

OpenAI (3 models)

  • text-embedding-3-large (3072 dims)

  • text-embedding-3-small (1536 dims)

  • text-embedding-ada-002 (1536 dims)

Google (3 models)

  • gemini-embedding-001-768

  • gemini-embedding-001-1536

  • gemini-embedding-001-3072

Cohere (6 models)

  • embed-v4.0 (256, 512, 1024, 1536 dims)

  • embed-english-v3.0

  • embed-multilingual-v3.0

How to Use

When creating an assistant, you can now specify embedding parameters directly:

{
  "name": "My Assistant",
  "embedding_provider": "openai",
  "embedding_model_name": "text-embedding-3-large"
}

The selected model must exist in the Model Library.

Compatibility

All updates are fully backward compatible—existing integrations and assistants continue to work without modification.

Embedding models open up new possibilities for retrieval, classification, search, and RAG workflows inside Backboard.

For support or questions, contact the Backboard team or visit backboard.io/docs.

Announcement

Nov 9, 2025

AI Memory vs RAG: What’s the Difference?

Every few weeks, someone on a dev forum asks, “Isn’t AI memory just RAG?”
It’s an understandable question. Both systems help models “remember” information but they do it in completely different ways.

Understanding the difference isn’t just semantics. It determines how your AI behaves, scales, and how close it gets to real contextual intelligence.

The Short Answer

RAG (Retrieval-Augmented Generation) is about looking up information.
AI memory is about retaining and evolving it.

What RAG Actually Does

RAG connects a model to an external knowledge source like a vector database. When a user asks a question, the system retrieves the most relevant context chunks, passes them to the model, and generates a response.

RAG excels at:

  • Knowledge retrieval: Answering questions from documents, PDFs, or web sources.

  • Reducing hallucinations: Providing evidence-based grounding.

  • Scalability: You can update or swap the data store without retraining the model.

But RAG doesn’t remember anything. Each query is stateless. If you close the chat, the system forgets you ever existed.

What AI Memory Actually Does

AI memory adds a temporal layer—it lets the model accumulate and recall experiences across interactions. Instead of pulling static documents, it reconstructs contextual continuity:

  • User context: remembers preferences, facts, tone, and history.

  • Environmental context: understands where, when, and how interactions happen.

  • Evolving knowledge: stores model-generated insights over time.

Where RAG looks outward for static truth, AI memory looks inward for continuity and learning. It’s how systems begin to behave like never-ending databases of experience, not just retrieval machines.

Why Developers Confuse Them

Many frameworks blur the line by calling vector stores “memory.” But a true contextual memory system does more than retrieve embeddings it writes, organizes, and retrieves context dynamically.

In a proper architecture, RAG and memory work together:

  • RAG augments knowledge with external retrieval.

  • Memory preserves conversational or operational continuity.

You could think of it like this:

RAG finds facts.
Memory builds relationships.

When to Use Each

Use RAG when:

  • You need to access large, changing knowledge bases.

  • You’re answering factual or reference-heavy queries.

  • Your system doesn’t need long-term personalization.

Use AI memory when:

  • You need stateful or evolving context (like ongoing conversations or agents).

  • You’re building assistants that should “learn” over time.

  • You want models that retain user identity, task history, and preferences.

And in most real systems? You’ll want both.

How Backboard Approaches It

At Backboard, we treat memory as an architecture, not a feature.
Our API allows developers to configure both retrieval pipelines (RAG) and stateful, portable memory layers—working together or independently.

Developers can tune:

  • Memory mode: Auto, Read-only, or Off.

  • Embedding models and vector dimensions: to optimize for precision and latency.

  • Persistence and portability: move your memory between models or agents instantly.

It’s how Backboard achieved record-setting contextual performance on the LoCoMo benchmark (90.1% accuracy) proving that true memory systems outperform retrieval-only setups.

The Takeaway

RAG and AI memory are complementary, not competitive.
Retrieval gives AI external knowledge.
Memory gives it personal history.
Combined, they create agents that are both informed and aware.

Announcement

Nov 3, 2025

Backboard Achieves Highest Score Ever on LoCoMo (90.1%)

New stateful memory architecture. Standard protocol. Fully reproducible.

A funny thing happened on the way to baseline our novel AI Memory architecture by using the industry standard benchmark, LoCoMo: We broke the record. And we did it with no gaming, no adjustments, just pure, reproducible execution!

Backboard scored 90.1 percent overall accuracy using the standard task set and GPT-4.1 as the LLM judge. Full results, category breakdowns, and latency are available below, along with a one-click script and API so anyone can replicate the run. This is now LIVE in our API so anyone can plug in and start testing.

Full Result set with replication scripts here: https://github.com/Backboard-io/Backboard-Locomo-Benchmark

About the Benchmark

LoCoMo was designed to test memory across many sessions, long dialogues, and time-dependent questions. It is widely used to evaluate whether systems truly remember and reason over long horizons. snap-research.github.io+2arxiv.org+2

How we compare

Recent public writeups place leading memory libraries around 67 to 69 percent on LoCoMo, and a simple Letta filesystem baseline around 74 percent. Backboard’s 90.1 percent suggests a material step forward for long-term conversational memory. We will maintain a live comparison table on our results page.


Method

Single-Hop(%)

Multi-Hop(%)

Open Domain(%)

Temporal(%)

Overall(%)

Backboard

89.36

75

91.2

91.9

90

Mem0

67.13

51.15

72.93

55.51

66.88

Mem0-Graph

65.71

47.19

75.71

58.13

68.44

LangMem

62.23

47.92

71.12

23.43

58.1

Zep

74.11

66.04

67.71

79.79

75.14

OpenAI

63.79

42.92

62.29

21.71

52.9

Memobase(v0.0.32)

63.83

52.08

71.82

80.37

70.91

Memobase(v0.0.37)

70.92

46.88

77.17

85.05

75.78

Best in Class in Every Measure


Reproducibility and transparency

  • Same dataset and task set as LoCoMo

  • GPT-4.1 LLM judge with fixed prompts and seed

  • Logs, prompts, and verdicts published for every question

Run it yourself in minutes using our public script or by calling the evaluation API.

If memory is the foundation of intelligence, transparency must be the foundation of benchmarks.

Get started

Build with Backboard today. Sign up takes under a minute.

References
LoCoMo benchmark overview and paper. snap-research.github.io+2arxiv.org+2

Announcement

Nov 1, 2025

The Philosophy of Memory: Why AI Should Remember Everything (and Sometimes Forget)

Artificial intelligence today suffers from an old human flaw: forgetfulness.
Every chat, every insight, every piece of context disappears when a session ends. Most systems act as if memory is an afterthought, a feature that can be added later.

At Backboard, we see it differently. We believe that memory is the foundation of intelligence. Without it, AI performs tasks but never truly understands. With it, AI becomes capable of reasoning, continuity, and growth.

1. What We Believe

AI systems should behave like never-ending databases that can store, retrieve, and reason over every fact, every nuance, and every context they have ever seen.
They should remember the user, the task, the environment, and the intention behind each interaction.

Human memory fades. AI memory should not. It should be more accurate, more contextual, and more persistent than ours. It should exist to augment human cognition, not imitate its limits.

That is the standard we hold ourselves to. Based on independent benchmarks, Backboard’s memory systems are already leading the field. But “best” is not enough. We will not stop until “perfect” memory itself becomes outdated and the benchmarks must evolve.

2. The Debate That Drives Us

Building perfect memory is not as simple as storing everything forever. The deeper question is this: What should an intelligent system remember, and for how long?

Three schools of thought shape our work:

A. Infinite Memory

One belief is that AI should remember everything—every input, every decision, permanently stored and retrievable. It is a compelling vision of total continuity with no loss or drift.
But infinite memory comes with tradeoffs in privacy, ethics, compute, and relevance. Systems that remember everything risk losing focus and clarity.

B. Parametric Memory

Another view holds that all useful memory should live inside the model’s parameters. The future, in this view, is a model that simply knows. No external databases, no retrieval calls, no connectors.
It is elegant but static. A model that cannot forget or update easily cannot adapt to new realities.

C. Contextual Ephemerality

A third perspective suggests that memory should be temporary, context-scoped, and self-regulating. Systems should remember only what remains useful.
This design is efficient and ethical, but it risks losing the long-term continuity that makes intelligence coherent and personal.

3. Our Path: Adaptive Memory

We believe the right answer is not one extreme but balance.
Backboard is building adaptive memory architectures that know what to remember, what to compress, and what to release.

  • Short-Term Memory: Fast, ephemeral context for immediate reasoning.

  • Mid-Term Memory: Personalized threads that evolve with the user.

  • Long-Term Memory: Compressed, auditable archives for persistent knowledge.

The system does not just store information. It learns what deserves to persist.

4. The Future We’re Building

AI memory should not only be persistent. It should be self-aware able to reason about its own context and decide what matters.
When memory reaches that level, it will not simply match human cognition.
It will extend it.

That is the future we are building at Backboard.
And we are only getting started.


If you want to explore how Backboard’s memory systems work or test your own models against our benchmarks, sign up for free access!

All Posts

Changelog

Nov 10, 2025

Embedding Models Now Available in Model Library

We’re excited to announce that embedding models are now fully integrated into the Backboard Model Library. This update expands the power and flexibility of your AI stack—giving developers direct access to embeddings across multiple providers, right alongside LLMs.

What’s New

You can now:

  • Browse 12 embedding models from OpenAI, Google, and Cohere in the Model Library

  • Filter by provider, dimensions, and model type

  • Use embedding models directly when creating or configuring assistants

New API Endpoints

Developers can programmatically access embedding models using the following endpoints:

GET /api/models/embedding/all           # List all embedding models
GET /api/models/embedding/{model_name}  # Get details for a specific embedding model
GET /api/models/embedding/providers     # List available embedding providers
GET /api/models?model_type=embedding    # Filter models by type

Available Models

OpenAI (3 models)

  • text-embedding-3-large (3072 dims)

  • text-embedding-3-small (1536 dims)

  • text-embedding-ada-002 (1536 dims)

Google (3 models)

  • gemini-embedding-001-768

  • gemini-embedding-001-1536

  • gemini-embedding-001-3072

Cohere (6 models)

  • embed-v4.0 (256, 512, 1024, 1536 dims)

  • embed-english-v3.0

  • embed-multilingual-v3.0

How to Use

When creating an assistant, you can now specify embedding parameters directly:

{
  "name": "My Assistant",
  "embedding_provider": "openai",
  "embedding_model_name": "text-embedding-3-large"
}

The selected model must exist in the Model Library.

Compatibility

All updates are fully backward compatible—existing integrations and assistants continue to work without modification.

Embedding models open up new possibilities for retrieval, classification, search, and RAG workflows inside Backboard.

For support or questions, contact the Backboard team or visit backboard.io/docs.

Announcement

Nov 9, 2025

AI Memory vs RAG: What’s the Difference?

Every few weeks, someone on a dev forum asks, “Isn’t AI memory just RAG?”
It’s an understandable question. Both systems help models “remember” information but they do it in completely different ways.

Understanding the difference isn’t just semantics. It determines how your AI behaves, scales, and how close it gets to real contextual intelligence.

The Short Answer

RAG (Retrieval-Augmented Generation) is about looking up information.
AI memory is about retaining and evolving it.

What RAG Actually Does

RAG connects a model to an external knowledge source like a vector database. When a user asks a question, the system retrieves the most relevant context chunks, passes them to the model, and generates a response.

RAG excels at:

  • Knowledge retrieval: Answering questions from documents, PDFs, or web sources.

  • Reducing hallucinations: Providing evidence-based grounding.

  • Scalability: You can update or swap the data store without retraining the model.

But RAG doesn’t remember anything. Each query is stateless. If you close the chat, the system forgets you ever existed.

What AI Memory Actually Does

AI memory adds a temporal layer—it lets the model accumulate and recall experiences across interactions. Instead of pulling static documents, it reconstructs contextual continuity:

  • User context: remembers preferences, facts, tone, and history.

  • Environmental context: understands where, when, and how interactions happen.

  • Evolving knowledge: stores model-generated insights over time.

Where RAG looks outward for static truth, AI memory looks inward for continuity and learning. It’s how systems begin to behave like never-ending databases of experience, not just retrieval machines.

Why Developers Confuse Them

Many frameworks blur the line by calling vector stores “memory.” But a true contextual memory system does more than retrieve embeddings it writes, organizes, and retrieves context dynamically.

In a proper architecture, RAG and memory work together:

  • RAG augments knowledge with external retrieval.

  • Memory preserves conversational or operational continuity.

You could think of it like this:

RAG finds facts.
Memory builds relationships.

When to Use Each

Use RAG when:

  • You need to access large, changing knowledge bases.

  • You’re answering factual or reference-heavy queries.

  • Your system doesn’t need long-term personalization.

Use AI memory when:

  • You need stateful or evolving context (like ongoing conversations or agents).

  • You’re building assistants that should “learn” over time.

  • You want models that retain user identity, task history, and preferences.

And in most real systems? You’ll want both.

How Backboard Approaches It

At Backboard, we treat memory as an architecture, not a feature.
Our API allows developers to configure both retrieval pipelines (RAG) and stateful, portable memory layers—working together or independently.

Developers can tune:

  • Memory mode: Auto, Read-only, or Off.

  • Embedding models and vector dimensions: to optimize for precision and latency.

  • Persistence and portability: move your memory between models or agents instantly.

It’s how Backboard achieved record-setting contextual performance on the LoCoMo benchmark (90.1% accuracy) proving that true memory systems outperform retrieval-only setups.

The Takeaway

RAG and AI memory are complementary, not competitive.
Retrieval gives AI external knowledge.
Memory gives it personal history.
Combined, they create agents that are both informed and aware.

Announcement

Nov 3, 2025

Backboard Achieves Highest Score Ever on LoCoMo (90.1%)

New stateful memory architecture. Standard protocol. Fully reproducible.

A funny thing happened on the way to baseline our novel AI Memory architecture by using the industry standard benchmark, LoCoMo: We broke the record. And we did it with no gaming, no adjustments, just pure, reproducible execution!

Backboard scored 90.1 percent overall accuracy using the standard task set and GPT-4.1 as the LLM judge. Full results, category breakdowns, and latency are available below, along with a one-click script and API so anyone can replicate the run. This is now LIVE in our API so anyone can plug in and start testing.

Full Result set with replication scripts here: https://github.com/Backboard-io/Backboard-Locomo-Benchmark

About the Benchmark

LoCoMo was designed to test memory across many sessions, long dialogues, and time-dependent questions. It is widely used to evaluate whether systems truly remember and reason over long horizons. snap-research.github.io+2arxiv.org+2

How we compare

Recent public writeups place leading memory libraries around 67 to 69 percent on LoCoMo, and a simple Letta filesystem baseline around 74 percent. Backboard’s 90.1 percent suggests a material step forward for long-term conversational memory. We will maintain a live comparison table on our results page.


Method

Single-Hop(%)

Multi-Hop(%)

Open Domain(%)

Temporal(%)

Overall(%)

Backboard

89.36

75

91.2

91.9

90

Mem0

67.13

51.15

72.93

55.51

66.88

Mem0-Graph

65.71

47.19

75.71

58.13

68.44

LangMem

62.23

47.92

71.12

23.43

58.1

Zep

74.11

66.04

67.71

79.79

75.14

OpenAI

63.79

42.92

62.29

21.71

52.9

Memobase(v0.0.32)

63.83

52.08

71.82

80.37

70.91

Memobase(v0.0.37)

70.92

46.88

77.17

85.05

75.78

Best in Class in Every Measure


Reproducibility and transparency

  • Same dataset and task set as LoCoMo

  • GPT-4.1 LLM judge with fixed prompts and seed

  • Logs, prompts, and verdicts published for every question

Run it yourself in minutes using our public script or by calling the evaluation API.

If memory is the foundation of intelligence, transparency must be the foundation of benchmarks.

Get started

Build with Backboard today. Sign up takes under a minute.

References
LoCoMo benchmark overview and paper. snap-research.github.io+2arxiv.org+2

Announcement

Nov 1, 2025

The Philosophy of Memory: Why AI Should Remember Everything (and Sometimes Forget)

Artificial intelligence today suffers from an old human flaw: forgetfulness.
Every chat, every insight, every piece of context disappears when a session ends. Most systems act as if memory is an afterthought, a feature that can be added later.

At Backboard, we see it differently. We believe that memory is the foundation of intelligence. Without it, AI performs tasks but never truly understands. With it, AI becomes capable of reasoning, continuity, and growth.

1. What We Believe

AI systems should behave like never-ending databases that can store, retrieve, and reason over every fact, every nuance, and every context they have ever seen.
They should remember the user, the task, the environment, and the intention behind each interaction.

Human memory fades. AI memory should not. It should be more accurate, more contextual, and more persistent than ours. It should exist to augment human cognition, not imitate its limits.

That is the standard we hold ourselves to. Based on independent benchmarks, Backboard’s memory systems are already leading the field. But “best” is not enough. We will not stop until “perfect” memory itself becomes outdated and the benchmarks must evolve.

2. The Debate That Drives Us

Building perfect memory is not as simple as storing everything forever. The deeper question is this: What should an intelligent system remember, and for how long?

Three schools of thought shape our work:

A. Infinite Memory

One belief is that AI should remember everything—every input, every decision, permanently stored and retrievable. It is a compelling vision of total continuity with no loss or drift.
But infinite memory comes with tradeoffs in privacy, ethics, compute, and relevance. Systems that remember everything risk losing focus and clarity.

B. Parametric Memory

Another view holds that all useful memory should live inside the model’s parameters. The future, in this view, is a model that simply knows. No external databases, no retrieval calls, no connectors.
It is elegant but static. A model that cannot forget or update easily cannot adapt to new realities.

C. Contextual Ephemerality

A third perspective suggests that memory should be temporary, context-scoped, and self-regulating. Systems should remember only what remains useful.
This design is efficient and ethical, but it risks losing the long-term continuity that makes intelligence coherent and personal.

3. Our Path: Adaptive Memory

We believe the right answer is not one extreme but balance.
Backboard is building adaptive memory architectures that know what to remember, what to compress, and what to release.

  • Short-Term Memory: Fast, ephemeral context for immediate reasoning.

  • Mid-Term Memory: Personalized threads that evolve with the user.

  • Long-Term Memory: Compressed, auditable archives for persistent knowledge.

The system does not just store information. It learns what deserves to persist.

4. The Future We’re Building

AI memory should not only be persistent. It should be self-aware able to reason about its own context and decide what matters.
When memory reaches that level, it will not simply match human cognition.
It will extend it.

That is the future we are building at Backboard.
And we are only getting started.


If you want to explore how Backboard’s memory systems work or test your own models against our benchmarks, sign up for free access!

All Posts

Announcements

Changelog

Changelog

Nov 10, 2025

Embedding Models Now Available in Model Library

We’re excited to announce that embedding models are now fully integrated into the Backboard Model Library. This update expands the power and flexibility of your AI stack—giving developers direct access to embeddings across multiple providers, right alongside LLMs.

What’s New

You can now:

  • Browse 12 embedding models from OpenAI, Google, and Cohere in the Model Library

  • Filter by provider, dimensions, and model type

  • Use embedding models directly when creating or configuring assistants

New API Endpoints

Developers can programmatically access embedding models using the following endpoints:

GET /api/models/embedding/all           # List all embedding models
GET /api/models/embedding/{model_name}  # Get details for a specific embedding model
GET /api/models/embedding/providers     # List available embedding providers
GET /api/models?model_type=embedding    # Filter models by type

Available Models

OpenAI (3 models)

  • text-embedding-3-large (3072 dims)

  • text-embedding-3-small (1536 dims)

  • text-embedding-ada-002 (1536 dims)

Google (3 models)

  • gemini-embedding-001-768

  • gemini-embedding-001-1536

  • gemini-embedding-001-3072

Cohere (6 models)

  • embed-v4.0 (256, 512, 1024, 1536 dims)

  • embed-english-v3.0

  • embed-multilingual-v3.0

How to Use

When creating an assistant, you can now specify embedding parameters directly:

{
  "name": "My Assistant",
  "embedding_provider": "openai",
  "embedding_model_name": "text-embedding-3-large"
}

The selected model must exist in the Model Library.

Compatibility

All updates are fully backward compatible—existing integrations and assistants continue to work without modification.

Embedding models open up new possibilities for retrieval, classification, search, and RAG workflows inside Backboard.

For support or questions, contact the Backboard team or visit backboard.io/docs.

Announcement

Nov 9, 2025

AI Memory vs RAG: What’s the Difference?

Every few weeks, someone on a dev forum asks, “Isn’t AI memory just RAG?”
It’s an understandable question. Both systems help models “remember” information but they do it in completely different ways.

Understanding the difference isn’t just semantics. It determines how your AI behaves, scales, and how close it gets to real contextual intelligence.

The Short Answer

RAG (Retrieval-Augmented Generation) is about looking up information.
AI memory is about retaining and evolving it.

What RAG Actually Does

RAG connects a model to an external knowledge source like a vector database. When a user asks a question, the system retrieves the most relevant context chunks, passes them to the model, and generates a response.

RAG excels at:

  • Knowledge retrieval: Answering questions from documents, PDFs, or web sources.

  • Reducing hallucinations: Providing evidence-based grounding.

  • Scalability: You can update or swap the data store without retraining the model.

But RAG doesn’t remember anything. Each query is stateless. If you close the chat, the system forgets you ever existed.

What AI Memory Actually Does

AI memory adds a temporal layer—it lets the model accumulate and recall experiences across interactions. Instead of pulling static documents, it reconstructs contextual continuity:

  • User context: remembers preferences, facts, tone, and history.

  • Environmental context: understands where, when, and how interactions happen.

  • Evolving knowledge: stores model-generated insights over time.

Where RAG looks outward for static truth, AI memory looks inward for continuity and learning. It’s how systems begin to behave like never-ending databases of experience, not just retrieval machines.

Why Developers Confuse Them

Many frameworks blur the line by calling vector stores “memory.” But a true contextual memory system does more than retrieve embeddings it writes, organizes, and retrieves context dynamically.

In a proper architecture, RAG and memory work together:

  • RAG augments knowledge with external retrieval.

  • Memory preserves conversational or operational continuity.

You could think of it like this:

RAG finds facts.
Memory builds relationships.

When to Use Each

Use RAG when:

  • You need to access large, changing knowledge bases.

  • You’re answering factual or reference-heavy queries.

  • Your system doesn’t need long-term personalization.

Use AI memory when:

  • You need stateful or evolving context (like ongoing conversations or agents).

  • You’re building assistants that should “learn” over time.

  • You want models that retain user identity, task history, and preferences.

And in most real systems? You’ll want both.

How Backboard Approaches It

At Backboard, we treat memory as an architecture, not a feature.
Our API allows developers to configure both retrieval pipelines (RAG) and stateful, portable memory layers—working together or independently.

Developers can tune:

  • Memory mode: Auto, Read-only, or Off.

  • Embedding models and vector dimensions: to optimize for precision and latency.

  • Persistence and portability: move your memory between models or agents instantly.

It’s how Backboard achieved record-setting contextual performance on the LoCoMo benchmark (90.1% accuracy) proving that true memory systems outperform retrieval-only setups.

The Takeaway

RAG and AI memory are complementary, not competitive.
Retrieval gives AI external knowledge.
Memory gives it personal history.
Combined, they create agents that are both informed and aware.

Announcement

Nov 3, 2025

Backboard Achieves Highest Score Ever on LoCoMo (90.1%)

New stateful memory architecture. Standard protocol. Fully reproducible.

A funny thing happened on the way to baseline our novel AI Memory architecture by using the industry standard benchmark, LoCoMo: We broke the record. And we did it with no gaming, no adjustments, just pure, reproducible execution!

Backboard scored 90.1 percent overall accuracy using the standard task set and GPT-4.1 as the LLM judge. Full results, category breakdowns, and latency are available below, along with a one-click script and API so anyone can replicate the run. This is now LIVE in our API so anyone can plug in and start testing.

Full Result set with replication scripts here: https://github.com/Backboard-io/Backboard-Locomo-Benchmark

About the Benchmark

LoCoMo was designed to test memory across many sessions, long dialogues, and time-dependent questions. It is widely used to evaluate whether systems truly remember and reason over long horizons. snap-research.github.io+2arxiv.org+2

How we compare

Recent public writeups place leading memory libraries around 67 to 69 percent on LoCoMo, and a simple Letta filesystem baseline around 74 percent. Backboard’s 90.1 percent suggests a material step forward for long-term conversational memory. We will maintain a live comparison table on our results page.


Method

Single-Hop(%)

Multi-Hop(%)

Open Domain(%)

Temporal(%)

Overall(%)

Backboard

89.36

75

91.2

91.9

90

Mem0

67.13

51.15

72.93

55.51

66.88

Mem0-Graph

65.71

47.19

75.71

58.13

68.44

LangMem

62.23

47.92

71.12

23.43

58.1

Zep

74.11

66.04

67.71

79.79

75.14

OpenAI

63.79

42.92

62.29

21.71

52.9

Memobase(v0.0.32)

63.83

52.08

71.82

80.37

70.91

Memobase(v0.0.37)

70.92

46.88

77.17

85.05

75.78

Best in Class in Every Measure


Reproducibility and transparency

  • Same dataset and task set as LoCoMo

  • GPT-4.1 LLM judge with fixed prompts and seed

  • Logs, prompts, and verdicts published for every question

Run it yourself in minutes using our public script or by calling the evaluation API.

If memory is the foundation of intelligence, transparency must be the foundation of benchmarks.

Get started

Build with Backboard today. Sign up takes under a minute.

References
LoCoMo benchmark overview and paper. snap-research.github.io+2arxiv.org+2

Announcement

Nov 1, 2025

The Philosophy of Memory: Why AI Should Remember Everything (and Sometimes Forget)

Artificial intelligence today suffers from an old human flaw: forgetfulness.
Every chat, every insight, every piece of context disappears when a session ends. Most systems act as if memory is an afterthought, a feature that can be added later.

At Backboard, we see it differently. We believe that memory is the foundation of intelligence. Without it, AI performs tasks but never truly understands. With it, AI becomes capable of reasoning, continuity, and growth.

1. What We Believe

AI systems should behave like never-ending databases that can store, retrieve, and reason over every fact, every nuance, and every context they have ever seen.
They should remember the user, the task, the environment, and the intention behind each interaction.

Human memory fades. AI memory should not. It should be more accurate, more contextual, and more persistent than ours. It should exist to augment human cognition, not imitate its limits.

That is the standard we hold ourselves to. Based on independent benchmarks, Backboard’s memory systems are already leading the field. But “best” is not enough. We will not stop until “perfect” memory itself becomes outdated and the benchmarks must evolve.

2. The Debate That Drives Us

Building perfect memory is not as simple as storing everything forever. The deeper question is this: What should an intelligent system remember, and for how long?

Three schools of thought shape our work:

A. Infinite Memory

One belief is that AI should remember everything—every input, every decision, permanently stored and retrievable. It is a compelling vision of total continuity with no loss or drift.
But infinite memory comes with tradeoffs in privacy, ethics, compute, and relevance. Systems that remember everything risk losing focus and clarity.

B. Parametric Memory

Another view holds that all useful memory should live inside the model’s parameters. The future, in this view, is a model that simply knows. No external databases, no retrieval calls, no connectors.
It is elegant but static. A model that cannot forget or update easily cannot adapt to new realities.

C. Contextual Ephemerality

A third perspective suggests that memory should be temporary, context-scoped, and self-regulating. Systems should remember only what remains useful.
This design is efficient and ethical, but it risks losing the long-term continuity that makes intelligence coherent and personal.

3. Our Path: Adaptive Memory

We believe the right answer is not one extreme but balance.
Backboard is building adaptive memory architectures that know what to remember, what to compress, and what to release.

  • Short-Term Memory: Fast, ephemeral context for immediate reasoning.

  • Mid-Term Memory: Personalized threads that evolve with the user.

  • Long-Term Memory: Compressed, auditable archives for persistent knowledge.

The system does not just store information. It learns what deserves to persist.

4. The Future We’re Building

AI memory should not only be persistent. It should be self-aware able to reason about its own context and decide what matters.
When memory reaches that level, it will not simply match human cognition.
It will extend it.

That is the future we are building at Backboard.
And we are only getting started.


If you want to explore how Backboard’s memory systems work or test your own models against our benchmarks, sign up for free access!