Beyond Vector Similarity: Neural-Symbolic Extraction for Agentic Memory

How Aletheia combines BERT-based Neural Extraction with Deterministic Logic to build a more reliable memory engine for AI agents.

Neural ExtractionKnowledge GraphMetric VaultAI Agents

Beyond Vector Similarity: Neural-Symbolic Extraction for Agentic Memory

Most RAG (Retrieval-Augmented Generation) systems rely on a single primitive: Vector Similarity. You embed a chunk of text, store it in a database, and pull it back when a similar query arrives.

But human memory doesn't just work by "vibe check." We extract structured facts, recognize people and places, and track numeric changes over time.

Today, we're introducing Aletheia's Neural-Symbolic Extraction Pipeline—a major upgrade to the memory engine that moves beyond simple text matching into structured knowledge management.

The Problem with Pure Neural Memory

If you tell an agent, "I spent $50 on dinner today," a standard vector database will retrieve that chunk when you ask about spending. But if you ask, "How much have I spent on dinner this week?" the LLM has to:

Retrieve every single chunk where you mentioned dinner.
Hope it didn't miss any distractors.
Perform the arithmetic correctly.

This is a brittle process. LLMs are notoriously bad at exhaustive retrieval and precise math in long-context windows.

The Aletheia Solution: Neural-Symbolic Fusion

Aletheia now implements a multi-stage extraction pipeline that runs during ingestion:

1. Neural Entity Extraction (BERT-NER)

We've integrated a local BERT-based Named Entity Recognition (NER) model. Every time a memory is ingested, Aletheia automatically identifies:

People (PER): Tracking who the user interacts with.
Organizations (ORG): Building a map of companies and teams.
Locations (LOC): Understanding the user's geographic context.

2. Autonomous Relationship Discovery

Extraction is useless without connection. Aletheia uses relationship heuristics to build a Knowledge Graph automatically. If you mention a person and a company in the same breath, Aletheia creates an associated_with edge. If you express a preference ("I love jasmine tea"), it triggers Implicit Preference Detection, creating a has_preference link that is exempt from the standard time-decay policies.

3. The Metric Vault (Deterministic Logic)

For the "math" problem, we've added the Metric Vault. Alongside the neural embedding, Aletheia runs a suite of deterministic regex extractors to pull out:

Currency: $50, 100 EUR, etc.
Distance: 5 miles, 10km.
Counts: "3 times", "2 people".

These values are stored in a specialized B-Tree index, allowing for O(log N) deterministic aggregation. When you query the Analytics API, Aletheia doesn't "guess" the total—it computes it with absolute precision.

Why This Matters

By moving structure-extraction into the memory kernel, we give AI agents a "System 2" for their memory.

Perfect Accuracy: Deterministic aggregation for numbers.
Deep Context: A graph of relationships that survives model swaps.
Smarter Decay: Preferences stay fresh forever, while conversational noise fades away.

The complete memory picture

Neural-symbolic extraction is one layer of a broader memory stack. Hybrid retrieval ensures extracted facts can be found by both semantic and exact queries. Deterministic aggregation computes the numbers the Metric Vault stores. Fact supersession keeps the knowledge graph clean when relationships change.

Memory is not just a bag of words. It's a structured understanding of reality.

Check out the Analytics API and the knowledge graph documentation to start building math-perfect agents.

Browse the journal

Beyond Vector Similarity: Neural-Symbolic Extraction for Agentic Memory

The Problem with Pure Neural Memory

The Aletheia Solution: Neural-Symbolic Fusion

1. Neural Entity Extraction (BERT-NER)

2. Autonomous Relationship Discovery

3. The Metric Vault (Deterministic Logic)

Why This Matters

The complete memory picture

Related posts

OpenAI-Compatible Memory Proxy: Drop-In Persistent Memory for Existing Agents

Knowledge Graph Memory for AI Agents: Why Relationships Matter as Much as Facts

AI Agent Memory at Scale: From Prototype to Production