Pipeline

Ingestion Pipeline

Ingestion transforms raw events into durable, queryable Aletheia memory with deduplication and lineage.

Pipeline stages

  1. Validate payload and normalize text.
  2. Expand companion memories when configured.
  3. Embed each memory candidate.
  4. Run dedup against content hash and entity scope.
  5. Persist source-of-truth records.
  6. Update vector, lexical, and graph indexes.
  7. Emit ingest result with accepted/skipped counters.

Failure strategy

Index operations should be idempotent. If ingest crashes after durable write but before index completion, a background repair pass should re-index missing memory IDs.

Never treat a secondary index success as proof that durable write succeeded. Source-of-truth storage decides ground reality.

Recommended response shape

Ingest response
{
  "accepted": 12,
  "deduplicated": 3,
  "invalid": 0,
  "memory_ids": ["user-123::session-9::41", "user-123::session-9::42"]
}