Memory
The memory system provides persistent storage and semantic retrieval. Conversation history is stored as JSONL session files, while extracted facts are stored in SQLite with vector embeddings.
Overview
| Component | Location | Purpose |
|---|---|---|
| MemoryStore | memory/store.py | CRUD operations |
| Embeddings | memory/embeddings.py | Vector generation |
| Retrieval | memory/retrieval.py | Semantic search |
Configuration
[memory]database_path = "~/.ash/memory.db"max_context_messages = 20context_token_budget = 100000recency_window = 10system_prompt_buffer = 8000auto_gc = truemax_entries = nullCore Options
| Option | Type | Default | Description |
|---|---|---|---|
database_path | path | "~/.ash/memory.db" | SQLite database path |
max_context_messages | int | 20 | Maximum messages in context |
context_token_budget | int | 100000 | Target context window size |
recency_window | int | 10 | Always keep last N messages |
system_prompt_buffer | int | 8000 | Reserved tokens for system prompt |
auto_gc | bool | true | Run garbage collection on server startup |
max_entries | int | null | Cap on active memories (null = unlimited) |
Compaction
When context grows too large, Ash summarizes old messages instead of dropping them:
[memory]compaction_enabled = truecompaction_reserve_tokens = 16384compaction_keep_recent_tokens = 20000compaction_summary_max_tokens = 2000| Option | Type | Default | Description |
|---|---|---|---|
compaction_enabled | bool | true | Enable context compaction |
compaction_reserve_tokens | int | 16384 | Buffer before triggering compaction |
compaction_keep_recent_tokens | int | 20000 | Always keep recent context |
compaction_summary_max_tokens | int | 2000 | Max tokens for summary |
Memory Extraction
Ash can automatically extract facts from conversations and store them as memories:
[memory]extraction_enabled = trueextraction_model = nullextraction_min_message_length = 20extraction_debounce_seconds = 30extraction_confidence_threshold = 0.7| Option | Type | Default | Description |
|---|---|---|---|
extraction_enabled | bool | true | Enable auto memory extraction |
extraction_model | string | null | Model for extraction (null = default) |
extraction_min_message_length | int | 20 | Skip short messages |
extraction_debounce_seconds | int | 30 | Min seconds between extractions |
extraction_confidence_threshold | float | 0.7 | Minimum confidence for storing |
Database Schema
Sessions
Conversations grouped by provider and chat:
CREATE TABLE sessions ( id TEXT PRIMARY KEY, provider TEXT NOT NULL, chat_id TEXT NOT NULL, user_id TEXT NOT NULL, created_at TIMESTAMP, updated_at TIMESTAMP, metadata JSON);Messages
Individual messages within sessions:
CREATE TABLE messages ( id TEXT PRIMARY KEY, session_id TEXT REFERENCES sessions(id), role TEXT NOT NULL, content TEXT NOT NULL, created_at TIMESTAMP, token_count INTEGER, metadata JSON);Memories
Persistent knowledge entries:
CREATE TABLE memories ( id TEXT PRIMARY KEY, content TEXT NOT NULL, source TEXT, created_at TIMESTAMP, expires_at TIMESTAMP, owner_user_id TEXT, metadata JSON);Vector Tables
Embeddings stored via sqlite-vec:
CREATE VIRTUAL TABLE message_embeddings USING vec0( message_id TEXT PRIMARY KEY, embedding FLOAT[1536]);
CREATE VIRTUAL TABLE memory_embeddings USING vec0( memory_id TEXT PRIMARY KEY, embedding FLOAT[1536]);Context Management
Ash uses smart pruning to fit conversations within token limits:
- Recency window - Last N messages are always included
- Token budget - Older messages pruned to fit budget
- System prompt buffer - Space reserved for instructions
During agent processing:
- Query embedding is generated
- Relevant memories are retrieved
- Memories are injected into system prompt
memories = await retriever.retrieve(user_message, limit=5)context = format_memories(memories)system_prompt = f"{base_prompt}\n\nRelevant memories:\n{context}"Components
Memory Store
Location: src/ash/memory/store.py
class MemoryStore: async def add_memory(self, content: str, **metadata) -> Memory: """Store a new memory."""
async def get_memory(self, memory_id: str) -> Memory | None: """Retrieve a memory by ID."""
async def search_memories( self, query: str, limit: int = 10, ) -> list[Memory]: """Semantic search for relevant memories."""
async def delete_memory(self, memory_id: str) -> bool: """Delete a memory."""Embedding Generation
Location: src/ash/memory/embeddings.py
class EmbeddingGenerator: async def embed(self, texts: list[str]) -> list[list[float]]: """Generate embeddings using configured model."""Uses OpenAI’s embedding API via the LLM provider.
Semantic Search
Location: src/ash/memory/retrieval.py
class MemoryRetriever: async def retrieve( self, query: str, limit: int = 5, ) -> list[Memory]: """Find memories similar to query."""Uses sqlite-vec for vector similarity search:
SELECT m.*, vec_distance_cosine(e.embedding, ?) as distanceFROM memories mJOIN memory_embeddings e ON m.id = e.memory_idORDER BY distance ASCLIMIT ?CLI Commands
Managing Memories
# List stored memoriesuv run ash memory list
# Search memoriesuv run ash memory search -q "project ideas"
# Add a memoryuv run ash memory add -q "Remember to check logs daily"
# Remove a memoryuv run ash memory remove --id <uuid>
# View statisticsuv run ash memory stats
# Run garbage collectionuv run ash memory gcManaging Sessions
# View sessionsuv run ash sessions list
# Search message historyuv run ash sessions search -q "keyword"
# Export a sessionuv run ash sessions export -o backup.jsonDatabase Operations
# Run migrations after updatesuv run ash db migrate
# Check migration statusuv run ash db statusEmbeddings Configuration
Embeddings enable semantic search for memories and messages.
[embeddings]provider = "openai"model = "text-embedding-3-small"| Option | Type | Default | Description |
|---|---|---|---|
provider | string | "openai" | Embedding provider |
model | string | "text-embedding-3-small" | Model name |
Supported Models
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small | 1536 | Recommended, cost-effective |
text-embedding-3-large | 3072 | Higher quality |
text-embedding-ada-002 | 1536 | Legacy model |
Disabling Embeddings
Omit the [embeddings] section to disable semantic search:
# No [embeddings] section = disabledMemory search will fall back to text matching.
Full Example
[memory]database_path = "~/.ash/memory.db"
# Context managementmax_context_messages = 30context_token_budget = 150000recency_window = 15system_prompt_buffer = 10000
# Compactioncompaction_enabled = truecompaction_reserve_tokens = 20000compaction_keep_recent_tokens = 25000
# Extractionextraction_enabled = trueextraction_confidence_threshold = 0.8extraction_debounce_seconds = 60
# Maintenanceauto_gc = truemax_entries = 1000
[embeddings]provider = "openai"model = "text-embedding-3-small"