Memory
Ash memory keeps conversation context usable across long chats and stores reusable facts when extraction is enabled.
Memory In 30 Seconds
Memory in Ash has three layers:
- recent conversation context (always)
- structured memory entries in the graph store
- semantic retrieval via embeddings (optional)
Use defaults first. Tune only if context quality or extraction behavior needs adjustment.
Quick Start
Start with this baseline:
[memory]context_token_budget = 100000recency_window = 10compaction_enabled = trueextraction_enabled = trueextraction_confidence_threshold = 0.7extraction_context_messages = 8extraction_verification_enabled = trueVerify memory tools are working:
uv run ash memory statsuv run ash memory listuv run ash memory search "project ideas"Configure Memory
Common options:
[memory]max_context_messages = 20 # Hard cap for recent message countcontext_token_budget = 100000 # Target token budget for assembled contextrecency_window = 10 # Always keep this many most-recent messagessystem_prompt_buffer = 8000 # Reserved budget for prompt/tools metadatacompaction_enabled = true # Summarize older history when over budgetauto_gc = true # Run cleanup on startupmax_entries = null # Optional cap for total stored memories
# Background extractionextraction_enabled = trueextraction_model = null # Optional model alias overrideextraction_min_message_length = 20extraction_debounce_seconds = 30extraction_context_messages = 8extraction_confidence_threshold = 0.7
# Verification pass for extracted factsextraction_verification_enabled = trueextraction_verification_model = null # alias or provider model; null -> defaultExtraction Model Patterns
Use a cheaper extractor with stronger verification:
[models.default]provider = "openai"model = "gpt-5.2"
[models.fast]provider = "openai"model = "gpt-5.2-mini"
[memory]extraction_model = "fast"extraction_verification_enabled = trueextraction_verification_model = "default"Use a provider-qualified verification model directly:
[memory]extraction_verification_enabled = trueextraction_verification_model = "openai:gpt-5.2"Use an unqualified provider model name (resolved against default provider):
[memory]extraction_verification_enabled = trueextraction_verification_model = "gpt-5.2"Enable Semantic Retrieval (Embeddings)
[embeddings]provider = "openai"model = "text-embedding-3-small"Troubleshooting
Memory search returns nothing
uv run ash memory listuv run ash memory search "your query"uv run ash memory statsIf store is empty, add a test fact and search again.
Extraction seems too noisy
Raise confidence threshold or reduce extraction frequency:
[memory]extraction_confidence_threshold = 0.85extraction_debounce_seconds = 60Important context gets dropped
Increase budget and recency window:
[memory]context_token_budget = 140000recency_window = 15Memory quality regresses after model changes
Pin extraction and verification models explicitly:
[memory]extraction_model = "mini"extraction_verification_model = "default"Reference (Advanced)
High-level runtime flow:
- Incoming message is written to context.
- History is selected with recency and token-budget rules.
- Semantic retrieval (if enabled) injects relevant stored memories.
- Background extraction can write new structured memories.
CLI surface:
uv run ash memory listuv run ash memory search "..."uv run ash memory show <id>uv run ash memory add -q "..."uv run ash memory remove <id>uv run ash memory gcuv run ash memory stats