Memory
Clawdia memory is plain Markdown in the agent workspace. The files are the source of truth; the model only “remembers” what gets written to disk. Memory search tools are provided by the active memory plugin (default:memory-core). Disable memory plugins with plugins.slots.memory = "none".
Memory files (Markdown)
The default workspace layout uses two memory layers:memory/YYYY-MM-DD.md- Daily log (append-only).
- Read today + yesterday at session start.
MEMORY.md(optional)- Curated long-term memory.
- Only load in the main, private session (never in group contexts).
agents.defaults.workspace, default
~/clawd). See Agent workspace for the full layout.
When to write memory
- Decisions, preferences, and durable facts go to
MEMORY.md. - Day-to-day notes and running context go to
memory/YYYY-MM-DD.md. - If someone says “remember this,” write it down (do not keep it in RAM).
- This area is still evolving. It helps to remind the model to store memories; it will know what to do.
- If you want something to stick, ask the bot to write it into memory.
Automatic memory flush (pre-compaction ping)
When a session is close to auto-compaction, Clawdia triggers a silent, agentic turn that reminds the model to write durable memory before the context is compacted. The default prompts explicitly say the model may reply, but usuallyNO_REPLY is the correct response so the user never sees this turn.
This is controlled by agents.defaults.compaction.memoryFlush:
- Soft threshold: flush triggers when the session token estimate crosses
contextWindow - reserveTokensFloor - softThresholdTokens. - Silent by default: prompts include
NO_REPLYso nothing is delivered. - Two prompts: a user prompt plus a system prompt append the reminder.
- One flush per compaction cycle (tracked in
sessions.json). - Workspace must be writable: if the session runs sandboxed with
workspaceAccess: "ro"or"none", the flush is skipped.
Vector memory search
Clawdia can build a small vector index overMEMORY.md and memory/*.md so
semantic queries can find related notes even when wording differs.
Defaults:
- Enabled by default.
- Watches memory files for changes (debounced).
- Uses remote embeddings by default. If
memorySearch.provideris not set, Clawdia auto-selects:localif amemorySearch.local.modelPathis configured and the file exists.openaiif an OpenAI key can be resolved.geminiif a Gemini key can be resolved.- Otherwise memory search stays disabled until configured.
- Local mode uses node-llama-cpp and may require
pnpm approve-builds. - Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
models.providers.*.apiKey, or environment
variables. Codex OAuth only covers chat/completions and does not satisfy
embeddings for memory search. For Gemini, use GEMINI_API_KEY or
models.providers.google.apiKey. When using a custom OpenAI-compatible endpoint,
set memorySearch.remote.apiKey (and optional memorySearch.remote.headers).
Gemini embeddings (native)
Set the provider togemini to use the Gemini embeddings API directly:
remote.baseUrlis optional (defaults to the Gemini API base URL).remote.headerslets you add extra headers if needed.- Default model:
gemini-embedding-001.
remote configuration with the OpenAI provider:
memorySearch.provider = "local" or set
memorySearch.fallback = "none".
Fallbacks:
memorySearch.fallbackcan beopenai,gemini,local, ornone.- The fallback provider is only used when the primary embedding provider fails.
- Enabled by default for OpenAI and Gemini embeddings. Set
agents.defaults.memorySearch.remote.batch.enabled = falseto disable. - Default behavior waits for batch completion; tune
remote.batch.wait,remote.batch.pollIntervalMs, andremote.batch.timeoutMinutesif needed. - Set
remote.batch.concurrencyto control how many batch jobs we submit in parallel (default: 2). - Batch mode applies when
memorySearch.provider = "openai"or"gemini"and uses the corresponding API key. - Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
- See the OpenAI Batch API docs and pricing for details:
memory_search— returns snippets with file + line ranges.memory_get— read memory file content by path.
- Set
agents.defaults.memorySearch.provider = "local". - Provide
agents.defaults.memorySearch.local.modelPath(GGUF orhf:URI). - Optional: set
agents.defaults.memorySearch.fallback = "none"to avoid remote fallback.
How the memory tools work
memory_searchsemantically searches Markdown chunks (~400 token target, 80-token overlap) fromMEMORY.md+memory/**/*.md. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.memory_getreads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outsideMEMORY.md/memory/are rejected.- Both tools are enabled only when
memorySearch.enabledresolves true for the agent.
What gets indexed (and when)
- File type: Markdown only (
MEMORY.md,memory/**/*.md). - Index storage: per-agent SQLite at
~/.clawdia/memory/<agentId>.sqlite(configurable viaagents.defaults.memorySearch.store.path, supports{agentId}token). - Freshness: watcher on
MEMORY.md+memory/marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync. - Reindex triggers: the index stores the embedding provider/model + endpoint fingerprint + chunking params. If any of those change, Clawdia automatically resets and reindexes the entire store.
Hybrid search (BM25 + vector)
When enabled, Clawdia combines:- Vector similarity (semantic match, wording can differ)
- BM25 keyword relevance (exact tokens like IDs, env vars, code symbols)
Why hybrid?
Vector search is great at “this means the same thing”:- “Mac Studio gateway host” vs “the machine running the gateway”
- “debounce file updates” vs “avoid indexing on every write”
- IDs (
a828e60,b3b9895a…) - code symbols (
memorySearch.query.hybrid) - error strings (“sqlite-vec unavailable”)
How we merge results (the current design)
Implementation sketch:- Retrieve a candidate pool from both sides:
- Vector: top
maxResults * candidateMultiplierby cosine similarity. - BM25: top
maxResults * candidateMultiplierby FTS5 BM25 rank (lower is better).
- Convert BM25 rank into a 0..1-ish score:
textScore = 1 / (1 + max(0, bm25Rank))
- Union candidates by chunk id and compute a weighted score:
finalScore = vectorWeight * vectorScore + textWeight * textScore
vectorWeight+textWeightis normalized to 1.0 in config resolution, so weights behave as percentages.- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
- If FTS5 can’t be created, we keep vector-only search (no hard failure).
Embedding cache
Clawdia can cache chunk embeddings in SQLite so reindexing and frequent updates (especially session transcripts) don’t re-embed unchanged text. Config:Session memory search (experimental)
You can optionally index session transcripts and surface them viamemory_search.
This is gated behind an experimental flag.
- Session indexing is opt-in (off by default).
- Session updates are debounced and indexed asynchronously once they cross delta thresholds (best-effort).
memory_searchnever blocks on indexing; results can be slightly stale until background sync finishes.- Results still include snippets only;
memory_getremains limited to memory files. - Session indexing is isolated per agent (only that agent’s session logs are indexed).
- Session logs live on disk (
~/.clawdia/agents/<agentId>/sessions/*.jsonl). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
SQLite vector acceleration (sqlite-vec)
When the sqlite-vec extension is available, Clawdia stores embeddings in a SQLite virtual table (vec0) and performs vector distance queries in the
database. This keeps search fast without loading every embedding into JS.
Configuration (optional):
enableddefaults to true; when disabled, search falls back to in-process cosine similarity over stored embeddings.- If the sqlite-vec extension is missing or fails to load, Clawdia logs the error and continues with the JS fallback (no vector table).
extensionPathoverrides the bundled sqlite-vec path (useful for custom builds or non-standard install locations).
Local embedding auto-download
- Default local embedding model:
hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf(~0.6 GB). - When
memorySearch.provider = "local",node-llama-cppresolvesmodelPath; if the GGUF is missing it auto-downloads to the cache (orlocal.modelCacheDirif set), then loads it. Downloads resume on retry. - Native build requirement: run
pnpm approve-builds, picknode-llama-cpp, thenpnpm rebuild node-llama-cpp. - Fallback: if local setup fails and
memorySearch.fallback = "openai", we automatically switch to remote embeddings (openai/text-embedding-3-smallunless overridden) and record the reason.
Custom OpenAI-compatible endpoint example
remote.*takes precedence overmodels.providers.openai.*.remote.headersmerge with OpenAI headers; remote wins on key conflicts. Omitremote.headersto use the OpenAI defaults.
