What Is Hybrid Memory Search?
OpenClaw agents can search their own memory notes using hybrid search β a combination of:
- Vector search (semantic): Finds notes with similar meaning using sentence embeddings
- Text search (lexical): Finds notes with matching keywords using full-text indexing
By blending both approaches, hybrid search retrieves results that are both semantically relevant and keyword-precise. This is critical for long-running agents that accumulate knowledge across many sessions.
Architecture Overview
Agent query β Hybrid search engine
ββ Vector path (0.7 weight)
β ββ all-MiniLM-L6-v2 embeddings
β ββ Cosine similarity ranking
ββ Text path (0.3 weight)
ββ Full-text keyword matching
β Merged & re-ranked candidates
β Top results returned to agentConfiguration Walkthrough
All settings live under agents.defaults.memorySearch. Hereβs the complete setup from a real Azure deployment:
Step 1: Set the Provider to Local
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.provider localThe local provider runs embeddings on the gateway container itself β no external API calls, no data leaving your VM.
Step 2: Choose the Embedding Model
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.model all-MiniLM-L6-v2all-MiniLM-L6-v2 is a sentence-transformer model optimized for:
- Fast inference on CPU (important when running on an Azure B2s VM)
- 384-dimensional embeddings (compact vector size)
- Good quality for short-to-medium text passages
Step 3: Set the Model Path
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.local.modelPath \
sentence-transformers/all-MiniLM-L6-v2This tells the local provider where to find (or download) the model. On first run, the gateway downloads it from the Hugging Face model hub.
Step 4: Enable Hybrid Search
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.query.hybrid.enabled trueStep 5: Tune the Search Weights
The vector and text weights control how much each search path contributes to the final ranking:
# Semantic similarity gets 70% weight
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.query.hybrid.vectorWeight 0.7
# Keyword matching gets 30% weight
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.query.hybrid.textWeight 0.3| Weight Split | Best For |
|---|---|
| 0.9 vector / 0.1 text | Conversational queries, fuzzy recall |
| 0.7 vector / 0.3 text | General-purpose (recommended) |
| 0.5 vector / 0.5 text | Balanced, when exact terms matter |
| 0.3 vector / 0.7 text | Technical docs with specific jargon |
Step 6: Set the Candidate Multiplier
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.query.hybrid.candidateMultiplier 4The candidate multiplier controls how many raw candidates each search path retrieves before fusion. With a multiplier of 4, if you request 10 results, each path fetches 40 candidates β giving the re-ranker more material to work with.
Step 7: Enable the Embedding Cache
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.cache.enabled true
docker compose run --rm openclaw-cli config set \
agents.defaults.memorySearch.cache.maxEntries 50000The cache stores computed embeddings so identical queries donβt need re-embedding. With 50000 entries and 384-dimensional vectors, the cache uses roughly 50,000 Γ 384 Γ 4 bytes β 73 MB. Thatβs well within the 2 GB RAM of an Azure B2s VM.
Step 8: Restart the Gateway
docker compose restart openclaw-gatewayGateway Log Verification
After applying all settings, check the gateway logs for config reload confirmations:
docker logs openclaw-openclaw-gateway-1 | grep -i "memorySearch"You should see sequential reload entries for each setting:
[reload] config change detected; evaluating reload
(meta.lastTouchedAt, agents.defaults.memorySearch)
[reload] config change applied (dynamic reads:
meta.lastTouchedAt, agents.defaults.memorySearch)
[reload] config change detected; evaluating reload
(meta.lastTouchedAt, agents.defaults.memorySearch.model)
[reload] config change applied (dynamic reads:
meta.lastTouchedAt, agents.defaults.memorySearch.model)
[reload] config change detected; evaluating reload
(meta.lastTouchedAt, agents.defaults.memorySearch.local)
...Each pair confirms the gateway detected and applied the change dynamically.
Full Config JSON
After completing all steps, the config section looks like:
{
"agents": {
"defaults": {
"memorySearch": {
"provider": "local",
"model": "all-MiniLM-L6-v2",
"local": {
"modelPath": "sentence-transformers/all-MiniLM-L6-v2"
},
"query": {
"hybrid": {
"enabled": true,
"vectorWeight": 0.7,
"textWeight": 0.3,
"candidateMultiplier": 4
}
},
"cache": {
"enabled": true,
"maxEntries": 50000
}
}
}
}
}Understanding the all-MiniLM-L6-v2 Model
| Property | Value |
|---|---|
| Architecture | 6-layer MiniLM (distilled BERT) |
| Output dimensions | 384 |
| Max sequence length | 256 tokens |
| Model size | ~80 MB |
| Speed | ~14,000 sentences/sec on CPU |
| Quality | Competitive with larger models for retrieval tasks |
The model is downloaded once and cached inside the Docker containerβs filesystem. On an Azure B2s VM with 2 vCPUs, expect first-run download to take 30β60 seconds.
Performance Tuning Tips
For Small Memory Stores (< 1,000 notes)
candidateMultiplier: 2is sufficient- Cache may not provide significant benefit
- Text weight can be higher for precision
For Large Memory Stores (> 10,000 notes)
candidateMultiplier: 4-8improves recall- Enable cache with generous
maxEntries - Higher vector weight helps with semantic similarity
Memory Usage on Azure B2s
| Component | Approximate RAM |
|---|---|
| Gateway base | ~200 MB |
| MiniLM-L6-v2 model | ~80 MB |
| Embedding cache (50K entries) | ~73 MB |
| SQLite index | ~10β50 MB |
| Total | ~363β403 MB |
This leaves roughly 1.6 GB for the OS and other containers β comfortable but monitor with docker stats.
Troubleshooting
Model download fails:
- Check internet connectivity from the container:
docker exec -it openclaw-openclaw-gateway-1 sh -lc 'wget -q --spider https://huggingface.co && echo OK' - Verify DNS resolution works inside the container
- Pre-download the model and mount it as a Docker volume
Embeddings seem wrong:
- Ensure
modelandlocal.modelPathmatch (both should referenceall-MiniLM-L6-v2) - Clear the cache and restart:
docker compose run --rm openclaw-cli config set agents.defaults.memorySearch.cache.enabled falsethen re-enable
Search returns no results:
- Verify memory notes exist:
docker exec -it openclaw-openclaw-gateway-1 sh -lc 'ls -la /home/node/.openclaw/memory/notes' - Check that the SQLite database was created (see the memory store article)

