npm install thoughtlayer
ThoughtLayer is a memory layer that actually retrieves. Vector search, keyword search, freshness decay, importance scoring. Local SQLite, no cloud required. Works with any LLM.
// Every session starts from zero context = loadAllFiles() // 847 files × ~200 tokens each // = 169,400 tokens per query // = $0.42 per question // At 10K files: breaks entirely agent.ask("what database are we using?") // Timeout. Context too large.
// Retrieves exactly what's needed results = thoughtlayer.query("what database are we using?") // Vector + keyword search // 3 results × ~100 tokens // = 300 tokens per query // = $0.0006 per question // Works at 10K, 100K, 1M entries → "Postgres with pgvector, chosen for JSON + embeddings" score: 0.89 (vec:0.72, fts:0.95)
Each stage contributes a different signal. The combination outperforms any single approach.
Public dataset. 50 entries, 40 queries, ground-truth labels. Reproduce it yourself.
Verified March 2026. ThoughtLayer runs on your machine; you only pay for embedding API calls.
| Provider | Cost | What you get |
|---|---|---|
| ThoughtLayer | ~$0.002/query | Unlimited entries. Runs locally. You own the data. |
| Mem0 Starter | $19/mo | 50K memories, 5K retrievals/mo, cloud-only |
| Mem0 Pro | $249/mo | Unlimited memories, 50K retrievals/mo |
| Zep Flex | $475/mo | 300K credits, cloud-only |
Three commands. Real results.
Three integration paths. Pick the one that fits your stack.
Persistent memory across agent sessions. Agents query ThoughtLayer for context before handling tasks, and auto-curate knowledge from conversations. Memory survives compaction, restarts, and session boundaries.
# In your agent's workspace thoughtlayer init thoughtlayer add --domain decisions \ --title "Database Choice" \ "Team decided on Postgres with pgvector \ for the v2 rewrite. No more MongoDB." # Agent queries before responding thoughtlayer query "what database are we using" → Database Choice (score: 0.91)
Drop bin/thoughtlayer-query into your agent workspace. Agents call it when they need context for a task, not on every turn. Saves tokens, returns only what's relevant.
Six tools exposed via the Model Context Protocol. Claude and Cursor can query, add, and curate knowledge directly. Your AI assistant remembers everything you've told it across sessions.
// claude_desktop_config.json { "mcpServers": { "thoughtlayer": { "command": "thoughtlayer-mcp", "env": { "THOUGHTLAYER_PROJECT_ROOT": "/path/to/project", "OPENAI_API_KEY": "sk-..." } } } }
Tools: thoughtlayer_query, thoughtlayer_add, thoughtlayer_curate, thoughtlayer_search, thoughtlayer_list, thoughtlayer_health. Resources expose all entries as browsable thoughtlayer:// URIs.
Programmatic API for custom agent frameworks. LangChain, AutoGen, CrewAI, or your own. Import, initialise, query.
import { ThoughtLayer } from 'thoughtlayer'; const memory = ThoughtLayer.load('.'); // Retrieve context for a task const results = await memory.query( "current infrastructure setup" ); // Auto-curate from conversation await memory.curate( "Team decided we're using Postgres \ for the new API backend." );
Also works as a CLI: thoughtlayer query, thoughtlayer add, thoughtlayer curate. Same engine, different interface.
The engine is free forever. Cloud adds team features and hosted infrastructure.