Authentication
All API requests require authentication using your secret API key. Include it in the Authorization header as a Bearer token:
Authorization: Bearer tb-your_api_key_here
Alternatively, you can use the x-api-key header:
x-api-key: tb-your_api_key_here
Base URL for all endpoints:
https://onboarding.tokyobrain.ai
Core Endpoints
POST /v1/store
Saves a new memory or system state into the agent's dedicated namespace.
Request Body:
{
"document": "Completed the integration of the Stripe payment module.",
"track": "history",
"metadata": {
"project": "myapp",
"priority": "high"
}
}
| Parameter | Type | Required | Description |
|---|---|---|---|
document | string | Yes | The memory text to store |
track | string | No | "state" (always load on session start) or "history" (search to load). Default: "history" |
metadata | object | No | Arbitrary key-value tags for filtering and boosting |
sync | boolean | No | Set to true for immediate confirmation. Default: false (async) |
Response (202 Accepted):
{
"ok": true,
"id": "01JQXYZ123ABC",
"status": "queued"
}
Writes are async by default for performance. Use "sync": true for immediate confirmation.
cURL example:
curl -X POST https://onboarding.tokyobrain.ai/v1/store \
-H "Authorization: Bearer tb-your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"document": "Project uses FastAPI + PostgreSQL",
"track": "state",
"metadata": {"project": "myapp", "priority": "high"}
}'
POST /v1/recall
Retrieves semantically relevant memories based on the agent's current context. Queries pass through the full 10-layer recall pipeline.
Request Body:
{
"query": "What payment gateways have we discussed?",
"format": "structured",
"topK": 5
}
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Natural language search query |
format | string | No | "raw" (default JSON) or "structured" (human-readable with timestamps) |
topK | number | No | Max results to return. Default: 15 |
collections | string[] | No | Specific collections to search. Default: all |
Response -- format: "structured" (200 OK):
{
"results": [
"[2026-04-03 | myapp | high] Completed Stripe payment module integration.",
"[2025-12-01 | myapp | medium] Discussed PayPal but decided to hold off."
],
"_latency": 234
}
Response -- format: "raw" (200 OK):
{
"hot": [],
"warm": [
{
"id": "01JQXYZ123ABC",
"document": "Completed Stripe payment module integration.",
"metadata": {
"ts": 1743638400000,
"project": "myapp",
"priority": "high"
},
"distance": 0.12,
"collection": "nexus_knowledge"
}
],
"_latency": 234
}
quality: "curated" or priority: "highest" get 45% distance reduction.priority: "high" gets 30% reduction.cURL example:
curl -X POST https://onboarding.tokyobrain.ai/v1/recall \
-H "Authorization: Bearer tb-your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"query": "what framework does this project use?",
"format": "structured",
"topK": 3
}'
POST /v1/forget
Performs hard physical deletion across all three storage layers (Hot/Warm/Cold). GDPR compliant.
Method A: Delete by specific IDs
{
"ids": ["01JQXYZ123ABC", "01JQXYZ456DEF"]
}
Method B: Delete by metadata filter
{
"where": {
"project": "old-project"
}
}
Method C: Delete by collection + filter
{
"collection": "nexus_daily",
"where": {
"ts": {"$lt": 1700000000000}
}
}
Response (200 OK):
{
"ok": true,
"deleted": {
"hot": 0,
"warm": 12,
"cold": 3
}
}
GET /v1/health
Returns system status and your usage for the current billing cycle. No authentication required.
Response (200 OK):
{
"ok": true,
"version": "3.0.0",
"hot": "PONG",
"warm": "OK",
"cold": "OK",
"usage": {
"api_calls_this_month": 1450,
"storage_count": 8234,
"limit": {
"recall": 10000,
"store": 10000
}
}
}
cURL example:
curl https://onboarding.tokyobrain.ai/v1/health
Python SDK
The official Python SDK wraps all endpoints and handles authentication, retries, and error handling.
Installation
pip install tokyo-brain
Quick Start
from tokyo_brain import TokyoBrain
brain = TokyoBrain(api_key="tb-your_api_key_here")
# Store a memory
brain.store("Oscar rode his bike for the first time today")
# Store with metadata and track
brain.store(
"Project uses FastAPI + PostgreSQL",
track="state",
metadata={"project": "myapp", "priority": "high"}
)
# Recall with full 10-layer pipeline
results = brain.recall("What happened with Oscar recently?")
for r in results:
print(r)
# Recall with options
results = brain.recall(
"what framework?",
format="structured",
top_k=3
)
# Forget by metadata filter
brain.forget(where={"project": "old-project"})
# Forget by IDs
brain.forget(ids=["01JQXYZ123ABC"])
Environment Variable
You can also set your API key via environment variable instead of passing it directly:
export TOKYO_BRAIN_API_KEY=tb-your_api_key_here
from tokyo_brain import TokyoBrain brain = TokyoBrain() # reads from TOKYO_BRAIN_API_KEY
PyPI: tokyo-brain
LangChain Integration
Already using LangChain? Swap in Tokyo Brain memory with two lines. Your chain code stays exactly the same.
# Before (goldfish memory): from langchain.memory import ConversationBufferMemory memory = ConversationBufferMemory() # After (10-layer brain with subconscious): from tokyo_brain.langchain import TokyoBrainMemory memory = TokyoBrainMemory(api_key="tb-your_api_key_here") # That's it. Your chain code stays exactly the same.
As a Retriever (RAG chains)
from tokyo_brain.langchain import TokyoBrainRetriever
retriever = TokyoBrainRetriever(api_key="tb-your_api_key_here", top_k=5)
# Use in any LangChain RetrievalQA chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),
retriever=retriever
)
As ChatMessageHistory (persistent sessions)
from tokyo_brain.langchain import TokyoBrainChatHistory history = TokyoBrainChatHistory(api_key="tb-your_api_key_here") # Works with RunnableWithMessageHistory, ConversationChain, etc.
Error Codes
| HTTP Status | Error | Description |
|---|---|---|
| 400 | bad_request | Missing required parameters |
| 401 | unauthorized | Invalid or missing API key |
| 403 | forbidden | Insufficient permissions for this operation |
| 404 | not_found | Endpoint or resource not found |
| 429 | rate_limited | Too many requests. Retry after Retry-After header |
| 500 | internal_error | Server error -- contact support |
Error response format:
{
"ok": false,
"error": "unauthorized",
"message": "Invalid or missing API key. Keys start with tb-"
}
Rate Limits
| Plan | Price | Store / month | Recall / month |
|---|---|---|---|
| Free | $0 | 100 | 100 |
| Pro | $9/mo | 10,000 | 10,000 |
| Fleet | $49/mo | Unlimited | Unlimited |
When you hit your monthly limit, the API returns 429 with a Retry-After header. Upgrade your plan or wait for the next billing cycle.
Check your current usage at any time via the /v1/health endpoint.
Architecture
Tokyo Brain uses a three-tier memory architecture. All data is encrypted at rest with AES-256-GCM envelope encryption.
| Tier | Technology | Purpose |
|---|---|---|
| Hot | Redis 7.2 | Real-time state, session context (TTL-based) |
| Warm | ChromaDB (Vector DB) | Semantic search, long-term memory |
| Cold | Neo4j (Graph DB) | Entity relationships, knowledge graph |
Store Path
Input -> Sanitizer -> Emotional Salience -> Fact Extraction
-> BGE-m3 Embedding -> ChromaDB -> Entropy Monitor
Recall Path (10-Layer Pipeline)
Query -> Expansion -> Entity Link -> Temporal Parse
-> Multi-Collection Search -> Curated Boost -> Time Decay
-> Emotional Boost -> Temporal Filter -> Re-rank -> Dedup
Background Processes
For a deep dive into the architecture and benchmark results, read our engineering blog post.