[ Principles ]
Three rules, strictly kept.
- 01
The model lives here, not somewhere else.
Inference runs entirely on your device. No API key, no account, no silent telemetry. Airplane mode is a first-class runtime.
- 02
Many small minds, one shared draft.
A planner, a writer, a critic, a formatter. Each an isolated RWKV context. They hand the document between them until it's done.
- 03
Long-form output with real components.
Tables, code blocks, fenced diagrams, collapsible sections, citations. Structured writing — not a stream of chat bubbles.
[ The stack ]
Built for pockets,
not datacenters.
Agents
A small crew, each with a job.
Every message routes through a director that spawns specialized agents — planning, drafting, reviewing, formatting — and reassembles their output into a single document.
Local
No network required. Ever.
Crater uses rwkv_mobile_flutter to bridge to llama.cpp, MLX, CoreML, or QNN depending on the device. Weights live in your app sandbox. So do your chats.
Writing
Long-form is a first-class output.
Markdown with tables, fenced code, math, citations, and collapsible sections render inline as they stream. Chat mode remains, but writing is the point.
[ Get it ]
Runs here.
Writes here.
A native Flutter app with the inference engine compiled in. Android only for now — other platforms are on the way.
Backends
- llama.cpp
- qnn
- mlx
- coreml