Scroll
Local · 2026

[ Principles ]

Three rules, strictly kept.

  1. 01

    The model lives here, not somewhere else.

    Inference runs entirely on your device. No API key, no account, no silent telemetry. Airplane mode is a first-class runtime.

  2. 02

    Many small minds, one shared draft.

    A planner, a writer, a critic, a formatter. Each an isolated RWKV context. They hand the document between them until it's done.

  3. 03

    Long-form output with real components.

    Tables, code blocks, fenced diagrams, collapsible sections, citations. Structured writing — not a stream of chat bubbles.

[ The stack ]

Built for pockets, not datacenters.

Agents

A small crew, each with a job.

Every message routes through a director that spawns specialized agents — planning, drafting, reviewing, formatting — and reassembles their output into a single document.

Local

No network required. Ever.

Crater uses rwkv_mobile_flutter to bridge to llama.cpp, MLX, CoreML, or QNN depending on the device. Weights live in your app sandbox. So do your chats.

Writing

Long-form is a first-class output.

Markdown with tables, fenced code, math, citations, and collapsible sections render inline as they stream. Chat mode remains, but writing is the point.

[ Get it ]

Runs here.
Writes here.

A native Flutter app with the inference engine compiled in. Android only for now — other platforms are on the way.

Backends

  • llama.cpp
  • qnn
  • mlx
  • coreml