Runs 100% on your machine

Powerful AI that
never phones home.

Chaty loads any .gguf model — plus MLX models natively on Apple Silicon — and runs it locally: chat, a private knowledge base, multi-round Deep Research, and voice. No account, no cloud, no telemetry. Your conversations stay on your disk.

Download for Mac Download for Windows

Free & open source · All releases ↗

Chaty Qwen3.5 · 14B

How does the retrieval pipeline rank chunks?

Thought for 3s

It runs hybrid retrieval — dense bge-m3 vectors plus BM25 keywords, fused with RRF, then MMR-diversified.1

Neighbor chunks are pulled in for context, and answers stay grounded in your sources.2

notes.pdf · p.4spec.md

Knowledge baseDeep Research

Message Chaty…

100% offline-capable
0 accounts & logins
No telemetry, ever
Open source

Runs Chaty Llama 3 Gemma 3 / 4 Qwen 3 · 3.5 · 3.6 any GGUF or MLX from Hugging Face

What's inside

A full AI workstation,
contained on your device.

Everything a cloud chatbot does — and several things it can't — without a single byte leaving your computer.

Local · Private · GPU

Any model. All local.
On your GPU.

Load any .gguf — or an MLX model folder on Apple Silicon — and talk to it; tokenizer and chat template come straight from the model. Nothing is uploaded, nothing is logged to a server. Cross-vendor Vulkan on Windows and Metal on Apple Silicon offload it to your GPU, auto-tuned to your VRAM with a graceful CPU fallback.

Persistent context + KV-cache reuse for fast multi-turn
Auto-fit context window with non-destructive summarization
Full sampling controls & saveable prompt presets

Code mode · Agent

A coding agent that
never leaves your machine.

Flip the Chat | Code switch, point Chaty at a folder, and describe the task. Your local model explores, edits, and verifies the project on its own — reading files, making precise edits, searching, running commands — every step shown live, all file access confined to the workspace, shell sandboxed on macOS. Since 2.0 it’s a platform too: connect any MCP tool server (or one-click a curated, live-certified entry), teach it procedures as markdown skills, and let it remember project facts across sessions — every addition sized so small local models still fit.

Live task plan, visible reasoning, per-action approval with real diffs
Skills (/init, /review, /fix …), @-file mentions, slash commands
Context ring with automatic compaction — long tasks don't overflow

Knowledge base · RAG

Answers grounded in your documents.

Drop in PDFs, Markdown, code, even scanned images — Chaty chunks and embeds them locally with multilingual bge-m3 vectors. When the knowledge base is on, the model answers only from what it retrieved, and says so plainly when something isn't covered.

Hybrid dense + keyword retrieval, fused and diversified
Inline citations with hover-preview of the source passage
Per-document scope — pick exactly which files are searched

Deep Research

From a question to a cited report.

Hand Chaty a topic and watch it work: it plans queries, runs several rounds of web search interleaved with its own reasoning about what's still missing, then synthesizes a structured long-form report. References list only the sources it actually cited — and you can export the whole thing to PDF.

Topic-anchored so results never drift off subject
Free, key-less search — resilient multi-provider chain
Live progress: planning → searching → reasoning → writing
One-click export to PDF or Markdown

Live mode

Talk to it. Hands-free.

Hold a real conversation out loud. An animated orb listens, thinks and answers — local speech-to-text in, local text-to-speech out. Silence auto-sends your turn; replies are read back sentence by sentence. Eleven voices, all on CPU, never touching the model's VRAM.

Continuous, hands-free voice conversation
Silence auto-send & sentence-by-sentence read-aloud
Runs fully offline — zero GPU cost

Deep-dive podcast

Your documents, as a two-host show.

NotebookLM-style audio: turn a knowledge base into a natural English conversation between two hosts, written by the model and read aloud by alternating female and male voices. Watch progress and time remaining, lock the app while it works, then export the finished episode as a WAV.

Two distinct hosts, grounded in your own sources
Live progress with estimated time remaining
Cancel anytime · export the audio to WAV

Rich chat canvas

It renders what the model makes.

A foldable think panel that follows the reasoning as it streams. KaTeX math, Mermaid diagrams, syntax-highlighted code with per-block copy, and a live HTML preview that even runs single-file web games. Full-text search and Markdown / JSON export across every conversation.

Mermaid & KaTeX inline
Playable HTML preview
Light · dark · system
Tray, hotkey · EN / 简体中文

Gallery

See Chaty at work.

Real screenshots — every pixel rendered locally, on the machine that ran the model.

Chaty Code mode — the agent searches GitHub, fetches source, and edits the project

Chaty chat rendering code, a table and a KaTeX formula with a foldable reasoning panel

Design Canvas — live preview beside the actual source code in a split studio

Chaty local knowledge base with indexed documents, report and podcast generation

Chaty Live mode — a glowing orb listening, hands-free voice conversation

Chaty settings with a live data dashboard of conversations, models and knowledge base

A coding agent that searches the web and edits your repo.

Streaming reasoning, code, tables and math — rendered live.

Design Canvas — live preview beside the actual source.

Answers grounded in your own PDFs, notes and code.

Hands-free voice — talk, and it talks back.

Four themes, code-highlight styles and a live data dashboard.

your machine

Privacy is the architecture

Your data never leaves your device.

This isn't a setting you toggle — it's how Chaty is built. The model, your chats, your documents, and your knowledge base all live in local storage on your computer. There's no account to create and no server to trust.

The only time the network is touched is when you ask for it: optional web search, or a one-time model download. Turn those off and Chaty is fully offline.

Platforms

Native on the hardware you already own.

macOS

Chip: Apple Silicon (M-series)
GPU: Metal · offload-all on unified memory
Package: Signed .dmg

Windows

Arch: x64 · Windows 10 / 11
GPU: Vulkan · cross-vendor, auto-tuned
Package: Per-user installer · no admin

Under the hood

Engine: llama.cpp · Rust
Shell: Tauri 2 · React
Storage: Local SQLite + vector store

Bring the model home.

Free, open source, and yours to run forever. Pick your platform — the download tracks the latest release.

Download for Mac Download for Windows

macOS is ad-hoc signed (not notarized). First launch needs one Gatekeeper step — see the FAQ.

Questions

Good to know.

Is Chaty really free?

Yes — Chaty is free and open source. You bring your own GGUF or MLX models (search, browse and download them in-app from the built-in Hugging Face store), and everything runs on your own machine. There's no subscription and no account.

Does it work fully offline?

Once you've downloaded a model, yes. The network is only used when you explicitly enable web search / Deep Research, or to fetch a model or the optional voice and embedding files. Disable those and Chaty never touches the internet.

Which models can I run?

Any .gguf file — and on Apple Silicon, any MLX model folder from mlx-community. Chaty even ships its own model — a Qwen3.5-4B fine-tune tuned for leaner single-file web design, installable in one click from the first-launch setup. There's also first-class handling for Llama 3, Gemma 3 / 4, and Qwen 3 / 3.5 / 3.6 (including thinking control), plus a robust template fallback chain so unusual community models still chat.

What are the hardware requirements?

A 64-bit Windows 10/11 PC or an Apple-Silicon Mac. Smaller quantized models run comfortably in 8 GB of RAM; larger models want more. Chaty auto-tunes GPU offload to your VRAM and refuses models that can't physically fit, so it won't freeze your system.

On macOS it says the app "can't be verified." What do I do?

Chaty is ad-hoc signed but not notarized (there's no paid Apple Developer account behind it). Clear the download quarantine once in Terminal:

xattr -dr com.apple.quarantine /Applications/Chaty.app

Then open it normally — or right-click the app, choose Open, and confirm via System Settings → Privacy & Security → Open Anyway.

Where are my conversations stored?

In a local SQLite database in your user app-data folder. Nothing is synced anywhere. Delete the app data and it's gone — it was only ever on your disk.

Powerful AI thatnever phones home.

A full AI workstation,contained on your device.

Any model. All local.On your GPU.

A coding agent thatnever leaves your machine.