LLM/AI Chat app that runs locally in your browser

MrCakeBoss · March 28, 2026, 6:53pm

I realized last week that we can now use WebGPU and WASM to run computation heavy work directly in the browser. I’ve tried using this to run smaller LLMs on a MacBook Air this way and performance is reasonable.

This got me thinking about building an app that does the following:

Stores all your chats, documents, memory, projects, etc. locally
Runs local inference if supported (GPU, M3, or similar hardware)
Runs private inference otherwise (open-weights model hosted on a rented server)
Runs anonymous inference against frontier models on request (via API, hiding user data)

In my head this would give us the best of both worlds. We keep personal data private by default and keep our options open to use “more powerful AI” if we need or want to.

I’m pretty certain this can run most day-to-day tasks like research (web search), brainstorming, document processing (draft an email, etc.) locally. Image generation will likely be too much for now, but that’s why we can “run anonymously against frontier models” if we need/want.

Right now I’m on this roller coaster between “this is so cool, I should build it right now” and “ollama is good enough and who wants this in a browser anyway”. That’s why I decided to post here and ask:

Is this worth having? or am I the only one excited about “zero-setup local-first LLM”.

Topic		Replies	Views
I want to build an AI/LLM app that preserves privacy. Please roast my idea :) General	25	1242	April 14, 2026
Considering a Switch to Local LLMs from ChatGPT Questions	3	1402	July 9, 2025
Local llm chat Questions software	18	1847	February 22, 2024
Help : local AI Questions software	10	837	August 2, 2024
Ente made a local LLM chat General software	2	918	December 14, 2025

LLM/AI Chat app that runs locally in your browser

Related topics