I want to build an AI/LLM app that preserves privacy. Please roast my idea :)

overdrawn98901 · April 4, 2026, 1:14am

Correct, remote attestation is the actual solution to solve against bad actors (since when do bad actors follow rules). Should have better clarified that.

It does scare away corporations who don’t want to touch AGPL code with a 10 foot pole, and for that reason I suppose it also detracts those who would monetize based on that data. A heuristic, not a guarantee.

bitsondatadev · April 4, 2026, 9:39pm

@MrCakeBoss did you see this? I sent it and then the license can of worms got opened and I wanted to make sure you didn’t miss it. It seemed close to what you were describing on building unless I misunderstood. Maybe implementation and feature mismatch or two but just wanted to k ow if this was what you had in mind.

MrCakeBoss · April 6, 2026, 11:04am

@MrCakeBoss did you see this? I sent it and then the license can of worms got opened and I wanted to make sure you didn’t miss it.

Yes I saw; thank you for sharing very mixed results for me; I like their feature set but the implementation could be improved.

I tried running it on a MacBook Air (“weak device”) and every time I type a message with QWEN 3B the application freezes for a few seconds before it starts. If I ask for tool calls (“what’s the latest news?”) the entire Mac freezes for a bit before the agent tries to GET from google.com and, surprise surprise, that gets blocked by bot protection. I couldn’t find a way to select a different search provider or coax it into not using google. That said, having a search tool ticks the right box.

It is Ollama based, i.e., spins up Ollama under the hood and talks to it as the backend while serving a native UI as frontend. I haven’t tested their mobile app yet. I have plans to do it but got interrupted by my own stupidity and ended up having to restore some coding projects that got wiped of my workstation .

Back to AnythingLLM, I like the large suite of model connectors; you can provide API keys for basically anything under the sun from OpenAI through OpenRouter to “esoteric” stuff like locally-hosted Lemonade.

I also really liked that they ship (vector) RAG out-of-the-box; it’s a feature I’ve been missing from other apps I’ve tried. That said, I fear most people won’t have enough documents to get out of the “land of retrieval noise” and reap the benefits of vector-based retrieval. Vector RAG fetches multiple “relevant chunks” (often 5) so you need at least that many per topic the user asks for or you risk context poisoning.

The fact that they can get a 3B model to consistently tool call is great. I wonder if that’s because the model is good or if they do some magic under the hood. Not very happy with their choice of out-of-the-box tools. There are some very easy wins here that could make the app 10x or 100x better by updating which tools exist and how they are defined. They do have MCP support though (through Ollama I suspect) so within reason this could be fixed by the user (if they are technical and motivated enough).

Maybe implementation and feature mismatch or two but just wanted to k ow if this was what you had in mind.

Yes, this is the general direction I am thinking of walking down. (assuming I can find people willing to test the app and give me feedback once I have a first prototype that is)

bitsondatadev · April 6, 2026, 2:57pm

I’m excited to see what you cook up. I am personally interested in distributed language models with hopes that standard chips on a set of beefy x86 or arm CPUs will suffice for running inference + RAG.

I do wonder if there is a way for you to pitch to the anything LLM guy your ideas and try to see if there’s potential to integrate your use cases in the same project.

David1 · April 14, 2026, 12:57pm

With I will be using Qubes should I expect there to be a decrease in performance? I’m using Arch today

ls.skuggi · April 14, 2026, 7:32pm

It’ll highly depend on your machine especially size of RAM. Since Qubes OS runs a lot of qubes (aka VMs) simultaneously there is some overhead for sure.

I recommend at least 16 GB of RAM, the more the better but this depends on your use cases.

Topic		Replies	Views
LLM/AI Chat app that runs locally in your browser Questions	0	220	March 28, 2026
Layering Local LLM Over Commercial LLM for Privacy? General software	6	1089	May 11, 2026
Private online AI (Venice, Duck, etc) General	33	6375	October 5, 2025
Privacy-preserving meeting-summarizing tool? Questions software	4	443	January 12, 2025
Local llm chat Questions software	18	1864	February 22, 2024

I want to build an AI/LLM app that preserves privacy. Please roast my idea :)

Related topics