Local AI question

Hello,

I’ve recently started playing about with hosting various AI systems locally with the aim of getting a working, but privacy-respecting setup. One of the things I’m trying is getting image generation working. I’ve considered the recommendations on PG, but don’t like having to downgrade the security settings on my Mac to run Kobold, and am aware of a few vulnerabilities in Ollama. Given that, I’ve been looking for alternatives and have come across Draw Things. This would be ruled out of the PG criteria straight away as it isn’t cross-platform, but looks fairly OK to someone with very limited technical knowledge.

I’m after guidance from people who know what they’re doing on two things:

  • Is Draw Things in fact as safe and completely privacy respecting as it presents itself? I’m thinking primarily about privacy of prompts and outputs, but also wonder about other risks I might not immediately be thinking of. (I know they have cloud options - I’m talking about when those options are switched off/not utilised.)
  • Is it possible that, even if Draw Things (or whatever application one uses) is safe and privacy-respecting, one of the AI models one uses could in fact compromise one’s privacy or security? I’ve seen PG’s guidance on how to choose one’s models well, but I don’t know what the risks are here, and I don’t know whether I can assume that models downloaded through either draw things or from ‘verified’ companies via the huggingface repository (which PG recommends) will be privacy respecting. I’m particularly led to ask this question because Stability AI have started asking for personal details to allow people to download their latest model. Is there some way a model like this could have built in telemetry despite being locally run through a privacy respecting application?

Thank you very much indeed for any help.

1 Like

In my understanding, as long as the application itself is not malicious, and it works complwtely offline, I dont think it can harm someone’s privacy in anyway.

For security, it could become a risk (data being exfilteated or modified) if you have other malicious application installed on my device, but at the point the Local LLM wont be your primary concern.

For the models itself, always check their work and dun trust it blindly, then you should be fine.

Many thanks for this. Does anyone have any thoughts about Draw Things itself? I’m not sure whether to be concerned about it as it is not on the PG recommendations; or whether it really is as telemetry-free and privacy-first as it claims.

Thanks in advance if anyone can advise.

To know if something is privacy respecting or not, the easiest thing to do is ask for the source code. Usually it’s on GitHub. If the application is not FOSS, then I’m afraid it has a bit of a smell to it. You can’t fully validate it’s not sending data unless you analyze network requests going to/from the device.

It seems like they do, which increases trust, and is GPLv3, which is the best for local applications.

Not to say you shouldn’t pay attention to CVE, but CVE exists to understand current and historical issues with a service. Some CVEs may not be a big deal for you depending on your threat model (I.e. denial of service on a locally hosted model only on LAN is probably a nothing issue for you).

Ollama fits a different use case than the other app you mentioned. Ollama is a Swiss Army knife with lots of integrations. The other app is an iOS app just for drawing. The less an app does, the less attack surface it has and is generally less likely to be vulnerable.

PG recommendations aren’t the end all be all either, it’s just the top recommendations for general use cases.

2 Likes

Is it possible that, even if Draw Things (or whatever application one uses) is safe and privacy-respecting, one of the AI models one uses could in fact compromise one’s privacy or security?

Well that’s not how it works. Downloading a model could compromise your security, but it wouldn’t be the model itself that is unsafe.

A model is a bunch of numbers (think CSV file or content of a spreadsheet). Those numbers won’t do anything to your privacy or security. However, it’s a lot of numbers so we typically share them in a compressed format (basically a ZIP file) and to make the numbers do anything we need a program (like excel).

The compressed file (the “ZIP”) could contain malicious code that does bad things when you try to unpack it. The way you protect against this is by looking at hashes (MD-5 or SHA-256) of what you have downloaded compared to what the provider (that you must trust) says it should be. If they match the data is unlikely to have been tampered with.

The program that runs the model is typically build into the app you use. In your case, it is likely that Draw Things will use the same code to run a model. I have not read the code, so I can’t say with certainty, but that’s how the vast majority of AI apps work under the hood. If that is “safe” for one model, it is safe for any model.

So yeah, the model is safe and won’t leak any data. The stuff around the model is probably safe assuming you trust Draw Things and have basic hygiene around “downloading things from the internet”.

1 Like

Thank you for this. It’s good to know that it looks like the source code is fully available. I could see there was something on github but coudln’t quite tell how complete it was - noted it spoke about the repository expanding over time and wondered if this meant it wasn’t fully open source just yet.

You can’t fully validate it’s not sending data unless you analyze network requests going to/from the device.

I did have a look at what the program seems to do in terms of network traffic using system monitor. It sends a few KB on startup, which seems like it might be consistent with what the creator says, but I’m no way technically competent to assess what that traffic actually contains. Is there anything more someone without massive technical skills can do here?

Thanks again - really appreciate all advice here.

That’s really helpful, thank you.

The compressed file (the “ZIP”) could contain malicious code that does bad things when you try to unpack it. The way you protect against this is by looking at hashes (MD-5 or SHA-256) of what you have downloaded compared to what the provider (that you must trust) says it should be. If they match the data is unlikely to have been tampered with.

Stupid follow up question: could a company deliberately build something into that ‘ZIP’ which compromised user privacy when models were actually used? I’m guessing that the hashes would still match if that was the intension of the provider, but I may be misunderstanding.

Why not follow Qubes OS’s “security through isolation” concept?

Qubes allows you to run an offline AI qube completely separated from, for example, user data, email programs, and so on.

Correct. The idea behind checksums is to protect you from alterations by third parties. It does nothing to protect you from the provider doing “bad” things. That’s why need to trust the provider of anything you download and execute.

The rabbit hole of “can I trust the provider” goes infinitely deep. It’s often better to think of a scale of trust (“how much do I trust them”) than to think in a binary good/bad, which changes the conversation into one of risk appetite, impact, and likelihood.

Without going too deep, and with the added disclosure that the following is my personal opinion, the sensible things to do as a less technical person are:

  1. Check the provider’s brand and scale. The bigger the name, the more they will hurt from distributing bad code so there is mutual interest in sticking to what was promised.
  2. Check the ToS and Privacy Policy. It has details on how/where data is processed, collected, or shared, and AI is pretty good at reading them fast. They are legal documents, and most companies follow the law. If they report something there, that’s typically how it’s done.
  3. Use a virtual machine or sandbox. If you don’t trust what the provider says they do (and still don’t want to walk away) you can isolate the program from the rest of your system and limit access. This is inconvenient but gives you control over the impact side of your privacy risk.

There are also independent audits, reading source code, firewalls, network monitors, and a million other things that one could do. Trusting providers is an infinite rabbit hole, and you need to find the spot in the “shades of gray” where you are comfortable between “effort spent vetting/worrying about the provider” and “utility gained using their software or service”.

1 Like

Makes total sense. But makes me wonder: doesn’t this make the relative safety of the AI models themselves moot? As in, I get that the AI models are just numbers and so on but if the files they are bundled with could still be executing telemetry on behalf of their providers, does that make the distinction between models and wider files unimportant in terms of their effects?

And this makes me wonder: if all of this is the case, why do people so often characterise local AI as a qualitatively better approach from a privacy point of view? It seems like one still ends up relying on a great deal of trust of often the same corporations we don’t want to submit our data to through cloud AI… Might be completely misunderstanding here, of course!

On a related note, does it matter what type of file we’re talking about? I’ve recently read that safetensors files should solve the potential problems I think are described above (about something other than the models being snuck inside the ‘ZIP’ file along with the model) - CKPT vs SafeTensors 2026: Complete Security Guide & Conversion . Would a safetensors file eliminate the risk that

The compressed file (the “ZIP”) could contain malicious code that does bad things when you try to unpack it.

or does this risk persist regardless the file type?

That’s not how it works. The model itself doesn’t send any telemetry. What might send telemetry is the code that sends data to the model or code that sends data from the model back to the app.

Model providers for open-weights models (like Moonshot, Deepseek, or Facebook) can’t collect data here, because they don’t provide code for these steps.

Cloud model providers (OpenAI, Anthropic, OpenRouter, nano-gpt, …), on the other hand, very much provide the code for these steps so they may or may not collect telemetry here.

In the case of Draw Things (or any other local app), Draw Things controls the code for these steps, so they decide what data is collected and sent to whom.

In general, the entity running the model chooses the inference framework and how data is sent to/from the model. This is typically also the entity choosing where, how, and what is collected in terms of telemetry, logs, training data, etc.

if all of this is the case, why do people so often characterise local AI as a qualitatively better approach from a privacy point of view?

Because anything that runs on your computer and wants to collect telemetry needs to send that data through your network. This gives you (or someone technical enough) the option to inspect if data is sent somewhere. If a provider claims they don’t collect telemetry and do everything locally and you still see them sending data over the network something smells funny.

With cloud-hosted AI this method of verification is not possible because you send your data to them to have it processed. You can’t see what’s happening on their servers so they may or may not collect telemetry.

or does this risk persist regardless the file type?

The (security) risk of files from the internet containing 3d party malware or spyware exists regardless of file type.

The risk of providers adding telemetry/tracking into model code distributed on Huggingface is file-type specific. That said, Huggingface is very strict with not having this in the models they have on the platform, which is why you find initiatives like the one you linked (new/custom file formats) to make this requirement easier to enforce.

Thank you (all) very much for your advice. I’ve gotta say, it’s really reassuring to have your guidance and, tbh, great to just feel a bit less alone in caring about all of this.

One final question if that’s ok:

If a provider claims they don’t collect telemetry and do everything locally and you still see them sending data over the network something smells funny.

In the case of draw things, there is a small amount of network activity every time I use it - we’re talking about 10-25 kb according to system monitor. The app creators say that some network activity will occur in order to get an updated list of available models from Github. However, I’ve no idea how much activity I ought to expect this to take. So I’m wondering, does what I’m seeing seem consistent with what the creators say the app does? And/or how much network traffic would I expect to see if something more sinister were going on?