Layering Local LLM Over Commercial LLM for Privacy?

Yeehaw, y’all :cowboy_hat_face:

I’ve been mulling over an idea to use a commercial LLM for counseling in a private way.

This came to mind when I was exploring Claude.ai, which utilizes Claude-2 and allows access via VPN without the need for phone number registration, unlike ChatGPT. It’s an impressive LLM, but Anthropics privacy policy is a nightmare.

While the VPN hides my IP address, I believe my writing style and word choice can be used to fingerprint me. So, I’ve been contemplating a two-layered system to address this.

The idea is to run a less powerful open source LLM on my local machine. First, I’d write my query as I normally would, then this local LLM would sanitize my input, stripping away any unique writing styles or identifiable quirks. This sanitized version would then be passed to Claude-2.

On paper, this seems like a solid plan, but I’m curious to hear your thoughts. Are there any potential pitfalls I’m overlooking? Could this be an effective method to reduce the risk of being identifiable by these companies?

1 Like

The biggest drawback I would say is the hardware requirements to run something like that locally. From what I’ve seen, and this may be already outdated so please correct me if I’m wrong, you need a pretty decent graphics card with enough VRAM to get even the most basics models to run.

You can try to use a translation service like deepl and translate your query into multiple languages, then back to your original language. You could even alternate different languages as you use different VPNs to mask your location. I have no idea if any of this would be enough to prevent identification but that’s the first thing that popped to my mind when I was reading your post.

1 Like

Thanks for the feedback and suggestion!

You make a fair point about the hardware requirements for running a LLM model locally. However, due to techniques like quantization, it’s becoming feasible to run some more basic models without exceptionally powerful hardware.

The translation idea you mentioned is clever too! Though after thinking more on this situation, I’m not fully confident in my ability to avoid identification by commercial entities.

So I’ve decided to avoid using their services for sensitive matters and wait 1-2 years until I can afford the hardware for an open source model like Falcon-180B. It just got released and is already about as capable as GPT-3.5, which is really impressive.

Since open source LLMs are advancing rapidly, I’ll hold off until I’m able to run one entirely locally. Thanks again for the insightful ideas! I appreciate you taking the time to share your perspective.

2 Likes

Thank you, for sharing that info on Falcon-180B. I’m also hoping to be able to run something like that locally but I haven’t been paying too much attention, as I thought it would still take some time until that’s a feasible reality.

I just saw this news that Intel and Nvidia are going to be making dedicated chips to run artificial intelligence programs, so things are looking bright on that front, although I’m sure this will come with additional privacy concerns somehow…

https://www.spiceworks.com/tech/artificial-intelligence/news/nvidia-and-intel-to-leverage-ai-in-latest-offerings/#:~:text=Intel%20has%20confirmed%20that%20its,can%20manage%20AI-related%20tasks

For a counselling service, you’d need a fairly decent model, and that would be an issue. I think in the future (give it time) a SLM with specific domain knowledge (eg in psychology) could be useful.

I do not think you can just use some general LLM and expect the right result, especially if it doesn’t have proper rails for what it should say. LLMs logic processing often will agree with you on things where you’re wrong and in complex human-to-human relationships this leads people to believe they are in the right when they are not in regard to their behavior.

I also don’t believe you can sanitize the input into the cloud LLM, really only way to do it is either local or public. Sanitation might introduce contextual changes which means you’re not telling the cloud LLM what you really mean.

Ideally if you can hold off until DDR6. This is going to be a major game changer for local LLMs. Also by then we will have video cards with a decent amount of VRAM standard.

Question is affordability. Look at how fast people are snapping up Mac Minis.

Mac mini probably wouldn’t be sufficient as the most you can get now is 48GB (and paying a premium for that). You’d probably need a studio and the answer is very expensive.

Also in terms of stack, a lot of these models/toolsets are on a more mature Linux stack, (that’s whats also going to be running in a datacenter) so if you’re self hosting I wouldn’t be buying a mac for that. Essentially Linux support on macs sucks, and won’t ever be good because of the distinctly different hardware, and you’re at Apple’s mercy for support for however long that is.

The benefits of the unified memory will mostly be a lot more achievable with CAMM2 DDR6 RAM - and i the enterprise space SOCAMM2.

One of the other things to note, is that NVIDIA has a clear advantage with CUDA as a lot of models are balanced around that. AMD’s ROCm is getting there, but still a ways to go. Whatever Apple has is not going to ever get the same kind of density usage as Apple has zero enterprise footprint.