Add AI Chat tools

brivacy · September 25, 2024, 1:14pm

Ollama and alpaca is a stll good option when compared with copilot and stuff.

water · September 25, 2024, 3:12pm

I think we should stop calling Meta’s LLMs (i.e. LLaMa) an opensource LLM. It’s practically akin to saying Android (not AOSP) is open-source. Meta has released the weights of the model, and you are allowed to finetune it/use it commercially etc. But it is not Open Source. Atleast, LLM researchers and noobs like me are not gonna call it opensource. It will be opensource when we can see the training data and code for the model.

Encounter5729 · September 25, 2024, 4:03pm

A new (truly) open model has been released. AI2's Molmo shows open source can meet, and beat, closed multimodal models | TechCrunch

On a other note, @dngray could you review the PR? It is basically ready now.

water · September 25, 2024, 7:58pm

IMHO, we should not add anything AI related. As someone working in the field, here are my two cents on why:

the landscape is so evolving that we wouldn’t know what to recommend
most of them are secretive about their RAG, continuous learning, caching of interaction
another big reason is, THESE MODELS HALLUCINATE, A LOT.
WE SIMPLY DO NOT KNOW IF THE DATA THAT THEY TRAINED THOSE BOTS ON WEREN’T SCRAPED OFF OF PEOPLE’S CONVERSATIONS.

We can’t talk about privacy and then do a 180° turn and recommend LlaMa which might have used data which may include but is not limited to all of Australian Facebook User’s public Facebook photos, including children if they were posted by adults. Did they also use the messages from the era when messages were non E2EE? Google Research, FAIR, MSAI, no one says anything about their data or how it was collected, nor does OpenAI. Recommending chatbots on this platform is like: Hey yeaaah, these companies might have harvested all your data to train these models, but just because they are behind this blackbox of a model, we don’t think it’s a privacy violation or anything of that sort.

There is also a worrying trend of using chatbots as personal companion or venting, or asking mental health question is through the roof, especially amongst young people, who may not really grasp the intricate details about hallucination, data leakage, contamination amongst other things that goes in these models.

Forgot to tag @jonah

benm · September 25, 2024, 10:17pm

But the point of PG is to help users preserve their own privacy while using modern tools with workable systems.

Whatever limitations of AI, people can judge for themselves if they are useful, but they need assistance determining how to use them in privacy preserving ways.

Clearly using ollama locally is better for privacy than logging into OpenAI and using GPT or using Microsoft Copilot with your account.

water · September 25, 2024, 11:00pm

Again, the point isn’t just about preserving own privacy while using tools. IMHO, it’s kinda polar opposite to promote preserving privacy while making a guide to use the tools created with gross violation of user privacy.

IMHO, we don’t need to have a guide for everything just because it’s the shiny new thing now.

It’s like owning the newest Nike kicks/iPhone and preaching about sweatshops in Bangladesh/China.

benm · September 25, 2024, 11:25pm

Whatever extent privacy was violated to make the models for ollama, it has already happened. Using the tool isn’t changing that.

win11.shading291 · September 26, 2024, 1:33am

The damage is already done and lawsuits are underway. We can’t go back. In this context, pg should recommend what is the best among what is available.

Tech-Trooper · September 26, 2024, 3:41am

You are right. But users gonna use AI, anyway. It will be good to offer them some alternatives which are less likely to invade their privacy. If we will talk about externalities, we have a guide on MacOS and iOS, so will we consider the kid workers on batteries, exploited labor in China etc. I know what you mean, you are right. But practically it’s not a good idea not to say anything about AI if you wanna become an authoritative source on privacy.

water · September 26, 2024, 8:53am

So, being closed source, potentially trained on private data, unverifiable claims, repeated accusations of using test set already in the training for inflated results is not a dealbreaker for not to have an official guide on them or just say “Hey we are waiting this one out until we get some clarification from the devs/companies making them”?

@benm I agree with some of what you said, although it hasn’t already happened. Ollama is just an interface to connect to actual models like “Gemini, LlaMa”, amongst others. They will always release a new version of it because that’s how fast landscape changes in NLP, they will keep changing their policies without giving a notice. So, it hasn’t already happened, it is happening and continue going to happen.

@win11.shading291 Lawsuits, iirc, are for infringing copyrights of bigshots like NYT, SMG, I don’t recall anything about usage of user data from the platform, because we don’t know what it was trained on.

P.S: I am a bit biased about all of these, because researchers from my ex-affiliation’s department are the ones who have published some details of how these models are “cheating” on their evaluation by including that test set in their training.

Encounter5729 · September 26, 2024, 10:49am

AI is a tool, and I understand the concerns. However, we generally don’t evaluate ethical concerns. As you said, phones are made with exploitative processes, yet we recommend to buy one, is the Google Pixel.

For the license, Meta has never changed their license for the same model. They did however make their Llama 3.1 license free, by allowing it to be used for AI training.

People are free to use the model they want, with varying degree of openness. But open-weights models can’t ne defined as closed source.

The training on private data bit, yes it is true. But on the other hand, anything not private is public. That what you said on forums or other privacy-invading platforms get sold, used, etc is nothing new.
Unless they are reliable data extraction methods, I don’t see this as a big enough concern to not list them.

For the hallucinations, feel free to add a warning in the GitHub PR.

Astatine · September 26, 2024, 12:16pm

I agree, but nonetheless, if this topic is avoided, users will be left uninformed and most people will simply use what’s available.

Some literally using your personal data to train their crappy AI models. I’lll use an example from Microsoft’s privacy policy:

As part of our efforts to improve and develop our products, we may use your data to develop and train our AI models.

This is just one of many examples.

Off-topic(ish)

Honestly, I can’t see how anyone can bare to use Copilot, especially with its crappy responses and weird hallucinations that are somewhat amusing yet horrid.

water · September 26, 2024, 4:10pm

Open-weight models are closed source models. The closest I can think this in terms of software is: These are freeware models with option to add your own plugin (i.e. fine-tune).

Unless you can simple copy the data and code to reproduce the results that they got, it’s not opensource. Can’t believe people here are falling for FAIR’s bullshit marketing crap.

But on the other hand, anything not private is public.

I would consider that to be a very wild take, especially when it’s on this website.

Unless they are reliable data extraction methods

There is, it’s just expensive and would take time, making them lose their edge in a cutthroat field.

benm · September 28, 2024, 1:32pm

The privacy purpose of free open source is that code is open to audit. In other words, you can confirm there is no telemetry phoning home, there isn’t a secret keylogger, there isn’t something malicious that modifies system files, etc.

While there are plenty of other reasons one might want FOSS, these are the reasons related to privacy. Regardless of whether you categorize AI as “FOSS” because of details like the weights, the training data, etc. none of the privacy concerns normally associated with proprietary software apply.

Ultimately the main objections are:

The boycott argument - these models are doing bad things to get their training data or cheating on benchmarks so we should not use them because we have a duty to boycott
The quality argument - these models produce hallucinations or other results that are not good.

Personally, I think they boycott argument doesn’t work since PG already recommends tools for privacy violating platforms anyway. Whether you boycott is a personal decision and not everyone is an activist, and it’s not even clear how boycotting would stop or slow down the continued practices you oppose. More likely it just leaves people behind in an emerging technology.

The quality argument also does not work because people can decide for themselves if there is a high enough quality for their use case.

brivacy · September 28, 2024, 2:24pm

brivacy · September 28, 2024, 2:25pm

There is another fully open model from amd

Encounter5729 · October 6, 2024, 2:17pm

@brivacy yeah molmo looks great.

I am bit discouraged by the fact the Privacy Guides seem to completely be not interested in approving the PR, even though this a popu

Average_Joe · October 7, 2024, 6:35pm

Encounter5729:

Average_Joe:

What are the AI tools that don’t need to be connected to the Internet

Those are ollama, Kobold.cpp, llamafile

Look at the PR for details.

Average_Joe:

Can these AI tools that are run locally compete with the AI tools

Not really. They are slower. And while the best open models are not that far from proprietary best-sellers ( https://leaderboard.lmsys.org is a website where users can submit a prompt to two random models and blindly choose the best. With all those results they have a leaderboard of the best models.) your machine will not be capable of running the >70B models.

Average_Joe:

It seems like asking an AI that’s connected to the Internet your most personal questions would potentially, be a huge privacy breach

If you don’t submit PII that’s fine. Please look into the PR on Gituhb, where there is more explanation.

I appreciate your reply!

Is there an AI tool that’s connected to the Internet that protects user privacy?

I’ve been using Brave’s “Leo AI” since Brave has a great reputation in protecting user privacy.

Tech-Trooper · October 7, 2024, 8:30pm

duck.ai or hugging chat as discussed above.

Edit: forgot to add link.

Average_Joe · October 7, 2024, 9:48pm

I appreciate your reply!

How does Brave’s Leo AI compare for user privacy?

Topic		Replies	Views
Adding a new category about AI chatbots Tool Suggestions invalid	4	798	April 14, 2024
List your open source Ai tools that compete closed source ones Questions	3	617	March 18, 2024
Alternatives to Chatgpt and Bard Privacy guide , software	12	2366	January 26, 2025
Thoughts on DuckDuckGo AI Chat? Questions	14	2190	March 23, 2025
Local llm chat Questions software	18	1358	February 22, 2024

Add AI Chat tools

Related topics