Another related article…
Isn’t DeepSeek = hugging chat? I believe Hugging chat was recommended here somewhere? Maybe that should be removed.
Nope, theses articles talk about the newish LLM models V3 & R1 hosted by the chinese firm Deepseek. The privacy issues stem from the cloud version. You can download the weights for V3 & R1 from huggingface for free and run it locally on your devices without the privacy implications.
Deepseek =/= Hugging Chat.
But a Deepseek “Distill” is one of the options available on Huggingface/Huggingchat.
List of currently available models on HuggingChat
Qwen/Qwen2.5-72B-Instruct
meta-llama/Llama-3.3-70B-Instruct
CohereForAI/c4ai-command-r-plus-08-2024
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
Qwen/QwQ-32B-Preview
Qwen/Qwen2.5-Coder-32B-Instruct
meta-llama/Llama-3.2-11B-Vision-Instruct
NousResearch/Hermes-3-Llama-3.1-8B
mistralai/Mistral-Nemo-Instruct-2407
microsoft/Phi-3.5-mini-instruct
(the media coverage, and the popular discourse around Deepseek is fairly confusing at the moment since most of the discourse doesn’t do a good job differentiating between Deepseek the open source model & Deepseek the app and cloud hosted service & Deepseek “distills” [1]
A “distilled version” of a model refers to a process in machine learning called knowledge distillation. It involves taking a large, complex model (called the teacher model) and transferring its knowledge into a smaller, more efficient model (called the student model). The distilled model is trained to mimic the predictions of the larger model while maintaining much of its accuracy. ↩︎