PocketPal (Mobile App that Runs AI Locally)

I just found GitHub - a-ghorbani/pocketpal-ai: An app that brings language models directly to your phone.
It is available on Android and iOS and allows you to run small LLMs from Meta, Google, Alibaba, etc.

On my Pixel 9, it worked really well. I am looking to see how it runs on other devices !

2 Likes

Check it out
Pocketpal is an mobile app to run LLMs locally!

3 Likes

I just tried it. An awesome app.

1 Like

What device are you using?

iOS

1 Like

I am more interested in the specs like RAM and processor

I tried running it on both the Pixel 8 Pro and the iPhone 16 Pro. Although the iPhone has 8GB of memory, the largest model I was able to load was around 4.5GB. All models larger than that would not load at all.

In contrast, the Pixel has more memory than my iPhone, which allowed me to run larger models (around 9GB memory, 8B parameters). However, the inference time was painfully slow.

1 Like

I prefer to use the Duck chat now, but the app still seems very cool.

Perhaps when mobile hardware improves or larger models become more efficient, this will become a thing.

İPhone 15 a16.

The biggest Qwen model (Qwen2.5 3B(Quantisation5)) is sometimes a bit slower to load. But apart from that the app is working fine.

Interesting. How was the speed with 3B models?

I guess it’s better to stick with SLMs. When I have time, I will PR to add pocketpal and also explain what you can and can’t do on a smartphone AI.

On Pixel 9, I found Llama 3 8B speed to be OK actually. Like it’s not super fast but it basically is a fast as you can read.

Probably better to stick with Q4. The generation speed is OK then?

Yes. It approximately takes 10 seconds to load and generation speed is 8-10 tokens per second.

1 Like

I am really enjoying it thank you !
Though it’s still up to see if I have a real use case for it.

I am using a Pixel 7 (8GB RAM) and it’s usable.

1 Like

its pretty handy or just fun to pass time when paired with eg. gemma3 1b or 4b if you have 8GB+ RAM

I do not object to this
I’m aware of pocketpal, great app, allows any local model to be downloaded within the limits of system memory from huggingface (does not check. you’ll have to do your won due dilligence) I’ll check the PR we could explain what models are possible to use based on the amount of system memory
Edit: I did not realize pocketpal was on iOS, good to know!
do I have permission to improve upon your PR? @Encounter5729

Go ahead. There is already a table with model size and corresponding hardware I think.

have suggested changes

Some time ago it was mentioned in this forum that pocketpal connected to firebase and google analytics each start, did this change?

if the exodus scan is anything to go by, probably, have never seen this connect to any firebase and stuff like that doe, idk in what basis this is.

exodus report:

though if I did have a secondary android device, i coulda use app manager and inspect it but I hope someone else does this instead.

[And even then, PocketPal works completely offline after you get the LLM(s)]

No

Firebase is used for the benchmark feature but inadvertently phones home each time regardless

Is this just a ping or other info sent ?