If you look at their site menu and go to security they claim to use something called “confidential computing”, but I don’t know or understand that technology.
Does any one know what it is?
If you look at their site menu and go to security they claim to use something called “confidential computing”, but I don’t know or understand that technology.
Does any one know what it is?
It sounds like Apple’s Private Cloud compute, where to my understanding, data processed on their servers is inaccessible. Basically, even though their computers are running the AI that you prompt, they can’t really see what you’re doing.
This Privatemode seems pretty interesting, maybe we could open up a topic asking for it to be added to the AI chat section
So I found a blog from Nvidia from July 2024 about this.
It seems that Privatemode used to be called Continuum AI (because the link in this article goes to Privatemode’s website.
This paragraph was particularly interesting.
Continuum addresses the problem by running the AI code inside a sandbox on the confidential computing-protected AI worker. In general terms, a sandbox is an environment that prevents an application from interacting with the rest of a system. It runs the AI code inside an adapted version of Google’s gVisor sandbox. This ensures that the AI code has no means to leak prompts and responses in plaintext. The only thing the AI code can do is receive encrypted prompts, query the accelerator, and return encrypted responses.
This isn’t the same as E2EE. A remote server still processes your prompts in plaintext, but it’s just that the part that processes your prompts is sand boxed and doesn’t have network access.
Users interact directly with the AS [Attestation Service] and the workers or through a trusted web service. Users verify the deployment using the AS and set their inference secrets. Then they can send encrypted prompts to the service. The encryption proxy decrypts these prompts, forwards them to the sandbox, re-encrypts the responses, and sends them back to the user.
The main difference with E2EE is that any vulnerabilities along the process and there can be leaks.
So then here is my very basic understand so far…
Privatemode’s app has some sort of mechanism in place that verify’s that the hardware that’s going to process your prompt is properly set up for confidential computing.
I type a prompt and hit send.
My prompt is encrypted on my device locally
My prompt arrives to the GPU encrypted
The GPU decrypts my prompt to plain text, processes it, and gives a plain text response.
During this part, the plain text data the GPU is processing can’t be accessed by anyone, including the service provider, or even someone with physical access to the hardware. So this is the sandbox?
The plain text response is encrypted and sent back to the me.
The encrypted response is finally decrypted on my device locally.
Does this make sense? I just don’t understand how the GPU can decrypt a prompt if it was encrypted with a key that only I have.
That’s a puzzling part, but I guess it could be like Signal. So the important part is to understand that the encryption is between you and the sandboxed enclave.
The thing I am worrying more is to send back the answer, because by definition they have to send you the answer. So how do they encrypt it ? How does the key exchange works? Etc.