Hello, gravel here;
An upcoming change[1] to Session Messenger may undermine the encryption used for file attachments. The new encryption scheme is deterministic, meaning “the same user uploading an identical attachment results in an identical encrypted copy.”
I’d like to shed some light on this change, why I think it’s a problem, and suggest changes that preserve the qualities of deterministic encryption.
Disclaimer: I am not a cryptographer.
Before you proceed: if you’re aching to respond with “Session is trash anyway”, then this thread is not for you. Otherwise, feel free to continue reading.
To illustrate the issue with deterministic attachment encryption, imagine the following scenario:
Alice and Bob are talking, and Eve can capture encrypted file blobs [between Alice and Bob].
Let’s say Alice and Bob are doing insider trading (Eve has figured this out) and Alice covertly uses one image to signal [to Bob] buying stocks, and another one to signal selling stocks.
Under deterministic encryption, Eve can, for every new image sent by Alice, tell apart which image is which (and predict stocks herself).This attack is passive and can also be performed offline on captured blobs to extract information from past conversations.
The fact exploited by this scenario is that Eve can tell when a file has been repeated. However, Alice and Bob assume the opposite when relying on their encryption.
Is this exploit practical to pull off? Well, …
In practice, Eve is limited to intercepting blobs at the file server (only [Session’s] onion/transport encryption is stopping her). With a variation where the file server is a non-onion service that needs an exit node, Eve could also intercept blobs at the exit node.
Even if other users/dummy images are added to the scenario, Eve can always rely on the encrypted blobs to recognize the signature files being sent. The scenario can be expanded to any number of telltale images being sent (or even contrived examples like sending each letter as an image). Real scenarios are more complex, but will still resemble the original scenario in some ways.
… In other words, conversation secrecy is dependent on transport encryption and the number of users in the network.
So what’s the reason for this change?
The cited reason is “allowing deduplication in case of […] file server response timeouts or repeated uploads of the same profile picture.”
A bit of context: Session’s files are almost[2] exclusively handled by a centralized file server. This is a free service that doesn’t require any account to use (in the traditional sense)[3].
In this situation, de-duplicating files is a legitimate strategy to maintain usability [as opposed to expiring files early]. And if the same file (from the same user) always looks the same, de-duplicating becomes “easy”!
Can we achieve this without deterministic encryption?
The answer is yes!
Let’s revisit the commit description:
the encryption added here uses deterministic […] nonce and key generated from the user
What’s a “nonce”? It’s a number you should only use once!
Well, we’re clearly not doing that here… But the next best thing is to use a nonce as little as possible.
If time-based nonces are added, the encrypted blobs are no longer equal on different days.
What about changing the nonce every hour?
[Suddenly,] the conversation can no longer be compromised when comparing samples from literally just ANY time frames —except for the 1-hour window.
That sounds pretty good! And the file server can still de-duplicate files posted by the same user in the same hour (which is fine for most client glitches or denials of service).
What do the developers say?
In lieu of a more detailed explanation, see the following message excerpts:
KeeJef
Hmm, I’m not sure this example represents a major risk. Users don’t tag uploaded files with any kind of user identifier, and use ORs [onion requests] to upload and download images, so even if someone had access to the encrypted blobs on the file server, they wouldn’t know who uploaded or downloaded them. For the attack to work, the users would need to be sending the exact same images back and forth, and Eve would also have to understand what those blobs represent. That would effectively require her to have compromised one of the devices in the conversation, at which point she wouldn’t need to infer the content by analyzing file IDs. The main reason deterministic encryption is being introduced is to reduce load on the file server, particularly for profile images that are frequently reuploaded and otherwise create many duplicate copies of the same image.
my response
== On tagging uploads ==
In the worst case / undiluted scenario, Alice and Bob are the only ones using Eve’s fileserver. This should pose no problems for secrecy if the fileserver is trustless — metadata is of course only protected for a large userbase. Tagging uploads doesn’t make a difference since we know only Alice is uploading.Diluting this assumption would cause a lot of work for Eve, but even at small numbers this is still feasible. I admit I don’t have a mental model for the statistical attack, but time is on Eve’s side here.
== On Eve understanding the context ==
This does not require device compromise. Eve is allowed to affect the circumstances (affect the stock market) to increase the chance of Alice sending one image or the other and correlate the results.See: Chosen plaintext attack and Gardening - Wikipedia_(cryptanalysis)
The starting assumption is that Eve suspects something — in cryptanalysis that is not a controversial starting point IMO. As before this relies on the user counts involved (but imagine e.g. a corporate Session subnet). I can’t offer advanced analysis, but for security guarantees, the burden of proof is on the cryptographer’s side, not the cryptanalyst’s.
===
Arguably any Eve would have trouble pulling this attack off on the current mainnet. But this is not about showing a PoC. This is about showing a weakness — showing that guarantees are limited and not adaptable to future environments.
== On file server load == This is why I suggested time-based nonces. If a profile picture will be reuploaded 10 times in a given period T, choose that as the nonce period. Files uploaded in the same period will still be de-duplicated (stored at most twice if the period equals the file expiry time).
Choosing any period nonce will always be better than deterministic encryption! Under DE, a file encrypted in 2025 DOES NOT DIFFER from the same file encrypted 50 years later. The file server cannot justify that!
Special Thanks
I would like to thank Session user UneliasHirvi for bringing this issue to my attention, as well as Soatok for inspiring me to bring these topics into a public forum!
Add new attachment encrypt/decrypt functions · session-foundation/libsession-util@0f8a7f5 · GitHub ↩︎
The only exception being files uploaded to Session Communities, which manage files separately. ↩︎
Session has no notion of account registration. Unlike Signal, where you need a phone number to sign up, a Session account is just a local keypair. This makes rate-limiting and load management a high priority. In fact, in July of 2024, the slow processing of outbound requests arguably caused a sharp decline in active users: Session User Engagement Report · Issue #60 · oxen-io/oxen-improvement-proposals · GitHub ↩︎