About the anonymity/pseudonymity of Signal

I believe Signal offers strong pseudonymity and (to some extent) anonymity. But many people, including the PG forum, say that Signal does not provide anonymity. I’d like to discuss more about this.

1. What is the exact definition of (online) anonymity?

This Youtube video from PG defines anonymity as “The ability to act without a persistent identifier”.

This blogpost from PG says “The sender and/or recipient’s real ID is unknown”.

This post from Proton says “Keeping your identity private, but not your actions”.

OpenVPN’s blog claims that the purpose of anonymity is to protect an identity from being revealed.

Lastly, ExpressVPN’s blog says “Anonymity is allowing people to see what you’re doing, but not that you’re the one doing it”.

2. Signal’s identifiers

As we all know, Signal requires a valid phone number to register and all account information is tied to that phone number, thus allowing Signal to know what phone number is related to which account.

But I’m not sure why this directly concludes to denying anonymity. I believe anonymity is preserved if someone on Signal who has chatted with me cannot assure that the opponent is me, which is similar to the case of Tor where the operator of the site I visited cannot trace back to me.

According to this Signal’s transparency report, Signal cannot identify an account when given a profile name (which is obvious since profile names are end-to-end encrypted).
Also, Signal has the username feature as an alternative to phone number for initiating chats. As long as I set “Who can see my number” and “Who can find me by number” to “Nobody”, no one can find me in Signal, and can’t even determine whether or not I’m using Signal (unless they forensic my phone, but that’s something else).

About usernames, Signal cannot see or produce the username of a given account. If provided with the plaintext of a username known to be in use, Signal can connect that username to the Signal account that the username is currently associated with. However, once a username has been changed or deleted, it can no longer be associated with a Signal account. (Source: https://signal.org/blog/phone-number-privacy-usernames/)

Putting the above together, other users (including law enforcements) cannot identify a user he/she has chatted with. When specified with a phone number, Signal can confirm whether or not that phone number is associated with a Signal account along with providing two timestamps, but this should be a matter of plausible deniability, not anonymity. Tor too, does not hide the fact that you’re using Tor. It masks your real identity from others.

3 Likes

Yes the phone Number is the reason we don’t consider Signal anonymous.
Most people could input a number that’s tied to their identity unless it’s a prepaid or VoIP number.
Signal provides great pseudonymity if the above of “unless…” applies but in my experience there are people that are clever enough to separate their identity kind of making it anonymous but it’s mostly pseudonymity.

Everything you do online leaks bits of identifying information that eventually deanonymize you. It’s a massive game of Guess-Who and every cookie, web-bug, LSO, list of available fonts, every canvas fingerprint and IP-address, every post you write with your language, unique writing style, vocabulary, punctuation can be used to rule out other people, and eventually reveal your identity.

So

The ability to act without a persistent identifier

is a bit misleading, as your actions again reveal information that can deanonymize you. You can act anonymously, so depending on situation, there’s a limit of “Largest amount of actions under maximum amount of blending/masking you can take without being singled out”

Keeping your identity private, but not your actions

Again, same thing.

“Anonymity is allowing people to see what you’re doing, but not that you’re the one doing it”.

Again, same thing

The sender and/or recipient’s real ID is unknown

Describes anonymity in the context of secure communication, which is practical, but a bit narrow definition.

claims that the purpose of anonymity is to protect an identity from being revealed.

This is a bit circular and sure defines what anonymity means in one way.


Merriam-Webster says:

the quality or state of being anonymous

As for Anonymous, it says

  1. of unknown authorship or origin
  1. not named or identified
  1. lacking individuality, distinction, or recognizability

So to tie these all into something practical context, anonymity means your real life identity can not be linked to any pseudonymous (shadow) profile, under which every action and piece of partially identifying information, that you and/or your devices have ever leaked about you, have been collected.


I believe anonymity is preserved if someone on Signal who has chatted with me cannot assure that the opponent is me

It boils down to what they can infer about your writing style. If you’re the only one who at the same time is

  • interested in crypto anarchy,
  • who does beekeeping,
  • who thinks French fries come from France, and
  • who writes license as liesense,

That may be enough for a buddy of yours to tell its you. So anonymity requires quite a bit of OPSEC.

It won’t boil down to just phone number.

The reason why we say Signal is not anonymous, is because the client does not take steps to protect your anonymity.

  1. It again, requires your phone number it sends to the server. If the service was in secret run by an intelligence establishment, they could trivially cross-correlate your phone number with your IMEI and triangulate your position with IMSI catchers, and then look up your identity based on your location/address. This applies even to a burner phone with pre-paid SIM.

  2. It does not actively protect your IP-address from your server. For that you’ll want something

    • That connects to the server via Tor exit node, or even better,
    • That uses a Tor Onion Service server, or even better,
    • That uses Tor Onion Service based peer-to-peer architecture

    This prevents a service run by the intelligence establishment from asking the ISP to whom the IP-address block is assigned to, to which subscriber the IP was assigned on a given date and time.

Also, Signal has the username feature as an alternative to phone number for initiating chats.

This allows hiding the phone number from contacts, and if Alice is just JaneDoe.012 then one could argue she’s anonymous from her contacts, at least until she leaked too much information about herself.

The good thing is she knows how much she’s leaking. The bad thing is she might not be able to stop; As peers can’t tie her next username together with her old one, they can’t reach her once she changes her username.

Knowing this might make Alice stick to her username, which might also come back to bite her in another way: Bob can prove to Charlie he knows Charlie’s contact, and leak to him information about her.

EDIT: Signal apparently uses usernames in more throw.away fashion. See discussion below.

Still, Cwtch handles it much better as you can have very granular control over which profiles you have are throwaway, and which ones are persistent and reserved to IRL friends etc.


What Signal is doing is pretty good for a content-private messenger. But Signal doesn’t advertise itself as a metadata-private messenger.

End-to-end encryption is a technology used to protect content. It’s deployed on the Signal client that’s open source, and that you can compile and run on devices you own and control. The mechanism features public key fingerprints in the form of safety numbers, that allows you to verify the E2EE indeed happens between you and your contact, and that there is no man-in-the-middle attack by server or some third party. Signal thus has content-privacy by design.

Cwtch’s Onion Service routing mechanism is a technology used to protect metadata. It’s deployed on the Tor client running as a subprocess. It’s open source, and you can compile and run it on devices you own and control. The Tor-network is not run by you, but the odds of all 9000+ Tor nodes running backdoored code blindly, are rather slim. Tor gives the best technical chances of hiding IP-address out there, and the NSA has admitted it on its top secret slides it can’t deanonymize Tor users on demand. Cwtch thus has metadata-privacy by design.

Signal’s metadata protection is based on the company wide decision to run their servers in a way that masks phone numbers, and that doesn’t collect metadata about users conversing. If Signal was malicious, or greedy like Meta/WhatsApp, or compelled by national security law, Signal’s service could collect as much metadata as WhatsApp does. Because users don’t get a say in this, we say Signal has metadata-privacy by policy.

So because Signal client doesn’t take active steps to protect your identity from the Signal server with Tor, and because Signal server has a policy of requiring phone number for registration, it’s not anonymous in the same way it’s E2EE.

This is fine, Signal isn’t lying about it boasting metadata-privacy it doesn’t provide. It’s enough users know this and use Cwtch when they need the protections.

5 Likes

If this is referring to Signal and how it functions, a username change has no bearing on current connections, as the ACI has already been exchanged and is persistent. The Username is just a mapping function for the ACI, and once you have the ACI, the username is discarded entirely client side.

Oh, interesting! So the username just disappears from the contact when the user changes it? Or does it just keep showing the previous one that no longer works?

1 Like

Under the GDPR, anonymous information is described in Recital 26 as: “information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.” This means that once data is truly anonymized, it is no longer considered personal data and falls outside the scope of the GDPR.

On the other hand, pseudonymization is defined in Article 4(5) of the GDPR as: “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.” Unlike anonymization, pseudonymized data can still be linked back to an individual if the necessary additional information is available.

The key difference lies in reversibility: anonymized data is irreversibly altered, while pseudonymized data can be re-identified under certain conditions.

I would strongly urge to use these legal definitions and no others.

1 Like

I often see that both terms are misleadingly used. It is very hard to do proper anonymization and most of the time it is a false claim.

It’s interesting, because it’s actually discarded immediately after using it. So you hit search, and as soon as you open the chat it is discarded. It’s only shown as the display name until the other person accepts your message request, after which it’ll just show their set profile name.

Devs have indicated they purge it just about immediately because it’s meant to be rotatable and disposable.

Nice, thanks, TIL. I struck over the relevant stuff in my reply above.

1 Like

Thanks for your explanation!

AFAIK, Signal is both client-side and server-side open source. Can users assure that Signal does not log IP addresses, etc? If Signal’s server are run as per the code they opened, isn’t the IP address at least protected(not hiding/masking, but no logging/inability to disclose to third parties&LE) also by design?

I do agree that messengers like Cwtch and SimpleX provide more metadata privacy than Signal.

We have court evidence Signal doesn’t collect it. https://signal.org/bigbrother/ shows they only collect the UNIX timestamps of when you registered and the date of when you were last seen online.

In general, you can not be sure the server is running the same exact piece of code as the one that’s visible on GitHub. Thus, the most useful thing open source server provides, is continuity in case the service itself goes down. The next useful thing is, it allows independent parties to check for code smell and bugs that might also be present in what’s running. The point being, any malicious code would not be pushed to GitHub but that’s not the only class of problems in the code.

Now Signal does provide something called remote attestation, which is a proprietary technology built over Intel’s Software Guard eXtensions (SGX). I’m not sure if that accounts for all of Signal’s server-side code, but at least partially, the client can verify what the server is running, provided the attacker hasn’t lifted the private keys from Intel CPUs signing the code. This Defcon video looks into similar hardware attack and you’ll learn it’s not a trivial thing to pull off, but also probably not impossible for nation states. So it’s a pretty good defense against anything but FVEY.

Well not quite, sure, it’s a technical capability that is designed into the system, but it’s again, an internal policy decision to keep that technical implementation at play, and since SGX is proprietary, your client can’t independently be sure of what’s happening on server side. SGX (if it applies in this case) sort of protects you, but again only if the signing key remains protected. With end-to-end encryption, you wouldn’t want Intel to have centralized trove of keys that could decrypt every message you send. So privacy by design boils down to client side protections, where the open source lets you check mechanisms that protect you, and where you don’t have to trust anyone.

SimpleX is a bit tough nut in that the walk doesn’t currently match the talk. I wrote about this in this post in another thread.

The tldr is, SimpleX says it has no identifiers, when they mean it doesn’t add identifiers; They don’t care about your router gluing its IP-address to the TCP header and leaking it to the server. And since the entire public server infrastructure runs under two VPS providers (Akamai and Flux), it’s fifty-fifty, if your peer connects to the network via same VPS provider, that can perform end-to-end correlation attacks and put together communication logs. SimpleX is not disclosing this properly in their threat model, and they’re not linking to the threat model on the front page.

The only good part is, SimpleX has decent Tor Onion Service server mechanism. In the post above I said

So SimpleX is easy to configure to function like #2. You just install Tor, enable proxy, and set the client to enforce use of Onion Service servers. So two settings. Cwtch is better as it’s always using #3.

But since SimpleX by default is none of these three, I wouldn’t call it anonymous or metadata-private.

2 Likes

Thanks. I was also interested in the debate about SimpleX vs cwtch. Though I read it from the beginning, it was too long and included many links to other sources..

So you’re saying that Cwtch is better than SimpleX in terms of anonymity right? And if one is to use SimpleX, it is recommended to use it with Tor so that the SimpleX server operators (Akamai and Flux) cannot trace back to you with IP address?

Yeah exactly. Definitely use SimpleX with Tor if you need it to provide more metadata-privacy than Signal.

SimpleX does provide some convenience things Cwtch currently can’t, like server caching ciphertexts, so you have offline-messaging. I’m not sure what Cwtch’s current situation with that feature is.

The downside is SimpleX infra can still amass metadata of conversing session tokens used to validate access to ciphertext buffers identified by “queue IDs”. If you start fresh with SimpleX and set the proxy setting and Onion Service server enforcement immediately and before anything else, you will remain anonymous from the servers.

From what I tested, the Linux desktop client doesn’t currently give any indication it’s anonymous which is a problem in some cases. E.g. a partially restored reinstallation might lead to the setting being not properly enabled, and the client might connect to the server’s non-torified address.

But I think it’s better to continue the SimpleX discussion under some separate thread.

3 Likes

Got it. Thanks for your kind explanation❤️

Signal’s servers still ultimately need to know where to route your message. While Sealed Sender does provide one-way anonymity, that becomes useless once communication becomes two-way.

Say that you send a letter in the mail, just leaving the return address off. It will get routed from some post office to an address, then responding mail will get routed from some other post office to your address. If you’re the postal service (or someone else who could monitor the postal service in real-time), it wouldn’t take a genius to find out that the two addresses are pen pals. In fact, you wouldn’t even have to be pen pals if you’re both automatically sending receipts saying you’ve received (or read) their last letter.

Given that people mostly only talk to their friends, family, and coworkers, one could learn interesting things about someone from knowing who they talk to, even if they don’t know what they’re saying. If someone works at some factory, say, and then one day they start talking to an investigative journalist, I similarly wouldn’t need to be a genius to guess what they were talking about.