In the Tor Overview (v3.18), in the reference link in the relevant section and in othersources, it is emphasized that VPNs are vulnerable to website traffic fingerprinting. In fact, it is also emphasized that this type of attack gives false positives for various reasons, and as I understand it, these false positives include people accessing websites through an VPN using any web browser.
In the scenario of using VPN+Tor, website traffic fingerprinting is associated with the VPN. In the scenario where Tor Browser is used directly or through its own proxies, etc., it is understood or thought that this type of attack will be ineffective or that TB will provide resistance against this attack. If indeed TB provides resistance against this attack, if the trick is in TB, wouldn’t the VPN > Tor Browser/Tor nodes > privacyguides.org scenario also provide resistance against website traffic fingerprinting?
I think you might be getting confused by “fingerprinting” there are two kinds.
Network based fingerpringing assumingly by a network operator such as a carrier or internet provider and is trying to determine if the data moving “is tor” or “is vpn” etc.
Browser based fingerprinting is typically done by the remote site, and is about similar characteristics in the user’s behavior, browser and device - that is usually then passed to an advertiser who wants to identify the same user across multiple services.
The page you reference is about the first kind and is in relation to using a VPN may not necessarily mean a network provider can’t identify you’re also using Tor as well.
Web browsing privacy mechanisms, such as SSL, Tor, and encrypting tunnels, hide the content of the data transferred, but they do not obscure the size, direction, and timing of packets transmitted between clients and remote servers
As a result, researchers have proposed several defenses, primarily aimed at hiding packet size information. For example, Tor packs all data into 512-byte cells. Other mechanisms pad packets in a variety of ways (e.g. padding to 2k bytes, or padding all packets to the MTU). Wright, et al., proposed traffic morphing, which pads and fragments pack- ets so that the resulting distribution of packet sizes appears to be from a different web page [26]. Dyer, et al. showed that all these schemes are broken [6].
Another similar article I read years ago was this one: My Experience With the Great Firewall of China (14 Jan 2016). You can expect that these systems are a lot more advanced than they were back then. China also exports the GFW to other oppressive regimes, so it’s not just in use in China. This is why some VPN providers (Mullvad and IVPN are now using v2ray to disguise their VPN tunnels
I had understood that the website traffic fingerprinting attack was more than just detecting whether data movement was done via VPN or Tor, but also predicting which sites users accessed, and I tried to address this. This type of attack is only mentioned in the VPN/SSH connection scenario.
What should we understand from these two texts; attackers or observers can guess whether the traffic is Tor or not, but they can’t guess which websites are accessed?
VPN/SSH Fingerprinting
The Tor Project notes that theoretically using a VPN to hide Tor activities from your ISP may not be foolproof. VPNs have been found to be vulnerable to website traffic fingerprinting, where an adversary can still guess what website is being visited, because all websites have specific traffic patterns.
Therefore, it’s not unreasonable to believe that encrypted Tor traffic hidden by a VPN could also be detected via similar methods. There are no research papers on this subject, and we still consider the benefits of using a VPN to far outweigh these risks, but it is something to keep in mind.
If you still believe that pluggable transports (bridges) provide additional protection against website traffic fingerprinting that a VPN does not, you always have the option to use a bridge and a VPN in conjunction.
TorPlusVPN#vpnssh-fingerprinting
Wiki · Legacy / Trac · GitLab
Using a VPN or SSH does not provide strong guarantees of hiding your the fact you are using Tor from your ISP. VPN’s and SSH’s are vulnerable to an attack called Website traffic fingerprinting ^1^. Very briefly, it’s a passive eavesdropping attack, although the adversary only watches encrypted traffic from the VPN or SSH, the adversary can still guess what website is being visited, because all websites have specific traffic patterns. The content of the transmission is still hidden, but to which website one connects to isn’t secret anymore. There are multiple research papers on that topic. ^2^ Once the premise is accepted, that VPN’s and SSH’s can leak which website one is visiting with a high accuracy, it’s not difficult to imagine, that also encrypted Tor traffic hidden by a VPN’s or SSH’s could be classified. There are no research papers on that topic.
What about Proxy Fingerprinting? It has been said above already, that connections to proxies are not encrypted, therefore this attack isn’t even required against proxies, since proxies can not hide the fact, you’re using Tor anyway.
, ^1^ See Tor Browser Design for a general definition and introduction into Website traffic fingerprinting.
^2^ See slides for Touching from a Distance: Website Fingerprinting Attacks and Defenses. There is also a research paper from those authors. Unfortunately, it’s not free. However, you can find free ones using search engines. Good search terms include “Website Fingerprinting VPN”. You’ll find multiple research papers on that topic.
From that paper mentioned above, they can figure out the websites, too, while not maybe 100% it is still possible with enough data. See p4/12 of that paper:
RECOGNIZING WEB SITES
As the evaluation results in Section 6 will show, the classifier described above is quite good at determining which of n web pages a user is visiting, assuming the user is visiting one of those n pages.
It further elaborates, that some things might effect this, such as how a user browses a page, ie do they navigate through menus or do they go directly to pages. It also suggests that cached materials might effect the result, such as visiting a facebook profile of a specific person.
Website traffic fingerprinting is a highly effective attack type for predicting the websites that users visit, despite the possibility of false positives. The websites visited by VPN users can be predicted with this type of attack. If this is true, maybe some warning text(s) could be added to the VPN Overview and/or recommended VPN service providers.
Website traffic fingerprinting is an attack type that works equally well even on Tor Browser Bundle users. The websites visited by TBB users can be predicted by this type of attack, regardless of whether a proxy or a bridge is used.
Using Tor, TBB in combination with an active VPN connection (perhaps with a proxy or bridge configuration) does not inherently make users less vulnerable to this type of attack. If these two are true, maybe some warning text(s) could be added to the Tor Browser recommendation and/or the Tor Overview.
I am just trying to say that we need to better understand what risks we take when using VPN, TBB, VPN+Tor/TBB, etc. and perhaps recommendations should be made in this direction. Please understand that I am not trying to drag this out.
It’s more of a case using VPN+Tor isn’t going to make things better, as far as network fingerprinting goes. Of course it does depend on how much effort they put into observing your communications.
As far as browser fingerprinting goes, the idea of using the Tor Browser is to blend in with other Tor Browser users, who also originate from Tor IP addresses.
Maybe- For non-Tor VPN users, I don’t think we can say this being “highly effective” is true definitively. There isn’t much evidence that this is feasible in the real world, see for example: A Critique of Website Traffic Fingerprinting Attacks | The Tor Project (the information in this post is not exclusive to fingerprinting on Tor specifically)
That being said, most VPN providers do not go out of their way to protect against this attack in the same way Tor does. Some do provide added obfuscation, like @dngraymentioned.
I still don’t agree that a warning on the VPN overview page is warranted, because I don’t know why anybody would be impacted by this threat but not require the other much-more-relevant protections that Tor provides.
A critique of this research article specifically is also covered in the article above.
No- Tor provides its own protections against this attack.
When we are talking about traffic fingerprinting in the context of using a VPN+Tor configuration, we are talking about an adversary using these network fingerprinting techniques to merely determine whether you are using Tor inside that VPN. There’s no evidence or reason to believe that an adversary can use these fingerprinting techniques to also see what websites you’re visiting specifically within a VPN+Tor tunnel.
No- Using a VPN may provide some benefit in this situation. It also may not… It’s like we say:
There are no research papers on this subject, and we still consider the benefits of using a VPN to far outweigh these risks, but it is something to keep in mind.
and also more importantly:
If you think that a bridge can aid in defending against fingerprinting or other advanced network analysis more than a VPN’s encrypted tunnel already can, you always have the option to use a bridge in conjunction with a VPN as well. That way you are still protected by the pluggable transport’s obfuscation techniques even if an adversary gains some level of visibility into your VPN tunnel.
Using a bridge with added obfuscation (like obfs4) certainly does make users less vulnerable to such attacks.
@username0990
Tor already inserts randomized periodic padding cells and padding packets with real traffic.
However they explicitly state it is only to mitigate local traffic analysis and NOT a global adversary.
CircuitPadding 0|1
If set to 0, Tor will not pad client circuits with additional cover traffic. Only clients may set this
option. This option should be offered via the UI to mobile users for use where bandwidth may be
expensive. If set to 1, padding will be negotiated as per the consensus and relay support (unlike
ConnectionPadding, CircuitPadding cannot be force-enabled). (Default: 1)
ConnectionPadding 0|1|auto
This option governs Tor’s use of padding to defend against some forms of traffic analysis. If it is set
to auto, Tor will send padding only if both the client and the relay support it. If it is set to 0, Tor
will not send any padding cells. If it is set to 1, Tor will still send padding for client connections
regardless of relay support. Only clients may set this option. This option should be offered via the UI
to mobile users for use where bandwidth may be expensive. (Default: auto)