The following post was not written by me. It was written by Julien Voisin and
posted on his blog in October 2018.
I am sharing it here, unedited except as noted below, according to the CC BY-SA license of the post.
Edits made:
- Add table of contents.
- Change local links to point to my copies of the paper and its figures, not Julien Voisin's copies.
The paper it talks about is old news at this point (from 2018), but I see someone stumble upon it every few months ... instances that are just spread out enough
I can never remember where this amazing post is on the web. Now I can't lose it.
Title: Debunking "OSINT Analysis of the TOR Foundation" and a few words about Tor's directory authorities
Date: 2018-10-04 15:00
I have spent years on Tails' IRC channel answering
questions from various users, amassing a pile of personal notes about the internals
of both Tor and Tails in the process.
A friend of mine linked me an "interesting" paper
(local mirror)
entitled OSINT Analysis of the TOR Foundation
, and was wondering how much
trust to put in it. I read it, and decided that it was so hilariously bad that
it deserved a blogpost. It's also a nice opportunity to explain a few things
about the directory authorities (dirauth).
The post is in two parts: first, a rough explanation about what the dirauth
are and how resilient is the tor network with regard to them,
then a complete review of the paper.
Tor and the dirauth
The Tor network is mainly composed of relays run by volunteers, with various
attributes:
exit,
fast,
guard,
hsdir,
running,
stable,
valid,
badexit,
v2dir,
… but also authority.
Tor 0.0.2,
released in 2004, introduced Directory Authorities, servers that served
(duh.) cryptographically signed directory documents, containing a list of all
relays along with their associated metadata (capacity, version, uptime, …) and
status.
But the first version of the directory protocol didn't prevent a lying
authority from providing a distorted view to some clients. This is why the
second iteration implement cryptographic signature, to allow the client to only
trust the directory documents signed by strictly more than half of all the
dirauth.
The third version (which happens to be the current one) provides support for
offline storing of critical cryptographic material for the dirauth, so that
keys don't have to be stored in plain-text on the machines anymore.
Furthermore, it introduced a nicely constructed consensus for the dirauth,
instead of asking the client to aggregate all the separate data, to fight
partitioning attacks.
It's possible to take a look at what the consensus looks like here.
How are new relays registered?
When a new relay comes online, it uploads its relay descriptor to a dirauth, to
register itself with the tor network. Each dirauth is taking its view of the
network and every hour gossip their 'vote' of the view of the network (which is
essentially all the relay descriptors that they are aware of, and the bwauth
measurement results) with the other directory authorities, and that is all
merged into one 'consensus document' that is the global view of the network for
a certain duration. That consensus document is fed to the fall-back
authorities, who are the frontlines to clients coming online and needing to
load the current state of the network.
Who controls the dirauth?
There are currently 10 relays with this flag:
bastet
, in the USA, run by Stefani Banerian, hosted by riseup
dannenberg
, in Germany, run by Andreas Lehner, hosted by the CCC
dizum
, in the Netherlands, run by Alex de Joode from sabotage.org, hosted by XS4ALL AS
faravahar
, in the USA, run by Sina Rabbani, hosted on Rethem Hosting infrastructure.
gabelmoo
, in Germany, hosted by Sebastian Hahn (you might have heard about him) in an university
longclaw
, in Canada run by the birds from riseup, hosted on Koumbit
maatuska
, in Sweden, run by Linus Nordberg, hosted by the DFRI
moria1
, in the USA run by Roger "arma" Dingledine at the MIT's AS: AS3
tor26
, in Austria run by Peter Palfrader on Tele2 Telecommunication GmbH's AS: AS8437
Additionally, there is a bridge authority, that isn't a v3 directory one,
listed here only for completeness' sake:
All of them are either in North-america or in Europe. I'm not in the business
of doxing people, but it's pretty
easy to find the social graph and nationality of all the admin in the
list, their relationship to the Tor project, and even to have a beer with some of them 
What happens to the network if the dirauth goes down?
If the authorities were all shut down, clients would still be able to download
the list of relays: your client doesn't actually get the relay documents
directly from the authorities, but from caches from Tor nodes with the
V2dir flag.
Your tor client has as well a local cache anyway.
As for compromised directory authorities, starting from the version 2 of the
directory protocol, on top of downloading the actual relay documents, your
client is also getting hashes of the relay documents signed by other
authorities: relay documents will only be trusted if they are signed by
at least half of the authorities. If one or two or three authorities were to be
compromised, they won't be force clients to accept a distorted versions of the
consensus.
Can't we replace the authorities with something more distributed?
It's a non-trivial
problem.
Attacks on the dirauth
I discussed a bit with nextgens and others about the dirauth,
and it's actually not that trivial to influence them. The goto way would be to
simply pop them, but I do trust their respective maintainer to have deployed a
bunch of fancy mitigations and monitoring.
An other way would be to influence them, by taking control of the
pipes of the majority of the dirauth to influence their measurements.
Fortunately, the dirauth aren't really doing measurements on their own:
bandwidth authorities
(bwauth) are, and those are transmitting their calculations to the dirauth they
have a pre-established relationship with. Some bandwidth authorities are being
run by dirauths, but most of them are not being run on the dirauth machine
itself, but are 'hidden' elsewhere on the network
The paper
Author and context
The paper was written by Maxence Delong, Eric Filiol, Clément Coddet, Olivier Fatou and Clément Suhard,
from the ESIEA, in Laval,
more specifically, from the Operational Cryptology and Virology Laboratory (C + V)O.
At this time, everyone but Eric Filiol was a student
Eric Filiol is
known for pretending to have broken AES in 2002 (he didn't),
and in 2003 (he still didn't) and Tor
in 2011
(he didn't either),
and for being the architect and designer of DAVFI,
a French "new generation anti-malware solution",
known for a being a phenomenal (and extraordinary expensive)
source of fun.
The paper was presented at the
13th International Conference on Cyber Warfare and Security (ICCWS 2018),
and apparently underwent a "double-blind peer review process". The conference
is organised by Academic Conferences and Publishing International Limited,
organizers of a bunch of conferences.
The blog of the Operational Cryptology and Virology Laboratory (C + V)O
published a blogpost
entitled "OSINT on the TOR Foundation (Update)", by Eric Filiol,
containing two exaggerations (amongst, as usual, various typos):
As we shown on our paper “OSINT Analysis of the TORFoundation”, we worked on
the funds and proved that the US government is deeply involve with
arpproximatly 85% of the funds in 2015.
The paper states the following:
As we can see, at least 58.20%
of the total funds are coming from different departments of
the US government. The status of RFA (Radio Free Asia)
Contract is unclear and there are persistent allegations and
testimonies (Prados, 2017; Levine, 2015) or even suggestions
that it could be strongly connected to the CIA more than
expected (Levine, 2015). Would this suspicion be true, the
rate of funds from US government-related entities would
grow up to 85.24%.
There is a difference between a suspicion, and the blogpost's affirmation,
especially when it changes a number from 58.20% to 85.24%.
Secondly, we had some reasons to believe that the US government has strong
links with The TOR Project Inc. via Roger Dingledine who made an internship in
NSA and with some presentations in front of high authorities like the White
House and the FBI.
I don't think that doing an Summer internship at the NSA qualifies as a "strong
link". About the presentations, it's well known that R. Dingledine does
a lot of them to law enforcement entities, to improve their view on the
network, and more broadly the Tor ecosystem. A lot of
his
bio
for
various
conferences are ending with this:
In addition to all the hats he wears for Tor,
Roger organizes academic conferences on anonymity, speaks at a wide variety of
industry and hacker conferences, and also does tutorials on anonymity for
national and foreign law enforcement.
I don't think that this could be viewed as a credible connection to the US
government.
Form and sources
It's worth noting that while all the figures used in the paper are unreadable,
it's possible to extract them with pdfimages
(or to check the sources)
to see that they are in pretty high-resolution, and actually readable:
Figure 1.,
Figure 2.,
Figure 3. and
Figure 4..
The figure 3. doesn't come with any legend with regard to the used
currency, but since its point is to show a ratio, it doesn't matter much.
Despite a second revision to improve the English and remove the typos, the
paper is still full of typos, frenchisms, and oddly worded sentences.
Amusingly, this is the diff between the second and the third (and final at this
time) revision of the paper:
-\author{Maxence Delong$^{1}$, Eric Filiol$^{2}$, Clément Coddet$^{3}$, Olivier Fatou$^{4}$, Clément Suhard$^{5}$}% <-this % stops a space
+\author{Maxence Delong, Eric Filiol\thanks{Contact author: \url{filiol@esiea.fr}}, Clément Coddet, Olivier Fatou, Clément Suhard\\
+ ESIEA Laval, Operational Cryptology and Virology Laboratory $(C + V)^O$ \\ 38 rue des Drs Calmette et Gu\'erin 53000 Laval France}% <-this % stops a space
E. Filiol is the only one with an email address, and apparently the main author
of the paper.
About the sources of the papers, almost a third (3/10) of them
are from "Filiol et al."
Insinuations
The paper is making several baseless/inflated insinuations,
also known as loaded questions,
a classic fallacy technique.
Officially, this foundation has no
link with US government (any other one) and is independent (Dingledine, 2017).
There is a growing feeling that this may not be the case.
Recurrent questions arise that put this apparent independency into question: what if
the US government was behind the TOR network and somehow controls it?
In fact, the TOR project is an implementation of a concept born in the US Naval Research
Laboratory (Goldschlag et al., 1996; Syverson et al., 1997). Paul Syverson is the designer of the routing protocol
and was part of the original development team of the TOR network. Hence the TOR infancy was clearly linked
with the US government and still is.
Furthermore, Roger Dingledine spent a summer in internship in the NSA, so we
can suppose that he has kept a few contacts in there
The owner is Roger Dingledine, one of the three creators of the TOR
Project (and a former NSA employee).
Roger only did a Summer internship at the NSA, I wouldn't call him a "former NSA
employee".
Sloppy research
The title of the paper is "OSINT Analysis of the TOR Foundation", and refers
the "TOR foundation" or "foundation" at least 40 times in the paper,
as well to a company and a firm, but there are no such things:
The Tor Project, Inc.
is a "Massachusetts-based 501(c)(3) research-education nonprofit organization".
Moreover, the proper capitalization is Tor
to refer to the project, and tor
to
refer to the client or the network.
The authors didn't do a proper job to find the current Tor specification:
In this part, we will talk about the directory authorities (see
https://svn.torproject.org/svn/tor/tags/ tor-0_2_1_4_alpha/doc/spec/dir-spec.txt for details).
The canonical link for it is https://gitweb.torproject.org/torspec.git/plain/dir-spec.txt
The linked Tor 0.2.1.4-alpha
was released in 2008-08-04, ten years before the publication
of the paper.
The article doesn't understand the concept of pseudonymity:
It is a real problem for the network: do users can trust people
they do not know? Where do these people come from? What
is their background?
The Tails developers are all pseudonymous, it doesn't prevent the project from
being used and trusted by thousands of people
around the world, and endorsed by many.
Some famous projects have (or used to have) pseudonymous contributors:
Bitcoin,
Truecrypt,
DOTA, …
most of Wikipedia's contributors are too,
and all of those projects are used and trusted.
I'm way more comfortable knowing that the directory authorities aren't all
managed by Tor employees. Moreover, only a single authority (two when the paper
was written) is managed by well known collectives/pseudonymous people.
All of them are well established entities, known and trusted by many. Saying
that they are unknown and with a mysterious background is a pretty bold
statement. Moreover, "where do these people come from" is a pretty irrelevant
question.
Peter Palfrader was the owner of tor26
(the first directory authority which does not belong to Roger
Dingledine). Released in the version tor-0.0.8.1 in October
2004, the directory authority is not working anymore.
This is a plain lie:
tor26 is working continuously
since at least 5 years.
If a few people need to be on the Core
People page, it will be the founder of the TOR Foundation
and the people running a directory authority. With this
disappearance, the customers have less information about the
people who actually handle the network.
Although Paul Syverson worked with
Roger Dingledine and Nick Mathewson, he never was part of the Tor Project Inc.
He's still doing research on Tor, anonymity and onion-routing though.
On a side note, using the term "customers" instead of "users" is interesting:
Tor has nothing to sell, everyone can use the tor network for free.
There are at
least 25 research papers coming from Paul Syverson for the
TOR network. The last example in date was the 18th of
September 2017 for the version tor-0.3.2.1 which was imple-
mented by following a paper wrote by Paul Syverson and his
team from the US NRL only.
Saying that there are "at least 25" papers without naming a single one of them
is not a correct way to provide sources. Referring to a paper by its date
of publication isn't either. The paper in question being likely
Never Been KIST: Tor’s Congestion Management Blossoms with Kernel-Informed
Socket Transport by
Rob Jansen, John Geddes, Chris Wacek, Micah Sherr and Paul Syverson, followed by
Tor's Been KIST: A Case Study of Transitioning Tor Research to Practice
by Rob Jansen and Matthew Traudt.
The first paper wasn't written by "Paul Syverson and his team form the US NRL
only": only Syverson and Jansen are from the U.S. Naval Research Laboratory;
Geddes is from the University of Minnesota while Wacek and Sherr are from
the Georgetown University.
Officially, TOR is not
developed anymore by the US government but a major part
of changes was designed and developed by Paul Syverson
through the US NRL and some people have work closely
for the US government (not only among founders).
This is a bold statement without any kind of proof, but because the Tor Project
has a lot of code split in different projects,
a simple git shortlog
on tor
's source code shows that this is completely wrong:
$ git show | grep '^Date'
Date: Fri Sep 21 09:54:22 2018 -0400
$ git shortlog -s | sort -nr | head -n 25
16963 Nick Mathewson
6245 Roger Dingledine
715 Peter Palfrader
678 David Goulet
546 George Kadianakis
502 Sebastian Hahn
492 teor
417 Andrea Shepard
362 Karsten Loesing
322 Mike Perry
300 Andrew Lewman
268 teor (Tim Wilson-Brown)
234 Robert Ransom
221 rl1987
150 Alexander Færøy
145 Isis Lovecruft
137 cypherpunks
111 Linus Nordberg
87 Steven Murdoch
83 Taylor Yu
77 Yawning Angel
73 Cristian Toader
47 Neel Chauhan
47 Jacob Appelbaum
46 Paul Syverson
In this list, only Paul Syverson has (public) affiliations with the US government.
We note that the Core People page is not containing infor-
mation about a few important people in the TOR Foundation.
This page is not sufficient to have an idea of who are the
true leaders of the foundation. We have explained who are the
leaders of the network (directory authorities) but not those
of the foundation.
The board of director of the Tor project is
public,
and apparently, the authors of the paper forgot to check the
Past Contributors,
because it documents the role of every single significant past contributor to
the Tor Project.
Some contractors were hired, Pearl
Crescent for example (a developer), and were “hidden” by
the foundation. The TOR foundation asks indirectly a blind
trust on the source code (due to the huge amount of line) and
they give the development to people we do not even know.
Pearl Crescent isn't a developer at all, it's a
company, referred as Pearl Crescent LLC. in
the report. Its activity was thoroughly documented on the
tor-reports
mailing list, and their patches publicly (like any other ones) reviewed.
We discover a few names that are not on the
Core People page. Rob Thomas, Meredith Dunn, Andrew
Lewman, Mike Perry and Andrea Shepard are still unknown.
Rob Thomas
is the founder and CEO of Team Cymru.
Meredith Hoban Dunn is an accountant, advisor, and banker. She's the one
that signed the financial audits reports,
and is designated as the treasurer of The Tor Project, Inc in it.
Andrew Lewman, as indicated on the
past contributors, is the
former Executive Director. He managed the business operations of The Tor
Project, Inc. Played roles of finance, advocacy, project management, strategy,
press, law enforcement liaison, and domestic violence advocacy.
He was (likely, I don't have much details) fired,
and is now running a shady company
that does darknet-related-intelligence-magic-stuff.
A quick glance to the
Which PGP keys sign which packages
page shows that Mike Perry is/used to be the Tor Browser's lead developer.
The financial report indicates that he's a developer, and a quick glance to
the commits history of tor quickly confirms this.
He was my mentor during my
Google Summer of Code, in 2011,
when I wrote the first iteration of MAT. I'm not surprised that he doesn't
want to appear on the "Core People" page: he's a very private person.
Andrea Shepard was a Tor
developer, as shown by a quick git shortlog
, and as indicated in the
2015's financial report.
She was brought to the fore during the Jacob Appelbaum events.
The TOR Foundation is regularly claiming that the US
government is not funding anymore the TOR Project (Din-
gledine, 2017)
This is a plain lie: the document to source this affirmation is Roger's DEFCON 25's
presentation,
which actually shows that Dingledine actually debunked the following
"myths" during his talk, along
with several other ones listed in the paper:
- “I heard the Navy wrote Tor originally, so how can I trust it?”
- “I heard the NSA runs half the relays.”
- “I heard Tor gets most of its money from the US government.”
- “I heard 80% of Tor is bad people.”
The table 2. is right (notwithstanding the typos),
but since it's mostly copy/pasted data from the
financial report,
it's not surprising.
We will not develop most of the technical aspects that
could suggest or confirm that somehow the TOR network
has been designed or is managed in such a way that a few
“facilities” are possible and would enable to take control
over it. As a consequence, taking the control of a reduced
number of TOR relays (from 450 to 1400 only) would
enable to reduce the TOR traffic of at least 50 % and would
greatly ease correlation attacks (about 35 % of the traffic) or
eavesdropping (about 10 % of the traffic).
Yet an other loaded question, and references to other papers from Filiol; I
might publish my lecture notes about them at some point in the future too.
As far as the relay bridges management is concerned, it has
been possible to extract slightly more than 2,500 such bridges
thus compromising the alleged ability to bypass censorship.
This has been debunked
several
times.
During our study, in September 2017, we were contacted
by a user of a custom TOR library. This library is the “node-
Tor” written in JavaScript and allows the user to create
and run a node or connect to the TOR network. Further
exchanges with this person have shown a lot of inconsistency
and irregularities.
The person here is actually Aymeric Vitte. I sent him an email, and he felt
that Filiol's paper deserved a public response on
tor talk.
I do recommend its reading 
At first, we talk about the way that his node was added
to the network. For this custom library, the user asked the
TOR foundation to add a node with this library and after
an exchange of a few mails the node was accepted and run.
The library is very different from the original source code.
To compare very simply those two codes, we just compared
the number of code lines. We know that the number of code
lines does not really reflect the effect of the code but between
the original source code (several hundreds of thousands code
lines) and the library (only fifteen hundred code lines), we
can assure that it is very likely that a number of options or
securities are missing.
They are comparing the number of lines in a minimal javascript
(a high-level language)
implementation of Tor, and the official full-blown implementation, written in C
(a kind of low-level language):
this comparison metric doesn't make any sense.
Moreover, implying that Tor-node is only
1500 lines of code is a ludicrous claim, given how much it does.
Anyone can add a node to the network, there is no such thing like "ask the TOR
foundation (sic.)" to add one.
It is not the designer of this code who is responsible
but rather the TOR foundation for accepting a node on the
network with this kind of library. The first problem is that
no one is warned that this node is special and is not running
the official source code. This node owned by a user is not
controlled by the TOR foundation. So if the user is malicious,
he could modify his node and make every change he wants.
If a government wants to include this kind of node to log the
traffic and gather it, he can do it very simply and without
triggering any alert.
Having several implementations of tor relay running on the network is a
actually a great idea: this improves the security of the network (a bug found
in an implementation might not be present in an other one), and helps to find
bugs or specification issues, which is a great thing in my opinion.
For example, CVE-2018–17144
was likely found
due to implementation disparities between different bitcoin clients.
The tor network doesn't put much trust into relays themselves: any entity is
free to run whatever nodes it wants, this is how the network is designed to
work. Although, abuses might happen, and this is why there as several
documented countermeasures
and monitoring projects: Volunteers are running continuous checks to measure
the integrity and trustworthiness of exit-nodes: are they tampering with the
traffic or running active analysis? Malicious nodes are flagged and blacklisted
from the network on a continuous basis.
If the security of the network is ensured by the fact
that all the nodes run the same source code, with the same
security level, the same options and so on. . . this fact proves
that the TOR network is not so secure.
It's absolutely not the case, as explained in the previous paragraph. It seems
that Filiol et al. have no idea about the threat model nor implementation of
the Tor network at all.
We have discovered that with only few exchanges with the TOR
foundation, we can add a custom node (possibly malicious).
As for every node, no systematic control is possible by the
TOR foundation, once accepted within in the network, we
can do what we want with this node, log the traffic, insert
biases in the creation of circuits etc. . . In summary, we think
that the TOR project should not accept custom codes in
order to respect the uniformity of the network that ensures
“security”.
As previously explained, the only way to add a node to the network is to
register it to the authorities, there is no such thing as "few exchanges with
the TOR foundation", since the network isn't managed by it, nor by anyone,
expect the authorities.
The very fact that anyone can run a relay ensures the security and
anonymity of the network: imagine if a single entity would approve or reject
who could join tor…
As far as confidence is concerned, nobody
(except state organization) has the courage/time to read the
source code and no one is paying attention to the designer of
the changes on the TOR source code
A quick glance at the git showlog
gives a rough estimate (there might be
duplicates) of the number of committers:
$ git log --format='%aN' | sort -u | wc -l
203
$
This is a conservative estimation of the people that not only bothered to read
the code, but even contributed to it.
As a comparison this is the same command run on
GNUPG's git repository,
the library that everyone uses to encrypt emails and sign software
in the Linux world:
$ git log --format='%aN' | sort -u | wc -l
57
$
An other indicator of the attention that Tor is getting might be
activity on Tor's bugtracker timeline,
where it's not uncommon to have more than 100 different actions per day,
by a lot of different people.
Paul Syverson (from the US NRL) is the original designer (not developer) of
most of implementation. The last version of TOR is the perfect example: all
major changes are coming from the US NRL.
We already debunked this by looking at the git commit history.
No official statement revels that the US government is
helping the TOR network but all the information gathered
during our study seems to confirm that the US government
is still deeply involved in the TOR project
The sponsors page is
public, and lists every major sponsors. The fact that the US government is
giving grants to researcher to study anonymity and resilience is pretty
healthy for the Tor Project, and doesn't mean, at all, that the US government
is "deeply involved" in the project. At least not significatively
more that the other major donators like the EFF, Human
right Watch,
Google, the Freedom of the Press
Foundation, Reddit, …
This study is not claiming breaking the TOR network
or affirms that the US government is the real organization
behind the TOR project.
This blogpost is not claiming that E. Filiol is a clown,
nor affirms that he hasn't done any worthy contribution to computer science in
years.
However favoring such a network
would be a clear violation of the Wassenaar Agreement
(www.wassenaar.org) unless some sort of control is
in place in a way or another (Filiol, 2013).
The paper cited here (Filiol, 2013) is "The Control of Technology by Nation States –
Past, Present and Future – The Case of cryptology and Information
Security”, Journal in Information Warfare, vol. 12, issue 3, pp. 1—10,
October 2013.", published behind a paywall. Fortunately, it's possible to access it via
Google books.
In this paper, Filiol is speaking mostly about France,
while The Tor Project, Inc. is an American entity, but this doesn't matter much
in our case.
Since I'm not a lawyer, I asked a good friend of mine, who happens to be a
legal advisor, specialised in international and French business' Law,
to help me with this part.
The List of Dual -Use Goods and Technologies and Munitions List states that "Controls do not apply to "technology" "in the public domain", to "basic scientific research" or to the minimum necessary in formation for patent applications.".
A quick looks at the definition part of the document shows the following:
"In the public domain": This means "technology" or "software" which has been
made available without restrictions upon its further dissemination. Note:
Copyright restrictions do not remove "technology" or "software" from
being "in the public domain".
This is the case of Tor, and other Free (as in freedom) software, that are thus
not subject to the Wassenaar Agreement, at all.
A quick glance at the
comprehensive FAQ from rapid7
about the Wassenaar Arrangement,
or the small blog post from GNU
confirms our interpretation.
This study aims at informing TOR users and to make them aware
of network like the TOR network and the possible reality
behind. Customers need to be informed before using any
network who claims to protect your privacy and anonymity.
This blogpost aims at informing the public and to make it aware of charlatans
like E. Filiol and the possible reality behind. People need to be informed
before citing any work from this person, inviting him at conferences, or asking
his opinion.
Conclusion
This is a botched paper in broken English, filled with approximations
and sheer inventions about Tor.
...