Version 1.0, 2022/07/21, Bernhard R. Fischer <firstname.lastname@example.org>. Original Source: https://github.com/rahra/onioncat/blob/master/doc/INTRO_TO_ONIONCAT4.txt
Since the beginning of the development of OnionCat in 2008 a lot has changed. Most importantly this is the replacement of Tor’s hidden services version 2 by the new hidden services version 3. This had a severe impact on the functionality of OnionCat.
This article gives on overview why the newer hidden services almost broke OnionCat and explains the adaptions that have been made with OnionCat4 to work as smooth as possible again even with the newer version 3 of hidden services.
OnionCat4 comes with a distributed hosts database based on the DNS protocol which allows OnionCat nodes to dynamically exchange .onion hostnames between each other.
OnionCat4 refers to the versions >= 0.4.x of OnionCat which is found on Github (https://github.com/rahra/onioncat). This article is written in respect to version 0.4.6.
ABOUT ONIONCAT IN GENERAL
OnionCat is a VPN adapter which let’s you connect an arbitrary number of hosts together as if they were on the same network segment. This is what basically every VPN adapter (such as e.g. OpenVPN) does. The differences between OnionCat and any other VPN is that, firstly, OnionCat does not connect through the Internet but through Tor’s hidden services (or I2P’s), and, secondly, it uses a peer-to-peer approach. There is no central server or any other instance which manages the network.
Because OnionCat connects through Tor, any traffic passing through OnionCat is protected from surveillance.
Once a connection between two OnionCats is established, any network packets (such as e.g. TCP, UDP, and ICMP) can pass through this tunnel as long as they are IPv6-based. But since OnionCat assigns IPv6 addresses with the prefix fd87:d87e:eb43::/48 to the tunnel interface, any point-to-point connection between two nodes will work out-of-the box.
HIDDEN SERVICES AND THE GLUE BETWEEN ONIONCAT AND TOR
Tor offers hidden services which are a feature to connect to TCP-based services (such as e.g. HTTP servers) within the Tor network, i.e. connections do never exit to the Internet. Hidden services are addressed by special hostnames ending with the top level domain .onion.
If Tor gets a request for such hostname it does not exit to the Internet but it looks up the service within the Tor network itself and connects the TCP session there.
OnionCat makes use of these hidden services. It has two interfaces. One is a local tunnel interface which makes it able to receive IP packets from your host. The other interface is TCP to the Tor proxy. If OnionCat receives an IP packet on the tunnel interface, it reads the destination IP address of the packet, translates it to a .onion hostname, requests a connection from Tor, and sends the packet(s) there as soon as the connection is established. OnionCat internally maintains a peer list which is the association of TCP sessions and OnionCat IPv6 addresses.
The “magic” is the translation between IPv6 addresses and .onion hostnames. The development of OnionCat started in 2008. At the time then .onion hostnames with hidden services version 2 were encoded 80 bit long identifiers, such as e.g. 777myonionurl777.onion. Since IPv6 addresses have 128 bits, a translation back and forth into a special chosen prefix was possible. For example 777myonionurl777.onion. would translate to fd87:d87e:eb43:fffe:cc39:a873:6915:ffff and back again.
Since then several things changed and it was necessary to improve the cryptography of Tor and make it more strong. This led to the development of hidden services version 3 which come with 280 bit long identifiers. This obviously does not fit into an IPv6 address any more.
These newer hidden services almost broke OnionCat and its smart address translation technique and it became necessary to implement some kind of lookup mechanism which let’s OnionCat somehow retrieve the now much longer .onion URLs associated to the OnionCat IPv6 addresses.
AN .ONION HOSTNAME LOOKUP MECHANISM
Actually OnionCat had a built-in lookup mechanism almost since the beginning because I2P always used much longer identifiers. This original lookup mechanism was just a flat hosts file with IPv6 address/hostname pairs in it. Although this basically works fine, it makes it necessary to manually maintain these hosts files and urge people to exchange their hostnames in advance. The latter is in some cases simply impossible and not feasible, e.g. if people use BitTorrent on top of OnionCat.
OnionCat4 comes with a completely new lookup mechanism which should bring back its old elegant way, at least partially. And with some community effort it may become pretty powerful again.
Generally, doing a lookup means that there has to be some kind of database which keeps those IP address/hostname records. In my opinion, it is totally unrealistic to run some kind of centralized databases in the anonymous hidden service world. Simply because who should/would reliably do this? Probably some intelligence agency might jump in here 😉
OnionCat strictly follows a peer-to-peer approach. So it was necessary to develop a distributed peer-to-peer lookup mechanism as well.
THE LOOKUP MECHANISM OF ONIONCAT4
The lookup mechanism of OnionCat4 is based on DNS. 3 main building blocks where added to OnionCat:
- An internal hosts database.
- A lightweight DNS resolver.
- A lightweight name server.
OnionCat4 internally maintains a hosts database. Each time a new connection has to be opened, it looks up the hostname within this database. If the desired hostname is not found it selects some of the hosts within this database and sends DNS requests to them using its resolver. If at least one of these name servers responds, the new hostname is added to the internal hosts database and the next lookup will be successful.
THE HOSTS DATABASE
The hosts database basically is a list of entries. Each entry consists of an IPv6 address (which is the key), the .onion hostname, a source identifier (see below), an age (which is the timestamp when the entry was added), a TTL, and some metric parameters. The latter are used and modified by the resolver.
The hosts database may be populated by 6 different sources as follows. The 1st 3 are local sources, the 2nd 3 are remote sources.
- (0) Its own hostname. There is always just a single entry.
- (1) Hostnames passed as command line arguments with option -A.
- (2) The hosts file. By default this is /usr/local/etc/tor/onioncat.hosts. It is pulled in at program startup and automatically re-read every time if the file was modified.
- (3) Keepalive packets. Every OnionCat sends at least one initial keepalive packet to the remote end. It contains its own hostname.
- (4) An authoritative answer of an OnionCat name server.
- (5) A non-authoritative answer of an OnionCat name server.
The order is important since it defines the priority of the source. The first in the list has the highest priority. This means that an entry cannot be overwritten by a source with a lower priority.
The complete internal hosts database is saved to disk at regular intervals and at program exit (typically to /usr/local/var/onioncat/hosts.cached). The file will be pulled in again at the next program startup. This is to not loose remote entries across reboots. Entries of this file will never overwrite local entries of the hosts file (onioncat.hosts) or command line entries (option -A).
The hosts database can be viewed during runtime on the controller interface, or just have a look at the hosts.cached file. To get to the controller interface just telnet to port 8066 on the localhost and issue the command `hosts`.
The resolver is a simple lightweight resolver. Upon request it selects up to 5 hosts out of the hosts database and sends DNS requests to them. The name server selection is based on the metric value of each entry in the hosts database. Higher values are consider better. The metric algorithm may change in future but at the time of writing this article it is calculated as follows (see ocathosts.c:hosts_metric()):
metric = 1000 / source + acnt * 100 / qcnt source: CLI = 1, hosts = 2, keepalive = 3, ... acnt: answer count qcnt: query count
Basically that means that hosts with a higher priority and a higher answer/query ratio are considered better.
Technically the resolver does reverse lookups (PTR queries) because it tries the find a hostname for a given IPv6 address. Queries are sent using UDP on port 53.
Every time a query is sent the query counter (qcount) is increased. Successful responses will increase the answer counter (acnt) of the hostname entry of the name server. Responses will also update the TTL value of the hosts entry of the corresponding name in the query.
The resolver may be invoked manually on the controller interface of OnionCat. Telnet to port 8066 and use the built-in command `dig`.
THE NAME SERVER
The name server listens on UDP port 53 and replies to PTR queries. If a valid query was received it looks up the desired address within the internal hosts database. If an entry exists it sends back a proper response. If the source of the entry was local, the authoritative answer flag (AA) is set in the response.
If no entry is found the name server replies with NXDOMAIN. The server does not recursively resolve the request. To all other valid queries (e.g. A, MX, …) it will always respond with NXDOMAIN. Invalid queries may be answered with FORMERR if possible (see RFC1034 and RFC1035 for more details on the DNS protocol).
Since the name server implements the standard DNS protocol it may be queried with any standard DNS tool, e.g. `dig` or `nslookup`.
A background thread maintains the internal hosts database. On regular intervals the age and TTL values of the entries are checked. The default TTL is 2 hours after which it initiates automatically outgoing connections directly to those hosts to re-validate the entries. If a host is unreachable for more than 7 days, the entry is pruned.
With this new features built in, OnionCat has now the potential to be used as it was before and maybe get an even bigger user base.
Every single OnionCat instance now maintains its own hosts database and can connect and learn new entries by the hosts found in its database. Thereby the list of valid hosts will dynamically grow. Because the database is regularly saved to disk it will survive reboots even if they happened accidentally.
There are several scenarios possible. Let’s discuss a few of them. Of course, there is not just a single valid solution of implementation to the following examples.
RUNNING ONIONCAT4 IN A GROUP OF KNOWN INSTANCES
A group of people use a set of OnionCat nodes to connect to each other. Assume there are some notebooks (nodes N0 to N5) which are not permanently online and some servers (nodes S0 to S5) which are online all the time. All instances have Tor and OnionCat4 setup properly. Since hidden services version 3 require a hostname lookup the OnionCats need additional configuration and will not just magically work out of the box. At least one entry in the hosts database (except its own address) is necessary.
Let’s choose one of these servers (e.g. S0) which is said to be the most reliable one. We let it collect all entries and define it as our master. Please note that this is just a personal definition. It’s a point of view. All OnionCat instances are technically equal and have the same capabilities.
Step 1: Start OnionCat4 on all instances with the additional command line argument “-A <ipv6_address_of_S0>/<hostname_of_S0.onion>”.
Step 2: On the command line of each instance (except S0) ping the IPv6 address of S0.
Step 3: Because each instance can properly resolve S0 (because the hostname is already in every instance’s hosts database because of -A), all of them will be able to open a connection to S0. Since OnionCat sends an initial keepalive packet, S0 will learn about all of those instances and its internal hosts database (of S0) will immediately be populated with all the entries (N0-N5 and S0-S5 in this example).
Now let’s assume N0 wants to connect to N1 (e.g. let’s ping the IPv6 address of N1 on the command line of N0). N0 has no entry for N1 in its hosts database. Thus it will try to resolve it and chooses a name server out of its hosts database. This is S0 because at this moment it is the only entry in the database. N0 will send a DNS query to S0 and will request the name for N1. S0 will receive the request and will reply with the hostname of N1. It had learned the name of N1 already in step 3 of before.
N0 will receive the reply for the hostname of N1. It will add the name to its own hosts database and can immediately open a connection to N1. After the connection is established, N0 sends a keepalive to N1. As a result N1 now also knows about N0 and can connect back (e.g. to send an echo reply).
RUNNING BITTORRENT ON TOP OF ONIONCAT4
In this scenario we have a large number of BitTorrent clients (seeders and leechers) and at least one tracker. The tracker is the software which keeps track on the IP addresses of the seeders and leechers. If a new leecher wants to download a file, it 1st connects to the tracker which will in turn reply with a list of IP address of where portions of the desired file may be downloaded. The client then will directly connect to these hosts to download.
To make this work the tracker has to be run on a system wich runs OnionCat4 and shall (only) be bound to the OnionCat IPv6 address. This can easily be done with e.g. opentracker (https://erdgeist.org/arts/software/opentracker/) which is a robust and lightweight OpenSource BitTorrent tracker (compile it with -DWANT_V6). The OnionCat on the tracker is run without any special parameters (no -A needed).
Clients need 2 pieces of information. Both could be made public somewhere on the Internet.
- The IPv6 address and hostname pair of the tracker.
- A torrent file. It contains the information about the file to download and the URL of the tracker. In this case the URL is the OnionCat IPv6 address of the tracker, e.g. http://[fd87:d87e:eb43:<ip_of_tracker>]:6969/announce.
On the client run OnionCat4 with the option “-A <ipv6_of_tracker>/<hostname_of_tracker.onion>”. Then fire up your favorite BitTorrent client and open the torrent file.
Since somebody was the first one to seed the file, his BitTorrent client (S0) had to connect to the tracker. This happened through OnionCat4 and because of that the tracker’s OnionCat learned about the hostname of this initial seeder (because of the keepalive).
If now a leecher (L0) opens the torrent file, the BitTorrent client will connect to the tracker because of the URL within the torrent file. This will happen through OnionCat. As a result the tracker’s OnionCat will learn the IP address of this leecher as well as the tracker itself. The tracker will then reply with a list of IP addresses which is in this moment just the single address of the 1st seeder (S0).
The BitTorrent client (L0) will now try to connect to S0. Since L0’s OnionCat does not yet have an entry for S0 in its database it will resolve it. It sends a DNS query to the tracker’s OnionCat because it is the only instance in the hosts database. This OnionCat (the tracker) already knows about S0 and will reply. As a result L0 can now connect directly to S0 and start leeching.
You can continue this game with additional clients resulting in a growing list of hosts in the hosts database of the tracker’s OnionCat. It will become a major OnionCat name server. But also the clients themselves build up a larger hostname list because they connect directly to the seeders. Thus, they can and will also be used as name servers. Hence, the distributed knowledge of IPv6 address/hostname pairs grows in general.
It was exlained how OnionCat4 works and discussed in detail how the distributed hosts database grows by the interaction between OnionCat nodes. Of course, this is not a perfect solution as it did work with hidden services version 2. But things change and hidden services version 3 have a stronger level of protection in respect to cryptography, so OnionCat had to be adapted.
It cannot be predicted but it could happen that a critical mass of OnionCat nodes may be reached in such a way that enough nodes are out there that every new OnionCat can boot strap with just a single name server entry in its hosts file. Maybe some nodes out there advance to “public” OnionCat name servers in the hidden OnionCat world.
If this happens it is most likely that the metric algorithm has to be adapted.
Keep in mind that OnionCat is OpenSource. So everybody is invited to use it, make suggestions, improve the code, add features, or have decent discussions for a further development.