Planet Tor

@blog May 23, 2022 - 00:00 • 3 days ago
New Release: Tor Browser 11.0.13

Tor Browser 11.0.13 is now available from the Tor Browser download page and also from our distribution directory.

This version includes important security updates to Firefox.

We also updated Tor to 0.4.7.7 (the first stable Tor release with support for congestion control).

Note: the Android version 11.0.13 will be available later during the week.

The full changelog since Tor Browser 11.0.12 is:

...
@blog May 20, 2022 - 00:00 • 6 days ago
New Alpha Release: Tor Browser 11.5a11 (Windows/macOS/Linux)

Tor Browser 11.5a11 is now available from the Tor Browser download page and also from our distribution directory.

This version includes important security updates to Firefox.

Tor Browser 11.5a11 updates Firefox on Windows, macOS, and Linux to 91.9.0esr.

We use the opportunity as well to update various other components of Tor Browser:

  • NoScript 11.4.5
  • Tor 0.4.7.7
  • OpenSSL to 1.1.1o
  • Tor Launcher 0.2.35

The full changelog since Tor Browser 11.5a9 is:

...
@kushal May 16, 2022 - 15:46 • 9 days ago
OAuth Security Workshop 2022

Last week I attended OAuth Security Workshop at Trondheim, Norway. It was a 3-day single-track conference, where the first half of the days were pre-selected talks, and the second parts were unconference talks/side meetings. This was also my first proper conference after COVID emerged in the world.

osw starting

Back to the starting line

After many years felt the whole excitement of being a total newbie in something and suddenly being able to meet all the people behind the ideas. I reached the conference hotel in the afternoon of day 0 and met the organizers as they were in the lobby area. That chat went on for a long, and as more and more people kept checking into the hotel, I realized that it was a kind of reunion for many of the participants. Though a few of them met at a conference in California just a week ago, they all were excited to meet again.

To understand how welcoming any community is, just notice how the community behaves towards new folks. I think the Python community stands high in this regard. And I am very happy to say the whole OAuth/OIDC/Identity-related community folks are excellent in this regard. Even though I kept introducing myself as the new person in this identity land, not even for a single time I felt unwelcome. I attended OpenID-related working group meetings during the conference, multiple hallway chats, or talked to people while walking around the beautiful city. Everyone was happy to explain things in detail to me. Even though most of the people there have already spent 5-15+ years in the identity world.

The talks & meetings

What happens in Trondheim, stays in Trondheim.

I generally do not attend many talks at conferences, as they get recorded. But here, the conference was a single track, and also, there were no recordings.

The first talk was related to formal verification, and this was the first time I saw those (scary in my mind) maths on the big screen. But, full credit to the speakers as they explained things in such a way so that even an average programmer like me understood each step. And after this talk, we jumped into the world of OAuth/OpenID. One funny thing was whenever someone mentioned some RFC number, we found the authors inside the meeting room.

In the second half, we had the GNAP master class from Justin Richer. And once again, the speaker straightforwardly explained such deep technical details so that everyone in the room could understand it.

Now, in the evening before, a few times, people mentioned that in heated technical details, many RFC numbers will be thrown at each other, though it was not that many for me to get too scared :)

rfc count

I also managed to meet Roland for the first time. We had longer chats about the status of Python in the identity ecosystem and also about Identity Python. I took some notes about how we can improve the usage of Python in this, and I will most probably start writing about those in the coming weeks.

In multiple talks, researchers & people from the industry pointed out the mistakes made in the space from the security point of view. Even though, for many things, we have clear instructions in the SPECs, there is no guarantee that the implementors will follow them properly, thus causing security gaps.

At the end of day 1, we had a special Organ concert at the beautiful Trondheim Cathedral. On day 2, we had a special talk, “The Viking Kings of Norway”.

If you let me talk about my experience at the conference, I don’t think I will stop before 2 hours. It was so much excitement, new information, the whole feeling of going back into my starting days where I knew nothing much. Every discussion was full of learning opportunities (all discussions are anyway, but being a newbie is a different level of excitement) or the sadness of leaving Anwesha & Py back in Stockholm. This was the first time I was staying away from them after moving to Sweden.

surprise

Just before the conference ended, Aaron Parecki gave me a surprise gift. I spent time with it during the whole flight back to Stockholm.

This conference had the best food experience of my life for a conference. Starting from breakfast to lunch, big snack tables, dinners, or restaurant foods. In front of me, at least 4 people during the conference said, “oh, it feels like we are only eating and sometimes talking”.

Another thing I really loved to see is that the two primary conference organizers are university roommates who are continuing the friendship and journey in a very beautiful way. After midnight, standing outside of the hotel and talking about random things about life and just being able to see two longtime friends excited about similar things, it felt so nice.

Trondheim

I also want to thank the whole organizing team, including local organizers, Steinar, and the rest of the team did a superb job.

...
@anarcat May 13, 2022 - 20:19 • 12 days ago
NVMe/SSD disk failure

Yesterday, my workstation (curie) was hung when I came in the office. After a "skinny elephant", the box rebooted, but it couldn't find the primary disk (in the BIOS). Instead, it booted on the secondary HDD drive, still running an old Fedora 27 install which somehow survived to this day, possibly because ?BTRFS is incomprehensible.

Somehow, I blindly accepted the Fedora prompt asking me to upgrade to Fedora 28, not realizing that:

  1. Fedora is now at release 36, not 28
  2. major upgrades take about an hour...
  3. ... and happen at boot time, blocking the entire machine (I'll remember this next time I laugh at Windows and Mac OS users stuck on updates on boot)
  4. you can't skip more than one major upgrade

Which means that upgrading to latest would take over 4 hours. Thankfully, it's mostly automated and seems to work pretty well (which is not exactly the case for Debian). It still seems like a lot of wasted time -- it would probably be better to just reinstall the machine at this point -- and not what I had planned to do that morning at all.

In any case, after waiting all that time, the machine booted (in Fedora) again, and now it could detect the SSD disk. The BIOS could find the disk too, so after I reinstalled grub (from Fedora) and fixed the boot order, it rebooted, but secureboot failed, so I turned that off (!?), and I was back in Debian.

I did an emergency backup with ddrescue, from the running system which probably doesn't really work as a backup (because the filesystem is likely to be corrupt) but it was fast enough (20 minutes) and gave me some peace of mind. My offsites backup have been down for a while and since I treat my workstations as "cattle" (not "pets"), I don't have a solid recovery scenario for those situations other than "just reinstall and run Puppet", which takes a while.

Now I'm wondering what the next step is: probably replace the disk anyways (the new one is bigger: 1TB instead of 500GB), or keep the new one as a hot backup somehow. Too bad I don't have a snapshotting filesystem on there... (Technically, I have LVM, but LVM snapshots are heavy and slow, and can't atomically cover the entire machine.)

It's kind of scary how this thing failed: totally dropped off the bus, just not in the BIOS at all. I prefer the way spinning rust fails: clickety sounds, tons of warnings beforehand, partial recovery possible. With this new flashy junk, you just lose everything all at once. Not fun.

...
@anarcat May 13, 2022 - 20:04 • 12 days ago
BTRFS notes

I'm not a fan of BTRFS. This page serves as a reminder of why, but also a cheat sheet to figure out basic tasks in a BTRFS environment because those are not obvious to me, even after repeatedly having to deal with them.

Content warning: there might be mentions of ZFS.

Stability concerns

I'm worried about BTRFS stability, which has been historically ... changing. RAID-5 and RAID-6 are still marked unstable, for example. It's kind of a lucky guess whether your current kernel will behave properly with your planned workload. For example, in Linux 4.9, RAID-1 and RAID-10 were marked as "mostly OK" with a note that says:

Needs to be able to create two copies always. Can get stuck in irreversible read-only mode if only one copy can be made.

Even as of now, RAID-1 and RAID-10 has this note:

The simple redundancy RAID levels utilize different mirrors in a way that does not achieve the maximum performance. The logic can be improved so the reads will spread over the mirrors evenly or based on device congestion.

Granted, that's not a stability concern anymore, just performance. A reviewer of a draft of this article actually claimed that BTRFS only reads from one of the drives, which hopefully is inaccurate, but goes to show how confusing all this is.

There are other warnings in the Debian wiki that are quite scary. Even the legendary Arch wiki has a warning on top of their BTRFS page, still.

Even if those issues are now fixed, it can be hard to tell when they were fixed. There is a changelog by feature but it explicitly warns that it doesn't know "which kernel version it is considered mature enough for production use", so it's also useless for this.

It would have been much better if BTRFS was released into the world only when those bugs were being completely fixed. Or that, at least, features were announced when they were stable, not just "we merged to mainline, good luck". Even now, we get mixed messages even in the official BTRFS documentation which says "The Btrfs code base is stable" (main page) while at the same time clearly stating unstable parts in the status page (currently RAID56).

There are much harsher BTRFS critics than me out there so I will stop here, but let's just say that I feel a little uncomfortable trusting server data with full RAID arrays to BTRFS. But surely, for a workstation, things should just work smoothly... Right? Well, let's see the snags I hit.

My BTRFS test setup

Before I go any further, I should probably clarify how I am testing BTRFS in the first place.

The reason I tried BTRFS is that I was ... let's just say "strongly encouraged" by the LWN editors to install Fedora for the terminal emulators series. That, in turn, meant the setup was done with BTRFS, because that was somewhat the default in Fedora 27 (or did I want to experiment? I don't remember, it's been too long already).

So Fedora was setup on my 1TB HDD and, with encryption, the partition table looks like this:

NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
├─sda1                   8:1    0   200M  0 part  /boot/efi
├─sda2                   8:2    0     1G  0 part  /boot
├─sda3                   8:3    0   7,8G  0 part  
│ └─fedora_swap        253:5    0   7.8G  0 crypt [SWAP]
└─sda4                   8:4    0 922,5G  0 part  
  └─fedora_crypt       253:4    0 922,5G  0 crypt /

(This might not entirely be accurate: I rebuilt this from the Debian side of things.)

This is pretty straightforward, except for the swap partition: normally, I just treat swap like any other logical volume and create it in a logical volume. This is now just speculation, but I bet it was setup this way because "swap" support was only added in BTRFS 5.0.

I fully expect BTRFS experts to yell at me now because this is an old setup and BTRFS is so much better now, but that's exactly the point here. That setup is not that old (2018? old? really?), and migrating to a new partition scheme isn't exactly practical right now. But let's move on to more practical considerations.

No builtin encryption

BTRFS aims at replacing the entire mdadm, LVM, and ext4 stack with a single entity, and adding new features like deduplication, checksums and so on.

Yet there is one feature it is critically missing: encryption. See, my typical stack is actually mdadm, LUKS, and then LVM and ext4. This is convenient because I have only a single volume to decrypt.

If I were to use BTRFS on servers, I'd need to have one LUKS volume per-disk. For a simple RAID-1 array, that's not too bad: one extra key. But for large RAID-10 arrays, this gets really unwieldy.

The obvious BTRFS alternative, ZFS, supports encryption out of the box and mixes it above the disks so you only have one passphrase to enter. The main downside of ZFS encryption is that it happens above the "pool" level so you can typically see filesystem names (and possibly snapshots, depending on how it is built), which is not the case with a more traditional stack.

Subvolumes, filesystems, and devices

I find BTRFS's architecture to be utterly confusing. In the traditional LVM stack (which is itself kind of confusing if you're new to that stuff), you have those layers:

  • disks: let's say /dev/nvme0n1 and nvme1n1
  • RAID arrays with mdadm: let's say the above disks are joined in a RAID-1 array in /dev/md1
  • volume groups or VG with LVM: the above RAID device (technically a "physical volume" or PV) is assigned into a VG, let's call it vg_tbbuild05 (multiple PVs can be added to a single VG which is why there is that abstraction)
  • LVM logical volumes: out of that volume group actually "virtual partitions" or "logical volumes" are created, that is where your filesystem lives
  • filesystem, typically with ext4: that's your normal filesystem, which treats the logical volume as just another block device

A typical server setup would look like this:

NAME                      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1                   259:0    0   1.7T  0 disk  
├─nvme0n1p1               259:1    0     8M  0 part  
├─nvme0n1p2               259:2    0   512M  0 part  
│ └─md0                     9:0    0   511M  0 raid1 /boot
├─nvme0n1p3               259:3    0   1.7T  0 part  
│ └─md1                     9:1    0   1.7T  0 raid1 
│   └─crypt_dev_md1       253:0    0   1.7T  0 crypt 
│     ├─vg_tbbuild05-root 253:1    0    30G  0 lvm   /
│     ├─vg_tbbuild05-swap 253:2    0 125.7G  0 lvm   [SWAP]
│     └─vg_tbbuild05-srv  253:3    0   1.5T  0 lvm   /srv
└─nvme0n1p4               259:4    0     1M  0 part

I stripped the other nvme1n1 disk because it's basically the same.

Now, if we look at my BTRFS-enabled workstation, which doesn't even have RAID, we have the following:

  • disk: /dev/sda with, again, /dev/sda4 being where BTRFS lives
  • filesystem: fedora_crypt, which is, confusingly, kind of like a volume group. it's where everything lives. i think.
  • subvolumes: home, root, /, etc. those are actually the things that get mounted. you'd think you'd mount a filesystem, but no, you mount a subvolume. that is backwards.

It looks something like this to lsblk:

NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
├─sda1                   8:1    0   200M  0 part  /boot/efi
├─sda2                   8:2    0     1G  0 part  /boot
├─sda3                   8:3    0   7,8G  0 part  [SWAP]
└─sda4                   8:4    0 922,5G  0 part  
  └─fedora_crypt       253:4    0 922,5G  0 crypt /srv

Notice how we don't see all the BTRFS volumes here? Maybe it's because I'm mounting this from the Debian side, but lsblk definitely gets confused here. I frankly don't quite understand what's going on, even after repeatedly looking around the rather dismal documentation. But that's what I gather from the following commands:

root@curie:/home/anarcat# btrfs filesystem show
Label: 'fedora'  uuid: 5abb9def-c725-44ef-a45e-d72657803f37
    Total devices 1 FS bytes used 883.29GiB
    devid    1 size 922.47GiB used 916.47GiB path /dev/mapper/fedora_crypt

root@curie:/home/anarcat# btrfs subvolume list /srv
ID 257 gen 108092 top level 5 path home
ID 258 gen 108094 top level 5 path root
ID 263 gen 108020 top level 258 path root/var/lib/machines

I only got to that point through trial and error. Notice how I use an existing mountpoint to list the related subvolumes. If I try to use the filesystem path, the one that's listed in filesystem show, I fail:

root@curie:/home/anarcat# btrfs subvolume list /dev/mapper/fedora_crypt 
ERROR: not a btrfs filesystem: /dev/mapper/fedora_crypt
ERROR: can't access '/dev/mapper/fedora_crypt'

Maybe I just need to use the label? Nope:

root@curie:/home/anarcat# btrfs subvolume list fedora
ERROR: cannot access 'fedora': No such file or directory
ERROR: can't access 'fedora'

This is really confusing. I don't even know if I understand this right, and I've been staring at this all afternoon. Hopefully, the lazyweb will correct me eventually.

(As an aside, why are they called "subvolumes"? If something is a "sub" of "something else", that "something else" must exist right? But no, BTRFS doesn't have "volumes", it only has "subvolumes". Go figure. Presumably the filesystem still holds "files" though, at least empirically it doesn't seem like it lost anything so far.

In any case, at least I can refer to this section in the future, the next time I fumble around the btrfs commandline, as I surely will. I will possibly even update this section as I get better at it, or based on my reader's judicious feedback.

Mounting BTRFS subvolumes

So how did I even get to that point? I have this in my /etc/fstab, on the Debian side of things:

UUID=5abb9def-c725-44ef-a45e-d72657803f37   /srv    btrfs  defaults 0   2

This thankfully ignores all the subvolume nonsense because it relies on the UUID. mount tells me that's actually the "root" (? /?) subvolume:

root@curie:/home/anarcat# mount | grep /srv
/dev/mapper/fedora_crypt on /srv type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)

Let's see if I can mount the other volumes I have on there. Remember that subvolume list showed I had home, root, and var/lib/machines. Let's try root:

mount -o subvol=root /dev/mapper/fedora_crypt /mnt

Interestingly, root is not the same as /, it's a different subvolume! It seems to be the Fedora root (/, really) filesystem. No idea what is happening here. I also have a home subvolume, let's mount it too, for good measure:

mount -o subvol=home /dev/mapper/fedora_crypt /mnt/home

Note that lsblk doesn't notice those two new mountpoints, and that's normal: it only lists block devices and subvolumes (rather inconveniently, I'd say) do not show up as devices:

root@curie:/home/anarcat# lsblk 
NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
├─sda1                   8:1    0   200M  0 part  
├─sda2                   8:2    0     1G  0 part  
├─sda3                   8:3    0   7,8G  0 part  
└─sda4                   8:4    0 922,5G  0 part  
  └─fedora_crypt       253:4    0 922,5G  0 crypt /srv

This is really, really confusing. Maybe I did something wrong in the setup. Maybe it's because I'm mounting it from outside Fedora. Either way, it just doesn't feel right.

No disk usage per volume

If you want to see what's taking up space in one of those subvolumes, tough luck:

root@curie:/home/anarcat# df -h  /srv /mnt /mnt/home
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/fedora_crypt  923G  886G   31G  97% /srv
/dev/mapper/fedora_crypt  923G  886G   31G  97% /mnt
/dev/mapper/fedora_crypt  923G  886G   31G  97% /mnt/home

(Notice, in passing, that it looks like the same filesystem is mounted in different places. In that sense, you'd expect /srv and /mnt (and /mnt/home?!) to be exactly the same, but no: they are entirely different directory structures, which I will not call "filesystems" here because everyone's head will explode in sparks of confusion.)

Yes, disk space is shared (that's the Size and Avail columns, makes sense). But nope, no cookie for you: they all have the same Used columns, so you need to actually walk the entire filesystem to figure out what each disk takes.

(For future reference, that's basically:

root@curie:/home/anarcat# time du -schx /mnt/home /mnt /srv
124M    /mnt/home
7.5G    /mnt
875G    /srv
883G    total

real    2m49.080s
user    0m3.664s
sys 0m19.013s

And yes, that was painfully slow.)

ZFS actually has some oddities in that regard, but at least it tells me how much disk each volume (and snapshot) takes:

root@tubman:~# time df -t zfs -h
Filesystem         Size  Used Avail Use% Mounted on
rpool/ROOT/debian  3.5T  1.4G  3.5T   1% /
rpool/var/tmp      3.5T  384K  3.5T   1% /var/tmp
rpool/var/spool    3.5T  256K  3.5T   1% /var/spool
rpool/var/log      3.5T  2.0G  3.5T   1% /var/log
rpool/home/root    3.5T  2.2G  3.5T   1% /root
rpool/home         3.5T  256K  3.5T   1% /home
rpool/srv          3.5T   80G  3.5T   3% /srv
rpool/var/cache    3.5T  114M  3.5T   1% /var/cache
bpool/BOOT/debian  571M   90M  481M  16% /boot

real    0m0.003s
user    0m0.002s
sys 0m0.000s

That's 56360 times faster, by the way.

But yes, that's not fair: those in the know will know there's a different command to do what df does with BTRFS filesystems, the btrfs filesystem usage command:

root@curie:/home/anarcat# time btrfs filesystem usage /srv
Overall:
    Device size:         922.47GiB
    Device allocated:        916.47GiB
    Device unallocated:        6.00GiB
    Device missing:          0.00B
    Used:            884.97GiB
    Free (estimated):         30.84GiB  (min: 27.84GiB)
    Free (statfs, df):        30.84GiB
    Data ratio:               1.00
    Metadata ratio:           2.00
    Global reserve:      512.00MiB  (used: 0.00B)
    Multiple profiles:              no

Data,single: Size:906.45GiB, Used:881.61GiB (97.26%)
   /dev/mapper/fedora_crypt  906.45GiB

Metadata,DUP: Size:5.00GiB, Used:1.68GiB (33.58%)
   /dev/mapper/fedora_crypt   10.00GiB

System,DUP: Size:8.00MiB, Used:128.00KiB (1.56%)
   /dev/mapper/fedora_crypt   16.00MiB

Unallocated:
   /dev/mapper/fedora_crypt    6.00GiB

real    0m0,004s
user    0m0,000s
sys 0m0,004s

Almost as fast as ZFS's df! Good job. But wait. That doesn't actually tell me usage per subvolume. Notice it's filesystem usage, not subvolume usage, which unhelpfully refuses to exist. That command only shows that one "filesystem" internal statistics that are pretty opaque.. You can also appreciate that it's wasting 6GB of "unallocated" disk space there: I probably did something Very Wrong and should be punished by Hacker News. I also wonder why it has 1.68GB of "metadata" used...

At this point, I just really want to throw that thing out of the window and restart from scratch. I don't really feel like learning the BTRFS internals, as they seem oblique and completely bizarre to me. It feels a little like the state of PHP now: it's actually pretty solid, but built upon so many layers of cruft that I still feel it corrupts my brain every time I have to deal with it (needle or haystack first? anyone?)...

Conclusion

I find BTRFS utterly confusing and I'm worried about its reliability. I think a lot of work is needed on usability and coherence before I even consider running this anywhere else than a lab, and that's really too bad, because there are really nice features in BTRFS that would greatly help my workflow. (I want to use filesystem snapshots as high-performance, high frequency backups.)

So now I'm experimenting with OpenZFS. It's so much simpler, just works, and it's rock solid. After this 8 minute read, I had a good understanding of how ZFS worked. Here's the 30 seconds overview:

  • vdev: a RAID array
  • vpool: a volume group of vdevs
  • datasets: normal filesystems (or block device, if you want to use another filesystem on top of ZFS)

There's also other special volumes like caches and logs that you can (really easily, compared to LVM caching) use to tweak your setup. You might also want to look at recordsize or ashift to tweak the filesystem to fit better your workload (or deal with drives lying about their sector size, I'm looking at you Samsung), but that's it.

Running ZFS on Linux currently involves building kernel modules from scratch on every host, which I think is pretty bad. But I was able to setup a ZFS-only server using this excellent documentation without too much problem.

I'm hoping some day the copyright issues are resolved and we can at least ship binary packages, but the politics (e.g. convincing Debian that is the right thing to do) and the logistics (e.g. DKMS auto-builders? is that even a thing? how about signed DKMS packages? fun-fun-fun!) seem really impractical. Who knows, maybe hell will freeze over (again) and Oracle will fix the CDDL. I personally think that we should just completely ignore this problem (which wasn't even supposed to be a problem) and ship binary packages directly, but I'm a pragmatic and do not always fit well with the free software fundamentalists.

All of this to say that, short term, we don't have a reliable, advanced filesystem/logical disk manager in Linux. And that's really too bad.

...
@ooni May 13, 2022 - 00:00 • 13 days ago
Job Opening: OONI Community Coordinator
Are you passionate about defending human rights on the internet? Are you extremely organized and enjoy supporting communities around the world? We have a job opening for you! The OONI team (a non-profit fighting internet censorship) is looking for a dedicated community coordinator to help grow and support the OONI community around the world. The application deadline is Sunday, 12th June 2022. Job description By joining our team, you will support our partners and global community of human rights defenders to measure and fight internet censorship. ...
@blog May 12, 2022 - 00:00 • 14 days ago
New Release: Tor Browser 11.0.12 (Android)

Tor Browser 11.0.12 is now available from the Tor Browser download page and also from our distribution directory.

This version solves the startup crash that many users reported: https://gitlab.torproject.org/tpo/applications/fenix/-/issues/40212

Tor Browser 11.0.12 updates GeckoView to 96.0.3.

We use the opportunity as well to update various other components of Tor Browser:

  • NoScript 11.4.5
  • OpenSSL 1.1.1o

We switch to Go 1.17.9, too, for building our Go-related projects.

The full changelog since Tor Browser 11.0.8 is:

  • Android
  • Build System
    • Android
      • Update Go to 1.17.9
...
@anarcat May 6, 2022 - 16:26 • 19 days ago
Wallabako 1.4.0 released

I don't particularly like it when people announce their personal projects on their blog, but I'm making an exception for this one, because it's a little special for me.

You see, I have just released Wallabako 1.4.0 (and a quick, mostly irrelevant 1.4.1 hotfix) today. It's the first release of that project in almost 3 years (the previous was 1.3.1, before the pandemic).

The other reason I figured I would mention it is that I have almost never talked about Wallabako on this blog at all, so many of my readers probably don't even know I sometimes meddle with in Golang which surprises even me sometimes.

What's Wallabako

Wallabako is a weird little program I designed to read articles on my E-book reader. I use it to spend less time on the computer: I save articles in a read-it-later app named Wallabag (hosted by a generous friend), and then Wallabako connects to that app, downloads an EPUB version of the book, and then I can read it on the device directly.

When I'm done reading the book, Wallabako notices and sets the article as read in Wallabag. I also set it to delete the book locally, but you can actually configure to keep those books around forever if you feel like it.

Wallabako supports syncing read status with the built-in Kobo interface (called "Nickel"), Koreader and Plato. I happen to use Koreader for everything nowadays, but it should work equally well on the others.

Wallabako is actually setup to be started by udev when there's a connection change detected by the kernel, which is kind of a gross hack. It's clunky, but actually works and I thought for a while about switching to something else, but it's really the easiest way to go, and that requires the less interaction by the user.

Why I'm (still) using it

I wrote Wallabako because I read a lot of articles on the internet. It's actually most of my readings. I read about 10 books a year (which I don't think is much), but I probably read more in terms of time and pages in Wallabag. I haven't actually made the math, but I estimate I spend at least double the time reading articles than I spend reading books.

If I wouldn't have Wallabag, I would have hundreds of tabs open in my web browser all the time. So at least that problem is easily solved: throw everything in Wallabag, sort and read later.

If I wouldn't have Wallabako however, I would be either spend that time reading on the computer -- which I prefer to spend working on free software or work -- or on my phone -- which is kind of better, but really cramped.

I had stopped (and developing) Wallabako for a while, actually, Around 2019, I got tired of always read those technical articles (basically work stuff!) at home. I realized I was just not "reading" (as in books! fiction! fun stuff!) anymore, at least not as much as I wanted.

So I tried to make this separation: the ebook reader is for cool book stuff. The rest is work. But because I had the Wallabag Android app on my phone and tablet, I could still read those articles there, which I thought was pretty neat. But that meant that I was constantly looking at my phone, which is something I'm generally trying to avoid, as it sets a bad example for the kids (small and big) around me.

Then I realized there was one stray ebook reader lying around at home. I had recently bought a Kobo Aura HD to read books, and I like that device. And it's going to stay locked down to reading books. But there's still that old battered Kobo Glo HD reader lying around, and I figured I could just borrow it to read Wallabag articles.

What is this new release

But oh boy that was a lot of work. Wallabako was kind of a mess: it was using the deprecated go dep tool, which lost the battle with go mod. Cross-compilation was broken for older devices, and I had to implement support for Koreader.

go mod

So I had to learn go mod. I'm still not sure I got that part right: LSP is yelling at me because it can't find the imports, and I'm generally just "YOLO everythihng" every time I get anywhere close to it. That's not the way to do Go, in general, and not how I like to do it either.

But I guess that, given time, I'll figure it out and make it work for me. It certainly works now. I think.

Cross compilation

The hard part was different. You see, Nickel uses SQLite to store metadata about books, so Wallabako actually needs to tap into that SQLite database to propagate read status. Originally, I just linked against some sqlite3 library I found lying around. It's basically a wrapper around the C-based SQLite and generally works fine. But that means you actually link your Golang program against a C library. And that's when things get a little nutty.

If you would just build Wallabag naively, it would fail when deployed on the Kobo Glo HD. That's because the device runs a really old kernel: the prehistoric Linux kobo 2.6.35.3-850-gbc67621+ #2049 PREEMPT Mon Jan 9 13:33:11 CST 2017 armv7l GNU/Linux. That was built in 2017, but the kernel was actually released in 2010, a whole 5 years before the Glo HD was released, in 2015 which is kind of outrageous. and yes, that is with the latest firmware release.

My bet is they just don't upgrade the kernel on those things, as the Glo was probably bought around 2017...

In any case, the problem is we are cross-compiling here. And Golang is pretty good about cross-compiling, but because we have C in there, we're actually cross-compiling with "CGO" which is really just Golang with a GCC backend. And that's much, much harder to figure out because you need to pass down flags into GCC and so on. It was a nightmare.

That's until I found this outrageous "little" project called modernc.org/sqlite. What that thing does (with a hefty does of dependencies that would make any Debian developer recoil in horror) is to transpile the SQLite C source code to Golang. You read that right: it rewrites SQLite in Go. On the fly. It's nuts.

But it works. And you end up with a "pure go" program, and that thing compiles much faster and runs fine on older kernel.

I still wasn't sure I wanted to just stick with that forever, so I kept the old sqlite3 code around, behind a compile-time tag. At the top of the nickel_modernc.go file, there's this magic string:

//+build !sqlite3

And at the top of nickel_sqlite3.go file, there's this magic string:

//+build sqlite3

So now, by default, the modernc file gets included, but if I pass --tags sqlite3 to the Go compiler (to go install or whatever), it will actually switch to the other implementation. Pretty neat stuff.

Koreader port

The last part was something I was hesitant in doing for a long time, but that turned out to be pretty easy. I have basically switch to using Koreader to read everything. Books, PDF, everything goes through it. I really like that it stores its metadata in sidecar files: I synchronize all my books with Syncthing which means I can carry my read status, annotations and all that stuff without having to think about it. (And yes, I installed Syncthing on my Kobo.)

The koreader.go port was less than 80 lines, and I could even make a nice little test suite so that I don't have to redeploy that thing to the ebook reader at every code iteration.

I had originally thought I should add some sort of graphical interface in Koreader for Wallabako as well, and had requested that feature upstream. Unfortunately (or fortunately?), they took my idea and just ran with it. Some courageous soul actually wrote a full Wallabag plugin for koreader, in Lua of course.

Compared to the Wallabako implementation however, the koreader plugin is much slower, probably because it downloads articles serially instead of concurrently. It is, however, much more usable as the user is given a visible feedback of the various steps. I still had to enable full debugging to diagnose a problem (which was that I shouldn't have a trailing slash, and that some special characters don't work in passwords). It's also better to write the config file with a normal text editor, over SSH or with the Kobo mounted to your computer instead of typing those really long strings over the kobo.

There's no sample config file which makes that harder but a workaround is to save the configuration with dummy values and fix them up after. Finally I also found the default setting ("Remotely delete finished articles") really dangerous as it can basically lead to data loss (Wallabag article being deleted!) for an unsuspecting user...

So basically, I started working on Wallabag again because the koreader implementation of their Wallabag client was not up to spec for me. It might be good enough for you, but I guess if you like Wallabako, you should thank the koreader folks for their sloppy implementation, as I'm now working again on Wallabako.

Actual release notes

Those are the actual release notes for 1.4.0.

Ship a lot of fixes that have accumulated in the 3 years since the last release.

Features:

  • add timestamp and git version to build artifacts
  • cleanup and improve debugging output
  • switch to pure go sqlite implementation, which helps
  • update all module dependencies
  • port to wallabago v6
  • support Plato library changes from 0.8.5+
  • support reading koreader progress/read status
  • Allow containerized builds, use gomod and avoid GOPATH hell
  • overhaul Dockerfile
  • switch to go mod

Documentation changes:

  • remove instability warning: this works well enough
  • README: replace branch name master by main in links
  • tweak mention of libreoffice to clarify concern
  • replace "kobo" references by "nickel" where appropriate
  • make a section about related projects
  • mention NickelMenu
  • quick review of the koreader implementation

Bugfixes:

  • handle errors in http request creation
  • Use OutputDir configuration instead of hardcoded wallabako paths
  • do not noisily fail if there's no entry for book in plato
  • regression: properly detect read status again after koreader (or plato?) support was added

How do I use this?

This is amazing. I can't believe someone did something that awesome. I want to cover you with gold and Tesla cars and fresh water.

You're weird please stop. But if you want to use Wallabako, head over to the README file which has installation instructions. It basically uses a hack in Kobo e-readers that will happily overwrite their root filesystem as soon as you drop this file named KoboRoot.tgz in the .kobo directory of your e-reader.

Note that there is no uninstall procedure and it messes with the reader's udev configuration (to trigger runs on wifi connect). You'll also need to create a JSON configuration file and configure a client in Wallabag.

And if you're looking for Wallabag hosting, Wallabag.it offers a 14-day free trial. You can also, obviously, host it yourself. Which is not the case for Pocket, even years after Mozilla bought the company. All this wouldn't actually be necessary if Pocket was open-source because Nickel actually ships with a Pocket client.

Shame on you, Mozilla. But you still make an awesome browser, so keep doing that.

...
@blog May 6, 2022 - 00:00 • 20 days ago
Arti 0.3.0 is released: Robustness and API improvements

Arti is our ongoing project to create a working embeddable Tor client in Rust. It’s not ready to replace the main Tor implementation in C, but we believe that it’s the future.

Right now, our focus is on making Arti production-quality, by stress-testing the code, hunting for likely bugs and adding missing features that we know from experience that users will need. We're going to try not to break backward compatibility too much, but we'll do so when we think it's a good idea.

What's new in 0.3.0?

For a complete list of changes, have a look at the CHANGELOG.

This release has bugfixes for many robustness issues affecting failures to bootstrap. It includes automatic detection and reporting of clock skew, when that affects bootstrapping. (This is a feature we've wanted in Tor for a long time.)

We also add support for C tor-style "safe logging" (suppressing sensitive client information from persistent logs).

For developers, we've continued to improve our configuration API and make it more consistent. This involves a fair amount of breakage; we hope that next release will only have minor API breaks in the configuration logic.

There are also a bunch of smaller features, bugfixes, and infrastructure improvements; again, see the CHANGELOG for a more complete list.

And what's next?

In the short term, there's some pending work that wasn't ready for this release. Our next release will likely contain improved filesystem permission checking, final improvements to the configuration logic, a refactored directory download implementation, and more.

Beyond that, between now and our 1.0.0 milestone in September, we're aiming to make Arti a production-quality Tor client for direct internet access. (Onion services aren't funded yet, but we hope to change that soon.)

To do so, we need to bring Arti up to par with the C tor implementation in terms of its network performance, CPU usage, resiliency, and security features. You can follow our progress on our 1.0.0 milestone.

We still plan to continue regular releases between now and then.

Here's how to try it out

We rely on users and volunteers to find problems in our software and suggest directions for its improvement. Although Arti isn't yet ready for production use, you can test it as a SOCKS proxy (if you're willing to compile from source) and as an embeddable library (if you don't mind a little API instability).

Assuming you've installed Arti (with cargo install arti, or directly from a cloned repository), you can use it to start a simple SOCKS proxy for making connections via Tor with:

$ arti proxy -p 9150

and use it more or less as you would use the C Tor implementation!

(It doesn't support onion services yet. If compilation doesn't work, make sure you have development files for libsqlite installed on your platform.)

If you want to build a program with Arti, you probably want to start with the arti-client crate. Be sure to check out the examples too.

For more information, check out the README file. (For now, it assumes that you're comfortable building Rust programs from the command line). Our CONTRIBUTING file has more information on installing development tools, and on using Arti inside of Tor Browser. (If you want to try that, please be aware that Arti doesn't support onion services yet.)

When you find bugs, please report them on our bugtracker. You can request an account or report a bug anonymously.

And if this documentation doesn't make sense, please ask questions! The questions you ask today might help improve the documentation tomorrow.

Call for feedback

Our priority for the coming months is to make Arti a production-quality Tor client, for the purposes of direct connections to the internet. (Onion services will come later.) We know some of the steps we'll need to take to get there, but not all of them: we need to know what's missing for your use-cases.

Whether you're a user or a developer, please give Arti a try, and let us know what you think. The sooner we learn what you need, the better our chances of getting it into an early milestone.

Acknowledgments

Thanks to everybody who has contributed to this release, including Christian Grigis, Dimitris Apostolou, Samanta Navarro, and Trinity Pointard.

And thanks, of course, to Zcash Community Grants (formerly Zcash Open Major Grants (ZOMG)) for funding this project!

...
@blog May 4, 2022 - 00:00 • 22 days ago
Congestion Control Arrives in Tor 0.4.7-stable!

Tor has released 0.4.7.7, the first stable Tor release with support for congestion control. Congestion control will eliminate the speed limit of current Tor, as well as reduce latency by minimizing queue lengths at relays. It will result in significant performance improvements in Tor, as well as increased utilization of our network capacity. In order for users to experience these benefits, we need Exit relay operators to upgrade as soon as possible. This post covers a bit of congestion control history, describes technical details, and contains important information for all relay and onion service operators.

What is Congestion Control?

Congestion Control is an adaptive property of distributed networks, whereby a network and its endpoints operate such that utilization is maximized, while minimizing a constraint property, and ensuring fairness between connections. When this optimization problem is solved, the optimal outcome is that all connections transmit an equal fraction of the bandwidth of the slowest router in their shared path, for every path through the network.

TCP Congestion Control solves this optimization problem primarily by minimizing packet drops as the constraint property, effectively increasing speed until router queues overflow, and reducing speed in proportion to these drops. In TCP terminology, the congestion control optimization problem is solved by setting the Congestion Window equal to the Bandwidth-Delay Product of a path.

Some congestion control algorithms can make use of auxiliary information, such as latency, in order to anticipate congestion before the point at which queues overflow and packets drop. Notable examples are TCP Vegas, Bittorrent's LEDBAT, and Google's BBR.

Congestion Control Means a Faster Tor

While Tor uses TCP between relays, Tor was designed without any end-to-end congestion control through the network itself. Instead, it set a fixed window size of 1000 512-byte Tor cells on a circuit. In the early days of Tor, this resulted in unbearable latency caused by excessive queue delay, because these windows were much larger than each client's fair share of the Bandwidth-Delay Product on any given circuit. In the early Tor days, users could wait for up to a minute for a page load to respond. This also meant that relays used a huge amount of memory in these cases.

Once spare network capacity increased such that the spare Bandwidth-Delay Product of circuits exceeded this fixed window size of 1000 cells, overall latency improved due to lower queue delay, but throughput began to level off. Because the Bandwidth-Delay Product was artificially limited to 1000 cells, this fixed window size became a speed limit, with the property that lower-latency circuits had higher throughput than high-latency circuits, directly in proportion to their latency.

This turning point with respect to the window size happened around 2015:

Throughput and Latency from 2013-2016

When this capacity turning point was reached, congestion control became not only something that would improve latency, it would also significantly increase throughput.

This turning point made congestion control a top-priority improvement for the Tor network! Congestion control will remove this speed limit entirely, and will also reduce the impact of path latency on throughput.

History of Congestion Control Research on Tor

Unfortunately, because Tor's circuit cryptography cannot support packet drops or reordering, the research community struggled for nearly two decades to determine a way to provide congestion control on the Tor network.

Crucially, we rejected mechanisms to provide congestion control by allowing packet drops, due to the ability to introduce end-to-end side channels in the packet drop pattern.

This ultimately left only a very small class of candidate algorithms to consider: those that used Round-Trip Time to measure queue delay as a congestion signal, and those that directly measured Bandwidth-Delay Product. The up-shot is that this class of algorithms only requires clients and Exit relays and onion services to upgrade; they do not require any changes to intermediate relays.

We ultimately specified three candidate algorithms informed by prior Tor and TCP research: Tor-Westwood, Tor-Vegas, and Tor-NOLA. These algorithms are detailed in Tor Proposal 324

Tor-Westwood is based on the unnamed RTT threshold algorithm from the DefenestraTor Paper, in combination with Bandwidth-Delay Product estimation ideas from TCP Westwood.

Tor-Vegas is very closely based on TCP Vegas. TCP Vegas uses a much more fine-grained RTT ratio to directly estimate the total queue length on the path, and then targets a specific queue length as the constraint criteria. TCP Vegas is extremely efficient and effective, and is able to achieve fairness without any packet drops at all. However, it was never deployed on the Internet, because it was out-competed by the more aggressive and already deployed TCP Reno. Because Reno continues increasing speed until packet drops happen, TCP Reno would end up soaking up the capacity of less aggressive Vegas flows that did not drop packets.

The final algorithm, Tor-NOLA, was created to test the behavior of Bandwidth-Delay Product estimation used directly as the congestion window, without any adaptation.

An additional component, called Flow Control, is necessary to handle the case where an Internet destination or application is slower than Tor. We won't cover Flow Control in this post, but the interested reader can examine those details in Section 4 of Proposal 324.

Implementation, Simulation, and Deployment

We implemented all three algorithms (Tor-Westwood, Tor-Vegas, and Tor-NOLA) in Tor 0.4.7, and subjected them to extensive evaluation in the Shadow Simulator.

The end result was that Tor-Westwood and Tor-NOLA exhibited ack compression, which caused them to wildly overestimate the Bandwidth-Delay Product, which lead to runaway congestion conditions. Standard mechanisms for dealing with ack compression, such as smoothing, probing, and long-term averaging did little to address this, perhaps because of the lack of packet drops as a backstop on queue pressure. Tor-Westwood also exhibited runaway conditions due to the nature of its RTT threshold. (As an aside, Google's BBR algorithm also has these problems, and relies on packet drops as a backstop as well).

Tor-Vegas performed beautifully, almost exactly as the theory predicted. Here's the Shadow Simulator's throughput graphs of clients with simulated locations in Germany and Hong Kong:

Simulated HK and German Clients

While there is still a difference in throughput between these two locations, the speed limit from 0.4.6 Tor is clearly gone. End-to-end latency was not affected at all, according to the simulator.

Additionally, Tor-Vegas was not out-competed by legacy Tor traffic, allowing us to enable it as soon as 0.4.7 came out. We also gain protection from rogue algorithms via the combination of KIST and Circuit-EWMA, which were previously deployed on Tor to address latency problems during the BDP bottleneck era.

Exit Relay Operators: Please Upgrade!

Users of Tor versions 0.4.7 and above will experience faster performance when using Exits or Onion Services that have upgraded to 0.4.7.

This means that in order for users to see the benefits of these improvements, we need our Exit relay operators to upgrade to the new Tor 0.4.7 stable series, asap!

Packages for Debian, Ubuntu, and Fedora/CentOS/RHEL are already available. Please follow those links for instructions on using our packaging repos for those distributions, and upgrade asap!

BSD users should be able to install this release from their flavor's ports system.

If you run into problems while upgrading your relay, you can ask your questions on the public tor-relays mailing list and Relay Operator sub-category on the Tor Forum. You can also get help by joining the channel #tor-relays.

All Relay Operators: Be Prepared to Set Bandwidth Limits

Non-exit relay operators do not need to upgrade for congestion control to work, but this also means they may be surprised by the network effects of congestion control traffic running through their relays.

The faster performance and increased utilization of congestion control means that we will soon be able to use the full capacity of the Tor network. This means that all relays will soon experience new bottlenecks. Congestion control should prevent these bottlenecks from overwhelming relays completely, but this behavior may come as a surprise to operators who were used to the last several years of low CPU and bandwidth utilization.

We are already seeing an increase in the Advertised Bandwidth of relays as a result of some higher-throughput congestion control circuit use, similar to our previous flooding experiments, even though most clients are not yet using congestion control:

Advertised Bandwidth Increase

This increase is because Advertised Bandwidth is computed from the highest 7-day burst of traffic seen, where as Consumed Bandwidth is the average byte rate. As more clients upgrade, particularly after a Tor Browser Stable release with 0.4.7 is made, the Consumed Bandwidth of the network should also rise. We expect to make this Tor Browser Stable release on May 31st, 2022.

Once users migrate to this new release, relay operators who pay for bandwidth by the gigabyte may want to consider enabling hibernation, to avoid surprise cost increases.

This increased traffic may also cause your relay CPU usage to spike, due to increased cryptographic load of the additional traffic. In theory, Tor-Vegas congestion control should treat CPU throughput bottlenecks exactly the same as bandwidth bottlenecks, and back off once CPU bottleneck causes queue delay. However, if you also pay for CPU, you may want to rate limit your relay's bandwidth.

Relays may also experience overload on the Relay Search Portal. Here is an example of that:

Corona Overload

This overload indicator may appear for several reasons. If your relay has this overload indicator, follow the instructions on our overload support page, in order to diagnose the specific cause. If the cause is CPU overload, consider setting bandwidth limits, to reduce the traffic through your relay.

If you have issues diagnosing or eliminating the cause of overload, you can ask questions on the public tor-relays mailing list and Relay Operator sub-category on the Tor Forum. You can also get help by joining the channel #tor-relays.

Onion Service Operators Should Also Upgrade

Just like Exit relays, Onion Services also need to upgrade to 0.4.7 for users to be able to use congestion control with them.

Additionally, Tor 0.4.7 has a security improvement for short-lived onion services, called Vanguards-Lite. This system will reduce the risk of attacks that can discover the Guard relay of an onion service or onion client, so long as that onion service is around for a month or less. Longer lived onion services are still encouraged to use the vanguards addon.

Deployment Plan

The Tor Browser Alpha series already supports congestion control, but it won't experience improved performance unless an 0.4.7 Exit or Onion Service is used with it.

Because our network is roughly 25% utilized, we expect that throughput may be very high for the first few users who use 0.4.7 on fast circuits with fast 0.4.7 Exits, until the point where most clients have upgraded. At that point, a new equilibrium will be reached in terms of throughput and network utilization.

For this reason, we are holding back on releasing a Tor Browser Stable with congestion control, until enough Exits have upgraded to make the experience more uniform. We hope this will happen by May 31st.

Also for this reason, we won't be upgrading our Tor performance metrics sources to 0.4.7 until enough Exits have upgraded for those measurements to be an accurate reflection of congestion control. So these improvements will not be reflected in our performance metrics until we upgrade those onionperf instances, either.

The Future

The astute reader will note that we rejected datagram transports. However, this does not mean that Tor will never carry UDP traffic. On the contrary, congestion control deployment means that queue delay and latency will be much more stable and predictable. This will enable us to carry UDP without packet drops in the network, and only drop UDP at the edges, when the congestion window becomes full. We are hopeful that this new behavior will match what existing UDP protocols expect, allowing their use over Tor.

This still leaves the problem that very slow Tor relays may become a bottleneck, prohibiting the use of interactive voice and video over UDP while using them in a circuit. To address this problem, we will be examining our Guard and Fast relay bandwidth cutoffs, to avoid giving these flags to relays that are too slow to handle multiple clients at once.

Additionally, in Tor 0.4.8, we will be implementing a traffic splitting mechanism based on a previous Tor research paper called Conflux, with improvements from recent Multipath TCP research. This system is specified in Tor Proposal 329.

Conflux has the ability to rebalance traffic over multiple paths to an Exit relay, optimizing for either throughput, or latency.

With Conflux, Exit relays will become the new the speed limit of Tor, making fast Exits more valuable than ever before!

...
@blog May 3, 2022 - 00:00 • 23 days ago
New Release: Tor Browser 11.0.11 (Windows, macOS, Linux)

Tor Browser 11.0.11 is now available from the Tor Browser download page and also from our distribution directory.

This version includes important security updates to Firefox:

Tor Browser 11.0.11 updates Firefox on Windows, macOS, and Linux to 91.9.0esr.

We use the opportunity as well to update various other components of Tor Browser:

  • NoScript 11.4.5

The full changelog since Tor Browser 11.0.10 is:

  • Windows + OS X + Linux
  • Build System
    • Windows + OS X + Linux
      • Update Go to 1.17.9
...
@blog May 3, 2022 - 00:00 • 23 days ago
New Release: Tails 5.0

We are especially proud to present you Tails 5.0, the first version of Tails based on Debian 11 (Bullseye). It brings new versions of a lot of the software included in Tails and new OpenPGP tools.

New features

Kleopatra

We added Kleopatra to replace the OpenPGP Applet and the Password and Keys utility, also known as Seahorse.

The OpenPGP Applet was not actively developped anymore and was complicated for us to keep in Tails. The Password and Keys utility was also poorly maintained and Tails users suffered from too many of its issues until now, like #17183.

Kleopatra provides equivalent features in a single tool and is more actively developed.

Changes and updates

  • The Additional Software feature of the Persistent Storage is enabled by default to make it faster and more robust to configure your first additional software package.

  • You can now use the Activities overview to access your windows and applications. To access the Activities overview, you can either:

    • Click on the Activities button.
    • Throw your mouse pointer to the top-left hot corner.
    • Press the Super key on your keyboard.

    You can see your windows and applications in the overview. You can also start typing to search your applications, files, and folders.

Included software

Most included software has been upgraded in Debian 11, for example:

  • Update Tor Browser to 11.0.11.

  • Update GNOME from 3.30 to 3.38, with lots of small improvements to the desktop, the core GNOME utilities, and the locking screen.

  • Update MAT from 0.8 to 0.12, which adds support to clean metadata from SVG, WAV, EPUB, PPM, and Microsoft Office files.

  • Update Audacity from 2.2.2 to 2.4.2.

  • Update Disk Utility from 3.30 to 3.38.

  • Update GIMP from 2.10.8 to 2.10.22.

  • Update Inkscape from 0.92 to 1.0.

  • Update LibreOffice from 6.1 to 7.0.

Hardware support

  • The new support for driverless printing and scanning in Linux makes it easier to make recent printers and scanners work in Tails.

Fixed problems

  • Fix unlocking VeraCrypt volumes that have very long passphrases. (#17474)

For more details, read our changelog.

Known issues

  • Additional Software sometimes doesn't work when restarting for the first time right after creating a Persistent Storage. (#18839)

    To solve this, install the same additional software package again after restarting with the Persistent Storage for the first time.

  • Thunderbird displays a popup to choose an application when opening links. (#18913)

  • Tails Installer sometimes fails to clone. (#18844)

See the list of long-standing issues.

Get Tails 5.0

To upgrade your Tails USB stick and keep your persistent storage

  • Automatic upgrades are not available to 5.0.

    All users have to do a manual upgrade.

To install Tails on a new USB stick

Follow our installation instructions:

The Persistent Storage on the USB stick will be lost if you install instead of upgrading.

To download only

If you don't need installation or upgrade instructions, you can download Tails 5.0 directly:

What's coming up?

Tails 5.1 is scheduled for May 31.

Have a look at our roadmap to see where we are heading to.

Support and feedback

For support and feedback, visit the Support section on the Tails website.

...
@anarcat April 27, 2022 - 20:36 • 28 days ago
Using LSP in Emacs and Debian

The Language Server Protocol (LSP) is a neat mechanism that provides a common interface to what used to be language-specific lookup mechanisms (like, say, running a Python interpreter in the background to find function definitions).

There is also ctags shipped with UNIX since forever, but that doesn't support looking backwards ("who uses this function"), linting, or refactoring. In short, LSP rocks, and how do I use it right now in my editor of choice (Emacs, in my case) and OS (Debian) please?

Editor (emacs) setup

First, you need to setup your editor. The Emacs LSP mode has pretty good installation instructions which, for me, currently mean:

apt install elpa-lsp-mode

and this .emacs snippet:

(use-package lsp-mode
  :commands (lsp lsp-deferred)
  :hook ((python-mode go-mode) . lsp-deferred)
  :demand t
  :init
  (setq lsp-keymap-prefix "C-c l")
  ;; TODO: https://emacs-lsp.github.io/lsp-mode/page/performance/
  ;; also note re "native compilation": <+varemara> it's the
  ;; difference between lsp-mode being usable or not, for me
  :config
  (setq lsp-auto-configure t))

(use-package lsp-ui
  :config
  (setq lsp-ui-flycheck-enable t)
  (add-to-list 'lsp-ui-doc-frame-parameters '(no-accept-focus . t))
  (define-key lsp-ui-mode-map [remap xref-find-definitions] #'lsp-ui-peek-find-definitions)
  (define-key lsp-ui-mode-map [remap xref-find-references] #'lsp-ui-peek-find-references))

Note: this configuration might have changed since I wrote this, see my init.el configuration for the most recent config.

The main reason for choosing lsp-mode over eglot is that it's in Debian (and eglot is not). (Apparently, eglot has more chance of being upstreamed, "when it's done", but I guess I'll cross that bridge when I get there.)

I already had lsp-mode partially setup in Emacs so I only had to do this small tweak to switch and change the prefix key (because s-l or mod is used by my window manager). I also had to pin LSP packages to bookworm here so that it properly detects pylsp (the older version in Debian bullseye only supports pyls, not packaged in Debian).

This won't do anything by itself: Emacs will need something to talk with to provide the magic. Those are called "servers" and are basically different programs, for each programming language, that provide the magic.

Servers setup

The Emacs package provides a way (M-x lsp-install-server) to install some of them, but I prefer to manage those tools through Debian packages if possible, just like lsp-mode itself. Those are the servers I currently know of in Debian:

package languages
ccls C, C++, ObjectiveC
clangd C, C++, ObjectiveC
elpa-lsp-haskell Haskell
fortran-language-server Fortran
gopls Golang
python3-pyls Python

There might be more such packages, but those are surprisingly hard to find. I found a few with apt search "Language Server Protocol", but that didn't find ccls, for example, because that just said "Language Server" in the description (which also found a few more pyls plugins, e.g. black support).

Note that the Python packages, in particular, need to be upgraded to their bookworm releases to work properly (here). It seems like there's some interoperability problems there that I haven't quite figured out yet. See also my Puppet configuration for LSP.

Finally, note that I have now completely switched away from Elpy to pyls, and I'm quite happy with the results. lsp-mode feels slower than elpy but I haven't done any of the performance tuning and this will improve even more with native compilation. And lsp-mode is much more powerful. I particularly like the "rename symbol" functionality, which ... mostly works.

Remaining work

Puppet and Ruby

I still have to figure how to actually use this: I mostly spend my time in Puppet these days, there is no server listed in the Emacs lsp-mode language list, but there is one listed over at the upstream language list, the puppet-editor-services server.

But it's not packaged in Debian, and seems somewhat... involved. It could still be a good productivity boost. The Voxpupuli team have vim install instructions which also suggest installing solargraph, the Ruby language server, also not packaged in Debian.

Bash

I guess I do a bit of shell scripting from time to time nowadays, even though I don't like it. So the bash-language-server may prove useful as well.

Other languages

Here are more language servers available:

...
@anarcat April 27, 2022 - 20:29 • 28 days ago
building Debian packages under qemu with sbuild

I've been using sbuild for a while to build my Debian packages, mainly because it's what is used by the Debian autobuilders, but also because it's pretty powerful and efficient. Configuring it just right, however, can be a challenge. In my quick Debian development guide, I had a few pointers on how to configure sbuild with the normal schroot setup, but today I finished a qemu based configuration.

Why

I want to use qemu mainly because it provides better isolation than a chroot. I sponsor packages sometimes and while I typically audit the source code before building, it still feels like the extra protection shouldn't hurt.

I also like the idea of unifying my existing virtual machine setup with my build setup. My current VM is kind of all over the place: libvirt, vagrant, GNOME Boxes, etc?). I've been slowly converging over libvirt however, and most solutions I use right now rely on qemu under the hood, certainly not chroots...

I could also have decided to go with containers like LXC, LXD, Docker (with conbuilder, whalebuilder, docker-buildpackage), systemd-nspawn (with debspawn), unshare (with schroot --chroot-mode=unshare), or whatever: I didn't feel those offer the level of isolation that is provided by qemu.

The main downside of this approach is that it is (obviously) slower than native builds. But on modern hardware, that cost should be minimal.

How

Basically, you need this:

sudo mkdir -p /srv/sbuild/qemu/
sudo apt install sbuild-qemu
sudo sbuild-qemu-create -o /srv/sbuild/qemu/unstable.img unstable https://deb.debian.org/debian

Then to make this used by default, add this to ~/.sbuildrc:

# run autopkgtest inside the schroot
$run_autopkgtest = 1;
# tell sbuild to use autopkgtest as a chroot
$chroot_mode = 'autopkgtest';
# tell autopkgtest to use qemu
$autopkgtest_virt_server = 'qemu';
# tell autopkgtest-virt-qemu the path to the image
# use --debug there to show what autopkgtest is doing
$autopkgtest_virt_server_options = [ '--', '/srv/sbuild/qemu/%r-%a.img' ];
# tell plain autopkgtest to use qemu, and the right image
$autopkgtest_opts = [ '--', 'qemu', '/srv/sbuild/qemu/%r-%a.img' ];
# no need to cleanup the chroot after build, we run in a completely clean VM
$purge_build_deps = 'never';
# no need for sudo
$autopkgtest_root_args = '';

Note that the above will use the default autopkgtest (1GB, one core) and qemu (128MB, one core) configuration, which might be a little low on resources. You probably want to be explicit about this, with something like this:

# extra parameters to pass to qemu
# --enable-kvm is not necessary, detected on the fly by autopkgtest
my @_qemu_options = ['--ram-size=4096', '--cpus=2'];
# tell autopkgtest-virt-qemu the path to the image
# use --debug there to show what autopkgtest is doing
$autopkgtest_virt_server_options = [ @_qemu_options, '--', '/srv/sbuild/qemu/%r-%a.img' ];
$autopkgtest_opts = [ '--', 'qemu', @qemu_options, '/srv/sbuild/qemu/%r-%a.img'];

This configuration will:

  1. create a virtual machine image in /srv/sbuild/qemu for unstable
  2. tell sbuild to use that image to create a temporary VM to build the packages
  3. tell sbuild to run autopkgtest (which should really be default)
  4. tell autopkgtest to use qemu for builds and for tests

Note that the VM created by sbuild-qemu-create have an unlocked root account with an empty password.

Other useful tasks

  • enter the VM to make test, changes will be discarded (thanks Nick Brown for the sbuild-qemu-boot tip!):

     sbuild-qemu-boot /srv/sbuild/qemu/unstable-amd64.img
    

    That program is shipped only with bookworm and later, an equivalent command is:

     qemu-system-x86_64 -snapshot -enable-kvm -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -m 2048 -nographic /srv/sbuild/qemu/unstable-amd64.img
    

    The key argument here is -snapshot.

  • enter the VM to make permanent changes, which will not be discarded:

     sudo sbuild-qemu-boot --readwrite /srv/sbuild/qemu/unstable-amd64.img
    

    Equivalent command:

     sudo qemu-system-x86_64 -enable-kvm -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -m 2048 -nographic /srv/sbuild/qemu/unstable-amd64.img
    
  • update the VM (thanks lavamind):

     sudo sbuild-qemu-update /srv/sbuild/qemu/unstable-amd64.img
    
  • build in a specific VM regardless of the suite specified in the changelog (e.g. UNRELEASED, bookworm-backports, bookworm-security, etc):

     sbuild --autopkgtest-virt-server-opts="-- qemu /var/lib/sbuild/qemu/bookworm-amd64.img"
    

    Note that you'd also need to pass --autopkgtest-opts if you want autopkgtest to run in the correct VM as well:

     sbuild --autopkgtest-opts="-- qemu /var/lib/sbuild/qemu/unstable.img" --autopkgtest-virt-server-opts="-- qemu /var/lib/sbuild/qemu/bookworm-amd64.img"
    

    You might also need parameters like --ram-size if you customized it above.

And yes, this is all quite complicated and could be streamlined a little, but that's what you get when you have years of legacy and just want to get stuff done. It seems to me autopkgtest-virt-qemu should have a magic flag starts a shell for you, but it doesn't look like that's a thing. When that program starts, it just says ok and sits there.

Maybe because the authors consider the above to be simple enough (see also bug #911977 for a discussion of this problem).

Live access to a running test

When autopkgtest starts a VM, it uses this funky qemu commandline:

qemu-system-x86_64 -m 4096 -smp 2 -nographic -net nic,model=virtio -net user,hostfwd=tcp:127.0.0.1:10022-:22 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -monitor unix:/tmp/autopkgtest-qemu.w1mlh54b/monitor,server,nowait -serial unix:/tmp/autopkgtest-qemu.w1mlh54b/ttyS0,server,nowait -serial unix:/tmp/autopkgtest-qemu.w1mlh54b/ttyS1,server,nowait -virtfs local,id=autopkgtest,path=/tmp/autopkgtest-qemu.w1mlh54b/shared,security_model=none,mount_tag=autopkgtest -drive index=0,file=/tmp/autopkgtest-qemu.w1mlh54b/overlay.img,cache=unsafe,if=virtio,discard=unmap,format=qcow2 -enable-kvm -cpu kvm64,+vmx,+lahf_lm

... which is a typical qemu commandline, I'm sorry to say. That gives us a VM with those settings (paths are relative to a temporary directory, /tmp/autopkgtest-qemu.w1mlh54b/ in the above example):

  • the shared/ directory is, well, shared with the VM
  • port 10022 is forward to the VM's port 22, presumably for SSH, but not SSH server is started by default
  • the ttyS1 and ttyS2 UNIX sockets are mapped to the first two serial ports (use nc -U to talk with those)
  • the monitor UNIX socket is a qemu control socket (see the QEMU monitor documentation, also nc -U)

In other words, it's possible to access the VM with:

nc -U /tmp/autopkgtest-qemu.w1mlh54b/ttyS2

The nc socket interface is ... not great, but it works well enough. And you can probably fire up an SSHd to get a better shell if you feel like it.

Nitty-gritty details no one cares about

Fixing hang in sbuild cleanup

I'm having a hard time making heads or tails of this, but please bear with me.

In sbuild + schroot, there's this notion that we don't really need to cleanup after ourselves inside the schroot, as the schroot will just be delted anyways. This behavior seems to be handled by the internal "Session Purged" parameter.

At least in lib/Sbuild/Build.pm, we can see this:

my $is_cloned_session = (defined ($session->get('Session Purged')) &&
             $session->get('Session Purged') == 1) ? 1 : 0;

[...]

if ($is_cloned_session) {
$self->log("Not cleaning session: cloned chroot in use\n");
} else {
if ($purge_build_deps) {
    # Removing dependencies
    $resolver->uninstall_deps();
} else {
    $self->log("Not removing build depends: as requested\n");
}
}

The schroot builder defines that parameter as:

    $self->set('Session Purged', $info->{'Session Purged'});

... which is ... a little confusing to me. $info is:

my $info = $self->get('Chroots')->get_info($schroot_session);

... so I presume that depends on whether the schroot was correctly cleaned up? I stopped digging there...

ChrootUnshare.pm is way more explicit:

$self->set('Session Purged', 1);

I wonder if we should do something like this with the autopkgtest backend. I guess people might technically use it with something else than qemu, but qemu is the typical use case of the autopkgtest backend, in my experience. Or at least certainly with things that cleanup after themselves. Right?

For some reason, before I added this line to my configuration:

$purge_build_deps = 'never';

... the "Cleanup" step would just completely hang. It was quite bizarre.

Disgression on the diversity of VM-like things

There are a lot of different virtualization solutions one can use (e.g. Xen, KVM, Docker or Virtualbox). I have also found libguestfs to be useful to operate on virtual images in various ways. Libvirt and Vagrant are also useful wrappers on top of the above systems.

There are particularly a lot of different tools which use Docker, Virtual machines or some sort of isolation stronger than chroot to build packages. Here are some of the alternatives I am aware of:

Take, for example, Whalebuilder, which uses Docker to build packages instead of pbuilder or sbuild. Docker provides more isolation than a simple chroot: in whalebuilder, packages are built without network access and inside a virtualized environment. Keep in mind there are limitations to Docker's security and that pbuilder and sbuild do build under a different user which will limit the security issues with building untrusted packages.

On the upside, some of things are being fixed: whalebuilder is now an official Debian package (whalebuilder) and has added the feature of passing custom arguments to dpkg-buildpackage.

None of those solutions (except the autopkgtest/qemu backend) are implemented as a sbuild plugin, which would greatly reduce their complexity.

I was previously using Qemu directly to run virtual machines, and had to create VMs by hand with various tools. This didn't work so well so I switched to using Vagrant as a de-facto standard to build development environment machines, but I'm returning to Qemu because it uses a similar backend as KVM and can be used to host longer-running virtual machines through libvirt.

The great thing now is that autopkgtest has good support for qemu and sbuild has bridged the gap and can use it as a build backend. I originally had found those bugs in that setup, but all of them are now fixed:

  • #911977: sbuild: how do we correctly guess the VM name in autopkgtest?
  • #911979: sbuild: fails on chown in autopkgtest-qemu backend
  • #911963: autopkgtest qemu build fails with proxy_cmd: parameter not set
  • #911981: autopkgtest: qemu server warns about missing CPU features

So we have unification! It's possible to run your virtual machines and Debian builds using a single VM image backend storage, which is no small feat, in my humble opinion. See the sbuild-qemu blog post for the annoucement

Now I just need to figure out how to merge Vagrant, GNOME Boxes, and libvirt together, which should be a matter of placing images in the right place... right? See also hosting.

pbuilder vs sbuild

I was previously using pbuilder and switched in 2017 to sbuild. AskUbuntu.com has a good comparative between pbuilder and sbuild that shows they are pretty similar. The big advantage of sbuild is that it is the tool in use on the buildds and it's written in Perl instead of shell.

My concerns about switching were POLA (I'm used to pbuilder), the fact that pbuilder runs as a separate user (works with sbuild as well now, if the _apt user is present), and setting up COW semantics in sbuild (can't just plug cowbuilder there, need to configure overlayfs or aufs, which was non-trivial in Debian jessie).

Ubuntu folks, again, have more documentation there. Debian also has extensive documentation, especially about how to configure overlays.

I was ultimately convinced by stapelberg's post on the topic which shows how much simpler sbuild really is...

Who

Thanks lavamind for the introduction to the sbuild-qemu package.

...
@blog April 26, 2022 - 00:00 • 30 days ago
New Alpha Release: Tor Browser 11.5a9 (Windows/macOS/Linux)

Tor Browser 11.5a9 is now available from the Tor Browser download page and also from our distribution directory.

Tor Browser 11.5a9 updates Firefox on Windows, macOS, and Linux to 91.8.0esr.

We use the opportunity as well to update various other components of Tor Browser:

  • NoScript 11.4.4
  • Tor Launcher 0.2.34
  • Tor 0.4.7.5-alpha

The full changelog since Tor Browser 11.5a8 is:

...
@blog April 25, 2022 - 00:00 • 1 months ago
Malicious relays and the health of the Tor network

Running relays is a significant contribution to our project and we've designed that process so that the barrier of entry is low, making it possible for a variety of people with different backgrounds to participate. This openness is important as it makes our network (and the privacy guarantees it offers) more robust and resilient to attacks. However, that low threshold of contributing to our network also makes it easier for malicious operators to attack our users, e.g. via Man-in-the-Middle (MitM) attacks at exit nodes.

This blog post explains what we're doing to detect malicious actors (and remove their relays), how we developed these strategies, and what we're working on to make it harder for bad operators to run attacks. Additionally, we want to shine some light on this part of our day-to-day work at Tor. Because this is an arms race, we have to balance being transparent with effective detection of malicious actors. In this post we hope to offer more transparency about our approach without compromising the methods we use to keep our users safe.

What does bad-relay work look like?

Whether a relay is "bad" or "malicious" is often not as clear-cut as it might sound at first. Maybe the relay in question is just misconfigured and is, e.g., missing family settings (for the family configuration option see section 5 in our post-install instructions). Does that mean the operator has nefarious intentions? To help us react to those situations, we have developed a set of criteria and a process for dealing with potentially bad relays. Now, when we get a report or detect suspicious behavior, we follow that process. First, we try to fix the underlying issue together with the operator, and if that does not work or the behavior is outright malicious, we propose the relay for rejection (or maybe assigning the badexit flag).

There is an important point here to be made about the process of removing relays from the network: while we do have staff who are watching the network and reviewing reports from volunteers, it is not Tor staff who reject relays (and that is by design). Instead, rejecting relays from the network is the job of the directory authorities. Directory authorities are special-purpose relays that maintain a list of currently-running relays and periodically publish a consensus together with the other directory authorities. Directory authorities are run mostly by trusted volunteers from our community. They rely on recommendations from the Network Health team to evaluate and address malicious relays, but they only reject a potentially malicious relay if the majority of directory authorities agree to do so.

KAX17 and failures in the past

In December 2020, we noticed a weird pattern in our relay graphs. This pattern showed that dozens of relays would join the network on 00:00:00 at the first of a month just to vanish slowly again. This had been ongoing for months before we saw that pattern, and when we noticed it, we speculated that we were witnessing a bug in Tor's hibernation code.

Then in early September 2021, based on our own monitoring and investigation, we saw some unusual relay additions to our network that caught our attention: relays that seemed to belong to a single operator were showing up, but they did not have a family declaration. Moreover, there was no way to contact the operator as the contact information field was empty. Based on the set of criteria for dealing with potentially bad relays we had established, we proposed that these relays be rejected from the network, and the directory authorities approved this rejection. As soon as these relays were rejected, though, similar relays kept showing up on the network which then in turn got rejected as well.

At the end of October 2021, we got a tip from an anonymous Tor user who helped us to eventually detect and remove a relay group that was later dubbed "KAX17". We don't know who was running those relays or what they were doing or trying to achieve (their behavior was quite different from the more well-known attack where malicious exit relay operators are trying to do MitM-attacks on our users). However, after we kept a close eye on KAX17 relays, both the spikes at the begin of each month and the relay flooding stopped, and it's fair to assume that the KAX17 operator was responsible for both of them as well.

The KAX17 operator was active for months, maybe years, on the network. How is that possible?

There are three important reasons for that failure on our side worth highlighting here:

Firstly, while it is always way easier to connect different dots like the ones above in hindsight, our failure to do so should remind us that we invested too little effort in our bad-relay tooling over the last several years. In particular, we did not monitor the network closely enough to understand which kind of tools we might still be missing or would be needed most.

Secondly, fighting and removing malicious relays requires trust between all involved parties (volunteers, Tor staff, and directory authorities) and that trust was missing from time to time, creating friction and frustration, which in turn made the work to detect and reject bad relays even harder.

Thirdly, and most importantly, we were not set up as an organization to work effectively on detecting and removing malicious relays: we lacked resources to broaden our detection mechanisms and to enhance our relay policies, network monitoring, and community outreach. Even though the previous two reasons were affected by this as well, the organizational shortcoming deserves to be a separate item on this list.

Lessons learned

January 2020, we launched the Network Health team at Tor with the goal to get our work related to monitoring the network and the health of our relays up to speed and organized. This included bad-relay work. While having dedicated staff working in that area is important, it is no remedy to the third item of the list above by itself. We needed to set up policies for our day-to-day work that matched expectations related to the trust angle mentioned above. Additionally, in the past, the Tor Project teams worked in silos. We needed to increase coordination between teams to do this work effectively. In particular, working closely with the Community team has given us another very efficient tool in our fight against malicious relays.

Beyond increasing coordination and organization, we developed new tools that help in the fight against exit relays doing MitM- attacks on our users and in proactively detecting groups of potentially malicious relays. We believe that this proactive stance has significantly raised the bar for malicious operators entering the network during the past months which is an exciting development in this arms race.

Many of the improvements discussed above come from organizing our work in new ways, like opening lines of communication and coordination between teams. These improvements are also possible because of the investment of our donors and funders. We were able to increase capacity and stability with additional funding and the support of our community. Thank you to everyone who has made a donation in the last few years; you've made it possible for us to fight back and keep users safe.

What is coming up next?

We believe we have ramped up our bad-relay work significantly during the past year and we are moving in promising directions. Coordinating with the Tor Browser team resulted in Tor Browser 11.5 (due later this year) that will ship an HTTPS-Only mode enabled by default, which should help tremendously in the arms race with exit relays trying to MitM user connections: while we do have several defenses in place against those attacks, we believe that HTTPS-Only mode in Tor Browser will be a game-changer as it will strongly reduce incentives to spin up exit relays for MitM attacks in the first place. You can try out a recent desktop alpha release where this change is already getting some testing (and please report any bugs you find so we can improve the HTTPS-Only user experience where needed).

We are in particular excited about upcoming funding for work on more tools in our bad-relay toolbox including new and improved ways of monitoring our network, but also building a stronger relay operator community, given that just relying on monitoring tools is not a sufficient strategy in the bad-relay area. A stronger relay community is an essential building block in our longer-term plan to limit large-scale attacks on the network which we hope making progress on in the upcoming months and years.

So, stay tuned!

Please remember: if you witness malicious relays in our network, please report them to us for our users' safety sake to bad-relays[@]lists[.]torproject[.]org. We need everyone staying vigilant and helping to keep our users safe as malicious operators are trying to adapt their strategies to get around our defenses.

...
@anarcat April 13, 2022 - 20:56 • 1 months ago
Tuning my wifi radios

After listening to an episode of the 2.5 admins podcast, I realized there was some sort of low-hanging fruit I could pick to better tune my WiFi at home. You see, I'm kind of a fraud in WiFi: I only started a WiFi mesh in Montreal (now defunct), I don't really know how any of that stuff works. So I was surprised to hear one of the podcast host say "it's all about airtime" and "you want to reduce the power on your access points" (APs). It seemed like sound advice: better bandwidth means less time on air, means less collisions, less latency, and less power also means less collisions. Worth a try, right?

Frequency

So the first thing I looked at was WifiAnalyzer to see if I had any optimisation I could do there. Normally, I try to avoid having nearby APs on the same frequency to avoid collisions, but who knows, maybe I had messed that up. And turns out I did! Both APs were on "auto" for 5GHz, which typically means "do nothing or worse".

5GHz is really interesting, because, in theory, there are LOTS of channels to pick from, it goes up to 196!! And both my APs were on 36, what gives?

So the first thing I did was to set it to channel 100, as there was that long gap in WifiAnalyzer where no other AP was. But that just broke 5GHz on the AP. The OpenWRT GUI (luci) would just say "wireless not associated" and the ESSID wouldn't show up in a scan anymore.

At first, I thought this was a problem with OpenWRT or my hardware, but I could reproduce the problem with both my APs: a TP-Link Archer A7 v5 and a Turris Omnia (see also my review).

As it turns out, that's because that range of the WiFi band interferes with trivial things like satellites and radar, which make the actually very useful radar maps look like useless christmas trees. So those channels require DFS to operate. DFS works by first listening on the frequency for a certain amount of time (1-2 minute, but could be as high as 10) to see if there's something else transmitting at all.

So typically, that means they just don't operate at all in those bands, especially if you're near any major city which generally means you are near a weather radar that will transmit on that band.

In the system logs, if you have such a problem, you might see this:

Apr  9 22:17:39 octavia hostapd: wlan0: DFS-CAC-START freq=5500 chan=100 sec_chan=1, width=0, seg0=102, seg1=0, cac_time=60s
Apr  9 22:17:39 octavia hostapd: DFS start_dfs_cac() failed, -1

... and/or this:

Sat Apr  9 18:05:03 2022 daemon.notice hostapd: Channel 100 (primary) not allowed for AP mode, flags: 0x10095b NO-IR RADAR
Sat Apr  9 18:05:03 2022 daemon.warn hostapd: wlan0: IEEE 802.11 Configured channel (100) not found from the channel list of current mode (2) IEEE 802.11a
Sat Apr  9 18:05:03 2022 daemon.warn hostapd: wlan0: IEEE 802.11 Hardware does not support configured channel

Here, it clearly says RADAR (in all caps too, which means it's really important). NO-IR is also important, I'm not sure what it means but it could be that you're not allowed to transmit in that band because of other local regulations.

There might be a way to workaround those by changing the "region" in the Luci GUI, but I didn't mess with that, because I figured that other devices will have that already configured. So using a forbidden channel might make it more difficult for clients to connect (although it's possible this is enforced only on the AP side).

In any case, 5GHz is promising, but in reality, you only get from channel 36 (5.170GHz) to 48 (5.250GHz), inclusively. Fast counters will notice that is exactly 80MHz, which means that if an AP is configured for that hungry, all-powerful 80MHz, it will effectively take up all 5GHz channels at once.

This, in other words, is as bad as 2.4GHz, where you also have only two 40MHz channels. (Really, what did you expect: this is an unregulated frequency controlled by commercial interests...)

So the first thing I did was to switch to 40MHz. This gives me two distinct channels in 5GHz at no noticeable bandwidth cost. (In fact, I couldn't find hard data on what the bandwidth ends up being on those frequencies, but I could still get 400Mbps which is fine for my use case.)

Power

The next thing I did was to fiddle with power. By default, both radios were configured to transmit as much power as they needed to reach clients, which means that if a client gets farther away, it would boost its transmit power which, in turns, would mean the client would still connect to instead of failing and properly roaming to the other AP.

The higher power also means more interference with neighbors and other APs, although that matters less if they are on different channels.

On 5GHz, power was about 20dBm (100 mW) -- and more on the Turris! -- when I first looked, so I tried to lower it drastically to 5dBm (3mW) just for kicks. That didn't work so well, so I bumped it back up to 14 dBm (25 mW) and that seems to work well: clients hit about -80dBm when they get far enough from the AP, which gets close to the noise floor (and where the neighbor APs are), which is exactly what I want.

On 2.4GHz, I lowered it down even further, to 10 dBm (10mW) since it's better at going through wells, I figured it would need less power. And anyways, I rather people use the 5GHz APs, so maybe that will act as an encouragement to switch. I was still able to connect correctly to the APs at that power as well.

Other tweaks

I disabled the "Allow legacy 802.11b rates" setting in the 5GHz configuration. According to this discussion:

Checking the "Allow b rates" affects what the AP will transmit. In particular it will send most overhead packets including beacons, probe responses, and authentication / authorization as the slow, noisy, 1 Mb DSSS signal. That is bad for you and your neighbors. Do not check that box. The default really should be unchecked.

This, in particular, "will make the AP unusable to distant clients, which again is a good thing for public wifi in general". So I just unchecked that box and I feel happier now. I didn't make tests to see the effect separately however, so this is mostly just a guess.

...
@ooni April 12, 2022 - 00:00 • 1 months ago
New Launch: OONI Measurement Aggregation Toolkit (MAT)
Today the Open Observatory of Network Interference (OONI) team is thrilled to announce the public launch of the OONI Measurement Aggregation Toolkit (MAT)! The MAT is a tool that enables you to create your own custom charts based on aggregate views of real-time OONI data collected from around the world. Use the MAT to track internet censorship worldwide based on real-time OONI data! About the MAT ...
@blog April 7, 2022 - 00:00 • 2 months ago
New Release: Tor Browser 11.0.10 (Windows, macOS, Linux)

Tor Browser 11.0.10 is now available from the Tor Browser download page and also from our distribution directory.

This version includes important security updates to Firefox:

Tor Browser 11.0.10 updates Firefox on Windows, macOS, and Linux to 91.8.0esr.

We use the opportunity as well to update various other components of Tor Browser:

  • NoScript 11.4.3

The full changelog since Tor Browser 11.0.9 is:

...
@blog April 4, 2022 - 00:00 • 2 months ago
Arti 0.2.0 is released: Your somewhat-stable API is here!

Arti is our ongoing project to create a working embeddable Tor client in Rust. It’s not ready to replace the main Tor implementation in C, but we believe that it’s the future.

Right now, our focus is on making Arti production-quality, by stress-testing the code, hunting for likely bugs and adding missing features that we know from experience that users will need. We're going to try not to break backward compatibility too much, but we'll do so when we think it's a good idea.

What's new in 0.2.0?

For a complete list of changes, have a look at the CHANGELOG.

From a user's point of view, most of the changes in this version of Arti are for improved performance and reliability. We've started doing experiments on different kinds of network issues, and have improved Arti's behavior on IPv6-only networks and many kinds of network failure. We also now use less memory for directory storage (on the order of several megabytes in a running client).

The flashiest new user-facing feature is the dns_port option (like Tor's DnsPort) that you can use to send some types of DNS requests over Tor.

For developers, the most important change to be aware of is the new configuration code. When we're done, we think it will be far easier to configure an application that uses Arti. (It's a work in progress, and unfortunately will probably lead to more API changes in the future between now and 1.0.0.

Other important developer-facing features are:

  • A new, far more flexible API for stream isolation. It allows applications to create sophisticated sets of isolation rules, beyond those currently possible with the C Tor implementation.
  • An API for a "dormant mode" for inactive clients. (When in dormant mode, scheduled events are suspended.)
  • Support for replacing the directory provider code with an arbitrary implementation, for applications that need specialized directory download and caching behavior.

There are also a bunch of smaller features, bugfixes, and infrastructure improvements; again, see the CHANGELOG for a more complete list.

And what's next?

Between now and our 1.0.0 milestone in September, we're aiming to make Arti a production-quality Tor client for direct internet access. (Onion services aren't funded yet, but we hope to change that soon.)

To do so, we need to bring Arti up to par with the C tor implementation in terms of its network performance, CPU usage, resiliency, and security features. You can follow our progress on our 1.0.0 milestone.

We still plan to continue regular releases between now and then.

Here's how to try it out

We rely on users and volunteers to find problems in our software and suggest directions for its improvement. Although Arti isn't yet ready for production use, you can test it as a SOCKS proxy (if you're willing to compile from source) and as an embeddable library (if you don't mind a little API instability).

Assuming you've installed Arti (with cargo install arti, or directly from a cloned repository), you can use it to start a simple SOCKS proxy for making connections via Tor with:

$ arti proxy -p 9150

and use it more or less as you would use the C Tor implementation!

(It doesn't support onion services yet. If compilation doesn't work, make sure you have development files for libsqlite installed on your platform.)

If you want to build a program with Arti, you probably want to start with the arti-client crate. Be sure to check out the examples too.

For more information, check out the README file. (For now, it assumes that you're comfortable building Rust programs from the command line). Our CONTRIBUTING file has more information on installing development tools, and on using Arti inside of Tor Browser. (If you want to try that, please be aware that Arti doesn't support onion services yet.)

When you find bugs, please report them on our bugtracker. You can request an account or report a bug anonymously.

And if this documentation doesn't make sense, please ask questions! The questions you ask today might help improve the documentation tomorrow.

Call for feedback

Our priority for the coming months is to make Arti a production-quality Tor client, for the purposes of direct connections to the internet. (Onion services will come later.) We know some of the steps we'll need to take to get there, but not all of them: we need to know what's missing for your use-cases.

Whether you're a user or a developer, please give Arti a try, and let us know what you think. The sooner we learn what you need, the better our chances of getting it into an early milestone.

Acknowledgments

Thanks to everybody who has contributed to this release, including Christian Grigis, Dimitris Apostolou, Lennart Kloock, Michael, solanav, Steven Murdoch, and Trinity Pointard.

And thanks, of course, to Zcash Community Grants (formerly Zcash Open Major Grants (ZOMG)) for funding this project!

...
@kushal April 1, 2022 - 12:03 • 2 months ago
Securing verybad web application with only systemd

In my last blog post I talked about verybad web application. It has multiple major security holes, which allows anyone to do remote code execution or read/write files on a server. Look at the source code to see what all you can do.

I am running one instance in public http://verybad.kushaldas.in:8000/, and then I asked twitter to see if anyone can get access. Only difference is that this service has some of the latest security mitigation from systemd on a Fedora 35 box.

The service is up for a few days now, a few people tried for hours. One person managed to read the verybad.service file after a few hours of different tries. This allowed me to look into other available options from systemd. Rest of the major protections are coming from DynamicUser=yes configuration in systemd. This enables multiple other protections (which can not be turned off). Like:

  • SUID/SGID files can not be created or executed
  • Temporary filesystem is private to the service
  • The entire file system hierarchy is mounted read-only except a few places

systemd can also block exec mapping of shared libraries or executables. This way we can block any random command execution, but still allow the date command to execute.

Please have a look at the man page and learn about many options systemd now provides. I am finding this very useful as it takes such small amount of time to learn and use. The credit goes to Lennart and rest of the maintainers.

Oh, just in case you are wondering, for a real service you should enable this along with other existing mechanisms, like SELinux or AppArmor.

...
@anarcat April 1, 2022 - 01:50 • 2 months ago
Salvaged my first Debian package

I finally salvaged my first Debian package, python-invoke. As part of ITS 964718, I moved the package from the Openstack Team to the Python team. The Python team might not be super happy with it, because it's breaking some of its rules, but at least someone (ie. me) is actively working (and using) the package.

Wait what

People not familiar with Debian will not understand anything in that first paragraph, so let me expand. Know-it-all Debian developers (you know who you are) can skip to the next section.

Traditionally, the Debian project (my Linux-based operating system of choice) has prided itself on the self-managed, anarchistic organisation of its packaging. Each package maintainer is the lord of their little kingdom. Some maintainers like to accumulate lots of kingdoms to rule over.

(Yes, it really doesn't sound like anarchism when you put it like that. Yes, it's complicated: there's a constitution and voting involved. And yes, we're old.)

Therefore, it's really hard to make package maintainers do something they don't want. Typically, when things go south, someone makes a complaint to the Debian Technical Committee (CTTE) which is established by the Debian constitution to resolve such conflicts. The committee is appointed by the Debian Project leader, elected each year (and there's an election coming up if you haven't heard).

Typically, the CTTE will then vote and formulate a decision. But here's the trick: maintainers are still free to do whatever they want after that, in a sense. It's not like the CTTE can just break down doors and force maintainers to type code.

(I won't go into the details of the why of that, but it involves legal issues and, I think, something about the Turing halting problem. Or something like that.)

Anyways. The point is all that is super heavy and no one wants to go there...

(Know-it-all Debian developers, I know you are still reading this anyways and disagree with that statement, but please, please, make it true.)

... but sometimes, packages just get lost. Maintainers get distracted, or busy with something else. It's not that they want to abandon their packages. They love their little fiefdoms. It's just there was a famine or a war or something and everyone died, and they have better things to do than put up fences or whatever.

So clever people in Debian found a better way of handling such problems than waging war in the poor old CTTE's backyard. It's called the Package Salvaging process. Through that process, a maintainer can propose to take over an existing package from another maintainer, if a certain set of condition are met and a specific process is followed.

Normally, taking over another maintainer's package is basically a war declaration, rarely seen in the history of Debian (yes, I do think it happened!), as rowdy as ours is. But through this process, it seems we have found a fair way of going forward.

The process is basically like this:

  1. file a bug proposing the change
  2. wait three weeks
  3. upload a package making the change, with another week delay
  4. you now have one more package to worry about

Easy right? It actually is! Process! It's magic! It will cure your babies and resurrect your cat!

So how did that go?

It went well!

The old maintainer was actually fine with the change because his team wasn't using the package anymore anyways. He asked to be kept as an uploader, which I was glad to oblige.

(He replied a few months after the deadline, but I wasn't in a rush anyways, so that doesn't matter. It was polite for him to answer, even if, technically, I was already allowed to take it over.)

What happened next is less shiny for me though. I totally forgot about the ITS, even after the maintainer reminded me of its existence. See, the thing is the ITS doesn't show up on my dashboard at all. So I totally forgot about it (yes, twice).

In fact, the only reason I remembered it was that got into the process of formulating another ITS (1008753, trocla) and I was trying to figure out how to write the email. Then I remembered: "hey wait, I think I did this before!" followed by "oops, yes, I totally did this before and forgot for 9 months".

So, not great. Also, the package is still not in a perfect shape. I was able to upload the upstream version that was pending 1.5.0 to clear out the ITS, basically. And then there's already two new upstream releases to upload, so I pushed 1.7.0 to experimental as well, for good measure.

Unfortunately, I still can't enable tests because everything is on fire, as usual.

But at least my kingdom is growing.

Appendix

Just in case someone didn't notice the hyperbole, I'm not a monarchist promoting feudalism as a practice to manage a community. I do not intend to really "grow my kingdom" and I think the culture around "property" of "packages" is kind of absurd in Debian. I kind of wish it would go away.

(Update: It has also been pointed out that I might have made Debian seem more confrontational than it actually is. And it's kind of true: most work and interactions in Debian actually go fine, it's only a minority of issues that degenerate into conflicts. It's just that they tend to take up a lot of space in the community, and I find that particularly draining. And I think think our "package ownership" culture is part of at least some of those problems.)

Team maintenance, the LowNMU process, and low threshold adoption processes are all steps in the good direction, but they are all opt in. At least the package salvaging process is someone a little more ... uh... coercive? Or at least it allows the community to step in and do the right thing, in a sense.

We'll see what happens with the coming wars around the recent tech committee decision, which are bound to touch on that topic. (Hint: our next drama is called "usrmerge".) Hopefully, LWN will make a brilliant article to sum it up for us so that I don't have to go through the inevitable debian-devel flamewar to figure it out. I already wrecked havoc on the #debian-devel IRC channel asking newbie questions so I won't stir that mud any further for now.

(Update: LWN, of course, did make an article about usrmerge in Debian. I will read it soon and can then tell you know if it's brilliant, but they are typically spot on.)

...
@kushal March 29, 2022 - 07:15 • 2 months ago
Introducing Very Bad Web Application

I am planning to add a few chapters on securing services in my Linux Command Line book. But, to make it practical & hands on, I needed one real application which the readers can deploy and secure. I needed something simple, say one single binary so that it becomes easier to convert it into a proper systemd service.

I decided to write one in Rust :) This also helps to showcase that one can write totally insecure code even in Rust (or any other language). Let me introduce Very Bad Web application. The README contains the build instructions. The index page shows the available API.

Issues in the service

The service has the following 3 major issues:

  • Directory traversal
  • Arbitrary file read
  • Remote code execution

I am currently updating the systemd (services) chapter in my book to show how to secure the service using only the features provided by the latest systemd. In future I will also have chapters on SELinux and AppArmor and learn how to secure the service using those two options.

If you think I should add some other nice security holes in this application, please feel free to suggest :)

...
@anarcat March 28, 2022 - 14:11 • 2 months ago
What is going on with web servers

I stumbled upon this graph recently, which is w3techs.com graph of "Historical yearly trends in the usage statistics of web servers". It seems I hadn't looked at it in a long while because I was surprised at many levels:

  1. Apache is now second, behind Nginx, since ~2022 (so that's really new at least)

  2. Cloudflare "server" is third ahead of the traditional third (Microsoft IIS) - I somewhat knew that Cloudflare was hosting a lot of stuff, but I somehow didn't expect to see it there at all for some reason

  3. I had to lookup what LiteSpeed was (and it's not a bike company): it's a drop-in replacement (!?) of the Apache web server (not a fork, a bizarre idea but which seems to be gaining a lot of speed recently, possibly because of its support for QUIC and HTTP/3

So there. I'm surprised. I guess the stats should be taken with a grain of salt because they only partially correlate with Netcraft's numbers which barely mention LiteSpeed at all. (But they do point at a rising share as well.)

Netcraft also puts Nginx's first place earlier in time, around April 2019, which is about when Microsoft's IIS took a massive plunge down. That is another thing that doesn't map with w3techs' graphs at all either.

So it's all lies and statistics, basically. Moving on.

Oh and of course, the two first most popular web servers, regardless of the source, are package in Debian. So while we're working on statistics and just making stuff up, I'm going to go ahead and claim all of this stuff runs on Linux and that BSD is dead. Or something like that.

...