Untitled Blog.

Who is resolving my names?

2023-10-09

Let's say I have a Linux network namespace, with just a Wireguard interface moved from the main netns, as described in the Wireguard documentation; all traffic must go through Wireguard, as it's the only way out. This solution is fairly neat, since it lets you isolate some processes and force them to run through a Wireguard VPN, fully transparently to the application.

Now, onto name resolution. I use NetworkManager, and I don't use systemd-resolved yet, so in theory, name resolution should go through whatever nameserver is provided in /etc/resolv.conf, by glibc.

I want to make sure DNS goes to the correct nameserver, the one on the other side of the VPN, when I enter into this namespace. OK, easy enough: ip netns exec already has functionality for this. If you add a file at /etc/netns/[namespace]/resolv.conf, it will bind mount it in a new filesystem namespace to /etc/resolv.conf. So if we want to do this, it's as easy as using ip netns exec, or at least, doing the same steps manually.

Right?

Wrong

Using online DNS leak detection tools, I could tell from a browser running within the namespace that it was somehow calling out to the other, original nameserver, discovered via DHCP. At this point it occurred to me that maybe it wouldn't be a bad idea to just force all unencrypted DNS traffic within the netns to go to the desired nameserver, for good measure. So I tried to patch this leak using iptables, like you would.

ip netns exec "${NAMESPACE}" \
  iptables -t nat -A OUTPUT -p udp --dport 53 -j DNAT --to "${NAMESERVER}"
ip netns exec "${NAMESPACE}" \
  iptables -t nat -A OUTPUT -p tcp --dport 53 -j DNAT --to "${NAMESERVER}"

It's a pretty heavy-handed solution, but it's essentially guaranteed to work. Applications can of course use DoT or DoH or any other mechanism they choose over TCP/UDP to resolve names, if they want, but at the very least, our glibc resolver should definitely, 100%, absolutely for sure use the one that we want it to.

...

End of the blog post, right? ...Right?

Still wrong

I thought I knew how my machine's DNS was configured. After all, it's simple: glibc is configured via /etc/nsswitch.conf. I have some crap in there (like mdns and etc.) but mainly, the hosts: line is just dns. So in theory, that means an application using glibc is going to read /etc/resolv.conf, then go and call out to the nameserver using UDP or TCP and everything is good.

Except it's not. Because, when I run this command: sudo ip netns exec [namespace] curl example.com, I can see clearly via tcpdump port 53 that it is in fact, somehow, in a namespace with only Wireguard access, making a DNS request via the primary network interface in the default network namespace:

19:31:11.573718 IP xxxxx.xxxxx > xxxxx.local.domain: xxxxx+ [1au] A? example.com. (40)

Oh dear. This is not just a matter of using the wrong DNS server. This is straight-up leaking through the network namespace!

Finding out what's going on

At this point it feels like something impossible is happening. Clearly, the process doing the name resolution is not the one requesting the name resolution. However, using applications I know for sure are using glibc to resolve names, I can see it magically teleport outside of the namespace and resolve the names using the default interface. What exactly is going on here?

I decided the next plan of attack was to go in with strace and see what exactly its calling on a syscall level. I am, at this point, paranoid that this rabbit-hole will go on for weeks until someone just simply tells me the solution to the riddle.

However, much to my surprise, and frankly, relief, this immediately led me to the answer.

connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) =

...It was nscd.

What's Nscd?

Nscd is a name service caching daemon. It's kind of weird to see it here. Caching DNS requests on local systems has fallen out of favor for a variety of reasons. So, nscd is something you'd expect to see on older Linux machines with gpm for mouse support inside of virtual terminals and SysVinit and whatnot.

Nscd is a daemon that provides a cache for the most common name service requests. The default configuration file, /etc/nscd.conf, determines the behavior of the cache daemon. See nscd.conf(5).

That said, I'm pretty sure my DNS requests are, in fact, not cached, and I never set up nscd myself. What is it doing on my modern Linux machine?

I use NixOS, so most likely I can figure out what's going on by grepping for nscd in Nixpkgs. As it turns out, NixOS has a specific reason to use nscd:

Whether to enable the Name Service Cache Daemon. Disabling this is strongly discouraged, as this effectively disables NSS Lookups from all non-glibc NSS modules, including the ones provided by systemd.

Of course. Dynamically linked NSS modules constitute global state, so NixOS doesn't want to provide NSS modules that way. So instead, they rely on nscd: nscd can be provided all of the NSS modules, and everything else can just connect to it over a socket. At least I think that's the idea here.

This was surprisingly easy to figure out once I realized what was going on, but how do we fix it?

A workaround

For the most part, I don't really need a fancy solution to the NSS problem. I just want to be able to ensure that my namespaces stay isolated in their Wireguard VPN world, so that everything works as expected.

In my case, I'm not actually using ip netns exec, so I can control exactly what happens when we enter the namespace. So here's my solution: after unsharing the filesystem, I bind mount /var/empty to /var/run/nscd. It looks roughly like this in C (don't worry: the actual version of this does error checking and everything like that):

unshare(CLONE_NEWNS);
mount("none", "/" NULL, MS_REC | MS_PRIVATE, NULL);
mount("...", "/etc/resolv.conf", NULL, MS_BIND | MS_PRIVATE, NULL);
mount("/var/empty", "/var/run/nscd", NULL, MS_BIND | MS_PRIVATE, NULL);

Now, when I enter into this namespace, glibc can no longer utilize nscd to perform DNS requests, and will instead go through the intended DNS server.

Conclusion

It goes without saying that you should most likely not be ducktaping together Linux namespaces if you're in a life-or-death scenario. If you absolutely need security or privacy, you should consider an operating system like Qubes that has a layered approach, or at least use software set up to have a good configuration out of the box like Tor Browser or Mullvad Browser. Just because I patched this hole does not mean this is secure or that it doesn't leak. If you forget to disable WebRTC PeerConnection, you have another blatant hole potentially leaking information, and there are plenty more examples.

However, I still find the ability to "jail" processes such that they can only communicate through a VPN to be very useful; there are many practical applications. It's more ergonomic than wrangling virtual machines, more flexible and performant than messing with SOCKS proxies, and certainly less obtrusive than switching your entire machine to get routed through a VPN just so you can do some work using one. It can be a useful tool for using a VPN, so as long as you're not doing anything security critical with it. If you're doing some web scraping or bypassing region detection, it can be a fine solution.

With all of that having been said, I am a bit bothered about the way this played out. I think that even if you're aware that the system has nscd installed and configured, it might not be obvious the consequences that this can have. Someone once created an issue for “properly” providing NSS modules in NixOS, but it was mostly concluded that there was no particular problem with the approach of using nscd, whereas there are problems with providing NSS modules globally; that said, though, as far as I could see, nobody raised this particular issue. I do agree that nscd is a more elegant solution for the NSS module problem than many of the alternatives, but I may have stumbled across a fairly good argument against this approach that hadn't been factored in before.

With that said, I'm happy enough with my workaround, so I'm considering this case closed for me personally.