Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

With this, it will be possible to deploy multiple VPN network instances with overlapping traffic selectors and still route/encrypt/decrypt unambiguously.



Network Namespaces

An alternative solution to VRFs is, that instead of using per-NI VRF and CT zone, we could isolate on all network levels and run each Linux bridge and associated network interfaces plus dnsmasq in a separate network namespace. For the most part this is very similar to the VRF proposal, in that both solutions use VETHs to route and NAT packets from/to apps twice. Also, PBR routes/rules and iptables are very much the same.

The advantage of having multiple namespaces is a stronger isolation and not having all routes and iptables crammed in one network stack. Also, this solution is completely transparent to processes (like dnsmasq, radvd, etc.). The major downside of this solution is a higher overhead (memory footprint in particular).

However, from the management-plane point of view this proposal is much more difficult to implement. Working with multiple namespaces from the same process (e.g. zedbox) is possible but quite challenging. While each process has its own "default" namespace where it has started, individual threads can be switched between namespaces as needed. However, frequent switching between namespaces adds some overhead and it makes development and debugging even harder than it already is. For this reason most network-related software products, including strongSwan for example, are intentionally not able to manage multiple network namespaces from a single process instance.

In golang this is even more challenging since Go routines are provided instead of threads. Because Go routine can travel between threads as it executes, it can potentially change namespace mid-execution. It is possible to lock a Go routine with its current thread, but any Go routine spawned from inside will start back at the process default namespace. This gotcha is nicely described here: https://www.weave.works/blog/linux-namespaces-golang-followup

And so while switching to another namespace, locking the thread, doing something quick (not asynchronous, e.g. listing conntracks) and immediately switching back is safe, running a long asynchronous task (e.g. tcp server, packet capture) has the risk of having some Go sub-routines escaping the namespace, which leads to some funky bugs.

In general it is recommended to spawn a new child process for every network namespace that needs to be operated in. For this reason this proposal will follow up on the “Bridge Manager” described here.

The main idea of Bridge Manager is to split an overloaded zedrouter and run management of every network instance in a separate Go routine. In this proposal we would go even further and suggest to run Bridge Manager (here called "NetNS manager") as a child process of zedbox (or as a separate containerd process). The main communication mechanism of EVE - pubsub - is already prepared for interprocess messaging.

The following diagram shows how network instances can be isolated from each other using network namespaces. As it can be seen, not only network configuration is spread across namespaces, but also management plane is split into multiple processes.TODO



Proof of Concept

In order to verify that the proposed network configuration would actually work for all scenarios as intended, a PoC based on docker containers representing network stacks of (mock) apps, network instances and zedbox has been prepared. The source code for the PoC with diagrams and description can be found in this repository: https://github.com/milan-zededa/evenet

...