Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Background and problem statement

Currently the network traffic from the EVE microservices (zedagent, logmanager, downloader, etc) can operate with active/active handling of multiple management ports. However, traffic from an application ECO (using a network instance) can not reliably get such connective except when using the mesh overlay network instances. A local network instance can be specified to use the tag "uplink" which results in all of the management ports being used, but due to the constraints of NAT this attempt at active/active connectivity does not work reliably.

Furthermore, when there are multiple management ports with different preferences (for instance a free Ethernet plus a pay-per-use LTE connection) the level of policy control extends to the EVE microservices. This, subject to policy control, should also apply to application ECO traffic.

Proposal

Active/standby for network instances

Instead of using active/active for network instances, it is a lot simpler to use active/standby. This means that a given local (or cloud) network instance is using a single port at any given time, but as the EVE microservices detects failures the network instance will failover to using a different uplink.

...

Note that as part of these changes we can remove the maintenance of 500 routing table, which has been used for all NAT traffic.

Priority handling

When zedrouter notices that the currently used management port for a network instance has failed, then it needs to select an alternate. That should be performed the same way as the logic in zedcloud/send.go, which is to first look at the management ports marked as "free" and only if none of this are found to be working (i.e., lastSucceeded timestamp is more than lastFailed), then proceed to look at the management ports with free=false.

Refined priority handling

Currently the notion of a free management interface is specified in the API in PhyIOUsagePolicy, which is directly attached to the provisioning of the device ports. Longer term that might not be sufficient along two axes:

...