Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

<Please fill out the Overview, Design and User Experience sections for an initial review of the proposed feature.>

Overview

<Briefly describe the problem being solved, not how the problem is solved, just focus on the problem. Think about why the feature is needed, and what is the relevant context to understand the problem.>

There are limited built-in metrics available about an edge node, its host, and workloads.  Traditional approaches include data aggregation in the cloud or some other central data lake, streaming logs to a remote host, or surfacing them at the origin so they can be queried on the edge node by a remote operator logging in.  Each of those approaches has drawbacks and are not edge native.

Design

<Describe how the problem is fixed. Include all affected components. Include diagrams for clarity. This should be the longest section in the document. Use the sections below to call out specifics related to each aspect of the overall system, and refer back to this section for context. Provide links to any relevant external information.>

If an application were to be installed on an edge node, preferably delivered by Open Horizon, that could query the system for information, surface it, and make the data available in an efficient and edge-native manner, that would be ideal.  This may mean updating the node properties, and it may mean making the information remotely queryable without the operator logging in to the edge node.

...

  • The platform functionality
  • The node functionality
  • The query and monitoring functionality

This Feature Design candidate will deliver a functioning end-to-end example and documentation demonstrating how to deliver and configure an EdgeLake.  This code will deliver and connect a data collection and querying network consisting of a master node, a query node, and two or more operator nodes.  

Future iterations may include a version using non-containerized node agents, and a script that installs and integrates EdgeLake beside an All-in-One Open Horizon deployment instance in similar fashion to how FDO is integrated.

The Platform Functionality - Extending Open Horizon as a Platform:

AnyLog EdgeLake extends the Open Horizon functionality delivered to the edge as a platform:

...

The Node Functionality - Extending the functionalities of nodes deployed by Open Horizon:

  AnyLog EdgeLake extends the Open Horizon functionality delivered to the individual nodes by using the platform functionality such that:

...

KubeArmor running on the edge node provides visibility and protection for all the processes, files, or network operations in the containers as well as those running directly on the host.  See KubeArmor integration repo.  In this feature, KubeArmor (when present) can transmit (define how) collected metrics to the AnyLog EdgeLake code running on the Node.

...

NS1 will provide an API endpoint and help define when how, and what information will be transmitted from the Nodes over AnyLog into NS1 for Node and network visibility and analytics.

Edge Node Deployment (Day 1)

On the target edge node, native (non-containerized) anax and EdgeLake agents

Working assumptions

  • We are setting a precedent for installation of optional third-party components
    • To simplify the installation process, keep each operation atomic, and to allow components to be installed in any order, all component installations will be decoupled from the base anax installation.
    • The process should also function if a person is "bringing their own already-installed component" and we are just integrating anax with a pre-existing EdgeLake installation.
  • EdgeLake interactions with Open Horizon will be expressed as intents in a "data" policy
    • This data policy can be embedded within a node policy, service definition, and/or a deployment policy.
    • Deployment policies can override the node's default data policy due to greater specificity.
  • The default case will be based on native applications, not containerized versions, although both options or a mix thereof should work.
  • The integration should also be easily reversible.
  • A node may have more than one anax agent running, but anax > 1 must always be containerized.

anax agent installation

Today, installing the agent on the target device involves running the "agent-install.sh" script as documented at Automated agent installation and registration.  At this point, we are assuming that no signal needs to be sent to this installation script and process to notify it that EdgeLake should also be installed.  If that were the case, we should consider a flag in the form of an installation argument or an environment variable.  This will allow us to decouple the process of installing EdgeLake as an optional data component.

EdgeLake agent installation

Instead of altering the "agent-install.sh" script to trigger the EdgeLake installation process, we are proposing that a completely separate script be created that will install a native EdgeLake application, and then signal to anax that it has been installed and is ready to use.  This assumes that anax has already been installed and configured, but does not need to be registered with an exchange for EdgeLake to be installed.  In fact, if we are proposing to create or modify the node policy file, it is better if the anax agent is not currently registered.

User Experience

<Describe which user roles are related to the problem AND the solution, e.g. admin, deployer, node owner, etc. If you need to define a new role in your design, make that very clear. Remember this is about what a user is thinking when interacting with the system before and after this design change. This section is not about a UI, it's more abstract than that. This section should explain all the aspects of the proposed feature that will surface to users.>

...

Are there any ways to optionally extend the CLI when components are installed?  If not, they we should avoid this.

The lower level AnyLog EdgeLake functionality is enabled by a CLI, this can extend the hzn CLI.
AnyLog EdgeLake CLI includes dynamic help with links to help pages on GitHub - all of that can be available as an extension of the hzn CLI.

...

  • Nodes in the AnyLog network are configured such that commands and queries can be provided using REST. Therefore it is simple to integrate to existing and new applications without dependencies on existing infrastructure or setups.
  • Because of the decentralization nature of the AnyLog Network - any node or application can act as a point of access to the entire data set and the monitored status of all the member nodes.
  • AnyLog EdgeLake provides a web GUI that is optimized to the AnyLog API calls and data queries. It only requires a browser, can be installed on any node and can serve as a monitoring tool for network managers and as a training tool for administrators and developers showing how to interact with nodes in the network.

...

<Describe any new or changed interactions with components that are not the agent or the management hub.>

Installing the Anylog EdgeLake agent on an edge node should provide metrics collection and surfacing.

...

<Describe any related security aspects of the solution. Think about security of components interacting with each other, users interacting with the system, components interacting with external systems, permissions of users or components>

Ideally, the Anylog component should The EdgeLake component does not need root-level access.  – That is correct. 

The Anylog EdgeLake component maintains its own P2P network. – That is correct.

An AnyLog EdgeLake node can be deployed with and without security layers. If enabled - the AnyLog protocol is using keys and the blockchain to authenticate users and their permissions. The network can issue certificates to 3rd parties applications that authenticate the apps and users and determine their permissions.

...

<Describe and new/changed/deprecated APIs, including before and after snippets for clarity. Include which components or users will use the APIs.>

Link to Anylog EdgeLake docs.

  • Each AnyLog EdgeLake instance includes a CLI option.
  • Data monitored can be generated by AnyLog EdgeLake existing functionalities. For example, disk space, memory usage, networking status, cpu state, processes running etc. are build-in functionalities that can be leverage on each node. Additional details are in the Monitor Nodes document.
  • Southbound Connectors are detailed in the Adding Data document (including services to present a node as a broker for pub-sub of a data,  to subscribe to a third party broker, to receive data via REST).
  • Northbound connectors are based on SQL and AnyLog CLI commands that are transferred to the network using REST. 

AnyLog EdgeLake documentation:

...

Will be done using Open Horizon (we had a prototype Open Horizon + AnyLog EdgeLake working).

A detailed Docker based deployment training is available with this link.

...

  • Document deployment with Open Horizon.
  • AnyLog EdgeLake CLI extending the Open Horizon CLI.

Rahul Jadhav : Can we please add the documentation for all the possible ways in which the data can be ingested in to AnyLogEdgeLake? CC: Moshe Shadmon 

Test

<Summarize new automated tests that need to be added in support of this feature, and describe any special test requirements that you can foresee.>

...