You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Submitted by: John Walicki

Affiliation(s): IBM

Date of Submission:  

Sponsor User: Mathis Moder 

<Please fill out the above fields, and the Overview, Design and User Experience sections below for an initial review of the proposed feature.>

Scope and Signoff: (to be filled out by Chair)

Status: In Progress

Overview

For specific edge-nodes (e.g. moving cars or other critical equipment), a new Safety Critical "Change Freeze" mode should be introduced. The Agent will continue to run any active agreements / workloads but will not download/start new services/cancel existing services

Goal: prevent workloads from changing while the edge node is in a "safety critical state".

Design

An external invocation via CLI/API would set the state of the agent to "freeze" .

No agreements would be negotiated unless the state is no more frozen.

In order to avoid any situation where the agent would be frozen forever, the API could also accept an timeout parameter, after which the agent would resume as normal.

It's not required that the node stops all communication with the hub.

For the aspect of HA groups it can be assumed that all nodes in this group are frozen at the same time (i.e. redundant nodes in a vehicle - the external critical state would apply to all).

Possibly send heartbeats but not accept node property updates or changes

Possibly allow geofencing information updates?  Where the edge node is located might be important to know.  Aha "The car is on the driveway, geofenced at home" is an important clue that might allow the agent to trigger changes to workloads.   If the car is at the supermarket, not a good idea.

Governance should restart the agreement, if it dies unexpectedly - tricky?

node health state ? 

HA node groups need to skip over nodes that are in ChangeFreeze state.  This is orthogonal to the reason for a HA group.  Unsupported configuration.


Let the external change "The car is in park and the GPS knows that the car is "home" - Call the API to change out of ChangeFreeze state".

The agent never decides for itself that it out of ChangeFreeze state

Build a "Agent Config State" API

If a secret changes, the agbot sends a message of a change, if the agent doesn't see or handle that message, what happens?  Max?  Would the agreement get cancelled if the agent doesn't reply?

MMS handling of agents in ChangeFreeze status  -

ESS should also go into ChangeFreeze state as well.  It should not look for model updates while the edge node is in changefreeze state.

Node Management- behavior?


User Experience

As a node owner, I want to want to "freeze" the services of the node until i decide they can be changed again / after a defined timeout expires. 

As a service deployer, I want to have feedback about this "frozen" state of the node.

As an admin, I want to be able to unfreeze the node remotely via cli.

Command Line Interface

<Describe any changes to the hzn CLI, including before and after command examples for clarity. Include which users will use the changed CLI. This section should flow very naturally from the User Experience section.>


External Components

<Describe any new or changed interactions with components that are not the agent or the management hub.>


Affected Components

<List all of the internal components (agent, MMS, Exchange, etc) which need to be updated to support the proposed feature. Include a link to the github epic for this feature (and the epic should contain the github issues for each component).>



AgBot

Security

<Describe any related security aspects of the solution. Think about security of components interacting with each other, users interacting with the system, components interacting with external systems, permissions of users or components>


APIs

<Describe and new/changed/deprecated APIs, including before and after snippets for clarity. Include which components or users will use the APIs.>


When the agent comes out of ChangeFreeze state, the agbot should numerate through a list of its BasicAgreementVerification() s.

Build, Install, Packaging

<Describe any changes to the way any component of the system is built (e.g. agent packages, containers, etc), installed (operators, manual install, batch install, SDO), configured, and deployed (consider the hub and edge nodes).>


Documentation Notes

<Describe the aspects of documentation that will be new/changed/updated. Be sure to indicate if this is new or changed doc, the impacted artifacts (e.g. technical doc, website, etc) and links to the related doc issue(s) in github.>


Test

<Summarize new automated tests that need to be added in support of this feature, and describe any special test requirements that you can foresee.>

  • No labels