Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Status: In Progress

Overview

In For specific edge-nodes (e.g. moving cars or other critical equipment), a new Safety Critical "Change Freeze" mode should be introduced. The Agent will continue to run any active agreements / workloads but will not download/start new services/cancel existing services

Goal: prevent workloads from changing while the edge node is in a "safety critical state".

Design

...

An external invocation via CLI/API would set the state of the agent to "freeze" .

No Any agreements would be negotiated as usual, but the download of the service would not be executed unless the state is no more frozen.

In order to avoid any situation where the agent would be frozen forever, the API could also accept an timeout parameter, after which the agent would resume as normal.

It's not required that the node stops all communication with the hub.

For the aspect of HA groups it can be assumed that all nodes in this group are frozen at the same time (i.e. redundant nodes in a vehicle - the external critical state would apply to all).

Image Added



Expand
titleInitial comments from John

Possibly send heartbeats but not accept node property updates or changes

Possibly allow geofencing information updates?  Where the edge node is located might be important to know.  Aha "The car is on the driveway, geofenced at home" is an important clue that might allow the agent to trigger changes to workloads.   If the car is at the supermarket, not a good idea.

Governance should restart the agreement, if it dies unexpectedly - tricky?

node health state ? 

HA node groups need to skip over nodes that are in ChangeFreeze state.  This is orthogonal to the reason for a HA group.  Unsupported configuration.


Let the external change "The car is in park and the GPS knows that the car is "home" - Call the API to change out of ChangeFreeze state".

The agent never decides for itself that it out of ChangeFreeze state

Build a "Agent Config State" API

If a secret changes, the agbot sends a message of a change, if the agent doesn't see or handle that message, what happens?  Max?  Would the agreement get cancelled if the agent doesn't reply?

MMS handling of agents in ChangeFreeze status  -

ESS should also go into ChangeFreeze state as well.  It should not look for model updates while the edge node is in changefreeze state.

Node Management- behavior?

Option B) per service: include an "change-constraint" in the deployment policy

This approach could be compared to a normal property/constraint negotiation, but would only be relevant for the actual (de)activation of the payload. In this case _change.allow would be a reserved constraint parameter-map to avoid changes to the definitions on the first level. If this constraint resolves to true the service can be installed/removed, if it is false the service will neither be installed nor uninstalled.

Code Block
{
  "constraints": [
    "openhorizon.arch == arm64",
    "_change.allow": [
       "property.example": >= 1
    ]
  ]
}

User Experience

...


User Experience

As a node owner, I want to want to "freeze" the services of the node until i decide they can be changed again / after a defined timeout expires. 

As a service deployer, I want to have feedback about this "frozen" state of the node.

As an admin, I want to be able to unfreeze the node remotely via cli.

Command Line Interface

<Describe any changes to the hzn CLI, including before and after command examples for clarity. Include which users will use the changed CLI. This section should flow very naturally from the User Experience section.>

...