Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In addition to HA/CA support for service software upgrades, agent upgrades also need to be performed in a rolling fashion across all the nodes in an HA/CA node group. Agents are responsible for upgrading themselves based on node management policy as defined by the administrator, therefore there is no central entity that is able to coordinate agents within a group. The only entity in the system capable of assisting with the coordination is the Agbot. Agents in an HA Group will ask the Agbot if the agent can start the agent upgrade process. If the Agbot agrees, it will record (in the database) that the calling node is performing an upgrade, including which NMP is being processed by the node. With multiple Agbot instances, the database is needed to ensure that concurrent calls from different agents receive the correct response (i.e. only one agent is allowed to proceed with the upgrade). Subsequent calls from other nodes in the group will result in the agent being told to pause the upgrade. It is the agent's responsibility to poll the Agbot until it agrees that the upgrade may proceed. This will ensure that only 1 agent in an HA Group is upgrading at any point in time. The Agbot will use NMP status to know when a node upgrade has completed, allowing another node in the group to proceed. 

User Experience

<Describe which user roles are related to the problem AND the solution, e.g. admin, deployer, node owner, etc. If you need to define a new role in your design, make that very clear. Remember this is about what a user is thinking when interacting with the system before and after this design change. This section is not about a UI, it's more abstract than that. This section should explain all the aspects of the proposed feature that will surface to users.>

...