Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 Each agent's health is monitored through hardware watchdog. If an agent does not retouch the pid file for watchdog time interval, the device is rebooted.

Reset Time:

...

For controller connectivity health in normal operation mode

On controller connectivity loss, the EVE node is rebooted after the reset time interval.

 Fallback Time:

...

For controller connectivity health during baseos upgrade validation

...

For controller connectivity loss, EVE Node reboots and falls back to fallback image, after the fallback time interval.

...

The reset and fallback time functionalities are currently part of ZedAgent Module.  

...

Refactoring Details

The watchdog time functionality will remain as such. The reset and fallback time functionality will be moved into a new agent called, devagent. The whole baseos upgrade validation orchestration functionality will be moved into devagent module. Devagent will be spwaned along with ledmanager. Devagent will listen to ledmanager ledblinker config messages to determine controller connectivity status along with successful configuration pull message time stamps from zedagent, to orchestrate the baseos upgrade validation functionality. Devagent will be owner for Zboot config and will publish them for usage by baseosmanager. Also on successful baseos installation and reset/fallback timer expiry, the device reboot operations will be triggered through "devagent status"  pusub topic.

...

Baseosmanger will listen to devagent module, zboot config messages to handle, and update zboot status, during for baseos installation and upgrade validation orchestration.

...

Baseosmgr wiill subscribe to the following topic,

  •      

...

  • "zboot config" from devagent

            For baaseos instlallition and upgrade validation

ZedAgent Module

 Zedagent wiill subscribe to the following topic,

  •      

...

  • "devagent status" , generated by devagent

         

...

 For executing device reboot command

         

...

 To publish the remaining test time to controller, for baseos upgrade validation

  Zedagent will publish the following topic,

  •        

...

  • "zedagent status"

         

...

 Time stamp for last successful configuration pull from controller

DevAgent Module

DevAgent  module will  subscribe to the following topics,

  •    

...

  • "ledBlinker config", generated by zedclient/zedagent, etc

...

For EVE node registration, controller connectivity change events

  •    

...

  • "zboot status", generated by baseosmgr

       

...

For baseos installation and upgrade validation orchestration

  •    

...

  • "zedagent status", generated by zedagent

       

...

For the last successful config fetch time stamp, from controller

DevAgent will publish the following topics,

  •    

...

  • "zboot config"

       

...

Zboot partition information

  •    

...

  • "devagent status"

       

...

 For reboot commands, in baseos installation and reset/fallback timer expiry 

       

...

 Remaining test time, for

...

publication  to controller ( consumed by zedagent)


P.S.

For completeness and future workscope, the following items are noted, for EVE node health. This list is not exhaustive, and the necessary actions for them needs be defined. 

  •            

...

  • cpu usage health
  •            

...

  • disk space usage health
  •            

...

  • network usage health
  •            

...

  • each agent's basic functionality check, (on upgrade)