Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

EVE system has been built with security at the core of its design. One of its SECURITY principles is that, EVE should be trustworthy, and it should provide a deterministic way to measure its software layers, right from firmware, all through bootloader, kernel and user-space applications.  It should also provide a mechanism to report these measurements to a third-party for attestation. This is to provide a verifiable software environment to launch user applications, i.e. the Edge Container Objects. The concept of measured boot is not new. For example, mobile phones use measured boot and attest to an attestation servera verifier, before initiating a payment transaction. Blockchain smart oracles at the cyber-physical edge, have to prove their software stack as trustworthy, before injecting events into smart contracts. This is a common requirement for distributed systems in general, but it becomes even more important for geographically remote systems like IoT Edge gateways, as there is no physical perimeter security for these Edge gateways. Following section describes the unique operational requirements of Edge gateways, and the attack possibilities associated with them:

  1.  In some deployments, EVE devices operate in geographically remote locations e.g.  top of wind turbines or in the mid-ocean oil extraction zones, where manual access to the device is limited. In such deployments, the only way to get operational information about EVE is by EVE itself shipping the operational state to the controller.  Thus providing visibility of the operational status of the device to the controller at any time is a key.  Even if there is an attack and the software running on EVE has been modified, it is very critical that EVE itself detects it and reports it to the controller.
  2.  In some deployments, Internet connectivity and power supply are not reliable. There can be intermittent loss of connectivity between EVE and Controller, and EVE is expected to continue operating with the last-known configuration till it is able to reach the Controller again. There can be  power outage as well, which means, once power is back, EVE is expected to boot up and get back to its last operational state and resume its Edge Container Instances (even if the connectivity is still down)
  3.  Some of the EVE devices are also deployed in cases where the device is prone to security attacks, e.g. smart poles, Industrial factory floor etc. Therefore these EVE devices can be easily attacked by inserting USB drives, or the device itself can be physically hijacked to a different location for possible extraction of user data.
  4. EVE software upgrades are done remotely from the Controller. Ensuring integrity of the upgrade in a zero-trust environment is a challenge (i.e. How do we know if EVE device is running the correct version indeed ?)

These challenging environmental conditions and deployment requirements bring in their own set of security attack possibilities:

  1.  There could be physical attacks, e.g. booting from a USB key with modified software, installing firmware rootkits, or by altering hardware configuration by adding unauthorized PCI peripheral or removing existing PCI device 
  2.  There could be attacks over the network, e.g.  modifying OS partitions to boot a different (potentially modified version of EVE) to run malicious programs  
  3.  There could be attempt to "steal" the device, and operate it in a different location, for off-line hacking, in an attempt to decrypt the encrypted volumes on the disk, by booting a different software
  4.  There could be a malicious software version booting up and pretending to be the legal version expected by the Controller, and thus extracting all the latest configuration from the Controller, which might contain secretive information like cloud storage credentials,  sensitive user-data such as cloud config for a Edge Container Instance etc.

That is the problem to solve - EVE should maintain operational availability in these diverse conditions, and at the same, EVE should also support a security framework to detect and mitigate these security challenges. i.e.

  1. Measure the boot chain of EVE 
  2. Detect any discrepancies in the boot chain and disallow access to sensitive resources in EVC(like the credentials)
  3. Allow EVE to access encrypted volumes of disk, as long as the measurements don't change 
  4. Have self-locking mechanism to restrict access to encrypted volumes in case of a change in the boot chain, even during offline operation.

That is what this proposal is about: Measuring the boot chain of EVE, and allowing access to select resources in the Controller only  on the basis of attestation of these measurements. Since any software can be potentially modified, such measurement architectures typically use a hardware based root of trust (HRoT) or a Trusted Execution Environment(TEE). Here we present a solution based on Trusted Platform Module as the Hardware Root of Trust.

At a high level, the solution is to

  1. Implement recommendations of TCG Remote Attestation Protocol TAP - with EVE as the attester and EVC as the verifier
  2. Use TPM to measure the booting sequence using Platform Configuration Registers (PCR)
  3. Lock the encrypted volumes, with the decryption key sealed using PCRs (for self-locking during offline/tampered conditions)
  4. Allow access to secretive resources only if the PCR values haven't changed

Following sections describe the details Since any software can be potentially modified, such measurement architectures typically use a hardware based root of trust (HRoT) or a Trusted Execution Environment(TEE). Here we present a solution based on Trusted Platform Module as the Hardware Root of Trust.

Measured Boot vs Secure Boot 

...

Measured Boot, is another security standard, where the measurements are recorded into TPM PCRs but the boot process can be allowed to complete. The measurements can be later read back from TPM and the Audit log of all the measurements can also be fetched.  Thus measured boot does not make any judgement about the integrity of the boot stages, but gives opportunity for an external server verifier to inspect the TPM Audit Log and PCR values in a detailed manner to see which components were measured, what their measurements are etc., and make a decision.

...

Secure BootMeasured Boot 
Core Root of Trust is certificate embedded in Firmware by OEM

Core Root of Trust is Trusted Platform Module (TPM)

Does not require connectivity to any remote server verifier to establish chain of trust.Attestation server verifier needs to inspect TPM PCR values before booting process can be trusted
Any new software needs to be signed by a certificate rooted to OEM certificatesAn independent trusted third-party (an attestation serververifier), can match the PCR measurements against its expected values
Chain of trust is established by each software layer verifying the next software layer before handing the control over (in the boot chain)Chain of trust is established after going through TPM Event Log and PCR values
Device will become inaccessible if one of the software layers fail validation since booting process will halt. Good for devices with human presence like smartphones, laptops etc. But poses difficulty for remotely managed systems like IoT Edge gateways, where manual access to the device is limited.  Since IoT gateways are managed through remotely located controllers, losing connectivity will severely impact visibility into the system.Works well for remotely managed systems. Device will be accessible, and controller can still reach the system, and get visibility into the system.

...

Trusted Platform Module as the Silicon RoT

Trusted Platform Module(TPM) supports many crypto functions. Notably the “PCR Extend” and “Seal” operations are used in popular measured boot architectures. Let's take a quick look at these commands.

...

Based on the above constructs, we present a solution to measure and attest software integrity of EVE node. Just for recap, EVE is the open-source software from LF-Edge for Edge Virtualization, running on IoT Edge gateways. EVC is the controller for managing these EVE instances.  Adam under LF-Edge is an open source implementation of one such EVC. The APIs between EVE and EVC are specified in EVE API specification.  In the context of remote attestation, the EVC is the attesting authority and EVE reports its measurements for attestation. 



Fig 1. Role of EVC as the Remote Attestation Serverthe verifier, and EVE as the attester

Introduction to Foundational Concepts

  1. A new device state,  “Unknown Update Detected” (Abbreviated as UUD) will be introduced in EVC, to notify that measurements of software running on a given device didn't match expected measurements.
  2. EVC maintains a central database of all the supported EVE software images, and their hash values, indexed with the EVE image version tag
  3. EVC also maintains a central database of mostly used BIOS firmware images, their signatures, and the certificate provided by the BIOS vendor for validating the signatures, indexed by a combined tag of BIOS version string + Manufacturer
  4. On receiving attestation request EVE,  attestation service module checks PCR Quote against the baseline value. If there is a change, the EVE node is marked as “UUD”.  When there is no software change, the PCR quote is expected to be the same across reboots. However, after an EVE software upgrade, it is expected that the baseline value will change for the PCR values. However, after comparing the reported values with the expected values(since EVC knows about the new image version and its hash values), EVC makes a decision: If the reported values match, the baseline for the EVE node is updated. If they don’t, EVE node is marked as UUD. 
  5. An optional feature, "Location-Lock" may be introduced in EVC, to additionally check the Geo-Location reported by device, and flag if the location has changed since its last-seen location. This feature may be critical in some deployments, where a given EVE node may be mounted permanently, and any change in its physical location should be flagged, and optionally considered along with software measurements to conclude if the EVE node is trusted. In this regard, another new device flag, “Unknown Movement Detected” (Abbreviated as UMD) will be introduced in EVC, to indicate that a change in the location of the device has been detected.  
  6. If EVE node is not in “UUD" or "UMD" state, EVC includes all the latest configuration in reply to config request from EVE.
  7. If edge-node is in “UUD” or "UMD" state, any config request from EVE will be responded with response code 403 - Forbidden. This is to protect any new sensitive  images/credentials from getting exposed to the compromised device. The response from EVC in such cases will carry an error code to indicate that there was an attestation failure, and hence partial configuration is being sent.  EVE, upon receiving such error codes MUST schedule re-attestation immediately.
  8. If EVE reboots with a different software image from the configured version, EVC should be able to detect as quickly as possible and force attestation. To this effect a new token is introduced, called "Integrity Token". This token is a random nonce that will be stored in EVE under an encrypted folder. The master- key for the encrypted volume to decrypt this folder will be sealed against the TPM PCRs.  Therefore, if EVE reboots with a different software, unsealing of this key will fail, and hence the new software will not be able to recover the Integrity Token inside it. This token is sent during attestation requests, and if the attestation is successful, the provided token value is sealed against PCR values. This is usually implemented by storing the token inside an encrypted folder, with the master- key for the encrypted volume protected by TPM with Seal operation.
  9. Every configuration request from EVE will include this Integrity Token. If there is a token mismatch, HTTP code 403 - Forbidden will be sent to the device as the response, indicating that device should do re-attestation. 
  10. Frequency of attestation: EVE node will be required to periodically attest itself via attestation requests. It is up to the EVC to schedule the frequency of attestation. Whenever EVC replies with Error 403, EVE MUST re-attest.  EVC can trigger this either periodically (say every few hours) or when it sees any discrepancy in the Integrity-Token.

...

All the ECO Volumes would be inside the encrypted folder (/persist/vault), with the master key for the encrypted volume sealed inside TPM, against a set of PCR values. Without access to this master- key for the encrypted volume, the encrypted folders can not be accessed. Essentially this means that the contents of the user-provisioned volumes are sealed against PCR values.

...

To implement this, ECO volumes would be inside the encrypted folder (/persist/vault), with the master key for the encrypted volume sealed inside TPM. 

To provide some flexibility to the EVE node administrator, when deploying ECOs on the EVE platform, there will be 3  security modes for app deployment:

...

  1. Device-steps starts client.go (the provisioning client) which will check and do the following:
    1. If EVC is not reachable, uses certificates from the last download
    2. If EVC is not reachable, uses UUID already present in /config/uuid
  2. Device-steps starts Vault Mgr
    1. Vault Mgr retrieves the master decryption key from TPM with Unseal operation. Unseal is successful since the PCR values are the same since last reboot.
    2. Unlocks /persist/vault with the master- key for the encrypted volume.
    3. Publishes vault status (unlocked)
  3. Device-steps starts TPM mgr
    1. TPM manger retrieves the attestation certificate and publishes to Zedagent
    2. Waits for Quote requests on pubsub channel from Zedagent
  4. Device-steps starts Zed Manager, Domain Mgr microservices - these services are responsible for launching the ECOs
  5. Device-steps starts Zedagent (and Zedagent starts 3 concurrent tasks: attest, info and config)
    1. Configuration task, since EVC is not reachable, uses the configuration from the last download, and publishes to Zed Mgr and Domain Mgr
    2. Since the Vault is unlocked, Domain Mgr can access Edge App Images, and hence launches the ECOs.

...

  1. Device-steps starts client.go (the provisioning client) which will check and do the following:
    1. If certificates from EVC are not yet fetched,  fetches them 
    2. Retrieves UUID from EVC
  2. Device-steps starts Vault Mgr
    1. Vault Mgr tries to retrieve the master decryption key from TPM with Unseal operation, and Unseal operation fails since the PCR values have changed
    2. Publishes vault status (as "locked")
    3. Waits for Integrity-Token and/or master-key key for the encrypted volume from EVC - it will block here forever
  3. Device-steps starts TPM mgr
    1. TPM manger retrieves the attestation certificate and publishes to Zedagent
    2. Waits for Quote requests on pubsub channel from Zedagent
  4. Device-steps starts Zedagent (and Zedagent starts 3 concurrent tasks: attest, info and config)
  5. Device-steps starts Edge App microservices - e.g. Domain Mgr 
    1. Attest task requests for a nonce from EVC (to prepare PCR quote)
    2. Attest task sends the nonce back to TPM Mgr and waits for PCR quote
    3. Attest task, once notified about the quote readiness, creates a random nonce value for Integrity-Token
    4. Attest task sends { Quote, Location, Event Log, Integrity-Token, Image Version } to EVC
    5. EVC, since the quote is different, tries to compare EventLog entries with its known hashes against the version reported. Since the PCR values are different, there will not be match, unless Admin has uploaded measurements for this new version. Assuming that Admin is yet to upload, attestation request will fail here.
  6. EVC sends an error back to EVE, to retry attestation.
  7. In the mean time, configuration task keeps requesting config from EVC (expected to fail till attestation goes through).
    1. EVC replies to configuration request with HTTP Error Code 403 - Forbidden. Indicates attestation failure (due to no or invalid Integrity Token)
    2. Config task communicates to Attest Task to re-trigger attestation
  8. Admin uploads measurements for the new image version
  9. The next attestation request is matched successfully against the uploaded measurements
  10. EVC approves the attestation request, and responds back with the master- key for the encrypted volume (encrypted with TPM key) for the EVE node, and also saves the Integrity-Token supplied in the attestation request
  11. Vault Mgr 
    1. picks up the master- key for the encrypted volume
    2. unlocks vault
    3. populates Integrity-Token approved by EVC inside /persist/vault/itoken
    4. Publishes new vault status (unlocked) to all the microservices
  12. The next configuration request picks up the new Integrity Token, and sends configuration request along with this Integrity-Token
  13. EVC verifies that Integrity-Token matches with its copy, and approves configuration request and responds with the latest configuration.
  14. Domain Mgr notices the new vault status (unlocked) from Vault Mgr , and latest configuration from ZedAgent, and starts the Edge Apps
  15. Info task reports the encrypted master- key for the encrypted volume back to EVC, for backup purposes

...

  1. Device-steps starts client.go (the provisioning client) which will check and do the following:
    1. First fetches EVC Certificates
    2. Registers the device certificate
    3. Retrieves UUID from EVC
  2. Device-steps starts Vault Mgr
    1. Vault Mgr creates new vault: /persist/vault
    2. Seals the master- key for the encrypted volume into TPM with PCR values
    3. Publishes vault status (unlocked)
    4. Waits for Integrity-Token from zedAgent
    5. Once received, copies the new Integrity-Token to /persist/vault/itoken file
  3. Device-steps starts TPM mgr
    1. TPM manger retrieves the attestation certificate and publishes to Zedagent
    2. Waits for Quote requests on pubsub channel from Zedagent
  4. Device-steps starts Zedagent (and Zedagent starts 3 concurrent tasks: attest, info and config)
  5. Attest task,
    1. picks up the attestation certificate and publishes to EVC
    2. requests for a nonce from EVC (to prepare PCR quote)
    3. sends the nonce back to TPM Mgr and waits for PCR quote
    4. Once notified about the quote readiness, creates a random nonce value for Integrity-Token
    5. Sends { Quote, Location, Event Log, Integrity-Token, Image Version } to EVC
  6. EVC, if master- key for the encrypted volume is not found for the EVE, approves the PCR quote as the baseline
    1. If Location-Lock is enabled, location is also approved as baseline (provided the EVE device is not in manufacturing account)
  7. In the mean time, configuration task keeps requesting config from EVC (expected to fail till attestation goes through)
  8. Attest task, once attestation request is approved, gets the master- key for the encrypted volume and Integrity-Token and passes it to Vault Mgr
  9. Config task now picks up the right Integrity-Token and Config request is now accepted by EVC, and full configuration is sent back
  10. Info task reports the encrypted master- key for the encrypted volume back to EVC, for backup purposes

...

When EVE goes through a software upgrade, the PCR values will change depending on what is new in the upgraded version. However, during the upgrade window, an attacker might boot a different software as well, and try to present the compromised software as the upgraded version of EVE. Therefore, it is important that we measure the boot sequence after the upgrade and validate that the new software is indeed the new image that was pushed by the EVC. This is done through the following steps:

  1. EVE encrypts the master- key of for the vault using encrypted volume using a TPM-based key, and sends it to EVC. EVE can send this information when the master- key for the encrypted volume is created for the first time, and also can include this in the periodic info message. Sending this in periodic info message also provides additional facility of rotating the key if required.
  2. After the upgrade, the EVE software presents the new Event Log, along with the PCR Quote. As mentioned earlier, the PCR Extend operations are recorded in a table called TPM Event Log Table.  This is sometimes also called the Boot Log.
  3. First, EVC repeats the transactions mentioned in the Event log, and the final PCR as computed from the Event Log should match the PCR Quote. If they don’t match, then Event Log can’t be trusted(say it was manipulated), and the EVE node is marked as UUD.  Please note: Event Log is stored in system memory and PCR Quote is generated by TPM. So PCR quote is the source of truth, pinned to HRoT.
  4. Secondly, if PCR Quote matches the Event Log, then Event Log entries are compared between old Event Log and the new Event Log, and the differing entries are extracted
  5. EVC maintains a central database of all the supported EVE software images, and their hash values, indexed with the EVE image version tag
  6. EVC also maintains a central database of all the BIOS firmware images, their signatures, and the certificate provided by the BIOS vendor for validating the signatures, indexed by a combined tag of BIOS version string + Manufacturer 
  7. The differing values are compared against acceptable values stored against the given image version. Additionally, If admin has enabled “Location-Lock” feature, additionally the geo-location reported by the device is checked as well. The Geo-location reported by the device can be trusted if the software state (established through PCR quote) can be trusted.
  8. If the PCR quote and the optional Geo-Location check pass, the encrypted master- key for the encrypted volume is given back to the EVE node. EVE decrypts the master- key for the encrypted volume, and unlocks the vault with the master- key for the encrypted volume. The master- key for the encrypted volume is now sealed into the TPM, against the current PCR values.
  9. User will have an option to override the attestation decision taken by EVC, through a configuration in EVC portal, against the EVE node

...