Blog

Introduction

Both Docker Compose and Open Horizon are tools for managing the deployment lifecycle of containerized applications, but there are significant differences.  This article will attempt to explain service software lifecycle management, and then compare and contrast the approaches used by these two tools.  This article assumes familiarity with containerized software, Dockerfiles, Docker Compose files, and Open Horizon components.

Service software lifecycle management

For the purposes of this article, we will decompose the concept of service deployment into the following lifecycle: Publishing, Execution, Operation (Updating, Monitoring, Restarting), and Removal.  Publishing is about specifying a container image (e.g. DockerHub, quay.io, IBM Cloud Container Registry), pulling the image, and storing it at the deployment location.  Execution covers optionally deploying secrets and any other configuration and then starting the service within a container engine.  Operation involves inspecting details about a running image, updating the image when new versions become available, and restarting an image if the host restarts or the image terminates unexpectedly.  Removal includes stopping a running image and optionally removing any resources it may be using from the deployment location.

Docker Compose and Open Horizon may then be compared using the above lifecycle stages in a table:

StagesDocker ComposeOpen Horizon
Publishing

manually run "docker-compose pull" on destination

not typically needed since this is included in Execution step below

automatically triggered on destination by Agent when an agreement is formed
Executionmanually run "docker-compose up" on destinationautomatically run on destination by Agent after publishing completes successfully
Operation

manually run "docker-compose ps" on destination

does not otherwise monitor or alert to runtime failures

docker-compose up to manually restart if a new service version is published

Agent automatically monitors running services and implements restarts and rollbacks as needed

If a new service version is published, agreement is terminated and re-negotiated

Removalmanually run "docker-compose down" on destinationIf an agreement is terminated, Agent will automatically halt and remove running services

In summary, Docker Compose is a tool for an operator to manually administer the service software lifecycle directly on deployment hosts.  Open Horizon is a tool for an operator to remotely specify the conditions under which the service software lifecycle should be autonomously administered on each deployment location by the Open Horizon Agent.

Operating environment

Docker Compose is designed for Linux Hosts and requires the Docker engine runtime.  It is not compatible with other container runtimes.  It can operate on macOS and Windows hosts using Docker Desktop.  It cannot be used to deploy containers to a Kubernetes cluster.

Open Horizon is designed for both Kubernetes clusters and Linux hosts, and is compatible with both Docker and podman runtimes.  It can deploy to Linux and macOS hosts using the Device Agent and to Kubernetes clusters using the Cluster Agent.  It can both deploy container images to, and bi-directionally synchronize machine learning assets with, the destination device or cluster.

Dependency management at load-time versus at run-time

Concepts to understand:

Top-level service vs dependency (or required service): A top-level service is the functionality that you intend to deploy.  A dependency or required service is one that is only used because your intended top-level service needs it.

Stateful vs stateless service: A service is stateful when it retains or persists information from one invocation to the next.  A service is stateless when each invocation is independently sent all of the information it requires in each request.

Singleton vs Multiple (sharable property in Service Definition file): "The value of this field determines how many instances of the service’s containers will be running on a node when the service is deployed more than once to the same node."  You might use the `singleton` value if your environment is resource-constrained and you cannot run more than one instance of a service.  Another reason to use `singleton` is if a service is stateful.  If the service is stateless and you have the available resources to run more than one copy, you should choose `multiple` instead of `singleton`.

Service Definition and Deployment Policies vs Node (Deployment) Pattern: Use the former when you have one or more top-level services that have independent lifecycles.  This is the recommended approach.  Use the latter only when you have a single application composed of multiple top-level services that have interdependent lifecycles.

Docker Compose combines the Publishing and Execution phases of the service software lifecycle, collectively described as load-time.  It has no awareness of which services are top-level services because it doesn't describe the dependency relationship between services.  It does explicitly determine the sequence in which services should be started based on the order in which they appear in the Docker Compose file.  This means that two top-level services could be started before the first top-level service's dependencies if that is the sequence in which they are described in the file.

Open Horizon allows a developer to describe a top-level service and its dependencies in a standalone Service Definition file.  A deployer uses policy (or a pattern) to instruct which services should be running on a deployment host. It is not necessary to explicitly publish dependent services, since Open Horizon will do this autonomously. Top level services are started last, after all of their dependencies are started.

A small group of companies (HP, IBM, Ingadi Flower Farm, Seeed, SoftServe Inc.) came together late in 2020 to demonstrate how edge computing solutions could improve agriculture.  One of the published core beliefs of the group is "with sufficient sensors and data, you should be able to apply the optimum resources to grow crops to their maximum yield".  This Special Interest Group (SIG) is hosted by the Open Horizon open-source software project that lives within LF Edge, part of the Linux Foundation.  All meetings are open to the public, and meeting recordings and presentation materials are published on the SIG's wiki page.

The initial effort of the group will be to grow a variety of hot-weather crops in two hoop houses on a plot at the Ingadi Flower Farm in Chelsea, Alabama.  Hoop houses are a row of curved support hoops covered with heavy plastic that can extend the growing season of crops in a location by weeks or months.  As an experiment, the same crops will be grown in the same layout in adjacent 10' x 20' hoop houses.  Both houses will contain identical sensors to collect soil and air data at one-minute intervals, and will be configured to send alerts when the soil moisture exceeds pre-configured thresholds.  One will be watered based on the farmer's best judgement, and the other will be watered based on the collected data.  When the crops are harvested, the yields will be compared to see which technique or approach was superior.

We intend to publish a Bill of Materials (BOM) for the hardware used, software developed, and data collected so that anyone can replicate the results should they wish to do so.  Additionally, the data can be analyzed to see what additional learnings can be derived as well as what changes should be made to data collection for future growing plans.  We also plan to publish detailed blog posts at the IBM Developer Blog for Open Horizon that will give step-by-step instructions on how to set up your own Smart Agriculture solution based on what we learn.

Please watch this space weekly for updates.  And post any comments and questions below.

Introduction

This blog post is written for software developers who understand basic concepts about containerized applications (hereinafter referred to as containers), and are new to using Open Horizon to deliver edge computing services.

On a host machine, some tasks can only be performed by an account with root access. This means that the account you are currently logged in as is either the root account itself (generally not a good idea), or your account has acquired root-level privileges through `sudo`. Likewise, containers generally do not need privileged mode on the host: to be run as the root user or to have root-level access on the host computer.

In Open Horizon and all commercial distributions based on it, you have the ability to specify that a service should be deployed with privileged process execution enabled. By default, it is disabled. You must explicitly enable it in the respective Service Definition file for each container that needs to run in this mode. And further, any node on which you want to deploy that service must also explicitly allow privileged mode containers.

The reason for requiring the node policy file to explicitly enable privileged mode is because the node owner gets a say/vote in what runs on the node. This is the whole purpose of the node policy, to give the node owner agency in the decision about what runs there.  If the service definition or one of its dependencies requires privileged mode, the node policy must also allow privileged mode, or else services will not be deployed to the node.

How does privileged process execution impact security?

A major security principle the Open Horizon project follows is: "All parties are untrusted by default." As a result, Node Policies and Service Definitions do not allow privileged process execution by default. You must explicitly enable it in both the node and the service if you want to deploy and run a service that requires it.

However, a privileged container is a powerful and potentially dangerous tool and should not be used without considering alternatives. If you run a container with privileged access, it can access all resources on the host system as the root user. If a privileged container can be hacked by a third party, that third party could then gain access to all resources on the host computer.

Therefore, try not to use privileged containers. If you must, use the following guidelines to ensure that privileged containers:

  • are thoroughly and continuously vetted for vulnerabilities
  • have a narrow scope for their duties ... meaning they should only perform a specific task
  • only mount necessary host directories and devices (Specified in the Service Definition file. See the example at the bottom of this blog post.)

Why do we use it (what is it good for)?

With all of the potential drawbacks, what situations require a container to run as privileged?

  1. Does your code require direct access to host hardware? For example, you may need to use a microphone in order to record and analyze sound waves. You might need to use the host GPU for model (re)training. You could potentially need to access a video stream directly from an attached camera.  In these situations, you should first try to bind mount the device to see if that approach is sufficient.  Another approach is to use `cap-add` to add only the kernel capabilities that you specifically need.  By way of contrast, privileged mode adds all  of the kernel's `CAP_*` capabilities.  
  2. Does your service need to spawn other containers? This is a common task in CI/CD pipelines. You may be able to spawn containers by bind mounting the docker daemon socket (normally /var/run/docker.sock) in systems that use docker. In systems that use podman, which has no daemon process, you may need elevated privileges. In any case, if your container can spawn other containers without restrictions then it can effectively run any code as root on the host.

When should you not use it?

If you need to access a file on the host with elevated permissions, try mounting the file into the container and ensuring that the container is running as a user that belongs to the same group as the file owner, with group read privileges enabled. Do not run the container as privileged just to read from or write to a file.

Can you show me an example?

The `audio2text` example service requires access to the microphone to record audio, and thus needs privileged access to the host hardware. View the Service Definition's deployment section to see how it is enabled.