Motivation

In order to support network metadata we have to rely on services, installed in the VM of user. The most common tool for instance initialization is cloud-init. There are set of images supporting it out-of-the-box: https://docs.openstack.org/image-guide/obtain-images.html. It supports several DataStores (from cloud), and as a variant of datasource we can use OpenStack one. It is open source and documented.

Сloud-init OpenStack DataSource requirements

In order to start communication with OpenStack DataSource, cloud-init do some checks of environment:

  • Maybe OpenStack if
    • non-x86 cpu architecture: because DMI data is buggy on some arches
  • Is OpenStack if x86 architecture and ANY of the following
    • /proc/1/environ: Nova-lxd contains product_name=OpenStack Nova
    • DMI product_name: Either Openstack Nova or OpenStack Compute
    • DMI chassis_asset_tag is OpenTelekomCloud, SAP CCloud VM, OpenStack Nova (since 19.2) or OpenStack Compute (since 19.2)

We can set product_name in smbios for our VMs to tell cloud-init to fire communication with endpoints.


Also we should take into account, that there are an order of DataSource observation inside cloud-init. By default NoCloud (drive we use now) has priority (the order is here).

So, with both DataStores activated:

root@1a831fa7-c50b-4693-a16e-fb8171f1b69e:~# grep Datasource /var/log/cloud-init-output.log
Cloud-init v. 20.4-0ubuntu1~20.10.1 finished at Tue, 09 Mar 2021 07:10:44 +0000. Datasource DataSourceNoCloud [seed=/dev/sr0][dsmode=net].  Up 22.97 seconds

With manually removed NoCloud drive:

ubuntu@niceshamir:~$ grep Datasource /var/log/cloud-init-output.log
Cloud-init v. 20.4-0ubuntu1~20.10.1 finished at Tue, 09 Mar 2021 07:25:26 +0000. Datasource DataSourceOpenStack [net,ver=2].  Up 23.16 seconds

Сloud-init OpenStack DataSource endpoints

OpenStack metadata serves several endpoints https://docs.openstack.org/nova/latest/user/metadata.html#metadata-openstack-format:

  • http://169.254.169.254/openstack/{version}/meta_data.json - contains (among other fields) public_keys, hostname, devices (disk, nic)
  • http://169.254.169.254/openstack/{version}/network_data.json - contains information about networks, dns service and links (which will be configured inside VM)
  • http://169.254.169.254/openstack/{version}/user_data - contains script to run inside VM
  • http://169.254.169.254/openstack/{version}/vendor_data2.json - data, which independent from VM deployments (we can omit it now)

  • http://169.254.169.254/openstack - contains versions of OpenStack metadata

Those endpoints should be accessible from VM and serve separate information for different VMs.


Cloud-init EC2 DataStore

We can also try to implement EC2-compatible datastore described here: https://docs.openstack.org/nova/latest/user/metadata.html#ec2-compatible-metadata. It will be called in case of image has no OpenStack datasource inside and forced to skip check (Cirros image for example).


  • No labels

3 Comments

  1. Why would we want to use the openstack schema? We don't do most of that is in its meta-data such as having EVE (or Nova) generate and provide a ssh public key.

    Do all of the cloud-init clients (I understand there are different versions used in different Linux distros) support the same set datasources and associated schemas? Or are some more commonly supported?

    I realize that what we do now with the noCloud is mostly user-data (with only two attributes in meta-data - instance-id and local-hostname) but we need to understand whether providing the openstack or EC2 API endpoints mean that we must provide more meta-data attributes for the clients to work correctly.

    1. The choice of the scheme is a matter of discussion. I choose it and propose because of the presence of a large number of images builded for it. Of course, the images that have the OpenStack field in the name imply the installation of cloud-init and the presence of a number of data sources supported by it. However, as I suspect, testing is done with this particular platform.

      Cloud-init without modifications in config supports whole set of datasources. But of course, there are other options for supporting the metadata service. For example CirrOS comes with EC2-only support of obtaining public keys.

      Unused fields can be omitted and defined in further development (network_data.json looks promising for defining logic inside VM).

      1. The functionality we are currently missing, where network_data could be useful, is when there are multiple network interfaces. Today we set those up in /etc/network/interfaces.d/ which means a different image with 1 vs 2 vs 3 interfaces. Can we do that with network_data.json? If so adding that makes sense.

        But we need to make sure that the various flavors which work now with cloud-init (Ubuntu, Centos, etc) do not get upset by the subset of meta-data that we will provide.