Availability Zones and Host Aggregates in OpenStack Compute (Nova)

UPDATE 2014-06-18: There was a talk at the last OpenStack Summit in Atlanta on this topic, Divide and Conquer: Resource Segregation in the OpenStack Cloud.

Confusion around Host Aggregates and Availabaility Zones in Nova seems to be very common. In this post I’ll attempt to show how each are used. All information in this post is based on the way things work in the Grizzly version of Nova.

First, go ahead and forget everything you know about things called Availability Zones in other systems.  They are not the same thing and trying to map Nova’s concept of Availability Zones to what something else calls Availability Zones will only cause confusion.

The high level view is this: A host aggregate is a grouping of hosts with associated metadata.  A host can be in more than one host aggregate.  The concept of host aggregates is only exposed to cloud administrators.

A host aggregate may be exposed to users in the form of an availability zone. When you create a host aggregate, you have the option of providing an availability zone name. If specified, the host aggregate you have created is now available as an availability zone that can be requested.

Here is a tour of some commands.

Create a host aggregate:

$ nova aggregate-create test-aggregate1
| Id | Name            | Availability Zone | Hosts | Metadata |
| 1  | test-aggregate1 | None              |       |          |

Create a host aggregate that is exposed to users as an availability zone. (This is not creating a host aggregate within an availability zone! It is creating a host aggregate that is the availability zone!)

$ nova aggregate-create test-aggregate2 test-az
| Id | Name            | Availability Zone | Hosts | Metadata |
| 2  | test-aggregate2 | test-az           |       |          |

Add a host to a host aggregate, test-aggregate2. Since this host aggregate defines the availability zone test-az, adding a host to this aggregate makes it a part of the test-az availability zone.

nova aggregate-add-host 2 devstack
Aggregate 2 has been successfully updated.
| Id | Name            | Availability Zone | Hosts         | Metadata                           |
| 2  | test-aggregate2 | test-az           | [u'devstack'] | {u'availability_zone': u'test-az'} |

Note that the novaclient output shows the availability zone twice. The data model on the backend only stores the availability zone in the metadata. There is not a separate column for it. The API returns the availability zone separately from the general list of metadata, though, since it’s a special piece of metadata.

Now that the test-az availability zone has been defined and contains one host, a user can boot an instance and request this availability zone.

$ nova boot --flavor 84 --image 64d985ba-2cfa-434d-b789-06eac141c260 \
> --availability-zone test-az testinstance
$ nova show testinstance
| Property                            | Value                                                          |
| status                              | BUILD                                                          |
| updated                             | 2013-05-21T19:46:06Z                                           |
| OS-EXT-STS:task_state               | spawning                                                       |
| OS-EXT-SRV-ATTR:host                | devstack                                                       |
| key_name                            | None                                                           |
| image                               | cirros-0.3.1-x86_64-uec (64d985ba-2cfa-434d-b789-06eac141c260) |
| private network                     |                                                       |
| hostId                              | f038bdf5ff35e90f0a47e08954938b16f731261da344e87ca7172d3b       |
| OS-EXT-STS:vm_state                 | building                                                       |
| OS-EXT-SRV-ATTR:instance_name       | instance-00000002                                              |
| OS-EXT-SRV-ATTR:hypervisor_hostname | devstack                                                       |
| flavor                              | m1.micro (84)                                                  |
| id                                  | 107d332a-a351-451e-9cd8-aa251ce56006                           |
| security_groups                     | [{u'name': u'default'}]                                        |
| user_id                             | d0089a5a8f5440b587606bc9c5b2448d                               |
| name                                | testinstance                                                   |
| created                             | 2013-05-21T19:45:48Z                                           |
| tenant_id                           | 6c9cfd6c838d4c29b58049625efad798                               |
| OS-DCF:diskConfig                   | MANUAL                                                         |
| metadata                            | {}                                                             |
| accessIPv4                          |                                                                |
| accessIPv6                          |                                                                |
| progress                            | 0                                                              |
| OS-EXT-STS:power_state              | 0                                                              |
| OS-EXT-AZ:availability_zone         | test-az                                                        |
| config_drive                        |                                                                |

All of the examples so far show how host-aggregates provide an API driven mechanism for cloud administrators to define availability zones. The other use case host aggregates serves is a way to tag a group of hosts with a type of capability. When creating custom flavors, you can set a requirement for a capability. When a request is made to boot an instance of that type, it will only consider hosts in host aggregates tagged with this capability in its metadata.

We can add some metadata to the original host aggregate we created that was *not* also an availability zone, test-aggregate1.

$ nova aggregate-set-metadata 1 coolhardware=true
Aggregate 1 has been successfully updated.
| Id | Name            | Availability Zone | Hosts | Metadata                   |
| 1  | test-aggregate1 | None              | []    | {u'coolhardware': u'true'} |

A flavor can include a set of key/value pairs called extra_specs. Here’s an example of creating a flavor that will only run on hosts in an aggregate with the coolhardware=true metadata.

$ nova flavor-create --is-public true m1.coolhardware 100 2048 20 2
| ID  | Name            | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public |
| 100 | m1.coolhardware | 2048      | 20   | 0         |      | 2     | 1.0         | True      |
$ nova flavor-key 100 set coolhardware=true
$ nova flavor-show 100
| Property                   | Value                      |
| name                       | m1.coolhardware            |
| ram                        | 2048                       |
| OS-FLV-DISABLED:disabled   | False                      |
| vcpus                      | 2                          |
| extra_specs                | {u'coolhardware': u'true'} |
| swap                       |                            |
| os-flavor-access:is_public | True                       |
| rxtx_factor                | 1.0                        |
| OS-FLV-EXT-DATA:ephemeral  | 0                          |
| disk                       | 20                         |
| id                         | 100                        |

Hopefully this provides some useful information on what host aggregates and availability zones are, and how they are used.

25 thoughts on “Availability Zones and Host Aggregates in OpenStack Compute (Nova)

    • Good question.

      Cells sit at a layer above host aggregates and availability zones. You can think of a basic nova deployment as a single cell. The deployment may have one or more host aggregates and/or availability zones defined.

      If you move into a deployment using multiple cells, you typically have one parent API cell, and one or more child compute cells. Each compute cell may contain one or more host aggregates and/or availability zones.

      • Note that what I said above is actually not correct. One of the major feature gaps with cells (current as of the Icehouse release, latest release when writing this comment) is that cells does not support host aggregates, and thus also does not support availability zones at all.

      • Our NeCTAR node is one of multiple cells in the NeCTAR federation, and uses host aggregates to separate those nodes with GPGPUs from those without. So at least on NeCTAR’s OpenStack fork, the two can co-exist. Currently, the majority (all?) of the NeCTAR federation runs Juno.

  1. Nice article!
    I got one question following up your last example. How can we make sure that the hosts in test-aggregate1 are not preempted by instances using flavors without the ‘coolhardware’ extra spec?
    As a cloud operator, I’d probably prefer to schedule “normal” instances on “cool-hw” hosts only when all my other “not-so-cool-hw” hosts are full. Is there an easy way to achieve that?

    • Good question.

      You can create a host aggregate for not-so-cool-hw for all of your other hosts, and have all of the other flavors set to schedule there. However, that would not allow not-so-cool-hw instances to then get scheduled on cool-hw hosts when the not-so-cool-hw hosts are full. Doing that would require some additional scheduler customization, I think.

      • Thanks for the reply, it matches what I thought. An approach that would work is a dedicated weigher to prioritize the hosts accordingly…

  2. Hi Russell,
    Is there API access to HA for non-admin users? How would non-admin users be able to discover AZs before they can launch VMs there? thx

    • All API policy is configurable. However, the default policy does allow discovery of availability zones. You just can’t see the host aggregate details that define the availability zone.

      • Would be appreciate if you can provide specific API that allows you to discover availability zones.. The only one I see are: Extended Availability Zone APIs that basically allow you to see AZ of an instance if you know the instanceID. It is not about discovering which AZs are available to launch my instance in..


  3. Hi Russel
    How do availability zones work for cinder? Do host aggregates allow you to specify whether its an AZ for nova or cinder?

  4. You say “This is not creating a host aggregate within an availability zone! It is creating a host aggregate that is the availability zone!” However, nova will happily create multiple aggregates with the same availability zone name. This seems like a mismatch since the end-user can only specify an availability zone when creating an instance, and there could be multiple aggregates (with different hosts) mapping to that zone. It seems like maybe nova should prevent the creation of an aggregate with the same availability zone name as an existing zone.

  5. It seems like there are a few holes open in the aggregate concept:

    1) Would it make sense to allow a host to be in multiple availability zones simultaneously (and allow an instance to specify multiple availability zones)? I could have a host that matches both “has_ssd” and “has_10g_network”, and maybe my specific instance (not necessarily my flavor) wants both of those attributes while others might only care about one or the other.
    2) What should happen if I try to remove all hosts from an aggregate that has running instances in it? Currently it’ll happily do this. (What if I later try to migrate/evacuate and there are no hosts in the aggregate?)
    3) What should happen if I try to delete an aggregate but there is an instance still in the corresponding availability zone? Currently I can delete it but the instance still shows the availability zone even though it no longer exists.
    4) What should happen if I try to evacuate/migrate an instance currently in an availability zone to a host not in that availability zone?
    5) What should happen if I try to evacuate/migrate an instance to a host that is not in an aggregate that matches the flavor metadata?

  6. Hi,

    I set the same metadata for a host aggregate and a flavor. But my ComputeCapabilityFilter is returning 0 hosts. The metadata of host aggregate and the extra specs for flavor are exactly same.

  7. Pingback: Openstack Havana Dashboard测试和使用 » 陈沙克日志

  8. Pingback: Collocating Ceph volumes and instances in a multi-datacenter setup | Loïc Dachary

    • 1) Creation of “host aggregates” (with “availability zone” name) in first use case would be applicable for the logical isolation and redundancy of host groups to provide fault tolerance for example each host-aggregate/az will have its own network equipment and power supply.
      2) In second use case host aggregate (group of hosts) is set with specific metadata/extra-spec (capability) and subsequently flavor is created/updated with same extra-spec and metadata. So later-on when user launches instance with flavor which has same extra-spec/metadata defined earlier for host-aggregate, scheduler will select the hosts from host-aggregate only.

  9. Hi, I just tried this but it made the nova.scheduler unable to find any hosts to put the VMs on? It kept saying unable to find a suitable host.

  10. Pingback: تجربیات ابری – قسمت اول | My Mostly Linux Stuff

  11. Pingback: Une zone OpenStack dédiée au nested KVM - InCloudUs - if [ "$UNIX" == "true" ]; then echo "Youpi o/"; fi

  12. Extra_specs for flavor should be created by prefixing aggregate_instance_extra_specs . Eample: aggregate_instance_extra_specs:coolhardware:true

Leave a Reply to Simon Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.