Tag Archives: Nova extensions

fping Support in OpenStack

OpenStack is very good at launching virtual machines – that’s its purpose, isn’t it? But usually you want to monitor state of you machines somehow, and there are many reasonable ways.

  1. You can test daemons running on the machine, e.g., check up open ports or poll known services. Of course, this approach means that you know exactly what services should be running – and this is the most precise way to test system health.
  2. You can ask hypervisor if the machine is ok. That’s a very dirty check since hypervisor will likely report that VM is active while its operating system kernel can encounter problems.
  3. A compromise settlement may be pinging the machine. It’s a general solution since a lot of VMs respond to ping normally. Sure, VM can ignore ping or its daemons can have problems while host is responding to ping, but this solution is far easier to implement then check each machine according to an individual plan.

Let’s concentrate on the last two approaches. I would like to launch a machine and check it.

[root@a001 ~]# nova image-list
+--------------------------------------+--------------+--------+--------+
| ID                                   | Name         | Status | Server |
+--------------------------------------+--------------+--------+--------+
| 960dc70a-3e0e-496a-b8da-0e9cd91d3a44 | selenium-img | ACTIVE |        |
+--------------------------------------+--------------+--------+--------+
[root@a001 ~]# nova boot --flavor m1.small --image 960dc70a-3e0e-496a-b8da-0e9cd91d3a44 selenium-0
...
[root@a001 ~]# nova list
+--------------------------------------+-------------------+--------+-------------------------+
| ID                                   | Name              | Status | Networks                |
+--------------------------------------+-------------------+--------+-------------------------+
| a9060a07-d32a-4dcf-8387-1c7d69f897dc | selenium-0        | ACTIVE | selenium-net=10.109.0.4 |
+--------------------------------------+-------------------+--------+-------------------------+
[root@a001 ~]# fping 10.109.0.4
10.109.0.4 is unreachable

As you can see, VM status is reported as active, but the machine has not booted really. Even more, consider a damaged image (I use a text file for this purpose):

[root@a001 ~]# glance index 
ID                                   Name                           Disk Format          Container Format     Size          
------------------------------------ ------------------------------ -------------------- -------------------- --------------
7d8007fe-a63c-4d02-8edf-a6cc19fa1d73 text                           qcow2                ovf                           17043
[root@a001 ~]# nova boot --flavor m1.small --image 7d8007fe-a63c-4d02-8edf-a6cc19fa1d73 text-0
[root@a001 ~]# nova list
+--------------------------------------+-------------------+--------+-------------------------+
| ID                                   | Name              | Status | Networks                |
+--------------------------------------+-------------------+--------+-------------------------+
| a9060a07-d32a-4dcf-8387-1c7d69f897dc | selenium-0        | ACTIVE | selenium-net=10.109.0.4 |
| 461e73e4-7f88-4c8f-bb1f-49df9ec18d84 | text-0            | ACTIVE | selenium-net=10.109.0.5 |
+--------------------------------------+-------------------+--------+-------------------------+

Nova bravely reports that the new instance is active, but it obviously is not functioning: a text file is not a disk image with an operating system. And fping reveals that the VM is ill:

[root@a001 ~]# fping 10.109.0.5
10.109.0.5 is unreachable

We can extend nova API adding this fping feature. Nova will run fping for requested instances and report which ones seems to be truly alive. I have developed this extension and it was accepted to Grizzly on November 16, 2012 (https://github.com/openstack/nova/commit/a220aa15b056914df1b9debc95322d01a0e408e8).

fping API is simple and straightforward. We can ask to check all instances or a single one. In fact, we have two API calls.

  1. GET /os-fping/<uuid> – check a single instance.
  2. GET /os-fping?[all_tenants=1]&[include=uuid[,uuid...][&exclude=...] – check all VMs in the current project. If all_tenants is requested, data for all projects is returned (by default, this option is allowed only for admins). include and exclude are parameters specifying VM masks. These parameters are mutually exclusive and exclude is ignored if include is specified. Consider that VM list is VM_all, then if include list is set, the only VM_all * VM_to_include (set intersection) will be tested – thus we can check several instances in a single API call. If exclude list is provided, VM_all -
    VM_to_exclude
    (set difference) will be polled – thus we can skip testing for instances that are not supposed to respond to ping.

fping increases I/O load on nova-api node, so, by default, fping API is limited to 12 calls in an hour (nevertheless it’s a single or several instances poll).

I have added nova fping support to python-novaclient (https://github.com/openstack/python-novaclient/commit/ff69e4d3830f463afa48ca432600224f29a2c238) making easy to write a daemon in Python that will periodically check instance states and send notifications on detected problems. This daemon is available in Grid Dynamics Altai Private Cloud For Developers and is called instance-notifier (https://github.com/altai/instance-notifier). The daemon is installed and configured by Altai installer automatically. Despite Altai 1.0.2 runs Essex, not Grizzly, I have added nova-fping as an additional extension package.

Let’s see how to use fping from client side. We have three instances: selenium-0 (shut off), selenium-1 (up and running), and text (invalid image). Nova reports that they are active:

[root@a001 /]# nova list
+--------------------------------------+-------------------+--------+-------------------------+
| ID                                   | Name              | Status | Networks                |
+--------------------------------------+-------------------+--------+-------------------------+
| a9060a07-d32a-4dcf-8387-1c7d69f897dc | selenium-0        | ACTIVE | selenium-net=10.109.0.4 |
| 20325b87-6858-49df-ab30-795a189dd2ac | selenium-1        | ACTIVE | selenium-net=10.109.0.3 |
| 461e73e4-7f88-4c8f-bb1f-49df9ec18d84 | text-0            | ACTIVE | selenium-net=10.109.0.5 |
+--------------------------------------+-------------------+--------+-------------------------+

Check them with nova fping!

[root@a001 /]# python
Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from novaclient.v1_1 import Client
>>> cl = Client("admin", "topsecret", "systenant", "http://localhost:5000/v2.0")
>>> ping_list = cl.fping.list()
>>> ping_list
[<Fping: 461e73e4-7f88-4c8f-bb1f-49df9ec18d84>, <Fping: a9060a07-d32a-4dcf-8387-1c7d69f897dc>, <Fping: 20325b87-6858-49df-ab30-795a189dd2ac>]
>>> import json
>>> print json.dumps([p._info for p in ping_list], indent=4)
[
    {
        "project_id": "4fd17bd4ac834dcf8ba1236368f79986", 
        "id": "461e73e4-7f88-4c8f-bb1f-49df9ec18d84", 
        "alive": false
    }, 
    {
        "project_id": "4fd17bd4ac834dcf8ba1236368f79986", 
        "id": "a9060a07-d32a-4dcf-8387-1c7d69f897dc", 
        "alive": false
    }, 
    {
        "project_id": "4fd17bd4ac834dcf8ba1236368f79986", 
        "id": "20325b87-6858-49df-ab30-795a189dd2ac", 
        "alive": true
    }
]

As expected, nova fping reported that only selenium-1 (id=20325b87-6858-49df-ab30-795a189dd2ac) is really alive.

So, fping in nova is a fast and quite reliable way to check instance health. Like a phonendoscope, it cannot provide full information, but if a human doesn’t respire, he’s likely to be dead.

fping-phonendoscope

Advertisements

Add Support for Local Volumes in OpenStack

Local volumes is functionality similar to regular nova volumes, but volume data always stored on local disk. It can helps when you need to resizable disk for VM, but do not want to attach volume by network.

This functionality was implemented as Nova API extension (gd-local-volumes). You can manage local volumes from command line with nova2ools-local-volumes command. Local volumes are stored in same format, as VM disks (see Using LVM as Disk Storage for OpenStack). You can mix nova-volumes and local volumes for the same instance.

To create local volume using nova2ools, just use “nova2ools-local-volumes create” command:

$ nova2ools-local-volumes create –vm <Id or name of VM> –size <Size> –device <Path to device inside Guest OS (example: /dev/vdb)>

This command will create local volume for given VM with specified size. There are some caveats too. According to libvirt behavior https://bugzilla.redhat.com/show_bug.cgi?id=693372, device name in guest OS can be different from what you specified in –device option. For instance, device names can simply be chosen in lexicographical order (vda, vdb, vdc and so on). Another caveat is that device may not be seen in Guest OS unless you reboot VM. Each local volume is defined by it’s VM id and device name.

You can create new local volume from existing local volume snapshot using –snapshot option like this:

$ nova2ools-local-volumes create –vm <Id or name of VM> –snapshot <Id of local volume snapshot> –size <Size> –device <Path to device inside Guest OS (example: /dev/vdb)>

You can omit –size option, when creating local volume from snapshot, then local volume will have actual snapshot size. If you specify –size option, then local volume will be resized to this size.

NO UNDERLYING FILE SYSTEM RESIZE WILL BE PERFORMED, so be careful with that!

To see list of all created local volumes simply write:

$ nova2ools-local-volumes list

After you create local volume for particular VM, you can perform these operations on it:

  • Take snapshot of local volume
  • Resize local volume
  • Delete local volume

To create snapshot use:

$ nova2ools-local-volumes snapshot –id <Id of local volume> –name <Name of snapshot>

You can see your snapshot in list provided by

$ nova2ools-images list

command.

Resize of local volume (with no underlying filesystem resize) can be performed by command:

$ nova2ools-local-volumes resize –id <Volume Id> –size <New size>

Finally, you can delete local volume that are no longer needed:

$ nova2ools-local-volumes delete –id <Volume Id>

In conclusion, essence facts about using local volumes:

  • They are allocated on the same instance where VM runs
  • They are defined by id of VM and device name
  • They use same storage type as VM disks (raw, qcow2 or LVM)
  • Snapshot functionality depends on backend: snapshots are allowed for qcow2 and LVM (on suspended instance) but can’t be made on raw.
  • Quota is enabled for local volumes too, just like for ordinary volumes
  • They are linked with particular VM forever and can only be deleted, not detached and attached again.
  • They are attached to VM just like usual VM disks without any iSCSI
  • Resize of local volume doesn’t perform underlying filesystem resize
  • They will be deleted if you delete VM, that uses them

Finally, some notes if you are using local volumes with LVM storage type. As was mentioned in Using LVM as Disk Storage for OpenStack, LVM snapshots can either be taking on running instance with force_snapshot flag specified or on suspended instance. Same thing applies when you try to snapshot local volume with LVM backend. However, we spotted that suspend\resume Nova API calls have been disabled by some reason, so we turned them on and you can use suspend\resume to take snapshots from LVM local volumes.

Hopefully, you will find this feature useful, when you need to allocate just local disk for particular VM, not some volume anywhere in the cluster.

Sources are available on GitHub:

https://github.com/griddynamics/nova/commit/f354279158ca702baf9226e19c3584de58541ccc

REST API Documentation:

http://www.griddynamics.com/openstack/docs/local-volumes-api.html