I've had a few persistent complaints in my four and half years of working on OpenStack, but two that stand out are:
-
The use of RPC—with large complicated objects being passed around on a message bus—to make things happen. It's fragile, noisy, over-complicated, hard to manage, hard to debug, easy to get wrong, and leads to workarounds ("raise the timeout") that don't fix the core problem.
-
It's hard, because of the many and diverse things to do in such a large commmunity, to spend adequate time reflecting, learning how things work, and learning new stuff.
So I decided to try a little project to address both and talk about it before it is anything worth bragging about. I reasoned that if I use the placement service to manage resources and etcd to share state, I could model a scheduler talking to one or more compute nodes. Not to do something so huge as replace nova (which has so much complexity because it does many complex things), but to explore the problem space.
Most of the initial work involved getting some simple etcd clients speaking to to etcd and placement and mocking out the creation of fake VMs. After that I dropped the work because of the many and diverse things to do, leaving a note to myself to investigate using virt-install.
I took nine months to come back to, but over the course of a couple hours on two or three days I had it booting VMs on multiple compute nodes.
In my little environment a compute node starts up, learns about its environment, and creates a resource provider and associated inventories representing the virtual cpus, disk, and memory it has available. It then sits in a loop, watching an etcd key associated with itself.
Beside the compute process there's a faked out metadata server running.
A scheduler takes a resource request and asks placement for list of allocation candidates. The first candidate is selected, an allocation is made for the resources and the allocations and an image URL are put to the etcd key that the compute node is watching.
The compute sees the change on the watched key, fetches the image,
resizes it to the allocated disk size, then boots it with
virt-install
using the allocated vcpus and memory. When the VM is
up another key is set in etcd containing the IP of the created
instance.
If the metadata server has been configured with an ssh public key, and the booted image looks for the metadata server, you can ssh into the booted VM using that key. For now it is only from the same host as the compute-node. Real networking is left as an exercise to the reader.
In the course of the work described in those ↑ short paragraphs is more learning about some of the fundamentals of creating a virtual machine than a few years of reading and reviewing inscrutable nova code. I should have done this much sooner.
The really messy code is in etcd-compute on GitHub.