When evaluating APIs (HTTP and otherwise) I am predisposed to simplicity and comprehensibility in implementation and use. That means the interface should:
- have a limited number of resources
- have simple representations of those resources with minimal
- thus a correspondingly limited number of operations that can be performed across the resources
- be easy to talk about independent of the details of any implementation
- be unified in concept (which is basically a recapitulation of all of the above)
I've been suggesting recently in a review of qualitative attributes or capabilities in the OpenStack placement API that we should represent the attributes as infinite inventories of classes of resources associated with resource providers. (See this spec for background on resource providers, inventories and classes.) This would allow some of the goals above to be satisfied while still allowing the desired functionality: Being able to make placement (resource consumption) decisions based on both quantitative inventories and (traditionally) qualitative attributes of the resources.
The solution proposed in the spec under review is to represent the qualitative attributes as tag-like capability strings in an additional database table. That is, a resource provider in addition to having inventories of consumable classes of resources it will also have lists of capabilities. When making a request a client asks for a mix of quantities of inventories and none or some capability strings.
My argument basically comes down to "let's model everything as inventory". If we do that then:
- every representation is the same
- every calculation is simple math
- there are fewer moving parts
- the implementation is easier to translate to another form
Inventory of quantitative stuff (e.g., disk) is subtracted from real capacity. Inventory of qualitative stuff (e.g., type of disk, ssd) is subtracted from an infinite capacity.
This can work because the following is universally true:
x = float('inf') assert x - 1 == x
Say, for example that I have a placement request where I want:
2 GB of DISK 1024 MB of RAM 4 VCPU 1 of ssd
I find a set of resource providers that can satisfy those
requirements. I then record an allocation of 2, 1024, 4 and 1 against the
resource providers. For
ssd the inventory of infinity - 1 will still be
The upshot of this is that (in the theoretical implementation) for any type of requirement I wish to make in a request spec or claim in an allocation I only need one type of operation to process it. The algorithm requires fewer conditionals and edge cases and the results are correct.
Doing this strikes people as dirty because it is conflating consumable
quantitative data (5
MEMORY_MB) with an attribute which
is not consumable and thus considered as solely qualitative. However, if
there is some disk and it is
SSD if you consume some of that
isn't any less of it. I think that explanation actually helps make it clear:
you can have some of it, and when you do, it's all still there; there is
Even if that argument doesn't convince, the efficacy of the perceived hack is potentially sufficient to warrant its use: If I'm someone who manages some resource providers or if I'm someone who writes code that manages resource providers I don't want to manage its inventory and its attributes as separate procedures. I just want to manage the resource provider in as few different steps as possible. Here's a fleshed-out example in gabbi format:
defaults: request_headers: accept: application/json content-type: application/json x-auth-token: $ENVIRON['TOKEN'] tests: - name: create resource provider POST: /resource_providers data: name: my disk farm uuid: 9109b19b-0ea0-4e60-a162-9cd6cfc5a41b status: 201 - name: set inventory PUT: $LOCATION/inventories data: resource_provider_generation: 1 inventories: - resource_class: DISK_GB, total: 20000 - resource_class: STORAGE_SSD total: ∞ status: 204 - name: check usages GET: /resource_providers/9109b19b-0ea0-4e60-a162-9cd6cfc5a41b/usages response_json_paths: $.usages.DISK_GB: 20000 $.usages.STORAGE_SSD: ∞ # NOTE: this select endpoint is not yet defined so this part is # pure speculation - name: request resources POST: /select_destinations data: requires: resources: VCPU: 2 DISK_GB: 1024 STORAGE_SSD: 1 reponse_json_paths: # ordered list of lists? resource_providers: - [9109b19b-0ea0-4e60-a162-9cd6cfc5a41b, 47c16c8d-cb80-426e-83ef-cb0ac81ae45C] - name: consume resources PUT: /allocations/2966de89-12af-4328-bf6c-d541a3abfe72 data: allocations: - resource_provider: uuid: 9109b19b-0ea0-4e60-a162-9cd6cfc5a41b resources: DISK_GB: 1024 STORAGE_SSD: 1 - resource_provider: uuid: 47c16c8d-cb80-426e-83ef-cb0ac81ae45C resources: VCPU: 2 status: 204 - name: check usages on disk farm GET: /resource_providers/9109b19b-0ea0-4e60-a162-9cd6cfc5a41b/usages response_json_paths: $.usages.DISK_GB: 18976 $.usages.STORAGE_SSD: ∞
Is this perfect? No, but it is relatively straightforward. Do I think we should do it this way? I don't know, but I do wish we could keep things simple as possible, even if that means instituting some constraints.