Welcome back to the placement update. If I've read the signs correctly, I should now be back to this as a regular thing. Apologies for the gap, I had to attend to some other responsibilities.
A lot has changed in the past few months, so it's hard to extract out a most important. It will depend on who is reading. Review what's changed for a summary of important stuff.
Placement is now its own official project. Until elections are held (it looks like nominations start this coming Tuesday), Mel is the PTL.
Deleting placement code from nova has been put on hold until Train to make it easier for certain types of upgrades to happen. New installs should prefer the extracted code, as the nova-side is frozen, but the placement side is not.
A large stack of code to remove oslo.versionedobjects from placement has merged. This has resulted in a significant change in performance on the
perfloadtest that runs in the gate. While not a complete representation of the entire system, it's enough to say "yeah, that was worth it": A request for allocation candidates that used to take around 2.5 seconds now takes 1.2. That refactoring continues (see below), seeking additional simplifications.
Microversion 1.31 adds
in_treeNquery parameters to GET /allocation_candidates. This is useful in a variety of nested resource provider scenarios, including the big bandwidth QoS changes that are in progress in nova and neutron.
Placement is now publishing install docs but it is important to note that those docs have not been validated (as far as I'm aware) by the packagers. That's a thing that needs to happen, presumably by the packagers.
os-resource-classes 0.3.0 has been released with a
There are some pending specs from nova which are primarily placement feature specs. We'll continue with those as is (see below), but come the next cycle the plan is to manage specs in the placement repo, not have a separate repo, and not have separate spec cores.
Near to Done
- Filter Allocation Candidates by Provider Tree has been mostly completed by Tetsuro, but there's a pending update to the spec.
Not yet Done
- Support filtering by forbidden aggregate membership
- Support any traits in allocation_candidates query
- Support mixing required traits with any traits
Not yet Approved
Update alloc-candidates-in-tree updates the in-tree spec above to reflect what was learned while doing the actual implementation. Notably how numbered
in_treeparameters impact results.
Resource provider - request group mapping in allocation candidate has had a recent resurgence in attention.
osc-placement is currently behind by 14 microversions.
Code for 1.18 is under review.
This section now overlaps a bit with the Specs/Features bit above. This will settle out with a bit more clarity as we move along.
Reshaper handing in nova keeps exposing additional things that need to be remembered on the nova-side, so there are a few patches remaining related to vgpu reshaping but it is mostly ready.
The bandwidth-resource-provider topic has merged a vast amount of code but there is still plenty left.
Related to all this nested stuff: The complex hardware models that drove the development of the nested resource provider system are challenging to test. The cloud hardware provided to OpenStack infrastructure does not expose the hardware that would allow real integration tests. If anyone reading this is in a position to provide third party CI with fancy hardware for NUMA, NFV, FPGA, and GPU related integration testing with nova, there's a significant need for that.
(I think refactoring should be a constant theme. To reflect that, I'm going to have a section here. Editorial privilege or something.)
There's a collection of patches in progress, currently under the topic scrub-Lists that is a follow up to the patches that removed oslo versioned objects. That work pointed out some opportunities to DRY-up the List classes (e.g., UsageList) to remove some duplication and simplify. Then, after looking at that, it became clear that entirely removing the List classes, in favor of using python native lists, would further simplify the code.
Apart from the previously mentioned performance and simplicity benefits of these changes, it's also managed to expose and fix a few bugs, simple because we were looking at things and moving them around. If you pick up rocks, you can see the bugs and squash them. If you don't, they breed.
https://review.openstack.org/#/q/topic:improve-debug-log A series of improvements leading to a better debug log when retrieving allocation candidates.
https://review.openstack.org/#/c/639628/ Docs: extract testing info to own sub-page
https://review.openstack.org/#/q/topic:cd/gabbi-tempest-job Gabbi-based integration tests of placement. These recently found a bug that none of the functional, grenade, nor tempest tests did.
https://review.openstack.org/#/c/619050/ Optionally migrate database at service startup (so you don't have to run
placement-manage db syncif you don't want to).
https://review.openstack.org/#/c/630216/ Add a vision-reflection (of the Technical Vision doc).
Other Service Users
See also the several links above for more nova changes. Also, I'm a bit behind on my tracking in this area, so there is likely plenty of other stuff too. This will improve over time.
https://review.openstack.org/538498 Convert driver supported capabilities to compute node provider traits
https://review.openstack.org/621494 Add descriptions of numbered resource classes and traits
https://review.openstack.org/636412 Make move_allocations handle empty source allocations (Part of a series on cross-cell resize)
https://review.openstack.org/#/q/topic:bp/count-quota-usage-from-placement Using placement (from nova) for counting (some of) quota.
https://review.openstack.org/#/q/topic:minimum-bandwidth-allocation-placement-api Neutron side of minimum bandwidth.
https://review.openstack.org/#/q/bp/no-affinity-instance-reservation Blazar reservation handling, including some manipulation of inventory in placement.
https://review.openstack.org/633204 Blazar: Retry on inventory update conflict
Though this is long, it doesn't really bring us fully up to date. If something is missing that you think is important please let me know. Once I'm back in the flow it should become increasingly complete.