Placement update 19-20. Lots of cleanups in progress, laying in the groundwork to do the nested magic work (see themes below).
The poll to determine what to do with the weekly meeting will close at the end of today. Thus far the leader is office hours. Whatever the outcome, the meeting that would happen this coming Monday is cancelled because many people will be having a holiday.
Most Important
The spec for nested magic is ready for more robust review. Since most of the work happening in placement this cycle is described by that spec, getting it reviewed well and quickly is important.
Generally speaking: review things. This is, and always will be, the most important thing to do.
What's Changed
-
os-resource-classes 0.4.0 was released, promptly breaking the placement gate (tests are broken not os-resource-classes). Fixes underway.
-
Null root provider protections have been removed and a blocker migration and status check added. This removes a few now redundant joins in the SQL queries which should help with our ongoing efforts to speed up and simplify getting allocation candidates.
-
I had suggested an additional core group for os-traits and os-resource-classes but after discussion with various people it was decided it's easier/better to be aware of the right subject matter experts and call them in to the reviews when required.
Specs/Features
-
https://review.opendev.org/654799 Support Consumer Types. This is very close with a few details to work out on what we're willing and able to query on. It's a week later and it still only has reviews from me so far.
-
https://review.opendev.org/658510 Spec for Nested Magic. Un-wipped.
-
https://review.opendev.org/657582 Resource provider - request group mapping in allocation candidate. This spec was copied over from nova. It is a requirement of the overall nested magic theme. While it has a well-defined and refined design, there's currently no one on the hook implement it.
These and other features being considered can be found on the feature worklist.
Some non-placement specs are listed in the Other section below.
Stories/Bugs
(Numbers in () are the change since the last pupdate.)
There are 20 (-3) stories in the placement group. 0 are untagged. 2 (-2) are bugs. 5 are cleanups. 11 (-1) are rfes. 2 are docs.
If you're interested in helping out with placement, those stories are good places to look.
On launchpad:
-
Placement related nova bugs not yet in progress on launchpad: 16 (0).
-
Placement related nova in progress bugs on launchpad: 7 (+1).
osc-placement
osc-placement is currently behind by 11 microversions. No change since the last report.
Pending changes:
-
https://review.openstack.org/#/c/640898/ Add 'resource provider inventory update' command (that helps with aggregate allocation ratios).
-
https://review.openstack.org/#/c/651783/ Add support for 1.22 microversion
-
https://review.openstack.org/586056 Provide a useful message in the case of 500-error
Main Themes
Nested Magic
At the PTG we decided that it was worth the effort, in both Nova and Placement, to make the push to make better use of nested providers — things like NUMA layouts, multiple devices, networks — while keeping the "simple" case working well. The general ideas for this are described in a story and an evolving spec.
Some code has started, mostly to reveal issues:
-
https://review.opendev.org/657419 Changing request group suffix to string
-
https://review.opendev.org/657510 WIP: Allow RequestGroups without resources
-
https://review.opendev.org/657463 Add NUMANetworkFixture for gabbits
-
https://review.opendev.org/658192 Gabbi test cases for can_split
Consumer Types
Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A spec has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound.
Cleanup
As we explore and extend nested functionality we'll need to do some work to make sure that the code is maintainable and has suitable performance. There's some work in progress for this that's important enough to call out as a theme:
-
https://storyboard.openstack.org/#!/story/2005712 Some work from Tetsuro exploring ways to remove redundancies in the code. There's a stack of good improvements.
-
https://review.opendev.org/643269 WIP: Optionally run a wsgi profiler when asked. This was used to find some of the above issues. Should we make it generally available or is it better as a thing to base off when exploring?
-
https://review.opendev.org/660691 Avoid traversing summaries in _check_traits_for_alloc_request
Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment.
Other Placement
Miscellaneous changes can be found in the usual place.
There are several os-traits changes being discussed.
Other Service Users
New discoveries are added to the end. Merged stuff is removed. Starting with the next pupdate I'll also be removing anything that has had no reviews and no activity from the author in 4 weeks. Otherwise these lists get too long and uselessly noisy.
-
https://review.openstack.org/552924 Nova: Spec: Proposes NUMA topology with RPs
-
https://review.openstack.org/622893 Nova: Spec: Virtual persistent memory libvirt driver implementation
-
https://review.openstack.org/641899 Nova: Check compute_node existence in when nova-compute reports info to placement
-
https://review.openstack.org/601596 Nova: spec: support virtual persistent memory
-
https://review.openstack.org/#/q/topic:bug/1790204 Workaround doubling allocations on resize
-
https://review.openstack.org/645316 Nova: Pre-filter hosts based on multiattach volume support
-
https://review.openstack.org/647396 Nova: Add flavor to requested_resources in RequestSpec
-
https://review.openstack.org/633204 Blazar: Retry on inventory update conflict
-
https://review.openstack.org/#/q/topic:bp/count-quota-usage-from-placement Nova: count quota usage from placement
-
https://review.openstack.org/#/q/topic:bug/1819923 Nova: nova-manage: heal port allocations
-
https://review.openstack.org/648665 Nova: Spec for a new nova virt driver to manage an RSD
-
https://review.openstack.org/625284 Cyborg: Initial readme for nova pilot
-
https://review.openstack.org/629142 Tempest: Add QoS policies and minimum bandwidth rule client
-
https://review.openstack.org/648687 Nova-spec: Add PENDING vm state
-
https://review.openstack.org/650188 nova-spec: Allow compute nodes to use DISK_GB from shared storage RP
-
https://review.openstack.org/651024 nova-spec: RMD Plugin: Energy Efficiency using CPU Core P-State control
-
https://review.openstack.org/650963 nova-spec: Proposes NUMA affinity for vGPUs. This describes a legacy way of doing things because affinity in placement may be a ways off. But it also may not be.
-
https://review.openstack.org/#/q/topic:heal_allocations_dry_run Nova: heal allocations, --dry-run
-
https://review.opendev.org/656448 Watcher spec: Add Placement helper
-
https://review.opendev.org/659233 Cyborg: Placement report
-
https://review.opendev.org/657884 Nova: Spec to pre-filter disabled computes with placement
-
https://review.opendev.org/657801 rpm-packaging: placement service
-
https://review.opendev.org/657016 Delete resource providers for all nodes when deleting compute service
-
https://review.opendev.org/654066 nova fix for: Drop source node allocations if finish_resize fails
-
https://review.opendev.org/660924 neutron: Add devstack plugin for placement service plugin
-
https://review.opendev.org/661179 ansible: Add playbook to test placement
-
https://review.opendev.org/656885 nova: WIP: Hey let's support routed networks y'all!
End
As indicated above, I'm going to tune these pupdates to make sure they are reporting only active links. This doesn't mean stalled out stuff will be ignored, just that it won't come back on the lists until someone does some work related to it.