Index ¦ Archives ¦ Atom

Placement Update 18-24

This is placement update 18-24, a weekly update of ongoing development related to the OpenStack placement service.

It's been quite a while since the last one, mostly because of travel, but also because coming to grips with the placement universe takes some time. Catching up will mean that this update is likely to be a bit long. Bear with it. This is obviously an expand style update (where we add new stuff). Next week will be a contract.

One thing I'd like to highlight is that with the merge of change 560459 we've hit a long promised milestone with placement. Thanks to an initial hit by Eric Fried and considerable followups by Bhagyashri Shewale, we now have rudimentary support in nova for libvirt-using compute nodes that use shared disk to accurately report and claim that disk. Using it requires some currently manual set up for the resource provider associated with the disk and creating the aggregate of that disk with the compute nodes that use it. But: this is one of the earliest promises provided by the placement concept, in the works for more than two years by many different people, finally showing up. Open the bubbly or something, a light celebration is in order.

The flip side of this is that it highlights that we have a growing documentation debt with the many features provided by placement and how to make best use of them in nova (and other services that might like to use placement). Before the end of the cycle we will need to be sure that we set aside a considerable chunk of time to address this gap.

Most Important

Getting nested providers and consumer generations working are still the key pieces of work. See the links in the themes below.

A lot of complicated work is in progress or recently merged and we are getting deeper into the cycle. There are going to be bugs. The sooner we get stuff merged so it has time to interact and we have time to experiment with it the better. And there's also that documentation gap mentioned above.

Also a reminder that for blueprints that have code that is ready for wide review, put it on the runway.

What's Changed

(This is rather long because of the gap since the last report, but also because we've hit a point where lots of stuff can merge.)

Discussion revealed an issue with allocations and inventory that exists on a top-level resource provider which we'd later like to move to a nested provider. An example is VGPU inventory which, until sometime very soon, was represented as inventory on the compute node (I think). Fixing this should be an atomic operation so a spec is in progress for Handling Reshaped Provider Trees. This suggests a new /migrator URI in the placement service, and for the sake of fast-forward-upgrades, a way to reach that URI from a within-process placement service (rather than over HTTP). The PlacementDirect tool has been created to allow this and has merged. Quite a lot of work will need to be done to implement that spec, so I'm going to add it as a theme (below).

Nova now requires the 1.25 placement microversion. It will go up again soon.

The groundwork for consumer generations (including requiring some form of project and user on all allocations) has merged. What remains is exposing it all at the API layer.

The placement version discovery document was incomplete, causing trouble for certain ways of using the openstacksdk. This has been fixed.

Placement now supports granular policy (policy per URI) in-code, with customization possible via a policy file.

A potential 500 when listing usage information has been fixed.

There is now a heal allocations CLI which is designed to help people migrate away from the CachingScheduler (which doesn't use placement).

Nova host aggregates are now magically mirrored as placement aggregates and, amongst other things, this is used to honor the availability_zone hint via placement.

Bugs

Specs

Total four weeks ago: 13. Now: 13

Spec-freeze has passed, so presumably exceptions will be required for these. There's already a notional exception for "Reshaped Provider Trees".

Main Themes

"Mirror nova host aggregates to placement" and "Granular" are done, so no longer listed as a theme. "Reshaped Provider Trees" is added because we're stuck if we don't do it.

Nested providers in allocation candidates

Quite a bit of the work related to nested providers in allocation candidates has merged. What remains is on this topic:

Eric noticed that in this process we've injected some changes in behavior in Rocky in the response to /allocation_candidates without guarding it by microversion changes. There's some discussion about it in IRC. First with me and then later with Jay. The gist is that it's unfortunate that happened, but it's not a disaster and the best outcome is that the diff between Queens and Rocky demonstrates the right behavior.

Consumer Generations

This allows multiple agents to "safely" update allocations for a single consumer. The code is in progress:

As noted above, much of this is merged. Most of what is left is exposing the functionality at the API level.

Reshaped Provider Trees

This allows moving inventory and allocations that were on resource provider A to resource provider B in an atomic fashion. Right now this is a spec on the following topic:

A glance at the spec will reveal that this is a multi-faceted and multi-party effort. Nine people are listed in the Assignee section.

The placement direct part merged today.

Extraction

The placement db connection change has been previously +W but since had a few merge conflicts. It presumably will merge soon. This will allow installations to optionally use a separate database for placement data. When that merges a zuul change to use it will adjust the nova-next job. The changes required to devstack are already in place.

A stack of changes to placement unit tests to make them not rely on nova.test has merged. There are functional tests remaining which still use that. If you are looking for extraction-related work, finding ways in which nova code is imported but isn't really needed is a good way to make progress.

A while back, Jay made a first pass at an os-resource-classes, which needs some additional eyes on it. I personally thought it might be heavier than required. If you have ideas please share them.

The placement extraction forum session went well. There was pretty good consensus from the people in the room and we got some useful feedback from some operators on how things ought to work.

An area we will need to prepare for is dealing with the various infra and co-gating issues that will come up once placement is extracted.

Other

19 entries four weeks ago. 23 now.

Some of the older items in this list are not getting much attention. That's a shame. The list is ordered (oldest first) the way it is on purpose.

End

Yow. That was long. Thanks for reading. Review some code please.

© Chris Dent. Built using Pelican. Theme by Giulio Fidente on github.