I've spent the last couple of days at OpenDev. Here are some quick and dirty thoughts from the two days.
The event described itself as:
OpenDev is an annual event focused at the intersection of composable open infrastructure and modern applications. The 2017 gathering will focus on edge computing, bringing together practitioners along with the brightest minds in the industry to collaborate around use cases, reference architectures and gap analysis.
The gist of that is "get some people who may care about edge computing and edge cloud in the same room and see what happens".
As is often the case with this sort of thing: When talking about something relatively new there is very little agreement about what is even being talked about. People came to the event with their ideas on what "edge" means, ideas that didn't always match with everyone else. This can lead to a bit of friction but it is a productive friction; areas of commonality are revealed and presumed areas of overlap which are not are exposed.
There were several different edges floating around the meetup. The two standouts I heard (not necessarily using these names):
-
Edge Cloud: A cloud (sometimes in single piece of hardware, sometimes not) that is closer (in terms of network latency or partitioning risk) to where the applications and/or data that could use that cloud is located. In this case a cloud is some compute, network, or storage infrastructure. VM or container agnostic. Orchestration of workloads on this thing happen on it.
-
Edge Compute: A piece or suite of hardware, often but not always small, which is closer to the consumer of the compute resources available on that hardware and where the applications on the hardware are managed in a composable fashion (for example a home networking device running its services as VMs). Orchestration of workloads on this thing happens from something else.
The locality of orchestration represents a sort of spectrum but there are also other dimensions. One example is the types of applications. One can imagine an edge service that is managing a bunch of virtualized network functions (VNF) for a customer of a large telco at the edge of the telco's network (between the customer and the telco). Or an edge service which does deep learning analysis of images with low latency for a mobile device doing augmented reality that does not have the compute capacity itself, but needs very low latency processing to provide a good experience.
The edge compute model is currently a poor match with Nova. High latency (or partitioning) between a nova-compute node and the rest of nova is not something that nova really does. It would be interesting to explore the idea of a remote-resilient (and maybe "mini") nova-compute.
In typical OpenStack fashion, the event resulted in a bunch of etherpads, a summary of which can be found.
My interest for the two days was to see to what extent the Placement service can help with these use cases or will need to be extended to cope with them. It's already quite well known that the NFV world is driving a lot of the work to make placement provide some of the pieces of the pie for what's called Enhanced Platform Awareness (aka give me NUMA, SR-IOV and huge pages). Until DPDK is able to provide a suitable alternative, VNFs at the edge will demand these sorts of things. Certain high performance edge applications (perhaps using GPUs) will have similar requirements.
Another concern for both styles of use case, but especially edge applications (cloudlets as Satya calls them) on limited hardware, may be scheduling needs related to the time dimension and to dynamic evaluation of the state of the hardware. Being able reserve capacity for a short time or use otherwise unreserved capacity during periods of intense activity will be important.
The Blazar project hopes to address some of the issues with time, something neither nova nor placement wish to address.
The nova scheduler has always had the capacity to represent dynamic aspects of host state ("Is my compute node on fire?") but it is perhaps not a consistently represented and available across deployments as it could be. Nor is it fully developed, so if it is something that people truly want there is an opportunity. The placement service made a design decision up front that it would not track data that may vary independent of individual workload requirements (e.g. temperature on a CPU) but scheduling has become a two step process: get some candidates from placement then further refine them in the nova-scheduler.
Edge looks like it is going to grow to become a major driver in the cloud environment. Hopefully continued events like OpenDev will ensure that people create interoperable implementations.