A version of this article was originally published on TM Forum's Inform.
However, every time we choose to use cloud we’re making some necessary trade-offs – for example, we trade the convenience of instantaneous elastic scaling across massive cloud infrastructure, against the latency which comes from having that compute infrastructure located far away from where we’re using it. In many cases, that’s not an important distinction – if the cloud workload is serving web pages to users across the country or across the globe, it’s not important that it’s running a few hundred kilometres away. But for some use cases, the location of the workload is an important consideration.
When we’re thinking about adding intelligence to an existing task, often the response time becomes critical – if it’s text on a webpage for a human to read, or a voice response to an Alexa question, then a few hundred milliseconds delay is neither here nor there. But for augmented reality overlays on a live video stream, or accurately tracking a gamer’s movements in a virtual world, anything more than a few tens of milliseconds becomes unacceptable.
When the response to external events becomes important, we need to think about the complete end-to-end round trips involved in the system. Recently I worked on a manufacturing use case which involves processing machine sensor data to respond to a critical event – shutting down a machine tool before it can damage itself on failure. From the instant that the event happens in the real world, the system must gather data through sensors, process it locally by digitizing it, and then package the data to be sent over a communications network to the system which can make sense of it. In the case of a system running in the public cloud it could take 50-60ms for the data to reach the application, longer if the volume of data is large. At the cloud-hosted system the data must be received, unpacked, and processed – adding intelligence involves making decisions, and the more complex the intelligence the longer that might take. Finally, the response must be formatted, packaged for transfer, and returned to the point of origin so that it can be used to implement the decision (in this case to halt the machine safely).
It’s clear that the time taken in moving data to and from the cloud can be a significant factor in feasibility of the use case. Public cloud data centers are typically large and highly centralized for economy of scale, with only a small number of physical locations for whole continents – so the round trip travel time alone can be upwards of 100ms. Sometimes, this distance latency is unacceptable, ruling out the use of a centralized cloud.
An increasing number of companies are investigating ways to retain the benefits of the centralized cloud (things like the pay-per-use consumption model, and the ability to scale out elastically) while also getting the benefits of localized compute hardware nearer to the point of consumption, i.e. the ‘edge cloud’. There are a number of active standards and Opensource projects, e.g. OpenStack StarlingX, Kinetic Edge Alliance, Linux Foundation Edge. Even the major public cloud providers have initiatives that in various ways blend local and centralized cloud – for example deploying local compute hardware on the customer’s premises but controlling it as an extension of the public cloud.
The telecoms industry is well-positioned to offer a consistent edge cloud experience, since existing network infrastructure already has many of the characteristics required for localized cloud. The telecoms networks have a hierarchical distribution of physical points-of-presence spread fairly evenly throughout the region, virtually all of which have already been upgraded for at least some level of compute hardware hosting. They have the essentials of network connectivity, mains electricity, a weatherproof containment and physical security provisions at least. Many network sites only have than ambient cooling, which may be a drawback, but there are technological solutions on the way to resolve that too, such as oil-immersion cooling. The telecoms industry has coalesced around proposals for multi-access edge compute as a way to think about how to service edge cloud requirements from within the communications network. It’s likely that with MEC we can deliver differentiated cloud services, offering much lower round-trip latency and higher overall availability, blended seamlessly with centralized cloud delivered from hyperscale data centers.
Thinking about the ways we can use edge cloud hosted just inside the provider network, the sort of use cases that come to mind are those where overall latency is critical – for example gaming, augmented reality processing, and IoT analysis use cases in medical and industrial applications generating real-time interventions. Since avoiding latency is the key characteristic, it makes no sense to avoid distance latency only to end up replacing it with processing latency – so it’s likely that the MEC compute nodes will need to have very high-performance hardware such as GPU and FPGA arrays, supporting high-performance computation for analytic and AI use cases.
It’s also clear that the network connecting the end-user to the edge cloud is critically important, and the characteristics of 5G make it an absolutely ideal choice. The combination of ultra-low latency, very high bandwidth and scale-out numbers of devices all make 5G perfect for use cases adding intelligence at the edge – and the flexibility and security arising from virtualization and network slicing are game changers.
The TM Forum Catalyst Project entitled ‘Manufacturing Predictive Maintenance using 5G’ has explored how 5G can enable data- and compute-intensive use cases to successfully exploit edge cloud
This article was provided by a contributing writer to TM Forum.
TM Forum is a global industry association for service providers and their suppliers in the telecommunications industry.