At Tetrate’s Service Mesh Day, ONF CTO Larry Peterson illustrated his session on the history and future of networking with a diagram depicting the evolution of the “narrow waist of the internet.” Peterson’s full slide deck
We’ve come far from the minimalism Peterson shared in the form of the IP protocol datagram created around 1980– a historical artifact and a “beautiful piece of computer science.”
In broad strokes, said Peterson, we’re moving toward a phase where we are concerned only with services, orchestrated and managed by what we might think of as service chains, running not on a world of networks but on a world full of clouds. Our task now is to figure out what we need to add to that narrow waist.
Peterson provided an example of a particular variation of protocol buffer that could be used as a kind of “hammer” for assembling disaggregated parts and to generate a lot of code.
With its centerpiece CORD project, the ONF is working to architect a “central office” data center that component projects can “plug in” to provide various component functionality. Once created, Peterson suggests we’ll next need something like a “service chain” to make it possible to construct functionality on behalf of individual users, subscribers, applications, and devices that stretch across that multicloud as well as on prem and wherever you go in your mobile world. This, said Peterson, is where service mesh is going to end up.
One of the challenges CORD is addressing is the question of how to automate the process of operationalizing a system built from disaggregated components. ONF’s approach is to make those parts adaptable and to centralize the control and operationalization of that system, to avoid having to manage each individual component independently.
“You want to manage it as a complete system as a complete service mesh, and not have to manage the individual parts,” said Peterson. “Disaggregation was good for innovation, but it’s what you’re working against as we try to integrate and operationalize.”
Another goal of this system is to enable the programming of services in the abstract, or the building of services out of services, as opposed to concrete implementations. “The service chains are basically combinations of instances that have been allocated to a particular use case,” said Peterson, “which could be a subscriber, an application, or an end device.”
But as we look to the future as all of these technologies are coming together, control and observability will be key features to pay attention to.
Some key takeaways:
- The access network is being codified.
- Operationalizing that edge is a really challenging problem.
- The access edge will be a critical part of the multi-cloud.
- We don’t know what the narrow waist is going to look like, but we can envision that everything will be a service, with meshes of functionality and service chains, the functionality allocated to individual subscribers.
- Fine-grain runtime control is going to be important
- Finally, there’s a rich opportunity now to get out of our siloes and think about problems that are exactly the same as you move up and down the stack.
“Whether your functionality is implemented in the Envoy proxy or in the switching pipeline, it’s still functionality that gets you from point A to point B, and it needs to be managed and operated,” said Peterson.
Complete transcript
I’m going to give you a little historical perspective. It’s always a good place to start. And I’m going to talk about what’s next and I’m actually going to shoot just beyond service mesh. So stay with me on that one. I’m not going to go to about service mesh narrowly. I’m going to talk about it very generally, which is where I think things are going. So historical perspective, I thought I’d start with an historical artifact. Hopefully you’re familiar with it. One of the most beautiful pieces of computer science, right? It is so minimalistic. If we could just all agree to those few bits on the screen, it’s not aesthetically pleasing to look at obviously, but uh, it’s that universal thing that if we could buy into, we could build a global thing, but you have to keep it minimal, right? So this is where, this is where it all started.
And in fact, you’re, I’m sure, familiar with the notion of the internet having a narrow waist, IP. It’s manifested by that ASCII diagram is the narrow, was the narrow waist of the Internet circa 1980. Uh, it was minimal ,and got things started. It wasn’t the end of the road of course. And I would argue that, I think we all recognize that the narrow the waist has moved up on the narrow waist to be Internet is now http. And again it’s the universality, that made them be able to do gets inputs on identifiers for anything in the world: URIs. And of course it had a little help with security in some congestion control management underneath it. But that’s basically the narrow waist of the Internet as we know it today. And that’s allowed for the explosion of cloud services. So of course I just begs the question, what’s next?
And I’m going to just draw it in very broad strokes. I’m going to use this phrase as it kind of a catchall. Everything as a service and that is in fact where things are going. Everything that they’re concerned about our services. We used to be concerned with uh, URIs and objects, you know, resources that are kind of a low level. We can now build services. So how do we orchestrate, manage and build new things out of services. So you, you might’ve expected me to say service mesh at the top there. I said service chain. That’s part of my intent to try to stretch you just a little bit beyond what we’re immediately trying to do and the un.., and below. It’s not that we have this world full of networks, we now have a world full of clouds and so it’s really, and I don’t know exactly what’s going to go into that narrow waist.
I’m going to label it. Everything as a service, but that’s what we’re all about for the next few years, trying to figure that out. So if the, if the manifestation of the architecture, uh, almost, a geez now, 40 years ago, was that ASCII diagram, what you’re going to look like in the future and this is just a one possible example. I think it’s going to be a model definition and this particular model definition, you can look at it and you’re probably going to recognize it looks a little bit like protocol buffers and it is syntactically is a, is a way that we in in a system that we’re building at ONF, which I’m going to use as as a, as an example as I go through this, are um, using to specify the behavior we want out of the system. And it’s our, it’s our hammer as we try to figure out how to build these more complicated things from all the disaggregated parts that we’ve created.
This particular variation of protocol buffers, we don’t just use it to generate code, uh, to martial arguments. We use it to generate a lot of code and we’ve extended it slightly. It all fits in, in the option field, but we call it x proto. And it’s basically, like I said, just an example of the specification language for models that define the architecture. I’m gonna use this running example as I go through a few slides here. So I said it’s a multi-cloud world. So this is going to be my way. I’m going to just draw multicloud world and everyone understands that the edge cloud is now the big deal. We’re going to get applications out in near where the end users are, where the devices are, low latency, high bandwidth, supports, mobility and so on. Except this doesn’t actually exist quite yet. And what in fact is there at the edge today is another historical artifact.
Uh, if, if you go into the access networks of all of the network operators, you will find what they traditionally call the central office. Uh, it is a network museum. There you might find 300 different pieces of equipment, bundled, proprietary. It’s still hooked up to the billing system so it doesn’t get thrown away. But, uh, it’s not the cloud that we all hope it to be. So this is basically where the ONF is working, which is we’re basically trying to do to that piece of the infrastructure, what everyone else in the cloud has been doing all along, which is to virtualize, commoditize, and disaggregate to what used to be there. And when we’re done with that, uh, we hope then we have this in the, in cloud that we can take advantage of. And just to put a name on this, if you’ve, if you’ve heard of it, this is the CORD project, which is the, uh, Central Office Rearchitected as a Datacenter.
So let’s try to bring this technology to that environment. Now having done that, uh, what do you, what do you end up trying to do? And basically this is where I’m going to use the word service chain. We’re trying to make it possible to construct functionality on behalf of individual users, subscribers, applications, devices that stretch across that multicloud. Let me call it a service chain, service, it’s, you’re going to see, it’s an instance of a service mesh, but I’m going to call it the service chain. And just so I’m clear, I’m going to have a little bit of a focus here on uh, the access edge that you might find in a central office. But don’t let yourself think of that it’s limited to that. It also stretches on-prem. So that’s, in the Telco infrastructure is one place this technology will go. But that’s not the only place.
And so these are in fact all the way to on-premise into your house, into your enterprise. But the important thing to take away from this picture is this is now a mobile world. And so as you leave home and you get in your car, this potentially follows you. So this is the thing that we’re after, uh, that I’m talking about. That is where the service mesh, I believe is going to end up. Now, it turns out I’ve been in networking for whatever, 30 years and I knew practically nothing about the access network. And I understand that is the case for almost everyone except those that live in the telco world. And so I wanna to just take a minute to talk about what we’re doing there so that you can, can, appreciate the problem that you get, you, you have from that. So the, what we’re doing is, is the same thing that’s been happening in the, in the, in the, uh, web services space for quite a while now, which is you bring this disaggregration to the game, except we’re bringing both microservices and SDN.
And so you take these very large bundled devices that used to buy from a traditional vendor. And the first thing you do is you break the control plane and the data plane and we use commodity white box switches there. Then they do the, uh, the second obvious thing, which is you break the big functions into a bunch of microservices and you build meshes out of those. And the third thing that you do is you move some functionality that you had in a microservice back down into the switching fabric. And so these are now programmable devices. So when you think of the applications that we’re building, don’t think of those applications is only running in some format on a server, they’re running in the switches as well. So those pipelines are increasingly powerful and there’s certain work that you really ought to be doing there and not have to hit a server.
Let me just draw a comparison just for a moment. And this is one of those, lock this one away and come back to it later. Those little functions that got dropped down to the switches. Imagine that’s Envoy. It’s functionality that I can put into the data plane, customized for some purpose, not just the standard yesterday’s forwarding packets. So this is just another example of what functionality might live. All right. So I could walk through examples of how we’ve disaggregrated the passive optical networks and the RAN and their radio cellular networks. But I’m not going to do that. Just just assume for a moment that we have broken these things into pieces that really leads us to the hard problem that we’re about here today, which is putting the pieces back together again. And so that’s the integration problem and it’s really, I kind of, uh, interesting.
It seems that the obvious thing to do is to break things into pieces. That’s what actually spurs innovation. And, and you become much more agile and improve feature velocity and all of that. But if you think about this from the Telco’s point of view, they have, and it’s same as true for cloud providers, they have to take those parts and put it back together again. And if you don’t put them back together and get it the right way, you’ve just thrown away all the advantages you had from disaggregating those parts in the first place. You get a brittle system, uh, that was integrated as a one, as a one-off solution. But you have to do that to operationalize it.
So the central challenge is how do you automate the process of operationalizing a system built from disaggregated components? That’s a challenge we’ve been facing in CORD, that’s the challenge of service mesh, just, it’s a shared challenge. The approach that we’re taking is we would like to declare the intent with intents what we want from the system. Once we put it back together, we don’t want to have to hardwire a bunch of stuff in and then we can’t change it and we can’t keep pace with it. With the change. And second, we want to centralize the control in the operationalization of that system. The last thing you want to do is to have to manage each of the individual components independently. You want to manage it as a complete system, as a complete service mesh, and not have to manage the individual parts. Disaggregation was good for innovation. Dis-aggregration is what you’re working against as you try to integrate an operationalize. So this is the shared problem that we’re all here about.
It exists in many different forms. All right, so let me try to put a little bit of meat on the bone with some details about a layer that we have built that I think the end of the day has an interesting interplay and complimentary to what Istio is doing. And I hope hope to dry out that, that comparison, but it just shares a lot of problems as well. So it’s a modeling language, a generative tool chain, which I won’t really talk too much about and some core models. So this system is an oper., I think of it as an operating system. That’s just the way we approached it. That why it’s got the name it has, but it’s basically a cut service control plane. And when I say service, do not think Kubernetes service. Kubernetes service is a particular implementation of a service, but when you start implementing functionality and switches and you implement them in serverless places and so on, the implementation might be different.
So I’m, I’m talking about an abstract service, so I’ve got a whole bunch of disaggregrated components. They might be legacy, what the Telco world calls VNF, virtual network functions, running VMs. It might be horizontally scalable microservices, it might be control applications running on top of an SDN fabric or whatever. Those are the disaggregrated components. And we’re basicall,y those live in the data plane, the service data plan, and we’re just basically trying to glue those together and manage them as a compound unit, uh, and a control layer. If you look internally, what you’ll find is that there’s a data model and that’s the key here– once we have the model, we can generate the northbound interfaces, we can generate a whole bunch of glue code that makes this thing behave as a single unit. There’s a synchronization framework that ties the data model to all those backend services.
And we load this with some core models, and I’m not going to go into detail, but the core models are services, service dependency, service instances and so on. And then each of the individual services that we want to integrate into that mesh, load their model into the system, there’s all kinds of cross-references, a lot of interconnections and a lot of richness. In fact, if you look at that poor orphan service over at the right, there’s nothing behind it because it’s a purely logical service. And that’s a really important point where we’re trying to get to is that you actually now start programming services in the abstract as opposed to programming services, uh, you know, concrete implementations of services. You can build services out of services and that’s in fact we’ve done that. That’s what that example on the right is. All right. Without again going into a whole lot of detail, I want to draw out, call attention to what happens once you have those models and basically you build a service mesh.
All right, and I’ve color coded this just to remind you that while we’re at the abstract layer here behind those, some are blue services because they’re implemented in the switching fabric. Some are red services because they’re implemented in microservices. Some of them are gray services because they’re purely logical implemented, implemented out of by their parts. That’s the service mesh. It’s, it’s exactly the same idea at the abstract level. Remember there’s an implementation below it. And then we implement service chains are basically combinations of instances that have been allocated to a particular use case. It could be a subscriber, it could be an application, it could be an in-device that’s completely up to the, the, the definitions that you want to, you want to invent. So this is the service chain that I was showing you earlier, which is basically it’s a multi tenant, the services and the service mesh are multi tenant. These are the instances of that multi tenancy.
One of the things to call attention to here though was we sort of, as we look to the future and where these technologies all come together, is that managing, having control and visibility at fine grain is one of the difficult problems that you have to pay attention to. So whether you’re tracing for debugging purposes or you’re providing fine grain visibility through the service mesh because you want to actually monetized a customized path for someone, uh, is, is up to you. But that’s basically the same problem. So I’ve been, these have been, you know, this is slideware. We have, we have built a bunch of these, in fact, this is a, uh, a multicloud service Mesh that we’ve been working at ONF with Tetrate on. This happens to be an implementation of the access network for cellular, for 5G, uh, again, I won’t, I won’t go into all the details, but the color coding is important here. The blue services– the functionality are implemented in the switches, the red services, the functionality is implemented in containers. And um, in fact it’s multicloud, these services live at the edge. These services live in the Google Cloud. And, uh, while I’m showing a CDN in a video archive as one example from a point of view of a demo, There could be many edge services in there could obviously be many cloud services.
Okay. So this is something that’s real. If you, if you walk into the details of this, you start to see interesting interplay between all the levels of the, of the, network stack. A lot of what we’ve been doing at the edge is at L2 and L3 and obviously what’s happened with the service meshes at L4 L7, but the problems are exactly the same as you go up and down the stack. And I think that’s where there’s an interesting opportunity as we go forward. So just a quick summary of some of the takeaways. The access network is being cloudified – be aware of that. It’s going to be really important. Operationalizing that edge is a really challenging problem. It’s exactly the same challenge that we’re all facing as we talk about service meshes. The access edge, will be at a critical part of the multicloud and don’t think it’s limited to the Telco.
They happen to have a monopoly on it today, but that’s not the way it’s going to play out, I believe, in the future, This technology 5G and the mobility that supports in unlicensed band is going to be critical for low latency, uh, mobile edge applications across enterprises as well as in in the telco networks and the support, but ability is obviously important. And finally, as we think back to the narrow waist, and I don’t know what exactly it’s going to look like, but it is going to be some notion of everything is going to be a service. There’s going to be meshes of functionality broadly. They’re going to be service chains, the functionality allocated to individual subscribers. And that’s basically where we’re, where we’re moving. That narrow fine grain runtime control is going to be important. And finally, there is a really, really rich opportunity here to get out of our silos and think, I only work at this level in the protocol stack. These problems are exactly the same as you moved up and down, whether your functionality is implemented on an Envoy proxy or it’s an implement in this switching pipeline, it is still functionality that gets you from point a to point B and it needs to be managed and operated. And so these problems are common across across the spectrum. So I think with that, I will stop and pass it on to Eric. Is that the okay, thanks. Thank you.