Data-aware infrastructure: It’s about time!
- 16 September, 2015 20:43
2015 has been my year of nonstop ranting about the need for us to get out of the IT Stone Age and into the 19th century. Dumb infrastructure will be our demise.
But it’s not too late. There are signs of life starting to appear. All hope is not lost.
A highway is dumb infrastructure. It’s great — it lets us humans make decisions and plot a course to take us where we want to go. Point A to Point B, pull over to relieve ourselves at Rest Stop 6, then continue on to our destination via Point C. If we didn’t have that dumb infrastructure, getting where we want to go would be horrifically difficult.
A luggage conveyor system at an airport is smart infrastructure. Our bag doesn’t have to make decisions. It has a destination encoded on it, and the infrastructure makes all the decisions to optimize where it goes and how it gets there. Yes, I know that things still go horribly wrong — but 99.9% of the time, the system works.
In this example, both the car and the bag are stand-ins for data. They are the payload, the thing that needs to get from where it is currently to where it needs to be. The difference is that dumb infrastructure just lies there and waits for you to make the decisions and force the process. Smart infrastructure doesn’t rely on you to do anything other than identify yourself.
The concept of data-aware infrastructure is getting some (deservedly) positive buzz lately. Infrastructure that optimizes itself and its pathways (and potentially other infrastructure) based on the data payload itself — which identifies itself overtly or covertly in such a way that the infrastructure knows what to do with it — is the way of the future.
Why do we need this? Let me go back to the previous example.
Imagine when you combine the dumb highway with the smart luggage conveyor. That’s what the 101 is going to be in 10 years. You aren’t going to drive your own car. Your car is going to drive itself (way better than you can drive it, regardless of what your overblown self-assessment tells you). Your car will be armed to the teeth with a million sensors, knowing what is going on in front of you, behind you, on the side of you, a mile away, etc. Your car will have a destination — your home, for example. It will automatically calculate the optimum path home based on sensor data collected from the other million cars on the 101, and everyone else’s cars and yours will all go at exactly the optimum speed along your route. The only thing that can mess it up is the one person who thinks he’s smarter than the system and screws up 400,000 cars behind him. Just like today.
When an emergency vehicle needs to get through, your car will be told, and the infrastructure will automatically migrate you from Lane 3 to Lane 2 — without you doing a thing. It will put you back on the optimum path once the ambulance goes by. The person waiting for the ambulance won’t be delayed because Joey in the Mustang refused to get out of the way. Joey will be vaporized. (OK, I’m making that up, but we all would like to see it.) This is the Internet of Things, or IoT. And this is a very simple example. Your car will also know that you have run out of diapers, which diapers you prefer, where they are on sale and where you should stop to pick some up. But that’s for next time.
Coming back to IT — we have spent 50 years building dumb infrastructure. We just make our highways bigger (networking bandwidth) and our buckets fatter (storage). But we don’t know what those highways or buckets carry at any more than a rudimentary level. We don’t truly know what’s important (the ambulance) or not (Joey). We are only marginally smart for “predictable” things — but when something comes at us ad hoc, we are deaf, dumb and blind.
Well, I hate to be the one to break it to all of you, but nothing is predictable in the age of virtual computing. You have no idea what is going to be running where or when at any given time. What was fine an hour ago causes nightmare slowdowns 20 minutes from now.
You can’t predict workload performance in a virtual infrastructure because you can’t predict every event. You are not smart enough. No one is. Heck, you can barely fix something broken in a virtual infrastructure because you can’t find anything. How could you possibly predict an outcome when everything is not interrelated?
Thus, there is only one plausible way out. Smarter infrastructure. We need the emphasis to be placed on the data itself — and have that data be responsible to identify itself in some way so that the infrastructure can find it and act accordingly. Because the one constant in a virtual world is the data — it doesn’t change. What you do with it might — but the data itself does not. It simply is.
We’re starting to see it. The concept of “software defined” is on the path to intelligent infrastructure. Initially, it’s all about just using commodity, cheapo physical stuff to perform heretofore high-end functions, but it allows for much more. It can make decisions. Some storage players are no longer content to build big, fast, dumb buckets to pour data into. Instead they are grasping what the data itself is and making decisions based on your rules. The same is going to happen in networking.
Remember: Storage array technology and IP networking are essentially unchanged architecturally for 40-plus years. Sure, they got bigger, faster, etc. and have added functionality, but they haven’t been designed to be “smart.” It takes a new way of doing things, and a new way of thinking. “Data out” thinking, not data in. Forty years of engineering how to put more data in an array, or over a network — not thinking about designing the infrastructure from the data-out perspective — is how we got here. Adapting the infrastructure in real time to accommodate the needs of the data itself is entirely different thinking. But it’s where we have to go.