This Q&A podcast is part one of two. Click here for part two.
As data centers evolve toward cloud computing, so must workload automation tools. Cloud computing in particular has spurred growth where workloads are concerned.
SearchDataCenter.com site editor Tom Walat talks with Robert Stinnett, a data center automation architect with CARFAX Inc., about the future of workload management and how it can help companies move into the cloud.
What has driven workload management to a point where it needs to change?
Robert Stinnett: Well, traditionally we would take a workload -- in the olden days, 10 years ago, we used to call this batch -- and at night the online regions came down, the batch processing started and the next morning we brought everything back up to be ready for the next business day. So the only thing that's changed over the years is that we went to a 24/7 global economy, driven by the Internet and mobile technologies.
When eBay Inc., for example, first started many years ago, there was a time every night when eBay was down for maintenance, and chances are they were doing some batch processing during that time. Now eBay, like any other e-commerce company, doesn't want to go down. They want to be available so you can buy your baseball cards and Pez dispensers any time you wish. A lot of companies now are like that. They have to be able to serve their customers 24/7, but most IT processing is still done in some sort of batch form.
There's very little real-time processing that goes on out there. When you go to a grocery store, for example, the only real-time processing happening there is your credit card transaction at the end, the payment part. Everything else -- inventory control, replenishment, staffing, etc. -- that's all been done in some sort of batch process usually after hours. And that's true of any company nowadays. All these factors are really driving us away from the "batch" and toward "automated" because we're moving away from time-based, which batch was, to event-driven. And those events can be anything from the time of day to "we just received an order from a customer" to "we just received a file from Bank of America" -- things of that nature.
It's a fascinating time in IT because this is part of a bigger trend you see called data center automation. Gartner reports put it as one of the hot topics going on in IT right now. And that, of course, encompasses everything from workload automation to server automation, network, database, etc. It's really a piece of a bigger puzzle -- a very important piece -- because, without data, most organizations would be nothing more than a bunch of people on computers piled up. All these factors are driving more companies to adopt workload automation and get it going in their environment.
Why would a business decide to move its processes to the cloud and why would it not?
Stinnett: Many times in IT, you have peaks and valleys of demands that we can see easily if we stand in front of a department store. If you stand in front of Sears, they don't have a lot of customers shopping for refrigerators at three in the morning, but they have quite a few customers probably shopping between noon and five. So based on their processing demands, do they plan for a noon to five rush every day or do they plan for the three in the morning silence? Most business do not want to go to either extreme, so businesses typically decide to look at some sort of cloud solution or augmentation to their existing IT systems, especially for workload management when they find out they're dealing with peaks and valleys.
Most companies cannot afford to always scale for the peaks. For a typical retail organization, of course, its peak is going to be Black Friday, when its transactions are scaling quite heavily even on its e-commerce site. But, typically, they're not going to buy servers to sit around 364 days just to be used one day a year. That's quite a waste of resources and IT expenditures.
What many companies in the cloud are starting to do is find when peak processing takes place. It either takes place after hours or during the day; it may be very dynamic in nature. And we may process two records or we may process two million records. We don't know that, but we do know how many average records we process or the average number of transactions.
So, what do we do when all of a sudden we jump from our average up to that two million number? A lot of companies are finding that it makes sense at that point to engage cloud resources via public, private or hybrid cloud. There are many things you have to consider. You might say, "Hey, I need to offload some of my workload management to the cloud for processing."
Of course the biggest thing is security. What type of data are you trying to process? For all the things the cloud is, there are still security concerns. It's relatively new territory, even though the cloud and its predecessors have been around for a few years now; it's new for many people and the technology is changing quite rapidly. There's also this question you have to answer: "What's my uptime?"
Amazon, one of the bigger cloud providers, has had some pretty significant outages in the past year or two. So if I'm trying to move some of my nightly processing -- and it's not even nightly; some of my processing happens during the day now, too -- and a cloud provider's uptime is not matching my requirements. Well, now we have a problem because that is going to affect my business.
Lastly, I think of bandwidth issues. When you process in-house, going back to our two million transactions file we're trying to process, that data is usually local, and I can copy it over a gigabyte link from server to server to database, and so forth. If I go to the cloud, I have to worry about how I'm going to get the data out there and how I'm going to pull all that data back when I'm done, because it's not just I/O and CPU cycles that you're paying for. You are also paying to get it back and forth to you. Companies have to take all these things into consideration when they're [thinking about moving] some of their workloads and processing out to the cloud.
Typically, we break it down really into two categories: First, there are business-critical applications. So far, [those are] staying in-house. If I'm going to process anything to do with financial data or if it's my company's bread and butter or secret sauce, we're probably going to keep that in-house because we need to make sure that is highly available, secure and reliable, and the company is probably willing to spend the money to plan for those peaks.
Then there is noncritical stuff for a company. Going back to our fictitious retailer, it's sales, e-commerce sales, stock, replenishment and so on. They may be doing marketing studies. Say I buy a box of Tide detergent, a case of Mountain Dew and some Pledge cleaning supplies. What can we infer from this? Anybody who has walked into a store or ordered from an online retailer lately knows that marketing is one of the big things these places are doing because they want to target us for different things. But a lot of that stuff, especially when you're looking at marketing data, is aggregated; it's usually not personally identifiable except through some sort of hashed identifier. And that is not business critical. If I can't run my shopper loyalty rewards marketing program tonight because their systems are overloaded with financial stuff, that's OK. It can probably wait. It'd be nice if I could run it, but stuff like that is why we're seeing different industries start moving to the cloud.
They could put a lot of data in the cloud to run. It can come back in-house later, where there are no service-level agreements (SLAs) to worry about or missed meetings. We'd rather free up our internal resources to use on business-critical applications and let the external part of the cloud take care of some of the other stuff. As time goes on, we're going to see a re-evaluation of this mix. But for now, that seems to be what most people are using cloud to do: offload some of the workload management.
When we talk about cloud in terms of a private cloud or the cloud that is hosted internally, then the whole dynamic changes. But that's understandable. Now, you're looking at resources that your company does control versus the public cloud, which they do not control.
The big question on many IT directors' minds is: What do I need to process? What is business critical and what is nice to process and I should process but is not critical to running the company on a day-to-day basis?
How do workload automation tools factor into workload automation management in the cloud?
Stinnett: You cannot have workload management and you can't even have cloud, I believe, unless you have some sort of automation in place. The whole idea behind workload automation is that keyword: automation. It means people can stop doing these manual processes. You don't have to have people hitting buttons, typing in information and monitoring it. Take that out to the cloud. Nobody wants to sit here and manually say, "OK, well I need some resources from my cloud provider. Let me log in to their site, spin up a cloud and put the right OS and the right package out there." At that point you're wasting money, plain and simple.
What companies realize is that in order for this to be successful, and in order to offload some of the workload processing to a cloud, deciding what should go out to the cloud and what's available in terms of resources has to be a fairly automated process. How can I provision that automatically? How can I get the necessary packages and data out there? How can I get it back into the company fully automated?
Again, a lot of these processes are running 24/7 at times when people aren't around, and you certainly don't want to have to wake somebody up at three in the morning just because you need to key in an access code to get to Amazon's cloud. That defeats the whole purpose of workload automation.
One thing that many companies need to do before they even think about offloading anything to the cloud is have a very stable internal workload automation process. You can't build Rome in a day -- and you sort of follow the Agile manifesto here and take this in small chunks. Many organizations, especially startups, don't have a lot of automation in place. It gets going as the company grows.
The first step is getting data processing and other workload management automated so they can process the transactions and the files. Then they can move the data, and then take a step back and see that the workload is automated pretty well. Now, they want to automate their resources -- their internal servers, external cloud providers, etc.
Most companies, in my experience, start with internal first. So, I'm going to be able to provision my servers automatically, provision my blades automatically, get my network resources set up.... And at that point, [companies] usually make the jump to the cloud.
Many cloud providers have standardized a lot of the tools. You are not reinventing the wheel because the cloud provider has already done that for you. They usually provide application programming interfaces [APIs] and other things you usually access with tons of documentation out there. It's easier to build on top of somebody else's work than to start from scratch. You have to get the full picture to get that vision down the road. People have to say: "What do we want to achieve here?" And when it comes to automation: "Where are we going to start?" And you have to start small.
When they start this kind of process, [companies need employees] to understand [that] the goal is not to automate people out of a job. The goal is to automate the routine and boring -- the stuff that wakes IT guys up at three in the morning on emergency service calls. Automate that, and the rest will take care of itself. Then [the IT staff] can be working on the next great thing and working [to make the] company better.
How do new automation tools affect infrastructure costs?
What does cloud computing do to IT?
Cloud computing now and in the future