Get started Bring yourself up to speed with our introductory content.

Applying the right incident management process averts trouble

This free book excerpt will show you how to prevent server failure with monitoring, change, problem and incident management processes.

To avoid finding out about an emergency after it's too late, implement monitoring, change, problem and incident management processes for your IT team.

Abdul A. JaludiAbdul A. Jaludi

The Command Center Handbook: Proactive IT Monitoring from Abdul A. Jaludi offers tips to optimize IT operational environments and presents approaches to increase efficiency. Jaludi, who heads operations services company TAG-MC, is an expert in IT infrastructure and command centers.

When it comes to the command center, IT pros often undervalue its primary function -- monitor the environment proactively.

"They allow the command center to become a dumping ground for other departments," Jaludi said. "These departments justify their actions as a cost-saving measure, since most command centers operate 24/7." Over time, this extraneous work becomes the command center's primary focus. The result is an increase in customer-visible outages, he added.

Chapter Five, Command Center Interactions, stresses the importance and necessity of monitoring processes in command centers. When implemented properly, an alert will sound in the command center to warn of a potential failure. The alarm signals the correct units that a failure may occur if action is not taken. Jaludi provides an example:

... An alert would be generated if the hard drive on a server with 500 MB of space gets down to 75 MB free, in other words if 85% of the available space is used. The first alert would be sent to the command center (and hopefully the support staff) when available space on the drive reaches 15%. When the first alert shows up, someone should be dispatched to investigate and correct the issue. If the incident is not corrected, a second alert would be generated and sent when available space reaches 10%. A third alert would be sent when the available space reaches 5%. If the condition is not fixed, when the available space reaches zero, the server will crash and your customers will be unable to perform the functions that generate your income ...

As seen in the example from Chapter Five, acknowledging alerts lets the IT team detect problems such as a degraded server as soon as possible. All problems must be well-documented, publicized to the affected departments and resolved quickly with a permanent fix, the domain of the problem management process team.

Preview the dos and don'ts of the command center in the book's fifth chapter for free.

Download the complete chapter now

Editor's note: This excerpt is from Command Center Handbook: Proactive IT Monitoring, authored by Abdul A. Jaludi, published with CreateSpace Independent Publishing Platform in 2014.

Dig Deeper on Real-Time Performance Monitoring and Management

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

What's causing problems in your IT command center?
Management use alerting to shift blame to support staff - whom they fail to provide adequate resources for. Reliability ends up always being the fault of the person who responds to the alert rather than to why little or nothing is ever done to avert issues beforehand (so the same issues re-occur over and over)