Hardware

Keeping apps to speed

Are you paying employees to sit around watching an hourglass? Losing customers because that order page on your Web site takes forever to generate and download? Then it's time to focus on application management.



Tying IT to business

Organising the IT function appropriately can assist the alignment of IT and business, Ross suggests: "We believe that a CIO needs to cross-functionalise within the IT departments." A traditionally organised IT operation puts developers in one silo, networks in another, and so on. This leads to finger pointing when things go wrong, so he suggests organising on business process lines, eg, an ERP (SAP) group, a CRM (Siebel) group, and so on. "IT is there to support these critical business applications," he says.

Business people don't really care what fails, they just want their application running again as soon as possible, according to Belger. Consequently, Tivoli takes a whole of infrastructure view, and works with application vendors to provide monitoring and management.

Similarly, Mercury offers Topaz application-specific monitors so customers collect information from within various .NET and J2EE application servers. "It's an area we've put a lot of R&D into," says Lilley, otherwise the application server is a black box with a lot happening inside it. Topaz lets IT staff look inside EJBs, database connections, and so on. This is good for the rapid resolution of problems, he says, as it identifies the area where things are going wrong."

Where do we go from here?
Applications management is about the three Ms—monitor, manage, and maintain—says Ciardulli. "You pick the technology, and we'll be able to find something [to monitor and manage it]." Research commissioned by Mercury, however, showed that while most organisations claim to monitor performance from an end-user perspective, they actually do it subjectively and retrospectively, eg, through surveys and analysing help desk calls. Only a minority do real-time monitoring, says Lilley. "A lot of application management is bottom-up" but you should monitor transactions by looking at real and synthetic users "to get that end-user perspective".

Capturing all the data that monitoring packages gather can give some remarkable insights. One Compuware customer has collected 1.5TB of performance data over two years. Data mining revealed that whenever there was a problem with a particular router in Hobart, a router in Broome rebooted. Further investigation showed this was the result of both routers being given the same IP address, so they were both responding to the remote reboot command.

-Application management is about the three Ms—monitor, manage, and maintain."
—Sal Ciardulli, Systems consulting manager, Quest Software, Asia-Pacific
Then there is "Neugent" (neural agent) technology, developed by Computer Associates and already used in the Unicenter suite. Neugents "learn" the normal state of a system—despite a rule of thumb that CPU utilisation should not exceed 90 percent, a neugent can learn that a particular system is OK at 98 percent. If the data feed to that system fails, utilisation may fall to two percent, which the neugent will recognise as a error and raise an alert. It will also correlate the drop in CPU utilisation with the drop in network traffic leaving the system. Once the cause has been determined, the set of circumstances can be labelled in a meaningful way to help recognition if they recur. A neugent can take into account the circumstances leading up to the event and if they arise again can predict the failure before it happens and estimate its probability. These circumstances can be spread across multiple systems, and the analysis can identify trends that people can't spot. Collecting the data in this way also lets you replay an event to see how it occurred.

Ross also stresses the importance of predicting incidents. Algorithms exist to forecast shortages of disk space, transaction times exceeding thresholds and other SLA exceptions. "This is where the whole optimisation thing comes in," he explains—if you can see you'll need more bandwidth in 10 days and more server space in 90 days you know which is most urgent. Such forecasts do not necessarily mean that resources must be added, as they may result from errors (eg, temporary files not being deleted or users submitting badly-formed SQL queries) rather than genuine shortages.

Belger says IBM's autonomic computing push also involves trends and predictive analysis—if you can catch the problem before it becomes noticeable, that's effective monitoring.

Executive summary

  • Design for reliability, and ensure your chosen architecture can carry the load you will put on it.
  • Consider bringing in experts to ensure packaged software is correctly configured—but check their expertise first.
  • There are many tools to help manage your applications. You may be inclined to seek a one-stop solution, but it may be more pragmatic to find a product that meets most of your needs and then augment it with other vendors' offerings.
  • Test early and test often. Don't leave testing until just before deployment.
  • Change management and the corresponding documentation can be a key to application reliability.
  • Establish performance baselines, and then you can look for trends and spot abnormal situations. Forecasting may help you head off a problem before it occurs. Some monitoring tools embody best-practice metrics.
  • Look at availability from a user's perspective—the server might be up, but that doesn't mean transactions are being processed promptly.
  • Work with line-of-business managers to determine which systems should get priority in the event of multiple failures.
  • Self-managing systems are coming, but current functionality is limited. However, software-driven analysis can detect patterns that humans would probably miss.

Subscribe now to Australian Technology & Business magazine.



Tying IT to business

Organising the IT function appropriately can assist the alignment of IT and business, Ross suggests: "We believe that a CIO needs to cross-functionalise within the IT departments." A traditionally organised IT operation puts developers in one silo, networks in another, and so on. This leads to finger pointing when things go wrong, so he suggests organising on business process lines, eg, an ERP (SAP) group, a CRM (Siebel) group, and so on. "IT is there to support these critical business applications," he says.

Business people don't really care what fails, they just want their application running again as soon as possible, according to Belger. Consequently, Tivoli takes a whole of infrastructure view, and works with application vendors to provide monitoring and management.

Similarly, Mercury offers Topaz application-specific monitors so customers collect information from within various .NET and J2EE application servers. "It's an area we've put a lot of R&D into," says Lilley, otherwise the application server is a black box with a lot happening inside it. Topaz lets IT staff look inside EJBs, database connections, and so on. This is good for the rapid resolution of problems, he says, as it identifies the area where things are going wrong."

Where do we go from here?
Applications management is about the three Ms—monitor, manage, and maintain—says Ciardulli. "You pick the technology, and we'll be able to find something [to monitor and manage it]." Research commissioned by Mercury, however, showed that while most organisations claim to monitor performance from an end-user perspective, they actually do it subjectively and retrospectively, eg, through surveys and analysing help desk calls. Only a minority do real-time monitoring, says Lilley. "A lot of application management is bottom-up" but you should monitor transactions by looking at real and synthetic users "to get that end-user perspective".

Capturing all the data that monitoring packages gather can give some remarkable insights. One Compuware customer has collected 1.5TB of performance data over two years. Data mining revealed that whenever there was a problem with a particular router in Hobart, a router in Broome rebooted. Further investigation showed this was the result of both routers being given the same IP address, so they were both responding to the remote reboot command.

-Application management is about the three Ms—monitor, manage, and maintain."
—Sal Ciardulli, Systems consulting manager, Quest Software, Asia-Pacific
Then there is "Neugent" (neural agent) technology, developed by Computer Associates and already used in the Unicenter suite. Neugents "learn" the normal state of a system—despite a rule of thumb that CPU utilisation should not exceed 90 percent, a neugent can learn that a particular system is OK at 98 percent. If the data feed to that system fails, utilisation may fall to two percent, which the neugent will recognise as a error and raise an alert. It will also correlate the drop in CPU utilisation with the drop in network traffic leaving the system. Once the cause has been determined, the set of circumstances can be labelled in a meaningful way to help recognition if they recur. A neugent can take into account the circumstances leading up to the event and if they arise again can predict the failure before it happens and estimate its probability. These circumstances can be spread across multiple systems, and the analysis can identify trends that people can't spot. Collecting the data in this way also lets you replay an event to see how it occurred.

Ross also stresses the importance of predicting incidents. Algorithms exist to forecast shortages of disk space, transaction times exceeding thresholds and other SLA exceptions. "This is where the whole optimisation thing comes in," he explains—if you can see you'll need more bandwidth in 10 days and more server space in 90 days you know which is most urgent. Such forecasts do not necessarily mean that resources must be added, as they may result from errors (eg, temporary files not being deleted or users submitting badly-formed SQL queries) rather than genuine shortages.

Belger says IBM's autonomic computing push also involves trends and predictive analysis—if you can catch the problem before it becomes noticeable, that's effective monitoring.

Executive summary

  • Design for reliability, and ensure your chosen architecture can carry the load you will put on it.
  • Consider bringing in experts to ensure packaged software is correctly configured—but check their expertise first.
  • There are many tools to help manage your applications. You may be inclined to seek a one-stop solution, but it may be more pragmatic to find a product that meets most of your needs and then augment it with other vendors' offerings.
  • Test early and test often. Don't leave testing until just before deployment.
  • Change management and the corresponding documentation can be a key to application reliability.
  • Establish performance baselines, and then you can look for trends and spot abnormal situations. Forecasting may help you head off a problem before it occurs. Some monitoring tools embody best-practice metrics.
  • Look at availability from a user's perspective—the server might be up, but that doesn't mean transactions are being processed promptly.
  • Work with line-of-business managers to determine which systems should get priority in the event of multiple failures.
  • Self-managing systems are coming, but current functionality is limited. However, software-driven analysis can detect patterns that humans would probably miss.

Subscribe now to Australian Technology & Business magazine.

Editor's Picks

Free Newsletters, In your Inbox