As a system architect you must consider many aspects of new development within your organization. Forgetting about the core metrics of a new system can lead to a system that fails because nobody knows how the system is performing or contributing to the overall organizational goals. By following a few simple considerations, you can develop a set of metrics suitable for both measuring your newly developed system and contributing positive feedback to the organization’s goals.

Metrics track development efforts
If you hail from a development background, you’re probably familiar with a variety of metrics used to gauge the performance of a development effort. These metrics are the standards by which the development effort is tracked. Common examples are lines of code, size of source files, or number of defects. These development metrics are tracked by a variety of methods, and many organizations consider them the basis for gauging the effectiveness of a development effort. What must be given equal importance is the development of metrics that will measure the performance and effectiveness of the system under development. In addition to assisting with the development of these metrics, you must organize the development of the software that allows you to gather the metrics and report them so that they can supply valuable feedback to your organization.

Consider operations first
Development of your metrics list can start with the operations group that will have to support the system once it is released from development. The operations group will need timely data to measure the system and monitor the performance of the day-to-day operation of your software. If you take a few moments to consider the system you are developing, you can easily come up with an initial list of metrics that can benefit the operations personnel. For instance, in a data collection system you might measure the performance of the system by measuring items similar to the list in Table A.
Table A

Metric Description
System Started Date and time the system started in UTC. This can be used to return a running time in any time measurement such as days, hours, or minutes.
Running Time Total amount of time the system has been running in milliseconds.
Attempted Connections Count of the total number of connection attempts made since the system started.
Connections Completed Count of connections that successfully completed their transer and confirmation.
Bytes Received Total number of bytes received since the system started.
Messages Received Total number of messages received by the system.
Bytes Sent Total number of bytes sent out to connections.
Commands Sent Count of commands sent.
Active Receiving Connections A count of the currently active connections that are open and sending information.
Connections in Queue Number of connections currently waiting int he queue to be processed.
Connections Failed Number of connections that have failed to complete successfully.
Average Connection Time The average time a connection spends communicating with they system.

Measuring system performance

You should examine similar programs that may have only one thing in common with your system in order to formulate a good list of metrics for your operations. By looking at various examples, you will get a great list of potential operation metrics.

Strategic metrics track organizational goals
Operational metrics are the types of measurements we’re accustomed to dealing with as computer professionals. Less obvious are strategic metrics. Strategic metrics are those measurements that relate directly to the goals and vision of the organization. The difficulty with strategic metrics is that they are not always directly measured by the system under development, but may get data from that system. The challenge is to recognize those items that the organization will require and build metrics into the system to help support the organizational goals. To help figure out these strategic metrics, consider financial, customer, and internal process issues. Take a look at the sample list in Table B using our data collection concept.
Table B

Strategic metric Description
Connections per customer Number of connections per customer to track customer use of the system. Ideally compared to prior methods of customer interaction or perhaps against those methods if they still exist.
Service level agreement Using connection data and system running time you can supply metrics to support your organizations service level agreement (i.e., measure uptime, number of failed customer queries).
Cost per byte Measure the throughput of the system to determine the best model. If you lease bandwidth the amount of bytes you are using is directly charged depending on the plan you use to pay for bandwidth. If you don’t lease bandwidth, you still must maximize the bandwidth you have.
Memory footprint per connection Measure the amount of memory required for managing a connection session. This helps with internal processes for managing machine resources.
Response speed How does this system save time compared to previous methods of performing the work or fulfilling the requests?
Accuracy How accurate is this system compared to the prior system (if one existed)?

Data collection sample metrics

As you can see, the strategic metrics are geared more to measuring the overall benefit to the system and not just how fast it performs or how many bytes it transfers. These metrics may not be directly read from the system, but as the architect, you must consider these factors and make sure that the data you do collect in the new system can support the organization’s strategic goals.

How to collect metrics
How do you collect metrics within your systems? Drop a comment into the discussion area and share your ideas and practices for collecting metrics. In a previous article, I presented a simple set of classes for collecting metrics within a Java-based application.