Systems Management: Application Response Measurement (ARM) API
Copyright © 1998 The Open Group

Introduction

Scope and Purpose

The applications that are used to run businesses have changed dramatically over the past few years. In the early 1980s, large applications generally executed on large computers, and were accessed from "dumb" terminals. Non-networked applications executing on personal computers were just beginning to be widely used. Since then, these two application models have moved steadily towards each other, fusing together to form distributed (networked) applications.

The most common programming model for distributed applications is the client/server model. In a client/server application, the application is split into two or more parts. One part is the user or "client" part, and this part generally executes on a personal computer or workstation. The "server" parts execute on computers that provide functions for the client part, that is, they serve the client application. The client and server can run on the same system, but generally they are on different systems. The client part of an application may invoke one or more functions on one or more servers, and it may do a significant amount of processing itself combining, manipulating, or analyzing the data provided by the servers.

An example of a client/server application might be processing a sales order by retrieving inventory information from one database, sales information from another database, and pricing information from a third. The client part of the application determines if there is sufficient inventory to accept the order, calculates the price based on current market conditions, factors in price discounts for this particular customer, and then invokes more server functions to complete processing of the order.

By contrast, host-centric applications contain all the application logic in one computer system, and users connect through "dumb" terminals to use the application. Examples of the protocols used by these applications are 3270, Telnet, and X-Windows. The response time as seen by a user for a transaction can generally be broken down into two components: the time to process the transaction on the host, and the time for the input message and the output response. Processing time at the terminal is usually trivial.

Measuring Service Levels

A monitoring product running at the host is able to measure the service levels of host-centric applications. The monitor observes the input request message that starts the transaction, and then observes the outbound response back to the terminal. The difference between the two times is the amount of time to process the transaction on the host. The monitor generally also measures the time for the outbound response to be sent to the terminal and an acknowledgment to be received, using this as an approximation of the transit time. The combination of the host and transit times is an approximation of the service level seen by the user.

Monitoring the performance and the availability of distributed applications has not proved to be easy to do. Some of the fundamental assumptions that the host-centric methods depend on do not hold true. Some examples showing why this is so are:

The user is typically running an application on a multitasking PC or workstation. When the user presses a key or the mouse button, the specified transaction starts, but the user may be able to continue doing other operations. Put another way, there is no reliable way to correlate keyboard or mouse input operations with business transactions.
One user transaction (which would be classified as a business transaction) may spawn several other component transactions, some of which may execute locally and some remotely. Any measurement agents that exist only in the network layer or in a host (server) will not see the entire picture.
The data may be sent through the network using various protocols, not just one, making the task of packet decoding and correlation much more difficult.
Client/server applications can be complex, taking different execution paths and spawning different component transactions, depending on the results of previous component transactions. Every permutation could take a different form when it goes across the communication link, making it that much harder to reliably correlate network or host (server) observations with what the user sees.

In spite of these difficulties, the need to monitor distributed applications has never been greater. They are increasingly being used in mission-critical roles. An approach that solves the problems listed above is to let the application itself participate in the process. A developer knows unambiguously when transactions begin and end, both those that are visible to the user, and the component transactions that invoke transactions on remote servers.

ARMing Your Applications

With the Application Response Measurement (ARM) API, sections of an application can be marked to define business transactions. By invoking ARM API function calls at the beginning and end of each transaction, the application can be monitored by any of the measurement agents that use data generated by the ARM API. Programs executing on client or server systems can be instrumented.

By instrumenting an application to call the ARM API, that application can be managed by any of the measurement agents that implement ARM. The advantage of this approach is that the user of the application can choose the measurement agent that best meets their needs, without needing to change the application.

Using ARM, system administrators will be able to answer key questions such as:

Is the application working correctly (available)?
How is the application performing? What is the response time? What is the workload throughput? You will be measuring the actual service levels experienced by your users.
Why is an application not available or performing poorly? What operation was the application performing when the problem occurred? If a remote server/application was being invoked when the problem occurred, which one?
Who is using the application, how much are they using it, and what kind of operations are being performed? Which servers are providing the services? This information is useful for capacity planning and for charge-back accounting.

Figure: ARM in the Enterprise

This diagram shows how enterprise management applications, measurement agents that implement the ARM API, and business applications that call the ARM API, work together to provide a robust way to monitor application response.

ARM Version 1.0 and Version 2.0

The ARM version 1.0 API was not adopted or published by The Open Group. Nevertheless, since the ARM version 1.0 API has been released by the ARM working group of the CMG, it is appropriate to position this ARM version 2.0 API in the context of its predecessor.

Several additional features in ARM version 2.0 API improve the ways applications can be managed, compared to ARM version 1.0 API:

You can indicate that a transaction is a component of another transaction. Also, you can do transaction correlation within one system or across multiple systems. This permits a better understanding of the overall transaction, how much time each part of the transaction is taking, and where problems are occurring.
You can provide additional information about the transaction, such as the number of bytes or records being processed, or about the state of the application at the moment that the transaction is being processed, such as the length of a work queue. This information (called application-defined metrics) is useful to better understand response times, and how the application can be tuned to perform better.
You can use the new logging agent to do simple verification of your instrumentation. It allows you to determine if the correct parameters are being passed on each call, but it does not function as a measurement agent.

ARM version 2.0 API is backward compatible with ARM version 1.0. Applications instrumented to the ARM 1.0 API will continue to function correctly with agents that implement the additional features of the ARM 2.0 API. Applications instrumented with ARM 2.0 will function correctly with agents that implement the features of ARM 1.0.

Why not acquire a nicely bound hard copy?
Click here to return to the publication details or order a copy of this publication.

Contents

Next section

Index