Previous section.
Universal Measurement Architecture Guide
Copyright © 1997 The Open Group
Features and Benefits of the UMA Interfaces
This Chapter provides further description of the interfaces defined
in the two UMA interface specifications:
-
the UMA Data Capture Layer Interface (DCI) specification (see reference
DCI)
-
the UMA Measurement Layer Interface (MLI) specification (see reference
MLI).
It also describes how they relate to one another.
Features and Benefits of the DCI
The Data Capture Interface (DCI) is the lowest architectural layer in the Universal
Measurement Architecture (UMA). This section will describe the DCI and the services
provided by the DCI, give an understanding of the problems solved by this layer, and the
problems that the DCI was not meant to solve.
The DCI is a collection of programming interfaces. The DCI specification defines the set
of DCI interfaces and the arguments and return values for those interfaces. The DCI
specification also defines the service provided by these programming interfaces.
Performance Management and the DCI
The DCI addresses several important problems in the performance management arena:
-
it provides a consistent interface between system
functions that are providing metrics and those functions
that consume these metrics
-
it allows any system entity, applications, daemons, or the operating system to
provide metrics
-
it separates the metric source from the method for acquiring the metrics. This
allows metric consumers to use a uniform acquisition method regardless of
source.
One of the problems the DCI was not meant to solve is the transmission of data across the
network. The DCI interfaces explicitly limit their scope to metric transmission between
providers and consumers on the same system. The reason for this scope limitation is that
the intersystem metric transmission problem is already addressed by both the higher UMA
architectural level (MLI and Data Services Layer) and by existing solutions, such as
SNMP.
In summary, the DCI is a relatively simple collection of interfaces to provide a uniform
mechanism for transmitting and collecting performance information from any system
entity, from the operating system to applications. Its primary benefits are the
standardisation of the collection interface, the elimination of prior knowledge
of the
metrics being collected, and use of a uniform access mechanism regardless of metric
source.
DCI Service
The service provided by this API (Application Programming Interface) is twofold. First,
the DCI acts as a connection broker between those system components
which produce metrics
(metric providers)
and those system components which consume metrics
(metric consumers).
Second, the DCI provides
a repository, called the DCI name space, for metric providers to store information about
the set of available metrics. Metrics consumers can traverse and interrogate the DCI name
space to find out information about the available metric set. It is not the metrics that are
stored in the name space, instead it is information about the metrics; the metrics
themselves are managed and supplied by the individual metrics providers. The DCI
structure and the client/server relationships are illustrated in
DCI Structure and Client/Server Relationships
.
Figure: DCI Structure and Client/Server Relationships
To clarify the relationship of this figure with the one previous showing the full UMA
architecture, it should be observed that when an implementation includes the MLI, the
UMA Data Services and Measurement Control Layers (which are the MLI service layers)
play the role of a DCI metrics consumer.
Name Space
Through use of the DCI, performance applications and MLI service consumers can
traverse the name space and find out what type of metrics are available, the units and data
types of individual metrics, the number and type of available measured objects,
and human
readable descriptive labels for both the metrics and measured objects.
Through the use of wildcards in the description of a metric,
a metric consumer can request multiple metrics in one call.
This enables the cost of delivering the metrics to be
reduced and the skew between metrics to be minimised.
In the terminology used by the DCI specification, the metrics name space contains names
for metric classes and instances of those classes. A metric class is a grouping of metrics
and the information used to describe that metric set. These
metric attributes
are such
things as units and data types. A metric instance is a representation of a measured object,
such as a disk. Thus there can be a metric class which describes disk I/O metrics and that
class can have five instances, one for each of the system's disk drives. A metrics consumer
can find out about the metrics by reading the metric class attributes. The consumer can
then read the disk performance data for one or all class instantiations. It is
at this point the
DCI's connection broker service comes into play. The metrics consumer does not
require
prior knowledge of which provider supports the desired metric set nor does it need to
know the mechanics of how those metrics are delivered. This is all handled by the DCI
service and the relationship it maintains with the set of system providers.
Polled Metric and Event Support
There are two types of data supported by the DCI: polled metrics and events. The
distinction between the two is whether the consumer or provider is primarily responsible
for metric delivery. In the case of polled metrics, the consumer gets metrics at whatever
rate is convenient.
In the case of event metrics,
consumers must wait for providers to deliver events
as these
events occur.
Traces in UMA are implemented as high frequency events and
are normally directed to a file by the DCI consumer.
Features and Benefits of the MLI
The Measurement Layer Interface (MLI) is an application programming
interface (API) and a set of services
(the UMA Data Services Layer) that simplify the implementation of
measurement application programs (MAPs) in a distributed environment.
- Note:
- In the following
discussion we refer to both polled data and events as
data.
The MLI implements the following aspects of the UMA architecture:
-
allows simple specification of polled data and event collection parameters,
-
establishes a consistent message architecture for UMA data,
and data is available in simply parsed structures
-
manages the distributed collection, reporting and recording of current and
historical data
-
provides synchronised capture of data
-
implements filtering of data based on selection criteria and thresholds
to
minimise network traffic
-
implements seamless switching between current and historical data.
Data Collection, Reporting and Recording
Through the MLI, a MAP can specify the types and characteristics of data to be reported
to a MAP. UMA distinguishes between the
reporting
of data to a MAP and data
collection. A MAP requests data from a specified source to be
reported
to a specified
destination.
The UMA services act on behalf of a MAP to perform the actual
data
collection
through the Data Capture Interface (DCI). Performance overhead
is
minimised by making use of existing collections in progress for other MAPs that
have
requested the same performance measurement data or events.
UMA Messages
UMA messages provide the basis for transmitting existing notifications
and data from UMA to a MAP. In addition, they are the default basis
for transmitting requests between Data Services Layers on
distributed nodes.
The data in UMA messages is identified by
classes
and
subclasses;
this is defined in detail in
UMA Data Pool
.
Control Messages
UMA control messages include MAP requests to the UMA facility, UMA
condition
notifications from UMA to a MAP, and in distributed environments, request and
acknowledgment messages between remote and local Data Services Layers.
Data Messages
Data Messages contain either
interval
data,
event
data or
configuration
data:
-
interval data
is requested by a MAP for capture at the end of a specified time
interval. The data reported through the MLI is the difference in value of the
requested metrics over the interval, or absolute counter values
-
event data
consists of notification messages to the MAP indicating that
some
predefined set of system events has occurred. The system events include UMA
configuration (for example, the availability of metrics), system configuration
(for example, hardware information - such as number/type of processors), and
process-end summaries.
-
configuration data
contains data informing an MLI-based application of the data classes
and subclasses available for each registered provider to the DCI.
Certain message subclasses have both interval and event forms. This permits the MAP to
select whether data is to be reported at each interval end, at an event (for example, the
termination of a process), or both.
Depending on the specified destination, a data message may be directed to the MAP itself,
to UMADS (a common UMA data storage facility), or to a private file for later
processing.
Screening and Filtering of Data
UMA provides two means by which the message traffic to a MAP (and possibly
connected network traffic) can be reduced:
-
establish threshold settings, thereby preventing the transmission of data
messages unless the threshold conditions are satisfied (for example when the
runqueue length reaches a particular value)
-
adjust the granularity of the collected data (for example, by restricting
reporting to a particular process or user id).
Constructed Workloads and Summarisation
The UMA MLI supports requesting of workload construction by permitting
the labelling of workloads. These constructed workloads typically
represent
the result of a request for filtering and/or summarisation of workload
metric
subclasses. For example, one could request the selection of all
commands
starting with the letters "abc" and one could additionally request
that a
specific per-work unit metric subclass report its process metrics over
the sum of all processes whose command names start with these same
letters.
A constructed workload is assigned an identifier by the
caller
which can then be used to tag this workload for later reference.
A special constructed workload that is the
complement
of a
specified workload is also available. The complement workload metrics
are derived by subtracting the selected per-work-unit workload data
values from the available global equivalents. For example, for
reporting at the process level, if the selection criterion is "User
Name: Albert", then the cpu utilization metric for the complement
workload would consist of the global cpu utilization minus the usage
for all processes running under the user name "Albert".
UMA Data Storage
UMA provides for the reading and writing of messages to
and from conventional (private)
files. In addition, UMA provides UMADS, a common facility for access and maintenance
of historical data.
UMADS maintains individually accessible collections of data by host, but there is no
requirement that data for a specific host be kept on that host. Instead, a systems
administrator can arrange to have UMADS collections for any
number of hosts stored on
performance data servers.
Seamless Access
UMA provides
seamless access
between historical and recent (live) data. This means that
a MAP may be receiving UMADS historical data until the time reaches the present, at
which time UMA automatically switches its source to provide live data (see
Seamless Switch - Historical to Recent Data
).
Figure: Seamless Switch - Historical to Recent Data
UMA also provides a
seek
mechanism, so that a MAP can navigate through time and can
seamlessly access UMADS data from the present time (or the reverse). A seek from
RECENT (current data) to UMADS (historical data) is illustrated in
Backwards Seek - Recent to Historical Data
.
Figure: Backwards Seek - Recent to Historical Data
Data Capture Synchronisation
The UMA Data Services and Measurement Control Layers enable better synchronised
data capture in two ways:
-
first, the UMA Data Services can utilise global time synchronisation
facilities, if they are
available, to ensure that polled data collections
on different platforms occur at the same time
-
second, on each individual platform, the UMA Measurement Control Layer merges
all
measurement requests for polled data so that they may be requested at one time by a
single process. This reduces the time skew of the data to the length of the collection time
itself. UMA provides an optional additional check on the time skew at the UMA subclass
collection level. If the collection time duration for the subclass is inordinately long, the
capture can be re-attempted immediately.
Why not acquire a nicely bound hard copy?
Click here to return to the publication details or order a copy
of this publication.