Governance, Risk, Security, and Compliance

Area Description

Operating at scale requires a different mindset. When the practitioner was starting out, the horizon seemed bounded only by imagination, will, and talent. At enterprise scale, it is a different world. The practitioner finds themselves constrained by indifferent forces and hostile adversaries, some of them competing fairly, and others seeking to attack by any means. Whether or not the organization is a for-profit, publicly traded company, it is now large enough that audits are required; it is also likely to have directors of some nature. The concept of “controls” has entered the practitioner’s awareness.

As a team of teams, the practitioner needs to understand resource management, finance, the basics of multiple product management and coordination, and cross-functional processes. At the enterprise level, they need also to consider questions of corporate governance. Stakeholders have become more numerous, and their demands have multiplied, so the well-established practice of establishing a governing body is applied.

Security threats increase proportionally to the company’s size. The talent and persistence of these adversaries are remarkable. Other challenging players are, on paper, “on the same side”, but auditors are never to be taken for granted. Why are they investigating IT systems? What are their motivations and responsibilities? Finally, what laws and regulations are relevant to IT?

As with other Competency Areas in the later part of this document, we are going to some degree introduce this topic “on its own terms”. We will then add additional context and critique in subsequent sections.

The organization has been concerned with security as a technical practice since your company started. Otherwise, you would not have gotten this big. But now, it has a Chief Information Security Officer, formal risk management processes, a standing director-level security steering committee, auditors, and compliance specialists. That kind of formalization does not usually happen until an organization grows to a certain size.

More than any other Competency Area, the location of this material, and especially its Security subsection, draws attention. Again, any topic in any Competency Area may be a matter of concern at any stage in an organization’s evolution. Security technical practices were introduced in Context I.

This document needed the content in Context III to get this far. It was critical to understand structure, how the organization was organizing our strategic investments, and how the organization is engaging in operational activities. In particular it is difficult for an organization to govern itself without some ability to define and execute processes, as processes often support governance controls and security protocols.

This Competency Area covers “Governance, Risk, Security, and Compliance” because there are clear relationships between these concerns. They have important dimensions of independence as well. It is interesting that Shon Harris' popular Guide to the CISSP® starts its discussion of security with a chapter titled “Information Security Governance and Risk Management”. Governance leads to a concern for risk, and security specializes in certain important classes of risk. Security requires grounding in governance and risk management.

Compliance is also related but again distinct, as is the concern for adherence to laws and regulations, and secondarily internal policy.

Competency Area 10 "Governance, Risk, Security, and Compliance" High-Level Dimensions

  • Define governance versus management

  • Describe key objectives of governance according to major frameworks

  • Define risk management and its components

  • Describe and distinguish assurance and audit, and describe their importance to digital operations

  • Discuss digital security concerns and practices

  • Identify common regulatory compliance issues

  • Describe how governance is retaining its core concerns while evolving in light of Digital Transformation

  • Describe automation techniques relevant to supporting governance objectives throughout the digital delivery pipeline

Governance

Description

What Is Governance?

The system by which organizations are directed and controlled.
— Cadbury Report

To talk about governing digital or IT capabilities, we must talk about governance in general. Governance is a challenging and often misunderstood concept. First and foremost, it must be distinguished from “management”. This is not always easy but remains essential. The ISACA COBIT framework, across its various versions, has made a clear distinction between governance and management, which encompass different types of activities, organizational structures, and purposes. In most enterprises, governance is the responsibility of the Board of Directors under the leadership of the chairperson while management is the responsibility of the executive management under the leadership of the CEO.

A Governance Example

Here is simple explanation of governance:

Suppose you own a small retail store. For years, you were the primary operator. You might have hired an occasional cashier, but that person had limited authority; they had the keys to the store and cash register, but not the safe combination, nor was their name on the bank account. They did not talk to your suppliers. They received an hourly wage, and you gave them direct and ongoing supervision.[1] In this case, you were a manager. Governance was not part of the relationship.

Now, you wish to go on an extended vacation — perhaps a cruise around the world, or a trek in the Himalayas. You need someone who can count the cash and deposit it, and place orders with and pay your suppliers. You need to hire a professional manager.

They will likely draw a salary, perhaps some percentage of your proceeds, and you will not supervise them in detail as you did the cashier. Instead, you will set overall guidance and expectations for the results they produce. How do you do this? And perhaps even more importantly, how do you trust this person?

Now, you need governance.

As we see in the above quote, one of the most firmly reinforced concepts in the COBIT guidance (more on this and ISACA in the next section) is the need to distinguish governance from management. Governance is by definition a Board-level concern. Management is the CEO’s concern. In this distinction, we can still see the shop owner and his or her delegate.

Theory of Governance

In political science and economics, the need for governance is seen as an example of the principal-agent problem [93]. Our shopkeeper example illustrates this. The hired manager is the “agent”, acting on behalf of the shop owner, who is the “principal”.

In principal-agent theory, the agent may have different interests than the principal. The agent also has much more information (think of the manager running the shop day-to-day, versus the owner off climbing mountains). The agent is in a position to do economic harm to the principal; to shirk duty, to steal, to take kickbacks from suppliers. Mitigating such conflicts of interest is a part of governance.

In larger organizations (such as you are now), it is not just a simple matter of one clear owner vesting power in one clear agent. The corporation may be publicly owned, or in the case of a non-profit, it may be seeking to represent a diffuse set of interests (e.g., environmental issues). In such cases, a group of individuals (directors) is formed, often termed a “Board”, with ultimate authority to speak for the organization.

The principal-agent problem can be seen at a smaller scale within the organization. Any manager encounters it to some degree, in specifying activities or outcomes for subordinates. But this does not mean that the manager is doing “governance”, as governance is by definition an organization-level concern.

The fundamental purpose of a Board of Directors and similar bodies is to take the side of the principal. This is easier said than done; Boards can become overly close to an organization’s senior management — the senior managers are real people, while the “principal” may be an amorphous, distant body of shareholders and/or stakeholders.

Because governance is the principal’s concern, and because the directors represent the principal, governance, including IT governance, is a Board-level concern.

There are various principles of corporate governance we will not go into here, such as shareholder rights, stakeholder interests, transparency, and so forth. However, as we turn to our focus on digital and IT-related governance, there are a few final insights from the principal-agent theory that are helpful to understanding governance. Consider:

the heart of principal-agent theory is the trade-off between (a) the cost of measuring behavior and (b) the cost of measuring outcomes and transferring risk to the agent. [93]

What does this mean? Suppose the shopkeeper tells the manager, “I will pay you a salary of $50,000 while I am gone, assuming you can show me you have faithfully executed your daily duties.”

The daily duties are specified in a number of checklists, and the manager is expected to fill these out daily and weekly, and for certain tasks, provide evidence they were performed (e.g., bank deposit slips, checks written to pay bills, photos of cleaning performed, etc.). That is a behavior-driven approach to governance. The manager need not worry if business falls off; they will get their money. The owner has a higher level of uncertainty; the manager might falsify records, or engage in poor customer service so that business is driven away. A fundamental conflict of interest is present; the owner wants their business sustained, while the manager just wants to put in the minimum effort to collect the $50,000. When agent responsibilities can be well specified in this manner, it is said they are highly programmable.

Now, consider the alternative. Instead of this very scripted set of expectations, the shopkeeper might tell the manager, “I will pay you 50% of the shop’s gross earnings, whatever they may be. I’ll leave you to follow my processes however you see fit. I expect no customer or vendor complaints when I get back.”

In this case, the manager’s behavior is more aligned with the owner’s goals. If they serve customers well, they will likely earn more. There are any number of hard-to-specify behaviors (less programmable) that might be highly beneficial.

For example, suppose the store manager learns of an upcoming street festival, a new one that the owner did not know of or plan for. If the agent is managed in terms of their behavior, they may do nothing — it’s just extra work. If they are measured in terms of their outcomes, however, they may well make the extra effort to order merchandise desirable to the street fair participants, and perhaps hire a temporary cashier to staff an outdoor booth, as this will boost store revenue and therefore their pay.

(Note that we have considered similar themes in our discussion of Agile and contract management, in terms of risk sharing.)

In general, it may seem that an outcome-based relationship would always be preferable. There is, however, an important downside. It transfers risk to the agent (e.g., the manager). And because the agent is assuming more risk, they will (in a fair market) demand more compensation. The owner may find themselves paying $60,000 for the manager’s services, for the same level of sales, because the manager also had to “price in” the possibility of poor sales and the risk that they would only make $35,000.

Finally, there is a way to align interests around outcomes without going fully to performance-based pay. If the manager for cultural reasons sees their interests as aligned, this may mitigate the principal-agent problem. In our example, suppose the store is in a small, tight-knit community with a strong sense of civic pride and familial ties.

Even if the manager is being managed in terms of their behavior, their cultural ties to the community or clan may lead them to see their interests as well aligned with those of the principal. As noted in [93], “Clan control implies goal congruence between people and, therefore, the reduced need to monitor behavior or outcomes. Motivation issues disappear.” We have discussed this kind of motivation in Coordination and Process, especially in our discussion of control culture and insights drawn from the military.

COSO and Control
Internal control is a process, effected by an entity’s Board of Directors, management, and other personnel, designed to provide reasonable assurance regarding the achievement of objectives relating to operations, reporting, and compliance.
— Committee of Sponsoring Organizations of the Treadway Commission
Internal Control — Integrated Framework

An important discussion of governance is found in the statements of COSO on the general topic of "control".

Control is a term with broader and narrower meanings in the context of governance. In the area of risk management, “controls” are specific approaches to mitigating risk. However, “control” is also used by COSO in a more general sense to clarify governance.

Control activities, according to COSO, are:

the actions established through policies and procedures that help ensure that management’s directives to mitigate risks to the achievement of objectives are carried out. Control activities are performed at all levels of the entity, at various stages within business processes, and over the technology environment. They may be preventive or detective in nature and may encompass a range of manual and automated activities such as authorizations and approvals, verifications, reconciliations, and business performance reviews.

…​ Ongoing evaluations, built into business processes at different levels of the entity, provide timely information. Separate evaluations, conducted periodically, will vary in scope and frequency depending on assessment of risks, effectiveness of ongoing evaluations, and other management considerations. Findings are evaluated against criteria established by regulators, recognized standard-setting bodies or management, and the Board of Directors, and deficiencies are communicated to management and the Board of Directors as appropriate. [76]

Analyzing Governance

Governance Context
governance environment
Figure 113. Governance in Context

Governance is also a practical concern for you because, at your scale, you have a complex set of environmental forces to cope with (see Governance in Context). You started with a focus on the customer, and the market they represented. Sooner or later, you encountered regulators and adversaries: competitors and cybercriminals.

These external parties intersect with your reality via various channels:

  • Your brand, which represents a sort of general promise to the market (see [272], p.16)

  • Contracts, which represent more specific promises to suppliers and customers

  • Laws, regulations, and standards, which can be seen as promises you must make and keep in order to function in civil society, or in order to obtain certain contracts

  • Threats, which may be of various kinds:

    • Legal

    • Operational

    • Intentional

    • Unintentional

    • Illegal

    • Environmental

We will return to the role of external forces in our discussion of assurance. For now, we will turn to how digital governance, within an overall system of digital delivery, reflects our emergence model.

Governance and the Emergence Model

In terms of our emergence model, one of the most important distinctions between a “team of teams” and an “enterprise” is the existence of formalized organizational governance.

governance emergence
Figure 114. Governance Emerges at the Enterprise Level

As illustrated in Governance Emerges at the Enterprise Level, formalized governance is represented by the establishment of a governing body, responsive to some stakeholders who seek to recognize value from the organization or “entity” — in this case, a digital delivery organization.

Corporate governance is a broad and deep topic, essential to the functioning of society and its organized participants. These include for-profit, non-profit, and even governmental organizations. Any legally organized entity of significant scope has governance needs.

One well-known structure for organizational governance is seen in the regulated, publicly owned company (such as those listed on stock exchanges) . In this model, shareholders elect a governing body (usually termed the Board of Directors), and this group provides the essential direction for the enterprise as a whole.

However, organizational governance takes other forms. Public institutions of higher education may have a Board of Regents or Board of Governors, perhaps appointed by elected officials. Non-profits and incorporated private companies still require some form of governance, as well. One of the less well-known but very influential forms of governance is the venture capital portfolio approach, very different from a public, mission-driven company. We will talk more about this in the digital governance section.

These are well-known topics in law, finance, and social organization, and there are many sources you can turn to if you have further interest. If you are taking any courses in Finance or Accounting, you will likely cover governance objectives and processes.

architecture of governance
Figure 115. Governance and Management with Interface

Illustrated in Governance and Management with Interface[2] is a more detailed visual representation of the relationship between governance and management in a digital context. Reading from the top down:

Value recognition is the fundamental objective of the stakeholder. We discussed in Product Management the value objectives of effectiveness, efficiency, and risk (aka top line, bottom line, and risk). These are useful final targets for impact mapping, to demonstrate that lower-level perhaps more “technical” product capabilities do ultimately contribute to organization outcomes.

The term “value recognition” as the stakeholder goal is chosen over “value creation” as “creation” requires the entire system. Stakeholders do not “create” without the assistance of management, delivery teams, and the individual.

Here, we see them from the stakeholder perspective of:

  • Benefits realization

  • Cost optimization

  • Risk optimization

(Adapted from [146 p. 23])

Both ISO 38500 [155] as well as COBIT [146, 152] specify that the fundamental governance activities of governance are:

  • Direct

  • Evaluate

  • Monitor

Evaluation is the analysis of current state, including current proposals and plans. Directing is the establishment of organizational intent as well as the authorization of resources. Monitoring is the ongoing attention to organizational status, as an input to evaluation and direction.

Direct, Evaluate, and Monitor may also be ordered as Evaluate, Direct, and Monitor (EDM). These are highly general concepts that in reality are performed simultaneously, not as any sort of strict sequence.

The governance/management interface is an essential component. The information flows across this interface are typically some form of the following:

From the governing side

  • Goals (e.g., product and go-to-market strategies)

  • Resource authorizations (e.g., organizational budget approvals)

  • Principles and policies (e.g., personnel and expense policies)

From the governed side

  • Plans and proposals (at a high level; e.g., budget requests)

  • Performance reports (e.g., sales figures)

  • Conformance/compliance indicators (e.g., via audit and assurance)

Notice also the circular arrow at the center of the governance/management interface. Governance is not a one-way street. Its principles may be stable, but approaches, tools, practices, processes, and so forth (what we will discuss below as "governance elements") are variable and require ongoing evolution.

We often hear of “bureaucratic” governance processes. But the problem may not be “governance” per se. It is more often the failure to correctly manage the governance/management interface. Of course, if the Board is micro-managing, demanding many different kinds of information and intervening in operations, then governance and its management response is all much the same thing. In reality, however, burdensome organizational “governance” processes may be an overdone, bottom-up management response to perceived Board-level mandates.

Or they may be point-in-time requirements no longer needed. The policies of 1960 are unsuited to the realities of 2020. But if policies are always dictated top-down, they may not be promptly corrected or retired when no longer applicable. Hence, the scope and approach of governance in terms of its elements must always be a topic of ongoing, iterative negotiation between the governed and the governing.

Finally the lowermost digital delivery chevron — aka value chain, represents most of what we have discussed in Contexts I, II, and III:

  • The individual working to create value using digital infrastructure and lifecycle pipelines

  • The team collaborating to discover and deliver valuable digital products

  • The team of teams coordinating to deliver higher-order value while balancing effectiveness with efficiency and consistency

Ultimately, governance is about managing results and risk. It is about objectives and outcomes. It is about “what”, not “how”. In terms of practical usage, it is advisable to limit the “governance” domain — including the use of the term — to a narrow scope of the Board or Director-level concerns, and the existence of certain capabilities, including:

  • Organizational policy management

  • External and internal assurance and audit

  • Risk management, including security aspects

  • Compliance

We turn to a more in depth conversation of how governance plays out across its boundary with management.

Evidence of Notability

Corporate governance is a central concern for organizations as they start to scale. Understanding its fundamentals, and especially distinguishing it from management, is critical. There is substantial evidence for this, including the very existence of ISACA as well as COSO and related organizations.

Limitations

Governance is an abstract and difficult to understand concept for people in earlier career stages. The tendency is to either lump it in with "management" in general, or equate it just with "security".

Related Topics

Implementing Governance

To govern, organizations must rely on a variety of mechanisms, which are documented in this competency category.

Elements of a governance structure

Organizations leverage a variety of mechanisms to implement governance. The following draws on COBIT 2019 and other sources [147, 152]:

  • Mission, Principles, Policies, and Frameworks

  • Organizational Structures

  • Culture, Ethics, and Behavior

  • People, Skills, and Competencies

  • Processes and Procedures

  • Information

  • Services, Infrastructure, and Applications

COBIT terminology has changed over the years, from "control objectives" (v4 and before) to "enablers" (v5) to the current choice of "components". This document uses the term "elements", and retains frameworks as a key governance element (COBIT 2019 dropped).

In Elements Across the Governance Interface varying lengths of the elements are deliberate. The further upward the bar extends, the more they are direct concerns for governance. All of the elements are discussed elsewhere in the book.

Governance elements
Figure 116. Elements Across the Governance Interface
Table 21. Governance element cross-references
Element Covered

Principles, Policies, and Procedures

Principles & policies covered in this Competency Category. Frameworks covered in (PMBOK), (CMMI, ITIL, COBIT, TOGAF), (DMBOK).

Processes

Competency Area 7

Organizational Structures

Competency Area 9 on org structure

Culture, Ethics, and Behavior

Competency Area 9 on culture

Information

Competency Area 11

Services, Infrastructure, and Applications

Context I

People, Skills, and Competencies

Competency Area 9 on workforce

Here, we are concerned with their aspect as presented to the governance interface. Some notes follow.

Mission, Principles, Policies, and Frameworks

Carefully drafted and standardized policies and procedures save the company countless hours of management time. The consistent use and interpretation of such policies, in an evenhanded and fair manner, reduces management’s concern about legal issues becoming legal problems.
— Michael Griffin
“How To Write a Policy Manual"

Principles are the most general statement of organizational vision and values. Policies will be discussed in detail in the next section. In general, they are binding organization mandates or regulations, sometimes grounded in external laws. Frameworks were defined in Industry Frameworks. We discuss all of these further in this Competency Category.

Some companies may need to institute formal policies quite early. Even a startup may need written policies if it is concerned with regulations such as HIPAA. However, this may be done on an ad hoc basis, perhaps outsourced to a consultant. (A startup cannot afford a dedicated VP of Policy and Compliance.) This topic is covered in detail in this section because, at enterprise scale, ongoing policy management and compliance must be formalized. Recall that formalization is the basis of our emergence model.
Vision Hierarchy
policy hierarchy
Figure 117. Vision/Mission/Policy Hierarchy

Illustrated in Vision/Mission/Policy Hierarchy is one way to think about policy in the context of our overall governance objective of value recognition.

The organization’s vision and mission should be terse and high level, perhaps something that could fit on a business card. It should express the organization’s reason for being in straightforward terms. Mission is the reason for being; vision is a “picture” of the future, preferably inspirational.

The principles and codes should also be brief. (“Codes” can include codes of ethics or codes of conduct.) For example, Nordstrom’s is about 8,000 words, perhaps about ten pages.

Policies are more extensive. There are various kinds of policies:

In a non-IT example, a compliance policy might identify the Foreign Corrupt Practices Act and make it clear that bribery of foreign officials is unacceptable. Similarly, a human resources policy might spell out acceptable and unacceptable side jobs (e.g., someone in the banking industry might be forbidden from also being a mortgage broker on their own account).

Policies are often independently maintained documents, perhaps organized along lines similar to:

  • Employment and human resources policies

  • Whistleblower policy (non-retaliation)

  • Records retention

  • Privacy

  • Workplace guidelines

  • Travel and expense

  • Purchasing and vendor relationships

  • Use of enterprise resources

  • Information security

  • Conflicts of interest

  • Regulatory

(This is not a comprehensive list.)

Policies, even though more detailed than codes of ethics/conduct, still should be written fairly broadly. In many organizations, they must be approved by the Board of Directors. They should, therefore, be independent of technology specifics. An information security policy may state that the hardening guidelines must be followed, but the hardening guidelines (stipulating, for example, what services and open ports are allowable on Debian® Linux) are not policy. There may be various levels or classes of policy.

Finally, policies reference standards and processes and other governance elements as appropriate. This is the management level, where documentation is specific and actionable. Guidance here may include:

  • Standards

  • Baselines

  • Guidelines

  • Processes and procedures

These concepts may vary according to organization, and can become quite detailed. Greater detail is achieved in hardening guidelines. A behavioral baseline might be “Guests are expected to sign in and be accompanied when on the data center floor.” We will discuss technical baselines further in the Competency Category on security, and also in our discussion of the technology product lifecycle in Architecture. See also Shon Harris' excellent CISSP Exam Guide [124] for much more detail on these topics.

The ideal end state is a policy that is completely traceable to various automation characteristics, such as approved “Infrastructure as Code” settings applied automatically by configuration management software (as discussed in “The DevOps Audit Toolkit,” [85] — more on this to come). However, there will always be organizational concerns that cannot be fully automated in such manners.

Policies (and their implementation as processes, standards, and the like) must be enforced. As Steve Schlarman notes “Policy without a corresponding compliance measurement and monitoring strategy will be looked at as unrealistic, ignored dogma.” [247]

Policies and their derivative guidance are developed, just like systems, via a lifecycle. They require some initial vision and an understanding of what the requirements are. Again, Schlarman: “policy must define the why, what, who, where, and how of the IT process” [247]. User stories have been used effectively to understand policy needs.

Finally, an important point to bear in mind:

Company policies can breed and multiply to a point where they can hinder innovation and risk-taking. Things can get out of hand as people generate policies to respond to one-time infractions or out of the ordinary situations [116 p. 17].

It is advisable to institute sunset dates or some other mechanism that forces their periodic review, with the understanding that any such approach generates demand on the organization that must be funded. We will discuss this more in the Competency Category on digital governance.

Standards, Frameworks, Methods, and the Innovation Cycle

We used the term “standards” above without fully defining it. We have discussed a variety of industry influences throughout this document: PMBOK, ITIL, COBIT, Scrum, Kanban, ISO/IEC 38500, and so on. We need to clarify their roles and positioning further. All of these can be considered various forms of “guidance” and as such may serve as governance elements. However, their origins, stakeholders, format, content, and usage vary greatly.

First, the term "standard” especially has multiple meanings. A “standard” in the policy sense may be a set of compulsory rules. Also, “standard” or “baseline” may refer to some intended or documented state the organization uses as a reference point. An example might be “we run Debian Linux 16_10 as a standard unless there is a compelling business reason to do otherwise”.

This last usage shades into a third meaning of uniform, de jure standards such as are produced by the IEEE, IETF, and ISO/IEC.

  • ISO/IEC: International Standards Organization/International Electrotechnical Commission

  • IETF: Internet Engineering Task Force

  • IEEE: Institute of Electrical and Electronics Engineers

The International Standards Organization, or ISO, occupies a central place in this ecosystem. It possesses “general consultative status” with the United Nations, and has over 250 technical committees that develop the actual standards.

The IEEE standardizes such matters as wireless networking (e.g., WiFi). The IETF (Internet Engineering Task Force) standardizes lower-level Internet protocols such as TCP/IP and HTTP. The W3C (World Wide Web Consortium) standardizes higher-level Internet protocols such as HTML. Sometimes standards are first developed by a group such as the IEEE and then given further authority though publication by ISO/IEC. The ISO/IEC in particular, in addition to its technical standards, also develops higher-order management/"best practice” standards. One well-known example of such an ISO standard is the ISO 9000 series on quality management.

Some of these standards may have a great effect on the digital organization. We will discuss this further in the Competency Category on compliance.

Frameworks were discussed in Industry Frameworks. Frameworks have two major meanings. First, software frameworks are created to make software development easier. Examples include Struts, AngularJS, and many others. This is a highly volatile area of technology, with new frameworks appearing every year and older ones gradually losing favor.

In general, we are not concerned with these kinds of specific frameworks in this document, except governing them as part of the technology product lifecycle. We are concerned with “process” frameworks such as ITIL, PMBOK, COBIT, CMMI, and the TOGAF framework. These frameworks are not “standards” in and of themselves. However, they often have corresponding ISO standards:

Table 22. Frameworks and Corresponding Standards
Framework Standard

ITIL

ISO/IEC 20000

COBIT

ISO/IEC 38500

PMBOK

ISO/IEC 21500

CMMI

ISO/IEC 15504

TOGAF

ISO/IEC 42010

Frameworks tend to be lengthy and verbose. The ISO/IEC standards are brief by comparison, perhaps on average 10% of the corresponding framework. Methods (aka methodologies) in general are more action-oriented and prescriptive. Scrum and XP are methods. It is at least arguable that PMBOK is a method as well as a framework.

There is little industry consensus on some of these definitional issues, and the student is advised not to be overly concerned about such abstract debates. If you need to comply with something to win a contract, it doesn’t matter whether it’s a “standard”, “framework”, “guidance”, “method”, or whatever.

Finally, there are terms that indicate technology cycles, movements, communities of interest, or cultural trends: Agile and DevOps being two of the most current and notable. These are neither frameworks, standards, nor methods. However, commercial interests often attempt to build frameworks and methods representing these organic trends. Examples include SAFe, Disciplined Agile Delivery, the DevOps Institute, the DevOps Agile Skills Association, and many others.

standards cycle
Figure 118. Innovation Cycle

Figure: Innovation Cycle illustrates the innovation cycle. Innovations produce value, but innovation presents change management challenges, such as cost and complexity. The natural response is to standardize for efficiency, and standardization taken to its end state results in commodification, where costs are optimized as far as possible, and the remaining concern is managing the risk of the commodity (as either consumer or producer). While efficient, commoditized environments offer little competitive value, and so the innovation cycle starts again.

Note that the innovation cycle corresponds to the elements of value recognition:

  • Innovation corresponds to Benefits Realization

  • Standardization corresponds to Cost Optimization

  • Commoditization corresponds to Risk Optimization

Organizational Structures

We discussed basic organizational structure in Organization and Culture. However, governance also may make use of some of the scaling approaches discussed in IT Human Resources Management. Cross-organization coordination techniques (similar to those discussed in Boundary Spanning) are frequently used in governance (e.g., cross-organizational coordinating committees, such as an enterprise security council).

Culture, Ethics, and Behavior

Culture, ethics, and behavior as a governance element can both drive revenue as well as risk and cost. See also Culture and Hiring.

People, Skills, and Competencies

People and their skills and competencies (covered in IT Human Resources Management) are governance elements upon in which all the others rest. “People are our #1 asset” may seem to be a cliche, but it is ultimately true. Formal approaches to understanding and managing this base of skills are therefore needed. A basic “human resources” capability is a start, but sophisticated and ambitious organizations institute formal organizational learning capabilities to ensure that talent remains a critical focus.

Processes

Process is defined in Definitions. We will discuss processes as controls in the upcoming Competency Category on risk management. A control is a role that a governance element may play. Processes are the primary form of governance element used in this sense.

Information

Information is a general term; in the sense of a governance element, it is based on data in its various forms, with overlays of concepts (such as syntax and semantics) that transform raw “data” into a resource that is useful and valuable for given purposes. From a governance perspective, information carries governance direction to the governed system, and the fed back monitoring is also transmitted as information. Information resource management and related topics such as data governance and data quality are covered in Information Management; it is helpful to understand governance at an overall level before going into these more specific domains.

Services, Infrastructure, and Applications

Services, infrastructure, and applications of course are the critical foundation of digital value. These fundamental topics were covered in Context I. In the sense of governance elements, they have a recursive or self-reflexive quality. Digital technology automates business objectives; at scale, a digital pipeline becomes a non-trivial business concern in and of itself, requiring considerable automation [31], [278]. Applications that serve as digital governance elements might include:

  • Source control

  • Build management

  • Package management

  • Deployment and configuration management

  • Monitoring

  • Portfolio management

Evidence of Notability

Governance requires mechanisms in order to function.

Limitations

Elaborate structures of governance elements (especially processes) can slow down digital delivery and, ironically, contribute greater risk than they reduce.

Related Topics

Risk and Compliance Management

Description

Risk and compliance are grouped together in this Competency Category as they are often managed by unified functional areas or capabilities (e.g., "Corporate Risk and Compliance"). Note, however, that they are two distinct concerns, to be outlined below.

Risk Management Fundamentals

Risk is defined as the possibility that an event will occur and adversely affect the achievement of objectives.
— Committee of Sponsoring Organizations of the Treadway Commission
Internal Control — Integrated Framework

Risk is a fundamental concern of governance. Management (as we have defined it in this Competency Category) may focus on effectiveness and efficiency well enough, but too often disregards risk.

As we noted above, the shop manager may have incentives to maximize income but, usually, does not stand to lose their life savings. The owner, however, does. Fire, theft, disaster — without risk management, the owner does not sleep well.

For this reason, risk management is a large element of governance, as indicated by the popular GRC acronym: Governance, Risk Management, and Compliance.

Defining “Risk"

The definition of “risk” is surprisingly controversial. The ISO 31000 standard [156] and the Project Management Institute® PMBOK [223] both define risk as including positive outcomes (benefits). This definition has been strongly criticized by (among others) Douglas Hubbard in The Failure of Risk Management [133]. Hubbard points out that, traditionally, risk has meant the chance and consequences of loss.

As this is an an overview text, we will use the more pragmatic, historical definition. Practically speaking, operational risk management as a function focuses on loss. The possibility (“risk”) of benefits is eagerly sought by the organization as a whole and does not need “management” by a dedicated function.

“Loss”, however, can also equate to “failure to achieve anticipated gains”. This form of risk applies (for example) to product and project investments.

Risk management can be seen as both a function and a process (see Risk Management Context). As a function, it may be managed by a dedicated organization (perhaps called Enterprise Risk Management or Operational Risk Management). As a process, it conducts the following activities:

  • Identifying risks

  • Assessing and prioritizing them

  • Coordinating effective responses to risks

  • Ongoing monitoring and reporting of risk management

Risk impacts some asset. Examples in the digital and IT context would include:

  • Operational IT systems

  • Hardware (e.g., computers) and facilities (e.g., data centers)

  • Information (customer or patient records)

It is commonly said that organizations have an “appetite” for risk [149 p. 79], in terms of the amount of risk the organization is willing to accept. This is a strategic decision, usually reserved for organizational governance.

risk context
Figure 119. Risk Management Context

Risk management typically has strong relationships with the following organizational capabilities:

  • Enterprise governance (e.g., Board-level committees)

  • Security

  • Compliance

  • Audit

  • Policy management

For example, security requires risk assessment as an input, so that security resources focus on the correct priorities. Risk additionally may interact with:

  • Project management

  • Asset management

  • Processes such as change management

and other digital activities. More detail on core risk management activities follows, largely adopted from the COBIT for Risk publication [149].

Risk Identification

There are a wide variety of potential risks, and many accounts and anecdotes are constantly circulating. It is critical that risk identification begins with a firm understanding of the organization’s objectives and context.

Risk identification can occur both in a “top-down” and “bottom-up” manner. Industry guidance can assist the operational risk management function in identifying typical risks. For example, the COBIT for Risk publication includes a useful eight-page “Generic Risk Scenarios” section [149 pp. 67-74] identifying risks such as:

  • “Wrong programs are selected for implementation and are misaligned with corporate strategy and priorities”

  • “There is an earthquake”

  • “Sensitive data is lost/disclosed through logical attacks”

These are only three of dozens of scenarios. Risks, of course, extend over a wide variety of areas:

  • Investment

  • Sourcing

  • Operations

  • Availability

  • Continuity

  • Security

and so forth. The same guidance also strongly cautions against over-reliance on these generic scenarios.

Risk Assessment

Risk management has a variety of concepts and techniques both qualitative and quantitative. Risk is often assumed to be the product of probability times impact. For example, if the chance of a fire in a facility is 5% over a given year, and the damage of the fire is estimated at $100,000, the annual risk is $5,000. An enterprise risk management function may attempt to quantify all such risks into an overall portfolio.

Where quantitative approaches are perceived to be difficult, risk may be assessed using simple ordinal scales (e.g., 1 to 5, where 1 is low risk, and 5 is high risk). COBIT for Risk expresses concern regarding “the use of ordinal scales for expressing risk in different categories, and the mathematical difficulties or dangers of using these numbers to do any sort of calculation” [149 p. 75]. Such approaches are criticized by Doug Hubbard in The Failure of Risk Management as misleading and potentially more harmful than not managing risk at all [133].

Hubbard instead suggests that quantitative techniques such as Monte Carlo analysis are rarely infeasible, and recommends their application instead of subjective scales.

The enterprise can also consider evaluating scenarios that have a chance of occurring simultaneously. This is frequently referred to as "stress" testing.

Risk Response
He who fights and runs away lives to fight another day.
— Menander
342 BC — 291 BC

Risk response includes several approaches:

  • Avoidance

  • Acceptance

  • Transference

  • Mitigation

Avoidance means ending the activities or conditions causing the risk; e.g., not engaging in a given initiative or moving operations away from risk factors.

Acceptance means no action is taken. Typically, such “acceptance” must reside with an executive.

Transference means that some sharing arrangement, usually involving financial consideration, is established. Common transfer mechanisms include outsourcing and insurance. (Recall our discussion of Agile approaches to contract management and risk sharing.)

Mitigation means that some compensating mechanism — one or more “controls” is established. This topic is covered in the next section and comprises the remainder of the material on risk management.

(The above discussion was largely derived from the ISACA COBIT 5 framework for Risk [150]).

Controls
The term "control objective" is no longer a mainstream term used in COBIT 5, and the word "control" is used only rarely. Instead, COBIT 5 uses the concepts of process practices and process activities.
— ISACA
COBIT 5 for Assurance

The term “control” is problematic.

It has distasteful connotations to those who casually encounter it, evoking images of "command and control” management, or “controlling” personalities. COBIT, which once stood for Control Objectives for IT, now deprecates the term control (see the above quote). Yet it retains a prominent role in many discussions of enterprise governance and risk management, as we saw at the start of this Competency Category in the discussion of COSO’s general concept of control. Also (as discussed in our coverage of Scrum’s origins) it is a technical term of art in systems engineering. As such it represents principles essential to understanding large-scale digital organizations.

In this section, we are concerned with controls in a narrower sense, as risk mitigators. Governance elements such as policies, procedures, organizational structures, and the rest are used and intended to ensure that:

  • Investments achieve their intended outcomes

  • Resources are used responsibly, and protected from fraud, theft, abuse, waste, and mismanagement

  • Laws and regulations are adhered to

  • Timely and reliable information is employed for decision-making

But what are examples of “controls"? Take a risk, such as the risk of a service (e.g., e-commerce website) outage resulting in loss of critical revenues. There are a number of ways we might attempt to mitigate this risk:

  • Configuration management (a preventative control)

  • Effective monitoring of system alerts (a detective control)

  • Documented operational responses to detected issues (a corrective control)

  • Clear recovery protocols that are practiced and well understood (a recovery control)

  • System redundancy of key components where appropriate (a compensating control)

and so forth. Another kind of control appropriate to other risks is deterrent (e.g., an armed guard at a bank).

Other types of frequently seen controls include:

  • Separation of duties

  • Audit trails

  • Documentation

  • Standards and guidelines

A control type such as "separation of duties” is very general and might be specified by activity type; for example:

  • Purchasing

  • System development and release

  • Sales revenue recognition

Each of these would require distinct approaches to separation of duties. Some of this may be explicitly defined; if there is no policy or control specific to a given activity, an auditor may identify this as a deficiency.

Policies and processes in their aspect as controls are often what auditors test. In the case of the website above, an auditor might test the configuration management approach, the operational processes, inspect the system redundancy, and so forth. And risk management would maintain an ongoing interest in the system in between audits.

As with most topics in this document, risk management (in and of itself, as well as applied to IT and digital) is an extensive and complex domain, and this discussion was necessarily brief.

Business Continuity

Business continuity is an applied domain of digital and IT risk, like security. Continuity is concerned with large-scale disruptions to organizational operations, such as:

  • Floods

  • Earthquakes

  • Tornadoes

  • Terrorism

  • Hurricanes

  • Industrial catastrophes (e.g., large-scale chemical spills)

A distinction is commonly made between:

  • Business continuity planning

  • Disaster recovery

Disaster recovery is more tactical, including the specific actions taken during the disaster to mitigate damage and restore operations, and often with an IT-specific emphasis.

Continuity planning takes a longer-term view of matters such as long-term availability of replacement space and computing capacity.

There are a variety of standards covering business continuity planning, including:

  • NIST Special Publication 800-34

  • ISO/IEC 27031:2011

  • ISO 22301

In general, continuity planning starts with understanding the business impact of various disaster scenarios and developing plans to counter them. Traditional guidance suggests that this should be achieved in a centralized fashion; however, large, centralized efforts of this nature tend to struggle for funding.

While automation alone cannot solve problems such as “where do we put the people if our main call center is destroyed”, it can help considerably in terms of recovering from disasters. If a company has been diligent in applying Infrastructure as Code techniques, and loses its data center, it can theoretically re-establish its system configurations readily, which can otherwise be a very challenging process, especially under time constraints. (Data still needs to have been backed up to multiple locations.)

Compliance

Compliance is a very general term meaning conformity or adherence to:

  • Laws

  • Regulations

  • Policies

  • Contracts

  • Standards

and the like. Corporate compliance functions may first be attentive to legal and regulatory compliance, but the other forms of compliance are matters of concern as well.

A corporate compliance office may be responsible for the maintenance of organizational policies and related training and education, perhaps in partnership with the human resources department. They also may monitor and report on the state of organizational compliance. Compliance offices may also be responsible for codes of ethics. Finally, they may manage channels for anonymous reporting of ethics and policy violations by whistleblowers (individuals who become aware of and wish to report violations while receiving guarantees of protection from retaliation).

Compliance uses techniques similar to risk management and, in fact, non-compliance can be managed as a form of risk, and prioritized and handled much the same way. However, compliance is an information problem as well as a risk problem. There is an ongoing stream of regulations to track, which keeps compliance professionals very busy. In the US alone, these include:

  • HIPAA

  • SOX

  • FERPA

  • PCI DSS

  • GLBA PII

and in the European Union, the GDPR (Global Data Protection Regulation) and various data sovereignty regulations (e.g., German human resources data must remain in Germany).

Some of these regulations specifically call for policy management, and therefore companies that are subject to them may need to institute formal governance earlier than other companies, in terms of the emergence model. Many of them provide penalties for the mismanagement of data, which we will discuss further in this Competency Area and in Competency Area 11. Compliance also includes compliance with the courts (e.g., civil and criminal actions). This will be discussed in the Information Management section on cyberlaw.

Evidence of Notability

Risk is a fundamental business concern, along with sales and net profits. The entire insurance industry is founded on the premise of risk management. Digital systems are subject to certain risks, and in turn present risks to the organizations that depend on them.

Limitations

Risk is difficult - not impossible - to quantify. When risk is not correctly quantified, it can lead to dysfunctional organizational behavior; e.g., spending disproportionate resources to mitigate a given risk. The irony (poorly understood by many risk professionals) is that overly elaborate risk mitigation approaches can themselves be a source of risk, in their imposition of delays and other burdens on the delivery of value.

Related Topics

Assurance and Audit

Assurance

Trust, but verify.
— Russian proverb

Assurance is a broad term. In this document, it is associated with governance. It represents a form of additional confirmation that management is performing its tasks adequately. Go back to the example that started this Competency Area, of the shop owner hiring a manager. Suppose that this relationship has continued for some time, and while things seem to be working well, the owner has doubts about the arrangement. Over time, things have gone well enough that the owner does not worry about the shop being opened on time, having sufficient stock, or paying suppliers. But there are any number of doubts the owner might retain:

  • Is money being accounted for honestly and accurately?

  • Is the shop clean? Is it following local regulations? For example, fire, health and safety codes?

  • If the manager selects a new supplier, are they trustworthy? Or is the shop at risk of selling counterfeit or tainted merchandise?

  • Are the shop’s prices competitive? Is it still well regarded in the community? Or has its reputation declined with the new manager?

  • Is the shop protected from theft and disaster?

audit compliance risk
Figure 120. Assurance in Context

These kinds of concerns remain with the owner, by and large, even with a reliable and trustworthy manager. If not handled correctly, the owner’s entire investment is at risk. The manager may only have a salary (and perhaps a profit share) to worry about, but if the shop is closed due to violations, or lawsuit, or lost to a fire, the owner’s entire life investment may be lost. These concerns give rise to the general concept of assurance, which applies to digital business just as it does to small retail shops. The following diagram, derived from previous illustrations, shows how this document views assurance: as a set of practices overlaid across governance elements, and in particular concerned with external forces (see Assurance in Context).

In terms of the governance-management interface, assurance is fundamentally distinct from the information provided by management and must travel through distinct communication channels. This is why auditors (for example) forward their reports directly to the Audit Committee and do not route them through the executives who have been audited.

Technologists, especially those with a background in networking, may have heard of the concept of “out-of-band control”. With regard to out-of-band management or control of IT resources, the channel over which management commands travel is distinct from the channel over which the system provides its services. This channel separation is done to increase security, confidence, and reliability, and is analogous to assurance.

governance and assurance
Figure 121. Assurance is an Objective, External Mechanism

As ISACA stipulates:

The information systems audit and assurance function shall be independent of the area or activity being reviewed to permit objective completion of the audit and assurance engagement. [151 p. 9]. Assurance can be seen as an external, additional mechanism of control and feedback. This independent, out-of-band aspect is essential to the concept of assurance (see Assurance is an Objective, External Mechanism).

Three-Party Foundation
Assurance means that pursuant to an accountability relationship between two or more parties, an IT audit and assurance professional may be engaged to issue a written communication expressing a conclusion about the subject matters to the accountable party.
— COBIT 5 for Assurance

There are broader and narrower definitions of assurance. But all reflect some kind of three-party arrangement (see Assurance is Based on a Three-Party Model, reflects concepts from [148, 142]).

three-party assurance
Figure 122. Assurance is Based on a Three-Party Model

The above diagram is one common scenario:

  • The stakeholder (e.g., the Audit Committee of the Board of Directors) engages an assurance professional (e.g., an audit firm) - the scope and approach of this are determined by the engaging party, although the accountable party in practice often has input as well

  • The accountable party, at the direction, responds to the assurance professional’s inquiries on the audit topic

  • The assurance professional provides the assessment back to the engaging party, and/or other users of the report (potentially including the accountable party)

This is a simplified view of what can be a more complex process and set of relationships. The ISAE3000 standard states that there must be at least three parties to any assurance engagement:

  • The responsible (accountable) party

  • The practitioner

  • The intended users (of the assurance report)

But there may be additional parties:

  • The engaging party

  • The measuring/evaluating party (sometimes not the practitioner, who may be called on to render an opinion on someone else’s measurement)

ISAE3000 goes on to stipulate a complex set of business rules for the allowable relationships between these parties [142 pp. 95-96]. Perhaps the most important rule is that the practitioner cannot be the same as either the responsible party or the intended users. There must be some level of professional objectivity.

What’s the difference between assurance and simple consulting? There are two major factors:

  • Consulting can be simply a two-party relationship — a manager hires someone for advice

  • Consultants do not necessarily apply strong assessment criteria

    • Indeed, with complex problems, there may not be any such criteria. Assurance, in general, presupposes some existing standard of practice, or at least some benchmark external to the organization being assessed.

Finally, the concept of assurance criteria is key. Some assurance is executed against the responsible party’s own criteria. In this form of assurance, the primary questions are “are you documenting what you do, and doing what you document?”. That is, for example, do you have formal process management documentation? And are you following it?

Other forms of assurance use external criteria. A good example is the Uptime Institute's data center tier certification criteria, discussed below. If criteria are weak or non-existent, the assurance engagement may be more correctly termed an advisory effort. Assurance requires clarity on this topic.

Types of Assurance
Exercise caution in your business affairs; for the world is full of trickery.
— Max Ehrmann
“Desiderata"

The general topic of “assurance” implies a spectrum of activities. In the strictest definitions, assurance is provided by licensed professionals under highly formalized arrangements. However, while all audit is assurance, not all assurance is audit. As noted in COBIT for Assurance, “assurance also covers evaluation activities not governed by internal and/or external audit standards” [150 p. 15].

This is a blurry boundary in practice, as an assurance engagement may be undertaken by auditors, and then might be casually called an “audit” by the parties involved. And there is a spectrum of organizational activities that seem at least to be related to formal assurance:

  • Brand assurance

  • Quality assurance

  • Vendor assurance

  • Capability assessments

  • Attestation services

  • Certification services

  • Compliance

  • Risk management

  • Benchmarking

  • Other forms of “due diligence”

Some of these activities may be managed primarily internally, but even in the case of internally managed activities, there is usually some sense of governance, some desire for objectivity.

From a purist perspective, internally directed assurance is a contradiction in terms. There is a conflict of interest in that in terms of the three-party model above, the accountable party is the practitioner.

However, it may well be less expensive for an organization to fund and sustain internal assurance capabilities and get much of the same benefits as from external parties. This requires sufficient organizational safeguards to be instituted. Internal auditors typically report directly to the Board-level Audit Committee and, generally, are not seen as having a conflict of interest.

In another example, an internal compliance function might report to the corporate general counsel (chief lawyer), and not to any executive whose performance is judged based on their organization’s compliance — this would be a conflict of interest. However, because the internal compliance function is ultimately under the CEO, their concerns can be overruled.

The various ways that internal and external assurance arrangements can work, and can go wrong, is a long history. If you are interested in the topic, review the histories of Enron, Worldcom, the 2008 mortgage crisis, and other such failures.

Assurance and Risk Management

Risk management (discussed in the previous section) may be seen as part of a broader assurance ecosystem. For evidence of this, consider that the Institute of Internal Auditors offers a certificate in Risk Management Assurance; The Open Group also operates the Open FAIR™ Certification Program. Assurance in practice may seem to be biased towards risk management, but (as with governance in general) assurance as a whole relates to all aspects of IT and digital governance, including effectiveness and efficiency.

Audit practices may be informed of known risks and particularly concerned with their mitigation, but risk management remains a distinct practice. Audits may have scope beyond risks, and audits are only one tool used by risk management (see Assurance and Risk Management).

assurance and risk
Figure 123. Assurance and Risk Management

In short, and as shown in the above diagram, assurance plays a role across value recognition, while risk management specifically targets the value recognition objective of risk optimization.

Non-Audit Assurance Examples
Businesses must find a level of trust between each other …​ third-party reports provide that confidence. Those issuing the reports stake their name and liability with each issuance.
— James DeLuccia
“Successfully Establishing and Representing DevOps in an Audit"

Before we turn to a more detailed discussion of the audit, we will discuss some specifically non-audit examples of assurance seen in IT and digital management.

Example 1: Due Diligence on a Cloud Provider

Your company is considering a major move to cloud infrastructure for its systems. The agility value proposition — the ability to minimize cost of delay — is compelling, and there may be some cost structure advantages as well.

But you are aware of some cloud failures:

  • In 2013, UK cloud provider 2e2 went bankrupt, and customers were given “24 to 48 hours to get …​ data and systems out and into a new environment” [90]; subsequently, the provider demanded nearly 1 million pounds (roughly $1.5 million) from its customers in order for their uninterrupted access to services (i.e., their data) [292]

  • Also in 2013, cloud storage provider Nirvanix went bankrupt, and its customers also had a limited time to remove their data; MegaCloud went out of business with no warning two months later, and all customers lost all data [51], [52]

  • In mid-2014, online source code repository Cloud Spaces (an early Github competitor) was taken over by hackers and destroyed; all data was lost [291], [188]

The question is, how do you manage the risks of trusting your data and organizational operations to a cloud provider? This is not a new question, as computing has been outsourced to specialist firms for many years. You want to be sure that their operations meet certain standards such as:

  • Financial standards

  • Operational standards

  • Security standards

Data center evaluations of cloud providers are a form of assurance. Two well-known approaches are:

  • The Uptime Institute’s Tier Certification

  • The American Institute of Certified Public Accountants' (AICPA) SOC 3 “Trust Services Report” certifying “Service Organizations” (based in turn on the SSAE-18 standard)

The Uptime Institute provides the well-known “Tier” concept for certifying data centers, from Tier I to Tier IV. In their words: “certification provides assurances that there are not shortfalls or weak links anywhere in the data center infrastructure” [290]. The Tiers progress as follows [289]:

  • Tier I: Basic Capacity

  • Tier II: Redundant Capacity Components

  • Tier III: Concurrently Maintainable

  • Tier IV: Fault Tolerance

Uptime Institute certification is a generic form of assurance in terms of the three-party model; the data center operator must work with the Uptime Institute who provides an independent opinion based on their criteria as to the data center’s tier (and therefore effectiveness).

The SOC 3 report is considered an “assurance” standard as well. However, as mentioned above, this is the kind of “assurance” done in general by licensed auditors, and which might casually be called an “audit” by the participants. A qualified professional, again in the three-party model, examines the data center in terms of the SSAE 18 reporting standard.

Your internal risk management organization might look to both Uptime Institute and SOC 3 certification as indicators that your cloud provider risk is mitigated. (More on this in Competency Category on Risk Management.)

Example 2: Internal Process Assessment

You may also have concerns about your internal operations. Perhaps your process for selecting technology vendors is unsatisfactory in general; it takes too long and yet vendors with critical weaknesses have been selected. More generally, the actual practices of various areas in your organization may be assessed by external consultants using the related guidance:

  • Enterprise Architecture with the TOGAF Architecture Development Method (ADM)

  • Project Management with PMBOK

  • IT processes such as incident management, change management, and release management with ITIL or CMMI-SVC

These assessments may be performed through using a maturity scale; e.g., CMM-derived. The CMM-influenced ISO/IEC 15504 standard may be used as a general process assessment framework. (Remember that we have discussed the problems with the fundamental CMM assumptions on which such assessments are based.)

According to [28]: “In our own experience, we have seen that the maturity models have their limitations.”. They warn that maturity assessments of Enterprise Architecture at least are prone to being:

  • Subjective

  • Academic

  • Easily manipulated

  • Bureaucratic

  • Superfluous

  • Misleading

Those issues may well apply to all forms of maturity assessments. Let the buyer beware. At least, the concept of maturity should be very carefully defined in a manner relevant to the organization being assessed.

Example 3: Competitive Benchmarking

Finally, you may wonder, “how does my digital operation compare to other companies?” Now, it is difficult to go to a competitor and ask this. It is also not especially practical to go and find some non-competing company in a different industry you don’t understand well. An entire industry has emerged to assist with this question.

We talked about the role of industry analysts in Investment and Portfolio. Benchmarking firms play a similar role and, in fact, some analyst firms provide benchmarking services.

There are a variety of ways benchmarking is conducted, but it is similar to assurance in that it often follows the three-party model. Some stakeholder directs an accountable party to be benchmarked within some defined scope. For example, the number of staff required to managed a given quantity of servers (aka admin:server) has been a popular benchmark. (Note that with cloud, virtualization, and containers, the usefulness of this metric is increasingly in question.)

An independent authority is retained. The benchmarker collects, or has collected, information on similar operations; for example, they may have collected data from 50 organizations of similar size on admin:server ratios. This data is aggregated and/or anonymized so that competitive concerns are reduced. Wells Fargo will not be told: “JP Morgan Chase has an overall ratio of 1:300”; they will be told: “The average for financial services is 1:250.”.

In terms of formal assurance principles, the benchmark data becomes the assessment criteria. A single engagement might consider dozens of different metrics, and where simple quantitative ratios do not apply, the benchmarker may have a continuously maintained library of case studies for more qualitative analysis. This starts to shade into the kind of work also performed by industry analysts. As the work becomes more qualitative, it also becomes more advisory and less about “assurance” per se.

Audit

The Committee, therefore, recommends that all listed companies should establish an Audit Committee.
— Cadbury Report
Agile or not, a team ultimately has to meet legal and essential organizational needs and audits help to ensure this.
— Scott Ambler
Disciplined Agile Delivery

If you look up “audit” online or in a dictionary, you will see it mainly defined in terms of finance: an audit is a formal examination of an organization’s finances (sometimes termed “books”). Auditors look for fraud and error so that investors (like our shop owner) have confidence that accountable parties (e.g., the shop manager) are conducting business honestly and accurately.

Audit is critically important to the functioning of the modern economy because there are great incentives for theft and fraud, and owners (in the form of shareholders) are remote from the business operations.

But what does all this have to do with IT and Digital Transformation?

Digital organizations, of course, have budgets and must account for how they spend money. Since financial accounting and its associated audit practices are a well-established practice, we won’t discuss it here. (We discussed IT financial management and service accounting in Financial Management of Digital and IT.)

Money represents a form of information - that of value. Money once was stored as a precious metal. When carrying large amounts of precious metal became impossible, it was stored in banks and managed through paper record-keeping. Paper record-keeping migrated onto computing machines, which now represent the value once associated with gold and silver. Bank deposits (our digital user’s bank account balance from Digital Fundamentals) are now no more than a computer record — digital bits in memory — made meaningful by tradition and law, and secured through multiple layers of protection and assurance.

Because of the increasing importance of computers to financial fundamentals, auditors became increasingly interested in IT. Clearly, these new electronic computers could be used to commit fraud in new and powerful ways. Auditors had to start asking: “How do you know the data in the computer is correct?”. This led to the formation in 1967 of the Electronic Data Processing Auditors Association (EDPAA), which eventually became ISACA (developer of COBIT).

It also became clear that computers and their associated operation were a notable source of cost and risk for the organization, even if they were not directly used for financial accounting. This has led to the direct auditing of IT practices and processes, as part of the broader assurance ecosystem we are discussing in this Competency Category.

A wide variety of IT practices and processes may be audited. Auditors may take a general interest in whether the IT organization is “documenting what it does and doing what it documents” and therefore nearly every IT process has been seen to be audited.

IT auditors may audit projects, checking that the expected project methodology is being followed. They may audit IT performance reporting, such as claims of meeting SLAs. And they audit the organization’s security approach — both its definition of security policies and controls, as well as their effectiveness.

IT processes supporting the applications used by financially-relevant systems, in general, will be under the spotlight from an auditor. This is where Information Technology General Controls (ITGCs) play a key role to assure that the data is secured and is reliable [153] for financial reporting. The IT Governance Institute [158] further articulates how IT controls and the technology environment’s relationship between the Public Company Accounting Oversight Board (PCAOB) and COBIT is built.

IT Control Environment
Figure 124. Audit Environment

Adapted from IT Control environment [158 p. 11], Figure 1.

Logical access to data, changes to applications, development of application, and computer operations will be the key areas of focus. Each of these areas are further divided into sub-areas for controlling the environment effectively.

Areas of control, generally, fall into these categories: SDLC, change management, logical access, and operations. A brief note on each of these controls follows.

SDLC If changes done to applications involve data conversion, management approval should be documented before moving the code to production, irrespective of the development methodology chosen. Controls could be called:

  • Data conversion testing

  • Go-live approval

Change Management Changes to the applications need to be tested with the appropriate level of documentation and approved through a Change Control Board. Management need to ensure that developers do not have privileged access to the production environment, including code migration privileges. Controls could be called:

  • Change testing

  • Change approval

  • Developer no privileged production access

  • Developer/migrator access

Logical Access Access to the applications need to be controlled through a variety of controls, such as provisioning and deprovisioning of access; password complexity and generic identity; password management; and access review of privileged/generic accounts and user accesses. Though not an extensive list, management oversight across the board is needed when users are involved to ensure the appropriate level of access is given and monitored. Controls could be called:

  • Password settings

  • User access reviews

  • Role reviews

  • Generic account password changes

  • Access review – privileged account

  • Access review – generic accounts

  • Access provisioning

  • Access removal

Operations In IT organizations, another key activity is to maintain and manage the infrastructure supporting the data. Who has access to the data center, how we schedule jobs, and what we do when jobs fail, and how we take backups and restore from backups, when needed, form key sub-areas to manage. Controls could be called:

  • Data center access

  • Job scheduling and resolution

  • Data backups

  • Data restoration

Controls driven through policies and processes target the application landscape, which helps financial reporting. Care needs to be taken to ensure that the prevention or detection controls are executed in a timely manner. As an example, Logical Access – Access Provisioning is deemed a preventative control; i.e., preventing unnecessary access to data, and Logical Access – User Access Review is deemed a detective control; i.e., detecting access provisioned is still relevant. Each control needs to be evaluated on its merit and ensure that there are mitigating controls.

If you are in a public company, IT controls relate directly to the SEC 10K report – an annual filing of the company’s financial status. The formation of this report is a culmination of the various audits conducted by internal and external auditors. Internal control over financial reporting by the Public Company Accounting Oversight Board [218] clearly notes the implications of the control environment.

Controls failure is an ongoing theme in The Phoenix Project, the novelization that helped launch DevOps as a movement [165].

If the controls start to fail, the scale of control failure ranges from an individual application failing any of the control to system design failure or the operational failure of the control [218] standard notes that the severity of the failure as a control is deficient - a combination of deficiencies leading to significant deficiency and a combination of deficiencies with potential for misstatement of financial reporting leading to material weakness. There should be enough management attention for each of these control failures and it may include application owner, control owner, and CIO including the Audit Committee of the Board of Directors.

Organizations need to survive in the digital economy by making choices with the limited pool of budget available. If the IT organization cannot control the environment, the cost of compliance significantly increases, thereby reducing the investment available for innovation.

It is better to be compliant to innovate more!

External versus Internal Audit

There are two major kinds of auditors of interest to us:

  • External auditors

  • Internal auditors

Here is a definition of external auditor:

An external auditor is chartered by a regulatory authority to visit an enterprise or entity and to review and independently report the results of that review. [198 p. 319].

Many accounting firms offer external audit services, and the largest accounting firms (such as PriceWaterhouse Coopers and Ernst & Young) provide audit services to the largest organizations (corporations, non-profits, and governmental entities). External auditors are usually certified public accountants, licensed by their state, and following industry standards (e.g., from the American Institute of Certified Public Accountants).

By contrast, internal auditing is housed internally to the organization, as defined by the Institute of Internal Auditors:

Internal auditing is an independent appraisal function established within an organization to examine and evaluate its activities as a service to the organization. [198], p.320.

Internal audit is considered a distinct but complementary function to external audit [70], 4_39. The internal audit function usually reports to the Audit Committee. As with assurance in general, independence is critical — auditors must have organizational distance from those they are auditing, and must not be restricted in any way that could limit the effectiveness of their findings.

Audit Practices

As with other forms of assurance, audit follows the three-party model. There is a stakeholder, an accountable party, and an independent practitioner. The typical internal audit lifecycle consists of (derived from [150]):

  • Planning/scoping

  • Performing

  • Communicating

In the scoping phase, the parties are identified (e.g., the Board Audit Committee, the accountable and responsible parties, the auditors, and other stakeholders).

The scope of the audit is very specifically established, including objectives, controls, and various governance elements (e.g., processes) to be tested. Appropriate frameworks may be utilized as a basis for the audit, and/or the organization’s own process documentation.

The audit is then performed. A variety of techniques may be used by the auditors:

  • Performance of processes or their steps

  • Inspection of previous process cycles and their evidence (e.g., documents, recorded transactions, reports, logs, etc.)

  • Interviews with staff

  • Physical inspection or walkthroughs of facilities

  • Direct inspection of system configurations and validation against expected guidelines

  • Attempting what should be prevented (e.g., trying to access a secured system or view data over the authorization level)

A fundamental principle is “expected versus actual”. There must be some expected result to a process step, calculation, etc., to which the actual result can be compared.

Finally, the audit results are reported to the agreed users (often with a preliminary “heads up” cycle so that people are not surprised by the results). Deficiencies are identified in various ways and typically are taken into system and process improvement projects.

Evidence of Notability

Assurance as a broad topic, and audit as a narrower instance, like governance itself are fundamental to the functioning of the economy. ISACA has published significant guidance on both topics, as have other, broader organizations such as the International Auditing and Assurance Standards Board, or IAASB.

Limitations

Assurance and audit in general require some clear standard or expectation against which to assess. In terms of process, they are suitable for lower-variability problem domains. They are less applicable to complex or chaotic domains. However, even within a chaotic situation, there will still be basic assumptions around (for example) how money and security are to be handled. Audit, in the IT systems context, has been important enough to drive the creation of one of the major IT/Digital Practitioner organizations, the IS Audit and Control Association (founded in 1967 as the Electronic Data Processing Auditors Association, or EDPAA).

Related Topics

Security

Description

A measure of a system’s ability to resist unauthorized attempts at usage or behavior modification, while still providing service to legitimate users.
— Sandy Bacik
Enernex

You have been practicing security since you first selected your initial choice of infrastructure and platform in Digital Infrastructure. But by now, your security capability is a well-established organization with processes spanning the enterprise, and a cross-functional steering committee composed of senior executives and with direct access to Board governance channels.

Security is a significant and well-known domain in and of itself. Ultimately, however, security is an application of the governance and risk management principles discussed in the previous sections. Deriving ultimately from the stakeholder’s desire to sustain and protect the enterprise (see Security Context), security relies on:

security context
Figure 125. Security Context

So what distinguishes security from more general concepts of risk? The definition at the top of this section is a good start, with its mention of “unauthorized attempts” to access or modify systems. The figure Security Context uses the term “sanctionable”, meaning violations might lead to legal or at least organizational penalties.

Many risks might involve carelessness, or incompetence, or random technical failure, or accidents of nature. Such concerns are sometimes termed "safety" to distinguish them from security. Security focuses on violations (primarily intentional, but also unintentional) of policies protecting organizational assets. In fact, "assets protection” is a common alternate name for corporate security.

“Authorization” is a key concept. Given some valuable resource, is access restricted to those who ought to have it? Are they who they say they are? Do they have the right to access what they claim is theirs? Are they conducting themselves in an expected and approved manner?

The security mentality is very different from the mentality found in a startup. A military analogy might be helpful. Being in a startup is like engaging in offensive missions: search, extract, destroy, etc. It involves travelling to a destination and operating with a single-point focus on completion.

Security, on the other hand, is like defending a perimeter. You have to think broadly across a large area, assessing weaknesses and distributing limited resources where they will have the greatest effect.

Security Taxonomy

An accepted set of terminology is key. The CISSP (Certified Information Systems Security Professional) Guide proposes the following taxonomy:

  • Vulnerability

  • Threat agent

  • Threat

  • Risk

  • Control

  • Exposure

  • Safeguard (e.g., control)

These terms are best understood in terms of their relationships, which are graphically illustrated in Security Taxonomy (similar to CISSP, [124]).

security taxonomy
Figure 126. Security Taxonomy

In implementing controls, the primary principles are:

  • Availability: the asset must be accessible to those entitled to it

  • Integrity: the asset must be protected from destruction or corruption

  • Confidentiality: the asset must not be accessible to those not entitled to it

Information Classification

At a time when the significance of information and related technologies is increasing in every aspect of business and public life, the need to mitigate information risk, which includes protecting information and related IT assets from ever-changing threats is constantly intensifying.
— ISACA
COBIT 5 for Security

Before we turn to security engineering and security operations, we need to understand the business context of security. The assets at risk are an important factor, and risk management gives us a good general framework. One additional technique associated with security is information classification. A basic hierarchy is often used, such as:

  • Public

  • Internal

  • Confidential

  • Restricted

The military uses the well-known levels of:

  • Unclassified

  • Confidential

  • Secret

  • Top Secret

These classifications assist in establishing the security risk and necessary controls for a given digital system and/or process.

Information also can be categorized by subject area. This becomes important from a compliance point of view. This will be discussed in Competency Area 11, in the section on records management.

Security Engineering

For the next two sections, we will adopt a “dual-axis” view, first proposed in [31].

In this model, the systems lifecycle is considered along the horizontal access, and the user experience is considered along the vertical access (which also maps to the “stack”). In the following picture, we see the distinct concerns of the various stakeholders in the dual-axis model (see Security and the Dual-Axis Value Chain).

dual-axis view of security
Figure 127. Security and the Dual-Axis Value Chain
Consumer versus Sponsor Perspective

The consumer of the digital service has different concerns from the sponsor/customer (in our three-party model). The consumer (our woman checking her bank balance) is concerned with immediate aspects of confidentiality, integrity, and availability:

  • Is this communication private?

  • Is my money secure?

  • Can I view my balance and do other operations with it (e.g., transfer it) confident of no interference?

The sponsor, on the other hand, has derivative concerns:

  • Are we safe from the bad publicity that would result from a breach?

  • Are we compliant with laws and regulations or are we risking penalties for non-compliance (as well as risking security issues)?

  • Are our security activities as cost-efficient as possible, given our risk appetite?

Security Architecture and Engineering

Security engineering is concerned with the fundamental security capabilities of the system, as well as ensuring that any initial principles established for the system are adhered to as development proceeds, and/or as vendors are selected and perhaps replaced over time.

There are multitudes of books written on security from an engineering, architecture, and development perspective. The tools, techniques, and capabilities evolve quickly every year, which is why having a fundamental business understanding based on a stable framework of risk and control is essential.

This is a book on management, so we are not covering technical security practices and principles, just as we are not covering specific programming languages or distributed systems engineering specifics. Studying for the CISSP exam will provide both an understanding of security management, as well as current technical topics. A glance at the CISSP Guide shows how involved such topics can be:

  • The Harrison-Rizzo-Ullman security model

  • The Diffie-Hellman Asymmetrical Encryption Algorithm

  • Functions and Protocols in the OSI Model

Again, the issue is mapping such technical topics to the fundamentals of risk and control. Key topics we note here include:

  • Authentication and authorization

  • Network security

  • Cryptography

Authentication and authorization are the cornerstones of access; i.e., the gateway to the asset. Authentication confirms that a person is who they say they are. Authorization is the management of their access rights (can they see the payroll? reset others' passwords?).

Network security is a complex sub-domain in and of itself. Because attacks typically transpire over the Internet and/or internal organizational networks, the structure and capabilities of networks are of critical concern, including topics such as:

  • Routing

  • Firewalls

  • The Domain Name Service

Finally, cryptography is the “storage and transmission of data in a form that only those it is intended for can read and process” [124].

All of these topics require in-depth study and staff development. As of the time of writing, there is a notable shortage of skilled security professionals. Therefore, a critical risk is that your organization might not be able to hire people with the needed skills (consider our section on resource management).

Security and the Systems Lifecycle

Security is a concern throughout the systems lifecycle. You already know this. Otherwise, you would not have reached enterprise scale. But now you need to formalize it with some consistency, as that is what regulators and auditors expect, and it also makes it easier for your staff to work on various systems.

Security should be considered throughout the SDLC, including systems design, but this is easier said than done. Organizations will always be more interested in a system’s functionality than its security. However, a security breach can ruin a company.

The CISSP recommends (among other topics) consideration of the following throughout the systems lifecycle:

  • The role of environmental (e.g., OS-level) safeguards versus internal application controls

  • The challenges of testing security functionality

  • Default implementation issues

  • Ongoing monitoring

Increasingly important controls during the construction process in particular are:

  • Code reviews

  • Automated code analysis

Finally, we previously discussed the Netflix Simian Army, which can serve as a form of security control.

Sourcing and Security

Vendors come and go in the digital marketplace, offering thousands of software-based products across every domain of interest (we call this the technology product lifecycle). Inevitably, these products have security errors. A vendor may issue a “patch” for such an error, which must be applied to all instances of the running software. Such patches are not without risk, and may break existing systems; they, therefore, require testing under conditions of urgency.

Increasingly, software is offered as a service, in which case it is the vendor responsibility to patch their own code. But what if they are slow to do this? Any customer relying on their service is running a risk, and other controls may be required to mitigate the exposure.

One important source of vulnerabilities is the National Vulnerability Database supported by the NIST. In this database, you can look up various products and see if they have known security holes. Using the National Vulnerability Database (NVD) is complex and not something that can be simply and easily “implemented” in a given environment, but it does represent an important, free, taxpayer-supported resource of use to security managers.

An important type of vulnerability is the “zero-day” vulnerability. With this kind of vulnerability, knowledge of a security “hole” becomes widespread before any patches are available (i.e., the software’s author and users have “zero days” to create and deploy a fix). Zero-day exploits require the fast and aggressive application of alternate controls, which leads us to the topic of security operations.

Integration of Security and Safety

The digital enterprise is becoming increasingly complex. Technological artifacts are embedded into organizations and social systems to form complex socio-technical systems. Unanticipated combinations of events can create losses that can become catastrophic. For example, autonomous vehicles are involved in deadly accidents, the power grid is threatened by viruses, or cybercriminals take hospitals hostage.

Traditional safety and security analysis techniques are increasingly less effective because they were created for a simpler world where linear event chains and statistical tools were sufficient. New safety and security analysis techniques that are based on systems theory are a better fit to address the digital enterprise challenges.

In a digital world we recommend modeling the enterprise as a complex socio-technical system and use an integrated approach to both security and safety. Young and Leveson [311] advocate this approach.

The integrated approach addresses losses due to intentional and unintentional actions: “Safety experts see their role as preventing losses due to unintentional actions by benevolent actors. Security experts see their role as preventing losses due to intentional actions by malevolent actors. The key difference is the intent of the actor that produced the loss event.”

This approach makes a more efficient use of resources. It also provides a high-level strategic analysis that prevents being stuck too early into tactical problems before the big picture is established. The Systems Theoretic Process Analysis (STPA) Handbook (published in March 2018), authored by Nancy G. Leveson and John P. Thomas, develops a systemic and integrated method to manage safety and security hazards.

STPA advocates the modeling of a controller to enforce constraints on the behavior of the system. Controls are not limited to the technology or process sphere; they can also include social controls.

Security Operations

Networks and computing environments are evolving entities; just because they are secure one week does not mean they are secure three weeks later.
— Shon Harris
Guide to the CISSP

Security requires ongoing operational attention. Security operations is first and foremost a form of operations, as discussed in Operations Management. It requires on-duty and on-call personnel, and some physical or virtual point of shared awareness (for example, a physical Security Operations Center, perhaps co-located with a Network Operations Center). Beyond the visible presence of a Security Operations Center, various activities must be sustained. These can be categorized into four major areas:

  • Prevention

  • Detection

  • Response

  • Forensics

Prevention

An organization’s understanding of what constitutes a “secure” system is continually evolving. New threats continually emerge, and the alert security administrator has an ongoing firehose of bulletins, alerts, patch notifications, and the like to keep abreast of.

These inputs must be synthesized by an organization’s security team into a set of security standards for what constitutes a satisfactorily configured (“hardened”) system. Ideally, such standards are automated into policy-driven systems configuration approaches; in less ideal situations, manual configuration — and double-checking — is required.

Prevention activities include:

  • Maintaining signatures for intrusion detection and anti-virus systems

  • Software patching (e.g., driven by the technology product lifecycle and updates to the National Vulnerability Database)

  • Ongoing maintenance of user authorizations and authentication levels

  • Ongoing testing of security controls (e.g., firewalls, configurations, etc.)

  • Updating security controls appropriately for new or changed systems

Detection

There are many kinds of events that might indicate some security issue; systems exposed to the open Internet are continually scanned by a wide variety of often-hostile actors. Internal events, such as unscheduled/unexplained system restarts, may also indicate security issues. The challenge with security monitoring is identifying patterns indicating more advanced or persistent threats. When formalized, such patterns are called “signatures”.

One particular form of event that can be identified for systems under management is configuration state change.

For example, if a core OS file — one that is well known and not expected to change — changes in size one day with no explanation, this might be indicative of a security exploit. Perhaps an attacker has substituted this file with one containing a “backdoor” allowing access. Tools such as Tripwire are deployed to scan and inventory such files and key information about them (“metadata”) and raise alerts if unexpected changes occur. Infrastructure managers such as Chef and Puppet may also serve as inputs into security event management systems; for example, they may detect attempts to alter critical configuration files and, in their re-converging the managed resource back to its desired state can be a source of valuable information about potential exploits. Such tools also may be cited as controls for various kinds of security risks.

We have discussed the importance of configuration management in both Digital Infrastructure and Operations Management. In Digital Infrastructure, we discussed the important concept of Infrastructure as Code and policy-driven configuration management; we revisited the importance of configuration management from an operational perspective in State and Configuration. Configuration management also will re-appear in Information Management and Architecture.

It should be clear by now that configuration management is one of the most critical enabling capabilities for digital management, regardless of whether you look to traditional ITSM practices or modern DevOps approaches.

Detection activities include:

  • Monitoring events and alerts from intrusion detection and related operational systems

  • Analyzing logs and other artifacts for evidence of exploits

Response

Security incidents require responses. Activities include:

  • Declaring security incidents

  • Marshaling resources (staff, consultants, law enforcement) to combat

  • Developing immediate tactical understanding of the situation

  • Developing a response plan, under time constraints

  • Executing the plan, including ongoing monitoring of effectiveness and tactical correction as needed

  • Keeping stakeholders informed as to situation

Forensics

Finally, security incidents require careful after-the-fact analysis:

  • Analyzing logs and other artifacts for evidence of exploits

  • Researching security incidents to establish causal factors and develop new preventative approaches (thus closing the loop)

Relationship to Other Processes

As with operations as a whole, there is ongoing monitoring and reporting to various stakeholders, and interaction with other processes.

One of the most important operational processes from a security perspective is change management. Configuration state changes (potentially indicating an exploit in progress) should be reconciled first to change management records. Security response may also require emergency change processes. ITSM event and incident management may be leveraged as well.

The particular concerns of security may interfere with cross-process coordination. This is a topic beyond the scope of this document.

Security and Assurance

Quis custodiet ipsos custodes?
— Latin for “Who watches the watchers?"

Given the critical importance of security in digital organizations, it is an essential matter for governance attention at the highest levels.

Security management professionals are accountable to governance concerns just as any other manager in the digital organization. Security policies, processes, and standards are frequently audited, by both internal auditors as well as external assurance professionals (not only auditors but other forms of assurance as well).

The idea that an “assets protection” group might itself be audited may be hard to understand, but security organizations such as police organizations have Internal Affairs units for just such purposes.

Security auditors might review the security processes mentioned above, or system configuration baselines, or log files, or any number of other artifacts, depending on the goals and scope of a security audit. Actual penetration testing is a frequently used approach: the hiring of skilled “white-hat” hackers to probe an organization’s defenses. Such hackers might be given license to probe as far as they possibly can and return with comprehensive evidence of what they were able to access (customer records, payrolls, account numbers, and balances, etc.).

Emerging Topics

As AI plays an increasing role in the digital enterprise, it creates new hazards and can at the same time help fight security threats. Forbes warned its readers that good bots can go bad and articles raise chatbot security concerns.

At the same time, Machine Learning techniques applied to cybersecurity are developed; for example, “Machine Learning and Security: Protecting Systems with Data and Algorithms" published by O’reilly Media, February 16, 2018.

Will this end the cat-and-mouse game between attackers and defenders? Or fuel a new security arm race? The jury is still out. That is why an holistic approach to security and safety problems is required. It provides the context and guidance to better control a digital world where automation and AI changes the game.

Evidence of Notability

Securing valuable assets and capabilities from unintentional and intentional harm has been a concern of the human race since time immemorial. Digital systems provide attractive targets and their security has been of interest since their first uses.

Limitations

Like risk more broadly, security must balance against value at risk, or otherwise it can degenerate into theater.

Related Topics

Digital Governance

Description

The legacy of IT governance is wide, deep, and often wasteful. Approaches based on mis-applied Taylorism and misguided, CMM-inspired statistical process control have resulted in the creation of ineffective, large-scale IT bureaucracies whose sole mission seems to be the creation and exchange of non-value-add secondary artifacts, while lacking any clear concept of an execution model.

What is to be done? Governance will not disappear any time soon. Simply arguing against governance is unlikely to succeed. Instead, this document argues the most effective answer lies in a re-examination of the true concerns of governance:

  • Sustaining innovation and effective value delivery

  • Maintaining efficiency

  • Optimizing risk

These fundamental principles (“top-line”, “bottom-line”, “risk”) define value for organizations around the world, whether for-profit, non-profit, or governmental. After considering the failings of IT governance, we will re-examine it in light of these objectives and come up with some new answers for the digital era.

From the perspective of Digital Transformation, there are many issues with traditional IT governance and the assumptions and practices characterizing it.

The New Digital Operating Model

Consider the idea of “programmability” mentioned at the start of this Competency Category. A highly “programmable” position is one where the responsibilities can be specified in terms of their activities. And what is the fundamental reality of Digital Transformation? It is no accident that such positions are called “programmable”. In fact, they are being “programmed away” or “eaten by software" — leaving only higher-skill positions that are best managed by objective, and which are more sensitive to cultural dynamics.

Preoccupation with “efficiency” fades as a result of the decreasingly “programmable” component of work. The term “efficiency” signals a process that has been well defined (is “programmable”) to the point where it is repeatable and scalable. Such processes are ripe for automation, commoditization, and outsourcing, and this is in fact happening.

If the repetitive process is retained within an organization, the drive for efficiency leads to automation and, eventually, efficiency is expressed through concerns for capacity management and the consumption of computing resources. And when such repetitive concerns are not retained by the organization, but instead become a matter of sourcing rather than execution, the emphasis shifts to risk management and governance of the supplier.

The remaining uncertain and creative processes should not just be managed for “efficiency" and need to be managed for effectiveness, including fast feedback, collaboration, culture, and so forth.

Project versus Operations as Operating Model

projects and operations as governance
Figure 128. Governance Based on Project versus Operations

As we can see illustrated in Governance Based on Project versus Operations (similar to [155]), ISO/IEC 38500 assumes a specific IT operating model, one in which projects are distinct from operations. We have discussed the growing influence of product-centric digital management throughout this document, but as of the time of writing the major IT governance standard still does not recognize it. The ripple effects are seen throughout other guidance and commentary. In particular, the project-centric mindset is closely aligned with the idea of IT and the CIO as primarily order-takers.

CIO as Order-Taker

Throughout much of the IT governance guidance, certain assumptions become evident:

  • There is an entity that can be called “the Business”

  • There is a distinct entity called “IT” (for “Information Technology”)

  • It is the job of “IT” to take direction (i.e., orders) from “the Business” and to fulfill them

  • There is a significant risk that “IT” activities (and by extension, dollars spent on them) may not be correctly allocated to the preferred priorities of “the Business”; IT may spend money unwisely, on technology for its own sake, and this risk needs to be controlled

  • The needs of “the Business” can be precisely defined and it is possible for “IT” to support those needs with a high degree of predictability as to time and cost; this predictability is assumed even when those activities imply multi-million dollar investments and months or years of complex implementation activities

  • When such activities do not proceed according to initial assumptions, this is labeled an "IT failure”; it is assumed that the failure could have been prevented through more competent management, “rigorous” process, or diligent governance, especially on the IT side

There may be organizations where these are reasonable assumptions. (This document does not claim they do not exist.) But there is substantial evidence for the existence of organizations for whom these assumptions are untrue.

The Fallacies of “Rigor” and Repeatability

One of the most critical yet poorly understood facts of software development and by extension complex digital system innovation is the impossibility of “rigor”. Software engineers are taught early that “completely” testing software is impossible [161], yet it seems that this simple fact (grounded in fundamentals of computer science and information theory) is not understood by many managers.

A corollary fallacy is that of repeatable processes, when complexity is involved. We may be able to understand repeatability at a higher level, through approaches like case management or the Checklist Manifesto's submittal schedules, but process control in the formal sense within complex, R&D-centric digital systems delivery is simply impossible, and the quest for it is essentially cargo cult management.

And yet IT governance discussions often call for "repeatability" (e.g., [198 p. 6]), despite the fact that value that can be delivered “repeatably”, “year after year” is for the most part commodity production, not innovative product development. Strategy is notably difficult to commoditize.

Another way to view this is in terms of the decline of traditional IT. As you review those diagrams, understand that much of IT governance has emerged from the arguably futile effort to deliver product innovation in a low-risk, “efficient” manner. This desire has led, as Ambler and Lines note at the top of this Competency Category, to the creation of layers and layers of bureaucracy and secondary artifacts.

The cynical term for this is “theater”, as in an act that is essentially unreal but presented for the entertainment and distraction of an audience.

As we noted above, a central reality of Digital Transformation is that commoditized, predictable, programmable, repeatable, “efficient” activities are being quickly automated, leaving governance to focus more on the effectiveness of innovation (e.g., product development) and management of supplier risk. Elaborate IT operating models specifying hundreds of interactions and deliverables, in a futile quest for "rigor” and “predictability”, are increasingly relics of another time.

Digital Effectiveness

Let’s return to the first value objective: effectiveness. We define effectiveness as “top-line” benefits: new revenues and preserved revenues. New revenues may come from product innovation, as well as successful marketing and sales of existing products to new markets (which itself is a form of innovation).

Traditionally, “back-office” IT was rarely seen as something contributing to effectiveness, innovation, and top-line revenue. Instead, the first computers were used to increase efficiency, through automating clerical work. The same processes and objectives could be executed for less money, but they were still the same back-office processes.

With Digital Transformation, product innovation and effectiveness is now a much more important driver. Yet product-centric management is still poorly addressed by traditional IT governance, with its emphasis on distinguishing projects from operations.

One tool that becomes increasingly important is a portfolio view. While PMOs may use a concept of “portfolio” to describe temporary initiatives, such project portfolios rarely extend to tracking ongoing operational benefits. Alternative approaches also should be considered such as the idea of an options approach.

Digital Efficiency

Efficiency is a specific, technical term, and although often inappropriately prioritized, is always an important concern. Even a digitally transforming, product-centric organization can still have governance objectives of optimizing efficiency. Here are some thoughts on how to re-interpret the concept of efficiency.

Consolidate the Pipelines

One way in which digital organizations can become more efficient is to consolidate development as much as possible into common pipelines. Traditionally, application teams have owned their own development and deployment pipelines, at the cost of great, non-value add variability. Even centralizing source control has been difficult.

This is challenging for organizations with large legacy environments, but full lifecycle pipeline automation is becoming well understood across various environments (including the mainframe).

Reduce Internal Service Friction

Another way of increasing efficiency is to standardize integration protocols across internal services, as Amazon has done. This reduces the need for detailed analysis of system interaction approaches every time two systems need to exchange data. This is a form of reducing transaction costs and therefore consistent with Coase’s theory of the firm [63].

Within the boundary of a firm, a collaboration between internal services should be easier because of reduced transaction costs. It is not hard to see that this would be the case for digital organizations: security, accounting, and CRM would all be more challenging and expensive for externally-facing services.

However, since a firm is a system, a service within the boundaries of a firm will have more constraints than a service constrained only by the market. The internal service may be essential to other, larger-scoped services, and may derive its identity primarily from that context.

Because the need for the service is well understood, the engineering risk associated with the service may also be reduced. It may be more of a component than a product. See the parable of the the Flower and the Cog. Reducing service redundancy is a key efficiency goal within the bounds of a system — more to come on this in Architecture.

Manage the Process Portfolio

Processes require ongoing scrutiny. The term "organizational scar tissue” is used when specific situations result in new processes and policies, that in turn increase transactional friction and reduce efficiency throughout the organization.

Processes can be consolidated, especially if specific procedural detail can be removed in favor of larger-grained case management or Checklist Manifesto concepts including the submittal schedule. As part of eventual automation and Digital Transformation, processes can be ranked as to how “heavyweight” they are. A typical hierarchy, from “heaviest” to “lightest”, might be:

  • Project

  • Release

  • Change

  • Service request

  • Automated self-service

The organization might ask itself:

  • Do we need to manage this as a project? Why not just a release?

  • Do we need to manage this as a release? Why not just a change?

  • Do we need to manage this as a change? Why not just a service request?

  • Do we need to manage this as a service request? Why is it not fully automated self-service?

There may be a good reason to retain some formality. The point is to keep asking the question. Do we really need a separate process? Or can the objectives be achieved as part of an existing process or another element?

Treat Governance as Demand

A steam engine’s governor imposes some load, some resistance, on the engine. In the same way, governance activities and objectives, unless fully executed by the directing body (e.g., the Board), themselves impose a demand on the organization.

This points to the importance of having a clear demand/execution framework in place to manage governance demand. The organization does not have an unlimited capacity for audit response, reporting, and the like. In order to understand the organization as a system, governance demand needs to be tracked and accounted for and challenged for efficiency just as any other sort of demand.

Leverage the Digital Pipeline

Finally, efficiency asks: can we leverage the digital pipeline itself to achieve governance objectives? This is not a new idea. The governance/management interface must be realized by specific governance elements, such as processes. Processes can (and often should) be automated. Automation is the raison d’etre of the digital pipeline; if the process can be expressed as user stories, behavior-driven design, or other forms of requirement, it simply is one more state change moving from dev to ops.

In some cases, the governance stories must be applied to the pipeline itself. This is perhaps more challenging, but there is no reason the pipeline itself cannot be represented as code and managed using the same techniques. The automated elements then can report their status up to the monitoring activity of governance, closing the loop. Auditors should periodically re-assess their effectiveness.

Digital Risk Management

Finally, from an IT governance perspective, what is the role of IT risk management in the new digital world? It’s not that risk management goes away. Many risks that are well understood today will remain risks for the foreseeable future. But there are significant new classes of risk that need to be better understood and managed:

  • Unmanaged demand and disorganized execution models leading to multi-tasking, which is destructive of value and results

  • High queue wait states, resulting in uncompetitive speed to deliver value

  • Slow feedback due to large batch sizes, reducing effectiveness of product discovery

  • New forms of supplier risk, as services become complex composites spanning the Internet ecosystem

  • Toxic cultural dynamics destructive of high team performance

  • Failure to incorporate cost of delay in resource allocation and work prioritization decisions

All of these conditions can reduce or destroy revenues, erode organizational effectiveness, and worse. It is hard to see them as other than risks, yet there is little attention to such factors in the current “best practice” guidance on risk.

Cost of Delay as Risk

In today’s digital governance there must be a greater concern for outcome and effectiveness, especially in terms of time-to-market (minimizing cost of delay). Previously, concerns for efficiency might lead a company to overburden its staff, resulting in queuing gridlock, too much work-in-process, destructive multi-tasking, and ultimately failure to deliver timely results (or deliver at all).

Such failure to deliver was tolerated because it seemed to be a common problem across most IT departments. Also relevant is the fact that Digital Transformation had not taken hold yet. IT systems were often a back office, and delays in delivering them (or significant issues in their operation) were not quite as damaging.

Now, the effectiveness of delivery is essential. The interesting, and to some degree unexpected result, is that both efficiency and risk seem to be benefiting as well. Cross-functional, focused teams are both more effective and more efficient, and able to manage risk better as well. Systems are being built with both increased rapidity as well as improved stability, and the automation enabling this provides robust audit support.

Team Dynamics as Risk

We have covered culture in some depth in Coordination and Process. Briefly, from a governance perspective:

The importance of organizational culture has been discussed by management thinkers since at least W. Edwards Deming. In a quote often attributed to Peter Drucker: “culture eats strategy for breakfast”. But it has been difficult at best to quantify what we mean by culture.

Quantify? Some might even say quantification is impossible. But Google and the State of DevOps research have done so. Google has established the importance of psychological safety in forming effective, high-performing teams [242]. And the State of DevOps research, applying the Westrum typology, has similarly confirmed that pathological, controlling cultures are destructive of digital value [224].

These facts should be taken seriously in digital governance discussions. So-called “toxic” leadership (an increasing concern in the military itself [293]) is destructive of organizational goals and stakeholder value. It can be measured and managed and should be a matter of attention at the highest levels of organizational governance.

Sourcing Risk

We have already covered contracting in terms of software and cloud. But in terms of the emergence model, it is typical that companies enter into contracts before having a fully mature sourcing and contract management capability with input from the GRC perspective.

We have touched on the issues of cloud due diligence and sourcing and security in this Competency Area. The 2e2 case discussed is interesting; it seems that due diligence had actually been performed. Additional controls could have made a key difference, in particular business continuity planning.

There are a wide variety of supplier-side risks that must be managed in cloud contracts:

  • Access

  • Compliance

  • Data location

  • Multi-tenancy

  • Recovery

  • Investigation

  • Viability (assurance)

  • Escrow

We have emphasized throughout this document the dynamic nature of digital services. This presents a challenge for risk management of digital suppliers. This year’s audit is only a point-in-time snapshot; how to maintain assurance with a fast-evolving supplier? This leading edge of cloud sourcing is represented in discussions such as “Dynamic certification of cloud services: Trust, but verify!":

The on-demand, automated, location-independent, elastic, and multi-tenant nature of cloud computing systems is in contradiction with the static, manual, and human process-oriented evaluation and certification process designed for traditional IT systems …​

Common certificates are a backward look at the fulfillment of technical and organizational measures at the time of issue and therefore represent a snapshot. This creates a gap between the common certification of one to three years and the high dynamics of the market for cloud services and providers.

The proposed dynamic certification approach adopts the common certification process to the increased flexibility and dynamics of cloud computing environments through using automation potential of security controls and continuous proof of the certification status
[180].

It seems likely that such ongoing dynamic evaluation of cloud suppliers would require something akin to Simian Army techniques, discussed below.

Beyond increasing supply-side dynamism, risk management in a full Supplier Integration and Management (SIAM) sense is compounded by the complex interdependencies of the services involved. All of the cloud contracting risks need to be covered, as well as further questions such as:

  • If a given service depends on two sub-services (“underpinning contracts”), what are the risks for the failure of either or both of the underpinning services?

  • What are the controls?

Automating Digital Governance

Digital Exhaust

One governance principle we will suggest here is to develop a governance architecture as an inherent part of the delivery system, not as an additional set of activities. We use the concept of “digital exhaust” to reinforce this.

Digital exhaust, for the purposes of this document, consists of the extraneous data, and information that can be gleaned from it, originating from the development and delivery of IT services.

Consider an automobile’s exhaust. It does not help you get to where you are going, but it is an inevitable aspect of having an internal combustion engine. Since you have it, you can monitor it and gain certain insights as to whether your engine is running efficiently and effectively. You might even be able to identify if you are at risk of an engine failure.

The term “digital exhaust” is also applied to the data generated from the IoT. This usage is conceptually aligned to our usage here, but somewhat different in emphasis.

To leverage digital exhaust, focus on the critical, always-present systems that enable digital delivery:

These systems constitute a core digital pipeline, one that can be viewed as an engine producing digital exhaust. This is in contrast to fragmented, poorly automated pipelines, or organizations with little concept of pipeline at all. Such organizations end up relying on secondary artifacts and manual processes to deliver digital value.

governance based on secondary artifacts
Figure 129. Governance Based on Activities and Artifacts

The illustration in Governance Based on Activities and Artifacts represents fragmented delivery pipelines, with many manual processes, activities, and secondary artifacts (waterfall stage approvals, designs, plans, manual ticketing, and so forth). Much IT governance assumes this model, and also assumes that governance must often rely on aggregating and monitoring the secondary artifacts.

In contrast (see Governance of Digital Exhaust), with a rationalized continuous delivery pipeline, governance increasingly can focus on monitoring the digital exhaust.

governance based on digital exhaust
Figure 130. Governance of Digital Exhaust

What can we monitor with digital exhaust for the purposes of governance?

  • Development team progress against backlog

  • Configuration management

  • Conformance to architectural standards (through inspection of source and package managers, code static analysis, and other techniques)

  • Complexity and technical debt

  • Performance and resource consumption of services

  • Performance of standards against automated hardening activities (e.g., Simian Army)

As noted above, certain governance objectives may require the pipeline itself to be adapted; e.g., the addition of static code analysis, or implementation of hardening tools such as Simian Army.

Additional Automation

The DevOps Audit Toolkit provides an audit perspective on pipeline automation [85]. This report provides an important set of examples demonstrating how modern DevOps toolchain automation can fulfill audit objectives as well or better than “traditional” approaches. This includes a discussion of alternate approaches to the traditional control of “separation of duties” for building and deploying code. These approaches include automated code analysis and peer review as a required control.

There are a variety of ways the IT pipeline can be automated. Many components are seen in real-world pipelines:

  • Source repositories

  • Build managers

  • Static code analyzers

  • Automated user interface testing

  • Load testing

  • Package managers

  • More sophisticated continuous deployment infrastructure

and much more.

Additionally, there may still be a need for systems that are secondary to the core pipeline:

  • Service or product portfolio

  • Workflow and Kanban-based systems (one notable example is workflow to ensure peer review of code)

  • Document management

There may also be a risk repository if the case can’t be made to track risks using some other system. The important thing to remember when automating risk management is that risks are always with respect to some thing.

A risk repository needs to be integrated with subject inventories, such as the Service Portfolio and relevant source repositories and entries in the package manager. Otherwise, risk management will remain an inefficient, highly manual process.

What are the things that may present risks?

  • Products/services

    • Their ongoing delivery

    • Their changes and transformations (releases)

    • Their revenues

  • Customers and their data

  • Employees and their positions

  • Assets

  • Vendors

  • Other critical information

Evidence of Notability

There is substantial friction between classical governance concerns and new digital operating models. See Jez Humble’s Lean Enterprise for one discussion [138]. The DevOps Audit Defense Toolkit provides another window into this topic [85].

Limitations

The Lean and Agile challenge to governance arises primarily in R&D-centric environments. Classic governance practices are well suited for purely operational environments, where they have been honed over decades.

Related Topics


1. Credit: Brian Barnier used this analogy at an ISACA meeting around 2011.
2. Synthesized from various sources including ISO 38500 and COBIT.