-
10. Phase C: Information Systems Architectures - Data Architecture
This chapter describes the Data Architecture part of Phase C.
10.1 Objectives
The objectives of the Data Architecture part of Phase C are to:
- Develop the Target Data Architecture that enables the Business Architecture and the Architecture Vision, while addressing the Request for Architecture Work and stakeholder concerns
- Identify candidate Architecture Roadmap components based upon gaps between the Baseline and Target Data Architectures
10.2 Approach
10.2.1 Key Considerations for Data Architecture
10.2.1.1 Data Management
When an enterprise has chosen to undertake largescale architectural transformation, it is important to understand and address data management issues. A structured and comprehensive approach to data management enables the effective use of data to capitalize on its competitive advantages.
Considerations include:
- A clear definition of which application components in the landscape will serve as the system of record or reference for enterprise master data
- Will there be an enterprise-wide standard that all application components, including software packages, need to adopt (in the main packages can be prescriptive about the data models and may not be flexible)?
- Clearly understand how data entities are utilized by business functions, processes, and services
- Clearly understand how and where enterprise data entities are created, stored, transported, and reported
- What is the level and complexity of data transformations required to support the information exchange needs between applications?
- What will be the requirement for software in supporting data integration with the enterprise's customers and suppliers (e.g., use of ETL tools during the data migration, data profiling tools to evaluate data quality, etc.)?
10.2.1.2 Data Migration
When an existing application is replaced, there will be a critical need to migrate data (master, transactional, and reference) to the new application. The Data Architecture should identify data migration requirements and also provide indicators as to the level of transformation, weeding, and cleansing that will be required to present data in a format that meets the requirements and constraints of the target application. The objective being that the target application has quality data when it is populated. Another key consideration is to ensure that an enterprise-wide common data definition is established to support the transformation.
10.2.1.3 Data Governance
Data governance considerations ensure that the enterprise has the necessary dimensions in place to enable the transformation, as follows:
- Structure: This dimension pertains to whether the enterprise has the necessary organizational structure and the standards bodies to manage data entity aspects of the transformation.
- Management System: Here enterprises should have the necessary management system and data-related programs to manage the governance aspects of data entities throughout its lifecycle.
- People: This dimension addresses what data-related skills and roles the enterprise requires for the transformation. If the enterprise lacks such resources and skills, the enterprise should consider either acquiring those critical skills or training existing internal resources to meet the requirements through a well-defined learning program.
10.2.2 Architecture Repository
As part of this phase, the architecture team will need to consider what relevant Data Architecture resources are available in the organization's Architecture Repository (see Part V, 41. Architecture Repository), in particular, generic data models relevant to the organization's industry "vertical" sector. For example:
- ARTS has defined a data model for the Retail industry.
- Energistics has defined a data model for the Petrotechnical industry.
10.3 Inputs
This section defines the inputs to Phase C (Data Architecture).
10.3.1 Reference Materials External to the Enterprise
- Architecture reference materials (see Part IV, 36.2.5 Architecture Repository)
10.3.2 Non-Architectural Inputs
- Request for Architecture Work (see Part IV, 36.2.17 Request for Architecture Work)
- Capability Assessment (see Part IV, 36.2.10 Capability Assessment)
- Communications Plan (see Part IV, 36.2.12 Communications Plan)
10.3.3 Architectural Inputs
- Organizational Model for Enterprise Architecture (see Part IV, 36.2.16 Organizational Model for Enterprise Architecture), including:
- Scope of organizations impacted
- Maturity assessment, gaps, and resolution approach
- Roles and responsibilities for architecture team(s)
- Constraints on architecture work
- Budget requirements
- Governance and support strategy
- Tailored Architecture Framework (see Part IV, 36.2.21 Tailored Architecture Framework), including:
- Tailored architecture method
- Tailored architecture content (deliverables and artifacts)
- Configured and deployed tools
- Data principles (see Part III, 23.6.2 Data Principles), if existing
- Statement of Architecture Work (see Part IV, 36.2.20 Statement of Architecture Work)
- Architecture Vision (see Part IV, 36.2.8 Architecture Vision)
- Architecture Repository (see Part IV, 36.2.5 Architecture Repository), including:
- Re-usable building blocks (in particular, definitions of current data)
- Publicly available reference models
- Organization-specific reference models
- Organization standards
- Draft Architecture Definition Document (see Part IV, 36.2.3 Architecture Definition Document), including:
- Baseline Business Architecture, Version 1.0 (detailed), if appropriate
- Target Business Architecture, Version 1.0 (detailed)
- Baseline Data Architecture, Version 0.1, if available
- Target Data Architecture, Version 0.1, if available
- Baseline Application Architecture, Version 1.0 (detailed) or Version 0.1 (Vision)
- Target Application Architecture, Version 1.0 (detailed) or Version 0.1 (Vision)
- Baseline Technology Architecture, Version 0.1 (Vision)
- Target Technology Architecture, Version 0.1 (Vision)
- Draft Architecture Requirements Specification (see Part IV, 36.2.6 Architecture Requirements Specification), including:
- Gap analysis results (from Business Architecture)
- Relevant technical requirements that will apply to this phase
- Business Architecture components of an Architecture Roadmap (see Part IV, 36.2.7 Architecture Roadmap)
10.4 Steps
The level of detail addressed in Phase C will depend on the scope and goals of the overall architecture effort.
New data building blocks being introduced as part of this effort will need to be defined in detail during Phase C. Existing data building blocks to be carried over and supported in the target environment may already have been adequately defined in previous architectural work; but, if not, they too will need to be defined in Phase C.
The order of the steps in this phase (see below) as well as the time at which they are formally started and completed should be adapted to the situation at hand in accordance with the established architecture governance. In particular, determine whether in this situation it is appropriate to conduct Baseline Description or Target Architecture development first, as described in Part III, 19. Applying Iteration to the ADM.
All activities that have been initiated in these steps must be closed during the Finalize the Data Architecture step (see 10.4.8 Finalize the Data Architecture). The documentation generated from these steps must be formally published in the Create Architecture Definition Document step (see 10.4.9 Create Architecture Definition Document.
The steps in Phase C (Data Architecture) are as follows:
- 10.4.1 Select Reference Models, Viewpoints, and Tools
- 10.4.2 Develop Baseline Data Architecture Description
- 10.4.3 Develop Target Data Architecture Description
- 10.4.4 Perform Gap Analysis
- 10.4.5 Define Candidate Roadmap Components
- 10.4.6 Resolve Impacts Across the Architecture Landscape
- 10.4.7 Conduct Formal Stakeholder Review
- 10.4.8 Finalize the Data Architecture
- 10.4.9 Create Architecture Definition Document
10.4.1 Select Reference Models, Viewpoints, and Tools
Review and validate (or generate, if necessary) the set of data principles. These will normally form part of an overarching set of architecture principles. Guidelines for developing and applying principles, and a sample set of data principles, are given in Part III, 23. Architecture Principles.
Select relevant Data Architecture resources (reference models, patterns, etc.) on the basis of the business drivers, stakeholders, concerns, and Business Architecture.
Select relevant Data Architecture viewpoints (for example, stakeholders of the data - regulatory bodies, users, generators, subjects, auditors, etc.; various time dimensions - real-time, reporting period, event-driven, etc.; locations; business processes); i.e., those that will enable the architect to demonstrate how the stakeholder concerns are being addressed in the Data Architecture.
Identify appropriate tools and techniques (including forms) to be used for data capture, modeling, and analysis, in association with the selected viewpoints. Depending on the degree of sophistication warranted, these may comprise simple documents or spreadsheets, or more sophisticated modeling tools and techniques such as data management models, data models, etc. Examples of data modeling techniques are:
- Entity-relationship diagram
- Class diagrams
10.4.1.1 Determine Overall Modeling Process
For each viewpoint, select the models needed to support the specific view required, using the selected tool or method.
Ensure that all stakeholder concerns are covered. If they are not, create new models to address concerns not covered, or augment existing models (see above).
The recommended process for developing a Data Architecture is as follows:
- Collect data-related models from existing Business Architecture and Application Architecture materials
- Rationalize data requirements and align with any existing enterprise data catalogs and models; this allows the development of a data inventory and entity relationship
- Update and develop matrices across the architecture by relating data to business service, business function, access rights, and application
- Elaborate Data Architecture views by examining how data is created, distributed, migrated, secured, and archived
10.4.1.2 Identify Required Catalogs of Data Building Blocks
The organization's data inventory is captured as a catalog within the Architecture Repository. Catalogs are hierarchical in nature and capture a decomposition of a metamodel entity and also decompositions across related model entities (e.g., logical data component -> physical data component ->] data entity).
Catalogs form the raw material for development of matrices and diagrams and also act as a key resource for portfolio managing business and IT capability.
During the Business Architecture phase, a Business Service/Information diagram was created showing the key data entities required by the main business services. This is a prerequisite to successful Data Architecture activities.
Using the traceability from application to business function to data entity inherent in the content framework, it is possible to create an inventory of the data needed to be in place to support the Architecture Vision.
Once the data requirements are consolidated in a single location, it is possible to refine the data inventory to achieve semantic consistency and to remove gaps and overlaps.
The following catalogs should be considered for development within a Data Architecture:
- Data Entity/Data Component catalog
The structure of catalogs is based on the attributes of metamodel entities, as defined in Part IV, 34. Content Metamodel.
10.4.1.3 Identify Required Matrices
Matrices show the core relationships between related model entities.
Matrices form the raw material for development of diagrams and also act as a key resource for impact assessment.
At this stage, an entity to applications matrix could be produced to validate this mapping. How data is created, maintained, transformed, and passed to other applications, or used by other applications, will now start to be understood. Obvious gaps such as entities that never seem to be created by an application or data created but never used, need to be noted for later gap analysis.
The rationalized data inventory can be used to update and refine the architectural diagrams of how data relates to other aspects of the architecture.
Once these updates have been made, it may be appropriate to drop into a short iteration of Application Architecture to resolve the changes identified.
The following matrices should be considered for development within a Data Architecture:
- Data Entity/Business Function (showing which data supports which functions and which business function owns which data)
- Business Service/Information (developed during the Business Architecture phase)
- Application/Data (developed across the Application Architecture and Data Architecture phases)
The structure of matrices is based on the attributes of metamodel entities, as defined in Part IV, 34. Content Metamodel.
10.4.1.4 Identify Required Diagrams
Diagrams present the Data Architecture information from a set of different perspectives (viewpoints) according to the requirements of the stakeholders.
Once the data entities have been refined, a diagram of the relationships between entities and their attributes can be produced.
It is important to note at this stage that information may be a mixture of enterprise-level data (from system service providers and package vendor information) and local-level data held in personal databases and spreadsheets.
The level of detail modeled needs to be carefully assessed. Some physical system data models will exist down to a very detailed level; others will only have core entities modeled. Not all data models will have been kept up-to-date as applications were modified and extended over time. It is important to achieve a balance in the level of detail provided (e.g., reproducing existing detailed system physical data schemas or presenting high-level process maps and data requirements, highlight the two extreme views).
The following diagrams should be considered for development within a Data Architecture:
- Conceptual Data diagram
- Logical Data diagram
- Data Dissemination diagram
- Data Lifecycle diagram
- Data Security diagram
- Data Migration diagram
10.4.1.5 Identify Types of Requirement to be Collected
Once the Data Architecture catalogs, matrices, and diagrams have been developed, architecture modeling is completed by formalizing the data-focused requirements for implementing the Target Architecture.
These requirements may:
- Relate to the data domain
- Provide requirements input into the Application, and Technology Architectures
- Provide detailed guidance to be reflected during design and implementation to ensure that the solution addresses the original architecture requirements
Within this step, the architect should identify requirements that should be met by the architecture (see 17.2.2 Requirements Development).
10.4.2 Develop Baseline Data Architecture Description
Develop a Baseline Description of the existing Data Architecture, to the extent necessary to support the Target Data Architecture. The scope and level of detail to be defined will depend on the extent to which existing data elements are likely to be carried over into the Target Data Architecture, and on whether architectural descriptions exist, as described in 10.2 Approach. To the extent possible, identify the relevant Data Architecture building blocks, drawing on the Architecture Repository (see Part V, 41. Architecture Repository).
Where new architecture models need to be developed to satisfy stakeholder concerns, use the models identified within Step 1 as a guideline for creating new architecture content to describe the Baseline Architecture.
10.4.3 Develop Target Data Architecture Description
Develop a Target Description for the Data Architecture, to the extent necessary to support the Architecture Vision and Target Business Architecture. The scope and level of detail to be defined will depend on the relevance of the data elements to attaining the Target Architecture, and on whether architectural descriptions exist. To the extent possible, identify the relevant Data Architecture building blocks, drawing on the Architecture Repository (see Part V, 41. Architecture Repository).
Where new architecture models need to be developed to satisfy stakeholder concerns, use the models identified within Step 1 as a guideline for creating new architecture content to describe the Target Architecture.
10.4.4 Perform Gap Analysis
Verify the architecture models for internal consistency and accuracy:
- Perform trade-off analysis to resolve conflicts (if any) among the different views
- Validate that the models support the principles, objectives, and constraints
- Note changes to the viewpoint represented in the selected models from the Architecture Repository, and document
- Test architecture models for completeness against requirements
Identify gaps between the baseline and target, using the Gap Analysis technique as described in Part III, 27. Gap Analysis.
10.4.5 Define Candidate Roadmap Components
Following creation of a Baseline Architecture, Target Architecture, and gap analysis, a data roadmap is required to prioritize activities over the coming phases.
This initial Data Architecture roadmap will be used as raw material to support more detailed definition of a consolidated, cross-discipline roadmap within the Opportunities & Solutions phase.
10.4.6 Resolve Impacts Across the Architecture Landscape
Once the Data Architecture is finalized, it is necessary to understand any wider impacts or implications.
At this stage, other architecture artifacts in the Architecture Landscape should be examined to identify:
- Does this Data Architecture create an impact on any pre-existing architectures?
- Have recent changes been made that impact the Data Architecture?
- Are there any opportunities to leverage work from this Data Architecture in other areas of the organization?
- Does this Data Architecture impact other projects (including those planned as well as those currently in progress)?
- Will this Data Architecture be impacted by other projects (including those planned as well as those currently in progress)?
10.4.7 Conduct Formal Stakeholder Review
Check the original motivation for the architecture project and the Statement of Architecture Work against the proposed Data Architecture. Conduct an impact analysis to identify any areas where the Business and Application Architectures (e.g., business practices) may need to change to cater for changes in the Data Architecture (for example, changes to forms or procedures, applications, or database systems).
If the impact is significant, this may warrant the Business and Application Architectures being revisited.
Identify any areas where the Application Architecture (if generated at this point) may need to change to cater for changes in the Data Architecture (or to identify constraints on the Application Architecture about to be designed).
If the impact is significant, it may be appropriate to drop into a short iteration of the Application Architecture at this point.
Identify any constraints on the Technology Architecture about to be designed, refining the proposed Data Architecture only if necessary.
10.4.8 Finalize the Data Architecture
- Select standards for each of the building blocks, re-using as much as possible from the reference models selected from the Architecture Repository
- Fully document each building block
- Conduct final cross-check of overall architecture against business requirements; document rationale for building block decisions in the architecture document
- Document final requirements traceability report
- Document final mapping of the architecture within the Architecture Repository; from the selected building blocks, identify those that might be re-used, and publish via the Architecture Repository
- Finalize all the work products, such as gap analysis
10.4.9 Create Architecture Definition Document
Document rationale for building block decisions in the Architecture Definition Document.
Prepare Data Architecture sections of the Architecture Definition Document, comprising some or all of:
- Business data model
- Logical data model
- Data management process model
- Data Entity/Business Function matrix
- Data interoperability requirements (e.g., XML schema, security policies)
- If appropriate, use reports and/or graphics generated by modeling tools to demonstrate key views of the architecture; route the document for review by relevant stakeholders, and incorporate feedback
10.5 Outputs
The outputs of Phase C (Data Architecture) may include, but are not restricted to:
- Refined and updated versions of the Architecture Vision phase deliverables, where applicable:
- Statement of Architecture Work (see Part IV, 36.2.20 Statement of Architecture Work), updated if necessary
- Validated data principles (see Part III, 23.6.2 Data Principles), or new data principles (if generated here)
- Draft Architecture Definition Document (see Part IV, 36.2.3 Architecture Definition Document), including:
- Baseline Data Architecture, Version 1.0, if appropriate
- Target Data Architecture, Version 1.0
- Business data model
- Logical data model
- Data management process models
- Data Entity/Business Function matrix
- Views corresponding to the selected viewpoints addressing key stakeholder concerns
- Draft Architecture Requirements Specification (see Part IV, 36.2.6 Architecture Requirements Specification), including such Data Architecture requirements as:
- Gap analysis results
- Data interoperability requirements
- Relevant technical requirements that will apply to this evolution of the architecture development cycle
- Constraints on the Technology Architecture about to be designed
- Updated business requirements, if appropriate
- Updated application requirements, if appropriate
- Data Architecture components of an Architecture Roadmap (see Part IV, 36.2.7 Architecture Roadmap)
The outputs may include some or all of the following:
- Catalogs:
- Data Entity/Data Component catalog
- Matrices:
- Data Entity/Business Function matrix
- Application/Data matrix
- Diagrams:
- Conceptual Data diagram
- Logical Data diagram
- Data Dissemination diagram
- Data Security diagram
- Data Migration diagram
- Data Lifecycle diagram