This article covers –
- Overall understanding of the domain
- Important concepts to focus on from exam point of view
The article is split into 10 parts as below:
- Part 1 – Information Systems operations, Management of IS operations, ITSM
- Part 2 – Service Level Agreements, Operational Level Agreements, Incident and problem Management process
- Part 3 – Roles and responsibilities of support/help desk, Change management, Patch management and release management.
- Part 4 – Quality Assurance (QA) and Overview of DBMS and DBMS architecture
- Part 5 – Data dictionary/Directory system, Database structure, OSI Architecture
- Part 6 – Application of OSI Model in Network Architecture, LAN topology, LAN components
- Part 7 – WAN components, WAN topology, Network performance metrics
- Part 8 – Network Management issues, Network Management tool and Overview of Disaster Recovery Planning (DRP)
- Part 9 – Overview of Recovery Point Objective (RPO) and Recovery Time Objective (RTO), additional parameters in defining recovery strategies and various types of recovery strategies
- Part 10 – Different recovery/Continuity/response teams and their responsibilities, overview on back-up and restoration and the various disaster recovery testing methods
|PART 1 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- Overall understanding of Domain 4
- What is information Systems operations?
- What are the ways of managing IS operations?
- What is IT service Management Framework (ITSM)?
Overall understanding of the domain:
Weightage – This domain constitutes 20 percent of the CISA exam (approximately 30 questions)
Covers 23 Knowledge statements covering the process of auditing information systems:-
- Knowledge of service management frameworks
- Knowledge of service management practices and service level management
- Knowledge of techniques for monitoring third-party performance and compliance with service agreements and regulatory requirements
- Knowledge of enterprise architecture (EA)
- Knowledge of the functionality of fundamental technology (e.g., hardware and network components, system software, middleware, database management systems)
- Knowledge of system resiliency tools and techniques (e.g., fault tolerant hardware, elimination of single point of failure, clustering)
- Knowledge of IT asset management, software licensing, source code management and inventory practices
- Knowledge of job scheduling practices, including exception handling
- Knowledge of control techniques that ensure the integrity of system interfaces
- Knowledge of capacity planning and related monitoring tools and techniques
- Knowledge of systems performance monitoring processes, tools and techniques (e.g., network analyzers, system utilization reports, load balancing)
- Knowledge of data backup, storage, maintenance and restoration practices
- Knowledge of database management and optimization practices
- Knowledge of data quality (completeness, accuracy, integrity) and life cycle management (aging, retention)
- Knowledge of problem and incident management practices
- Knowledge of change management, configuration management, release management and patch management practices
- Knowledge of operational risks and controls related to end-user computing
- Knowledge of regulatory, legal, contractual and insurance issues related to disaster recovery
- Knowledge of business impact analysis (BIA) related to disaster recovery planning
- Knowledge of the development and maintenance of disaster recovery plans (DRPs)
- Knowledge of benefits and drawbacks of alternate processing sites (e.g., hot sites, warm sites, cold sites)
- Knowledge of disaster recovery testing methods.
- Knowledge of processes used to invoke the disaster recovery plans (DRPs)
Important concepts from exam point of view:
1.Information Systems operations:
- Responsible for ongoing support for an organizations computer and IS environment
- plays a critical role in ensuring that computer operations processing requirements are met, end users are satisfied and information is processed securely
2.Management of IS operations:
COBIT 5 framework makes clear distinction between governance and management, which are as follows:
- Ensures that stakeholder needs, conditions and options are evaluated to determine balanced, agreed-on enterprise objectives to be achieved;
- Setting direction through prioritization and decision making; and monitoring performance and compliance against agreed-on direction and objectives.
- Overall governance is the responsibility of the board of directors under the leadership of the chairperson.
- Specific governance responsibilities may be delegated to special organizational structures at an appropriate level, particularly in larger, complex enterprises.
- Management plans, builds, runs and monitors activities in alignment with the direction set by the governance body to achieve the enterprise objectives
- Management is the responsibility of the executive management under the leadership of the chief executive officer (CEO).
- IS management has the overall responsibility for all operations within the IT department
3.IT Service Management framework (ITSM):
Refers to the implementation and management of IT services (people, process and information technology) to meet business needs
Two frameworks for ITSM:
- IT Infrastructure Library (ITIL):
- a reference body of knowledge for service delivery good practices
- a comprehensive framework detailed over five volumes – Service strategy, Service design, Service transition, services operations, Continual service improvement
- The main objective of ITIL is to improve service quality to the business.
- ISO 20000-1:2011 Information technology – Service management
- Requires service providers to implement the plan-do-check-act (PDCA) methodology
- The main objective is to improve service quality, achievement of the standard certifies organizations as having passed auditable practices and processes in ITSM.
|PART 2 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What are Service Level Agreements (SLAs) and Operational Level Agreements (OLAs)?
- What are the tools to monitor efficiency and effectiveness of services provided?
- Exception reports
- Operator problem reports
- System and application logs
- Operator work schedule
- ·What is incident management and problem management?
4.Service Level Agreement and Operational Level Agreement:
- Service Level Agreement:
- The Service Level agreement is a contract between service provider and customer
- SLAs can also be supported by operational level agreements (OLAs)
- Operational Level Agreement:
- OLA is an agreement between the internal support groups of an institution that supports SLA
- The OLA clearly depicts the performance and relationship of the internal service groups.
- The main objective of OLA is to ensure that all the support groups provide the intended Service Level Agreement
5.Tools to monitor efficiency and effectiveness of services provided:
- Exception reports:
- These automated reports identify all applications that did not successfully complete or otherwise malfunctioned.
- An excessive number of exceptions may indicate:
- Poor understanding of business requirements
- Poor application design, development or testing
- Inadequate operation instructions
- Inadequate operations support
- Inadequate operator training or performance monitoring
- Inadequate sequencing of tasks
- Inadequate system configuration
- Inadequate capacity management
- System and application logs:
- Refers to logs generated from various systems and applications
- Using this software, the auditor can carry out tests to ensure that:
- Only approved programs access sensitive data
- Only authorized IT personnel access sensitive data
- Software utilities that can alter data files and program libraries are used only for authorized purposes
- Approved programs are run only when scheduled and, conversely, that unauthorized runs do not take place
- The correct data file generation is accessed for production purposes
- Data files are adequately protected
- Operator problem reports – Manual report used by helpdesk to log computer operations problems & resolutions
- Operator work schedules – Report maintained manually by IS management to assist in human resource planning to ensure proper staffing of operation support
|Points to remember:
o Availability reports – The report that IS auditor use to check compliance with service level agreements (SLA) requirement for uptime
6.Incident management and problem management:
- Incident management:
- An Incident is an event that could lead to loss of, or disruption to, an organization’s operations, services or functions.
- Incident management is a term describing the activities of an organization to identify, analyze, and correct hazards to prevent a future re-occurrence.
- These incidents within a structured organization are normally dealt with by either an (IRT) or an incident management team (IMT)
- Incident management is reactive and its objective is to respond to and resolve issues restoring normal service (as defined by the SLA) as quickly as possible.
- Problem management:
- Problem management is the process responsible for managing the lifecycle of all problems that happen or could happen in an IT service.
- The primary objectives of problem management are to prevent problems and resulting incidents from happening, to eliminate recurring incidents, and to minimize the impact of incidents that cannot be prevented.
|PART 3 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
· What are the roles and responsibilities of Support/help desk?
· What is change management and patch management process?
· What is release management – Major, Minor and emergency releases?
7.Support/Help desk – Roles and responsibilities:
- The responsibility of the technical support function is to provide specialist knowledge of production systems to identify and assist in system change/development and problem resolution.
- The basic function of the help desk is to be the first, single and central point of contact for users and to follow the incident management process
- The help desk personnel must ensure that all hardware and software incidents that arise are fully documented and escalated based on the priorities established by management
8.Change management and patch management process:
- Change management:
- used when changing hardware, installing or upgrading to new releases of off-the-shelf applications, installing a software patch and configuring various network devices
- Changes are classified into three types:
- Emergency changes
- Major changes
- Minor changes
- Patch Management:
- an area of systems management that involves acquiring, testing and installing multiple patches (code changes) to an administered computer system in order to maintain up-to-date software and often to address security risk
- Patch management tasks include the following:
- Maintaining current knowledge of available patches
- Deciding what patches are appropriate for particular systems
- Ensuring that patches are installed properly; testing systems after installation
- Documenting all associated procedures, such as specific configurations required
|Points to remember:
o Patch Management – The BEST method for preventing exploitation of system vulnerabilities
- Software release management is the process through which software is made available to users.
- The term “release” is used to describe a collection of authorized changes.
- The release will typically consist of a number of problem fixes and enhancements to the service.
- The release can be of three types:
- Major releases: Normally contain a significant change or addition to new functionality. A major upgrade or release usually supersedes all preceding minor upgrades.
- Minor releases: Upgrades, normally containing small enhancements and fixes. A minor upgrade or release usually supersedes all preceding emergency fixes. Minor releases are generally used to fix small reliability or functionality problems that cannot wait until the next major release.
- Emergency releases: Normally containing the corrections to a small number of known problems. Emergency releases are fixes that require implementation as quickly as possible to prevent significant user downtime to business-critical functions
- While change management is the process whereby all changes go through a robust testing and approval process, release management is the process of actually putting the software changes into production.
|PART 4 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What is Quality Assurance (QA)?
- What is Database Management Systems (DBMS)?
- What is DBMS Architecture?
- QA personnel verify that system changes are authorized, tested and implemented in a controlled manner prior to being introduced into the production environment according to a company’s change and release management policies
11. Database management systems (DBMS):
- aids in organizing, controlling and using the data needed by application programs.
- A DBMS provides the facility to create and maintain a well-organized database.
- Primary functions include:
- Reduced data redundancy,
- Decreased access time and
- Basic security over sensitive data.
- Database architecture focuses on the design, development, implementation and maintenance of computer programs that store and organize information for businesses, agencies and institutions.
- A database architect develops and implements software to meet the needs of users. The design of a DBMS depends on its architecture
- the data (details/schema) of any other data (i.e. data about data)
- The word ‘Meta’ is the prefix that is generally the technical term for self-referential. In other words, we can say that Metadata is the summarized data for the contextual data.
- There are three types of metadata:
- Conceptual schema,
- External schema and
- Internal schema
|PART 5 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What is Data Dictionary / Directory system?
- What is Database structure?
- What are the database types?
- Hierarchical database model
- Network database model
- Relational database model
- What is OSI Architecture?
13.Information Systems operations:
- Data Dictionary contains an index and descriptions all of the data stored in database. Directory describes the locations of the data and the access method
- Some of the benefits of using DD/DS include:
- Enhancing documentation
- Providing common validation criteria
- Facilitating programming by reducing the needs for data definition
- Standardizing programming methods
- The database structure is the collection of record type and field type definitions that comprise your database`.
- There are three major types of database structure:
- Hierarchical database model,
- Network database model, and
- Relational database model
- Hierarchical database model:
- In this model there is a hierarchy of parent and child data segments. To create links between them, this model uses parent-child relationships.
- These are 1:N (one-to-many) mappings between record types represented by logical trees
- Network database model:
- In the network model, the basic data modeling construct is called a set.
- A set is formed by an owner record type, a member record type and a name.
- A member record type can have that role in more than one set, so a multi-owner relationship is allowed.
- An owner record type can also be a member or owner in another set. Usually, a set defines a 1:N relationship, although one-to-one (1:1) is permitted
- Disadvantages of Network database model:
- Structures can be extremely complex and difficult to comprehend, modify or reconstruct in case of failure.
- This model is rarely used in current environments.
- The hierarchical and network models do not support high-level queries. The user programs have to navigate the data structures.
- Relational database model
- In Relational database model, the data and relationships among these data are organized in tables.
- A table is a collection of rows, also known as tuples, and each tuple in a table contains the same columns. Columns, called domains or attributes, correspond to fields.
- Relational database has the following properties:
- Values are atomic.
- Each row is unique.
- Column values are of the same kind.
- The sequence of columns is insignificant.
- The sequence of rows is insignificant.
- Each column has a unique name
- The relational model is independent from the physical implementation of the data structure, and has many advantages over the hierarchical and network database models. With relational databases, it is easier:
- For users to understand and implement a physical database system
- To convert from other database structures
- To implement projection and join operations
- To create new relations for applications
- To implement access control over sensitive data
- To modify the database
- A key feature of relational databases is the use of “normalization”
- a technique of organizing the data in the database
- a systematic approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion Anomalies
- OSI model was developed by the International Organization for Standardization (ISO) in 1984, and it is now considered as an architectural model for the inter-computer communications
- OSI model is a reference model that describes how information from a software application in one computer moves through a physical medium to the software application in another computer.
- The OSI (Open Systems Inter-connection) is a proof-of-concept model composed of seven layers, each specifying particular specialized tasks or functions.
- The OSI model was defined in ISO/IEC 7498, which has the following parts:
- ISO/IEC 7498-1 The Basic Model
- ISO/IEC 7498-2 Security Architecture
- ISO/IEC 7498-3 Naming and addressing
- ISO/IEC 7498-4 Management framework
- Each layer is self-contained and relatively independent of the other layers in terms of its particular function
- There are seven OSI layers. Each layer has different functions. They are:
- Physical Layer
- Data-Link Layer
- Network Layer
- Transport Layer
- Session Layer
- Presentation Layer
- Application Layer
|Points to remember:
o The CISA candidate will not be tested on the specifics of this standard in the exam
- The functions of each layer are as follows:
- Physical Layer – The physical layer provides the hardware that transmits and receives the bit stream as electrical, optical or radio signals over an appropriate medium or carrier.
- Data-Link Layer – The data link layer is used for the encoding, decoding and logical organization of data bits. Data packets are framed and addressed by this layer, which has two sublayers
- Network Layer – This layer of the assigned the IP addresses and is responsible for routing and forwarding. This layer prepares the packets for the data link layer
- Transport Layer – The transport layer provides reliable and transparent transfer of data between end points, end-to-end error recovery and flow control.
- Session Layer – The session layer controls the dialogs (sessions) between computers. It establishes, manages and terminates the connections between the local and remote application layers
- Presentation Layer – The presentation layer converts the outgoing data into a format acceptable by the network standard and then passes the data to the session layer (It is responsible for translation, compression and encryption)
- Application Layer – provides a standard interface for applications that must communicate with devices on the network (e.g., print files on a network-connected printer, send an email or store data on a file server)
|Points to remember:
o The OSI layer that perform error detection and encryption – Data Link layer
|PART 6 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What is the application of OSI model in Network Architecture?
- What is LAN topology?
- What are the LAN components?
16.Application of the OSI model in Network Architectures:
- The concepts of the OSI model are used in the design and development of organizations’
network architectures. This includes LANs, WANs, MANs and use of the public Transmission
Control Protocol/Internet Protocol (TCP/IP)-based global Internet.
- The discussion will focus on:
- Wireless networks
- Public global internet infrastructure
- Network administration and control
- Applications in a networked environment
- On-demand computing
- Local Area Network (LAN):
- a computer network that interconnects computers within a limited area such as a residence, school, laboratory, university campus or office building
- Media used in LAN:
- Copper (twisted-pairs) circuit:
- Twisted pairs are of two types:
(1) Shielded twisted pair – More attenuation, More cross talk and more interference
(2) unshielded twisted pair – More attenuation, More cross talk and more interference
– Two insulated wires are twisted around each other, with current flowing through them in opposite directions.
a. This reduces the opportunity for cross talk
c. Readily available
d. Simple to modify
a. Easy to tap
b. Easy to splice
c. Interference and Noise
- Fiber-optics systems:
- It refers to the technology and medium used in the transmission of data as pulses of light through a strand or fiber medium made of glass or plastic flashes of light.
- Fiber-optic systems have a low transmission loss as compared to twisted-pair circuits.
- Optical fiber is smaller and lighter than metallic cables of the same capacity.
- Fiber is the preferred choice for high-volume, longer-distance runs
- Radio systems (wireless):
- Data are communicated between devices using low-powered systems that broadcast (or radiate) and receive electromagnetic signals representing data
|Points to remember:
o The method of routing traffic through split-cable facilities or duplicate-cable facilities is called “Diverse routing”
o The type of line media that provides the BEST security for a telecommunication network is “Dedicated lines”
- Star topology
- Bus topology
- Ring topology
- Repeaters – physical layer devices that extend the range of a network or connect two separate
network segments together
- Hubs- physical layer devices that serve as the center of a star-topology network or a network concentrator
- Bridges – data link layer devices that were developed to connect LANs or create two separate
LAN or WAN network segments from a single segment to reduce collision domains
- Switches – data link level devices that can divide and interconnect network segments
and help to reduce collision domains in Ethernet-based networks
- Routers – operate at the OSI network layer by examining network addresses (i.e., routing information encoded in an IP packet).
- Gateways – are devices that are protocol converters. Typically, they connect and convert between
LANs and the mainframe, or between LANs and the Internet, at the application layer of the OSI
|PART 7 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What are the WAN components?
- WAN switches
- What are WAN technologies?
- Point-to-point protocol
- Integrated services digital network (ISDN)
- Asynchronous transfer mode
- Frame Relay
- Multiprotocol label switching
- Digital subscriber lines/li>
- Virtual Private Network
- What are the network performance metrics?
- WAN switches – Data link layer devices used for implementing various WAN technologies such as ATM, point-to-point frame relay and ISDN
- Routers – devices that operate at the network layer of the OSI reference model and provide an interface between different network segments on an internal network or connects the internal
network to an external network
- Modems (modulator/demodulator)
- Converts computer digital signals into analog data signals and analog data back to digital.
- A main task of the modems at both ends is to maintain their synchronization so the receiving device knows when each byte starts and ends. Two methods can be used for this purpose:
- Synchronous transmission – a data transfer method in which a continuous stream of data signals is accompanied by timing signals (generated by an electronic clock) to ensure that the transmitter and the receiver are in step (synchronized) with one another. The data is sent in blocks (called frames or packets) spaced by fixed time intervals
- Asynchronous transmission – The term asynchronous is used to describe the process where transmitted data is encoded with start and stop bits, specifying the beginning and end of each character. Asynchronous transmission works in spurts and must insert a start bit before each data character and a stop bit at its termination to inform the receiver where it begins and ends.
- Point to point protocol – (PPP) is a data link layer communications protocol used to establish a direct connection between two nodes. PPP is a widely available remote access solution that supports asynchronous and synchronous links, and operates over a wide range of media.
- X.25 – is a standard suite of protocols used for packet-switched communications over a wide area network
- Frame Relay – Frame relay is a packet-switching telecommunication service designed for cost-efficient data transmission for intermittent traffic between LAN and between endpoints in WAN
- Integrated services digital network (ISDN) – It is a set of communication standards for simultaneous digital transmission of voice, video, data, and other network services over the traditional circuits of the public switched telephone network
- Asynchronous transfer mode – ATM is a dedicated-connection switching technology that organizes digital data into 53-byte cell units and transmits them over a physical medium using digital signal technology
- Multiprotocol label switching – Multiprotocol label switching (MPLS) is a mechanism used within computer network infrastructures to speed up the time it takes a data packet to flow from one node to another. It enables computer networks to be faster and easier to manage by using short path labels instead of long network addresses for routing network packets.
- Digital subscriber lines – Digital subscriber line (DSL) is a technology that transports high-bandwidth data over a simple telephone line that is directly connected to a modem. This allows for file-sharing, and the transmission of pictures and graphics, multimedia data, audio and video conferencing and much more
- Virtual Private Network (VPN):
- extends a private network across a public network and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network. Applications running on an end system (PC, smartphone etc.) across a VPN may therefore benefit from the functionality, security, and management of the private network
- VPN technology was developed to allow remote users and branch offices to access corporate applications and resources. To ensure security, the private network connection is established using an encrypted layered tunneling protocol, and VPN users use authentication methods, including passwords or certificates, to gain access to the VPN.
- There are three types of VPNs:
1. Remote-access VPN – Used to connect telecommuters and mobile users to the enterprise WAN in a secure manner; it lowers the barrier to telecommuting by ensuring that information is reasonably protected on the open Internet.
2. Intranet VPN – Used to connect branch offices within an enterprise WAN
3. Extranet VPN – Used to give business partners limited access to each other’s corporate network; and example is an automotive manufacturer with its suppliers
21. Network Performance Metrics:
- Latency: The delay that a message or packet will experience on its way from source to destination. A very easy way to measure latency in a TCP/IP network is to use the ping command.
- Throughput: The quantity of useful work made by the system per unit of time. In telecommunications, it is the number of bytes per second that are passing through a channel.
|Points to remember:
o Ping command is used to measure the latency
|PART 8 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What are the Network Management issues?
- Fault Management
- Performance management
- Configuration management
- Security management
- Accounting resources
- What are the Network Management tools?
- Response time
- Network monitors
- Downtime reports
- Simple Network Management Protocol (SNMP)
- Online monitors
- Help desk reports
- Protocol analyzers
- What is Disaster Recovery Planning (DRP)?
22.Network Management Issues:
A WAN needs to be monitored and managed similarly to a LAN. ISO, as part of its communications modeling effort (ISO/IEC 10040), has defined five basic tasks related to network management:
- Fault management – Detects the devices that present some kind of technical fault
- Configuration management – Allows users to know, define and change, remotely, the configuration of any device
- Accounting resources – Holds the records of the resource usage in the WAN (who uses what)
- Performance management – Monitors usage levels and sets alarms when a threshold has been surpassed
- Security management – Detects suspicious traffic or users, and generates alarms accordingly
23.Network Management tools:
- Response Time – Identify the time necessary for a command entered by users at a terminal to be answered by the host system.
- Downtime Reports – Track the availability of telecommunications line and circuits. Interruptions due to power line failure, traffic, overload, operator error or other anomalous conditions are identified in a downtime reports
- Online Monitors – Check data transmissions accuracy and errors. Monitoring can be performed be echo checking and status checking all transmissions, ensuring that messages are not lost or transmitted more than one.
- Network Monitors – Real time display of network nodes and status.
- Protocol Analyzers – It is a diagnostic tool used for monitoring packets flowing within the network.
- Simple Network Management Protocol (SNMP) – It is a TCP/IP-based protocol that monitors and controls different variables throughout the network, manages configurations, and collects statistics on performance and security
- Help desk reports – It is prepared by the help desk, which is staffed or supported by IT technicians trained to handle problems occurring during normal IS usage.
24.Disaster Recovery Planning (DRP):
- DRP is an element of an internal control system established to manage availability and restore critical processes/IT services in the event of interruption.
- The purpose of this continuous planning process is
- to ensure that cost-effective controls to prevent possible IT disruptions and
- to recover the IT capacity of the organization in the event of a disruption are in place
- DRP is a continuous process. Once the criticality of business processes and supporting IT services, systems and data are defined, they are periodically reviewed and revisited
- The ultimate goal of the DRP process is
- to respond to incidents that may impact people and
- the ability of operations to deliver goods and services to the marketplace and to comply with regulatory requirements
- The difference between BCP and DRP is as follows:
- BCP is focused on keeping the business operations running, perhaps in a different location or by using different tools or processes, after the disaster has happened. DRP is focused on restoring business operations after the disaster has taken place.
- BCP often includes Non-IT aspects of the business. DRP often focuses on IT systems
|Points to remember:
o The prerequisite for developing a disaster recovery planning is – to have a management commitment.
o The PRIMARY GOAL of Disaster Recovery planning and Business continuity planning should always be – Safety of Personnel (Human safety first)
o Occupant Emergency Plan (OEP) provides the response procedures for occupants of a facility in the event a situation poses a threat to the heal and safety of personnel
o The critical first step in disaster recovery and contingency planning is – to complete a business impact analysis
o The term “Disaster Recovery” refers to recovery of technological environment
o The BCP is ultimate responsibility of Board of Directors
o Minimizing single points of failure or vulnerabilities of a common disaster is mitigated by
geographically dispersing resources.
o Disaster Recovery planning addresses the technological aspect of business continuity planning
o A disaster recovery plan for an organization should focus on reducing the length of recovery time and the cost of recovery.
o The results of tests and drills are the BEST evidence of an organization’s disaster recovery readiness.
o Fault-tolerant hardware is the only technology that provides continuous and uninterrupted support in the event of a disaster or disruption
|PART 9 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What is Recovery Point Objective (RPO) and Recovery Time Objective (RTO)?
- What are the additional parameters in defining the recovery strategy?
- Interruption window
- Service delivery objective (SDO)
- Maximum tolerable outages
- What are the recovery strategies?
- Hot site
- Cold site
- Warm site
- Reciprocal arrangements
25.Recovery Point Objective (RPO) and Recovery Time Objective (RTO):
|Points to remember:
o The CISA candidate should be familiar with which recovery strategies would be best with different RTO and RPO parameters.
o with different RTO and RPO parameters.
- Recovery Point objective:
- RPO is determined based on the acceptable data loss in case of disruption of operations.
- RPO indicates the earliest point in time in which it is acceptable to recover the data. For example, if the process can afford to lose the data up to four hours before disaster, then the latest backup available should be up to four hours before disaster or interruption and the transactions that occurred during the RPO period and interruption need to be entered after recovery (known as catch-up data)
- RPO effectively quantifies the permissible amount of data loss in case of disruption.
- Recovery Time Objective:
- The RTO is determined based on the acceptable downtime in case of a disruption of operations.
- It indicates the earliest point in time at which the business operations (and supporting IT systems) must resume after disaster
- Both of these concepts are based on time parameters.
- The nearer the time requirements are to the center (0-1 hours), the higher the cost of the recovery strategies.
- If the RPO is in minutes (lowest possible acceptable data loss), then data mirroring or real-time replication should be implemented as the recovery strategy.
- If the RTO is in minutes (lowest acceptable time down), then a hot site, dedicated spare servers (and other equipment) and clustering must be used.
- The below table represents the relationship between RPO and RTO:
||Recovery Time Objective
||Recovery Point objective
|0 to 1 hour
||Mirroring (Real-time replication)
|1 to 4 hours
||Active-passive clustering (Hot Standby)
||Disk-based back-ups, snapshots,
delayed replication, log shipping
|4 – 24 hours
||Tape backups, log shipping
|Points to remember:
o Recovery Point Objective (RPO) will be deemed critical if it is small
o If the Recovery point objective (RPO) is close to zero, then it means that the activity is critical and hence the cost of maintaining the environment would be higher
o The LOWEST expenditure in terms of recovery arrangement can be through Reciprocal agreement
o A hot site is maintained and data mirroring is implemented, where Recovery Point Objective (RPO) is low
o The BEST option to support 24/7 availability is – Data Mirroring
o The metric that describes how long it will take to recover a failed system is – Mean time to Repair (MTTR)
26.Additional parameters in defining recovery strategy:
- Interruption window – The maximum period of time the organization can wait from the point of failure to the critical services/applications restoration. After this time, the progressive losses caused by the interruption are unaffordable.
- Service delivery objective (SDO) – Level of services to be reached during the alternate process mode until the normal situation is restored. This is directly related to the business needs.
- Maximum tolerable outages – Maximum time the organization can support processing in alternate mode. After this point, different problems may arise, especially if the alternate SDO is lower than the usual SDO, and the information pending to be updated can become unmanageable.
- A recovery strategy identifies the best way to recover a system (one or many) in case of interruption, including disaster, and provides guidance based on which detailed recovery procedures can be developed
- The selection of a recovery strategy would depend on:
- The criticality of the business process and the applications supporting the processes
- Time required to recover
- Recovery strategies based on the risk level identified for recovery are as follows:
- Hot sites – facilities with space and basic infrastructure and all of the IT and communications equipment required to support the critical applications, along with office furniture and equipment for use by the staff.
- Warm sites – are complete infrastructures but are partially configured in terms of IT, usually with network connections and essential peripheral equipment such as disk drives, tape drives and controllers.
- Cold sites – are facilities with the space and basic infrastructure adequate to support resumption of operations, but lacking any IT or communications equipment, programs, data or office support.
- Duplicate information processing facilities
- Mobile sites – are packaged, modular processing facilities mounted on transportable vehicles and kept ready to be delivered and set up at a location that may be specified upon activation
- Reciprocal agreements – are agreements between separate, but similar, companies to temporarily share their IT facilities in the event that one company loses processing capability. Reciprocal agreements are not considered a viable option due to the
constraining burden of maintaining hardware and software compatibility between the companies, the complications of maintaining security and privacy compliance during shared operations, and the difficulty of enforcing the agreements should a disagreement arise at the time the plan is activated.
- Reciprocal arrangements with other organisations – are agreements between two or
more organizations with unique equipment or applications. Under the typical agreement, participants promise to provide assistance to each other when an emergency arises.
|Points to remember:
The CISA candidate should know these recovery strategies and when to use them
An offsite information processing facility having electrical wiring, air conditioning and flooring, but no computer or communications equipment is a Cold site
- The type of offsite information processing facility is often an acceptable solution for preparing for recovery of non-critical systems and data is a cold site
- Data mirroring and parallel processing are both used to provide near-immediate recoverability for time-sensitive systems and transaction processing
- Organizations should use off-site storage facilities to maintain redundancy of current and critical information within backup files.
- An off-site processing facility should not be easily identifiable externally because easy identification would create an additional vulnerability for sabotage
- The GREATEST concern when an organization’s backup facility is at a warm site is – Timely availability of hardware.
- The GREATEST risk created by a reciprocal agreement for disaster recovery made between two companies is – Developments may result in hardware and software incompatibility.
|PART 10 – CISA Domain 4 – Information Systems operations, Maintenance and Service Management
- What are the different Recovery/Continuity/response teams and their responsibilities?
- What is back-up and restoration?
- Full back-up
- Incremental back-up
- Differential back-up
- What are the disaster recovery testing methods?
- Checklist review
- Parallel test
- Structured walk-through
- Full interruption test
- Simulation test
28. Different Recovery/continuity/response teams and their responsibilities:
- Incident response team
- Emergency action team
- Information security team
- Damage assessment team
- Offsite storage team
- Software team
- Applications team
- Administrative support team
- Salvage team
- Emergency operations team
- Network recovery team
- Communications team
- Transportation team
- User hardware team
- Relocation team
- Legal affairs team
- Recovery test team
- Training team
|Points to remember:
o The responsibility of disaster recovery relocation team is to co-ordinate the process of moving from hot site to a new location or to the restored original location.
o The responsibility of offsite storage team is to obtain, pack and ship media and records to the recovery facilities, as well as establishing and overseeing an offsite storage schedule.
o The responsibility of transportation team is to locate a recovery site, if one has not been predetermined, and coordinating the transport of company employees to the recovery site.
o The responsibility of salvage team is managing the relocation project and conducting a more detailed assessment of the damage to the facilities and equipment.
29.Back-up and restoration:
- Back-up schemes:
There are three main schemes for backup:
- Full back-up – This type of backup scheme copies all files and folders to the backup media, creating one backup set (with one or more media, depending on media capacity)
- Incremental back-up – An incremental backup copies the files and folders that changed or are new since the last incremental or full backup
- Differential back-up – A differential backup will copy all files and folders that have been added or changed since a full backup was performed. This type of backup is faster and requires less media capacity than a full backup and requires only the last full and differential backup sets to make a full restoration
|Points to remember:
o The BEST backup strategy for a large database with data supporting online sales is – Weekly full back-up with daily incremental back-up
30.Disaster Recovery testing methods:
- Checklist review – This is a preliminary step to a real test. Recovery checklists are distributed to all members of a recovery team to review and ensure that the checklist is current.
- Structured walk-through – Team members physically implement the plans on paper and review each step to assess its effectiveness, identify enhancements, constraints and deficiencies.
- Simulation test – The recovery team role plays a prepared disaster scenario without activating processing at the recovery site.
- Parallel test – The recovery site is brought to a state of operational readiness, but operations at the primary site continue normally.
- Full interruption test – Operations are shut down at the primary site and shifted to the recovery site in accordance with the recovery plan; this is the most rigorous form of testing but is expensive and potentially disruptive.
|Points to remember:
o A continuity plan test that uses actual resources to simulate a system crash to cost-effectively obtain evidence about the plan’s effectiveness is preparedness test
o The most effective test of DRP for organisations having number of offices across a wide geographical area is preparedness test
o The type of BCP test that requires only representatives from each operational area to meet to review the plan is Walk-through test
Full interruption test – Operations are shut down at the primary site and shifted to the recovery site in accordance with the recovery plan; this is the most rigorous form of testing but is expensive and potentially