Session 1 — Incident Management Lifecycle (Day 24, Mon Apr 20)
How CISM Defines an Information Security Incident
Domain 4 carries the second-highest exam weight at 30% — roughly 45 questions. That weight reflects ISACA's view that incident management is where governance and strategy become operational. You can design a flawless security program and still have your organization defined by how it responds when something goes wrong.
CISM defines an information security incident as an event that results in or has the potential to result in unauthorized access, use, disclosure, modification, or destruction of information, or interference with information systems — and critically, one that has a demonstrable business impact. This two-part definition matters on the exam: deviation from expected state alone is not sufficient. The business impact criterion is what separates an incident from routine noise.
That framing is deliberate. CISM positions the information security manager as a business leader, not a technical responder. The incident management program is a business function — it exists to protect the organization's ability to operate, not just to remediate compromised endpoints.
Incident vs. Event vs. Alert — A Critical Exam Distinction
CISM exam questions frequently test whether candidates understand these three terms precisely. Using them interchangeably is a common error that signals incomplete understanding.
| Term | CISM Definition | Example | Response Required? |
|---|---|---|---|
| Event | Any observable occurrence in a system or network. Events are neutral — they happen continuously. The vast majority are benign. | A user logs in successfully. A firewall blocks a connection. A file is accessed. | No — events are logged and monitored, not individually actioned. |
| Alert | A notification generated by a detection system that a specific pattern or threshold has been met. Alerts indicate potential problems but require human analysis to confirm. | SIEM triggers on 10 failed logins in 60 seconds. An AV tool flags a file as suspicious. | Requires triage — most alerts are false positives. Confirmed alerts may become incidents. |
| Incident | A confirmed violation of security policy or acceptable use that has a business impact. An alert that has been investigated, confirmed, and assessed to have actual or potential harm. | The 10 failed logins are confirmed as a brute-force attack on a privileged account. The flagged file is confirmed ransomware. | Yes — incident management process is activated. |
CISM scenarios often describe a situation and ask whether it constitutes an incident. The correct answer almost always depends on whether business impact has been confirmed — not just whether a detection system fired. An alert without confirmed impact is a potential incident. An event without analysis is not yet an incident at all. The information security manager's role at this stage is to establish the triage and escalation structure that correctly classifies incoming alerts.
The Incident Management Lifecycle
CISM aligns with the NIST SP 800-61 framework for incident response, organizing the lifecycle into distinct phases. Know these phases, their sequence, and what the information security manager's responsibilities are at each stage — not the technical actions of the IR team.
- Prepare: Build the capability before an incident occurs. Establish the incident response plan, train the team, define escalation paths, implement detection tools, and conduct exercises. This is the phase the information security manager owns most fully — it is a governance and program management function.
- Detect: Identify that an incident has occurred or is occurring. Detection relies on monitoring, logging, user reporting, and threat intelligence. Effectiveness here is directly proportional to the quality of Prepare-phase investments.
- Analyze: Determine scope, impact, and nature of the incident. Triage and classification occur here. The information security manager must ensure this phase produces a clear picture of business impact — not just a technical one.
- Contain: Limit the spread or damage of the incident. May involve isolating systems, disabling accounts, blocking network paths. Containment strategy is a business decision — it may require accepting some ongoing risk to preserve forensic evidence or maintain operations.
- Eradicate: Remove the root cause. Eliminate the attacker's access, remove malware, close exploited vulnerabilities. Do not recover before eradicating — restoring systems that still contain the threat restarts the incident.
- Recover: Restore normal operations. Validate that systems are clean, test functionality, confirm monitoring is active before returning systems to production.
- Post-Incident Review: Analyze what happened, why, and what improvements are needed. This phase closes the loop from reactive response back to the Prepare phase, improving the program for next time.
NIST SP 800-61 uses four phases: Preparation; Detection and Analysis; Containment, Eradication, and Recovery; and Post-Incident Activity. CISM expands this into seven steps for exam purposes but the underlying framework is the same. If an exam question references NIST 800-61, know that it is the foundational U.S. government guidance for computer security incident handling and that CISM's approach is consistent with it. You will not be tested on NIST 800-61 chapter and verse — you will be tested on whether you understand that incident response requires a structured, repeatable process aligned with documented standards.
The Incident Response Plan
The incident response plan (IRP) is the documented framework that governs how the organization detects, responds to, and recovers from information security incidents. It is a governance artifact — not a technical runbook. CISM expects you to understand what the IRP contains, who has authority over it, and when it must be tested.
A mature incident response plan includes:
- Scope and purpose: What types of incidents does this plan cover? What are the objectives of the program?
- Roles and responsibilities: Who does what during an incident. Clearly assigned, not implied.
- Incident classification schema: Severity levels, definitions, and criteria for escalation.
- Escalation procedures: Who is notified at each severity level, within what time frame, through what channel.
- Communication protocols: Internal and external communication — who speaks, what is said, to whom.
- Evidence handling procedures: Chain of custody, forensic preservation, legal hold instructions.
- Containment and recovery guidance: General strategies appropriate to major incident categories.
- Testing schedule and maintenance cadence: How often the plan is exercised and reviewed.
- External contact list: Law enforcement, regulators, legal counsel, IR retainers, cyber insurance.
The IRP must be approved by senior management — not just the CISO. Approval at the executive level signals that the organization has accepted the business commitments embedded in the plan (notification timelines, resource commitments, escalation thresholds). It must be tested at least annually and after any significant incident or major organizational change.
Incident Management Program vs. Incident Response Capability
CISM draws a meaningful distinction between the incident management program and the incident response capability. The exam tests this framing:
- Incident management program: The ongoing, proactive organizational function. It includes policy, governance, training, exercising, metrics, continuous improvement, and integration with risk management. The program exists 365 days a year regardless of whether an incident is occurring.
- Incident response capability: The reactive ability to respond when an incident occurs. It is activated when needed and is the product of the program's investments in people, process, and technology.
The information security manager is responsible for the program. The IR team (or IR lead) is responsible for executing the capability when an incident occurs. This distinction matters on the exam because CISM questions will ask what the information security manager should do — and the answer is almost always about governance, oversight, and program management, not technical response actions.
Integration with BCP and DR
Incident management does not operate in isolation. Major incidents may trigger BCP or DR activation — a ransomware attack that takes down critical systems is both an information security incident and a business continuity event. The information security manager must ensure that:
- The IRP identifies the trigger conditions under which BCP/DR activation is required
- The incident management team has pre-established handoff protocols with the BCP/DR team
- Recovery priorities from the BIA inform which systems are prioritized during incident recovery
- BCP/DR plans are tested in scenarios that include security-caused disruptions, not just natural disasters or infrastructure failures
Session 2 — Incident Classification, Escalation & Response (Day 25, Tue Apr 21)
Incident Classification Frameworks
Consistent incident classification is the foundation of effective escalation and resource allocation. Without a defined classification schema, every incident is either treated as a catastrophe or minimized to the point of inadequate response. The information security manager designs and governs the classification framework; the IR team applies it during response.
Most organizations use a severity tier model. Common approaches include four-tier (P1 through P4) or three-tier (High/Medium/Low) frameworks. What matters is that the criteria are defined in advance, consistently applied, and tied to specific escalation and response time requirements.
| Severity | Common Label | Criteria | Response Time |
|---|---|---|---|
| P1 / Critical | High / Critical | Confirmed breach of sensitive data; significant operational disruption; active attacker in environment; regulatory notification likely; executive or board notification required. | Immediate — within 1 hour |
| P2 / High | High | Significant threat with contained impact; single system compromised; potential data exposure under investigation; limited business disruption. | Within 4 hours |
| P3 / Medium | Medium | Confirmed policy violation without major impact; detected intrusion attempt that was blocked; limited system affected; no sensitive data at risk. | Within 24 hours |
| P4 / Low | Low | Security event with minimal impact; single user account issue; policy violation without malicious intent; informational. | Within 72 hours |
Classification Criteria — What CISM Tests
Severity classification is not just a technical judgment. CISM emphasizes four business-oriented criteria:
- Business impact: How much does this disrupt the organization's ability to conduct its business? Revenue impact, operational impact, reputational impact.
- Data sensitivity: What categories of data are at risk? Regulated data (PII, PHI, cardholder data), trade secrets, internal confidential information. Regulated data almost always escalates severity.
- System criticality: Is the affected system classified as critical in the BIA? Critical systems have shorter RTO/RPO requirements and warrant higher severity classification.
- Public exposure: Is this incident known or likely to become known externally — to customers, regulators, media? Public exposure immediately escalates the communications and legal dimensions of response.
Escalation Procedures
Escalation defines who is notified, at what severity level, through what channel, and within what time frame. An escalation matrix is a governance artifact — it removes ambiguity during the high-pressure, time-compressed environment of an active incident. The information security manager is responsible for ensuring it exists, is tested, and is current.
| Severity | Who Is Notified | Time Frame | Channel |
|---|---|---|---|
| P1 / Critical | CISO, CIO, CEO, Legal, Board Chair (if data breach), PR/Comms, IR Lead, relevant regulators | Within 1 hour of classification | Phone — not email |
| P2 / High | CISO, IR Lead, Legal, Business unit head of affected area | Within 4 hours | Phone + written brief |
| P3 / Medium | IR Lead, Security manager, affected system owner | Within 24 hours | Email + ticketing system |
| P4 / Low | Security team, system owner | Within 72 hours | Ticketing system |
A common exam trap: escalation should go up the chain quickly and step down as the incident is contained — not wait until the incident is fully understood. Senior leadership should be notified early with incomplete information ("we have a potential P1 breach under investigation") rather than late with complete information. The cost of over-notifying a P1 that turns out to be a P2 is a brief interruption to an executive's schedule. The cost of under-notifying a genuine P1 is organizational, legal, and reputational damage.
Incident Response Team Roles
CISM tests the organizational structure of an incident response capability — specifically, who is involved and what their function is. The answer is not just technical staff. Effective IR requires a cross-functional team activated and coordinated by the information security manager.
- CISO / Information Security Manager: Program owner. Coordinates the overall response, ensures escalation occurs, interfaces with executive leadership, makes resource allocation decisions.
- IR Lead / Incident Commander: Manages the technical investigation and response. Coordinates the technical team, tracks investigation status, directs containment and eradication actions.
- Legal Counsel: Advises on regulatory notification obligations, protects attorney-client privilege during investigation, coordinates with law enforcement if required, manages evidence preservation.
- PR / Communications: Manages external messaging. Coordinates with legal on what can and cannot be disclosed. Prepares customer notifications, media statements, and public disclosures.
- HR: Involved when the incident involves an employee, whether as victim, suspect, or witness. Manages personnel actions, disciplinary procedures, employee communications.
- Technical Responders: Forensic analysts, network engineers, system administrators who perform the actual investigation, containment, and remediation work.
- Executive Sponsor: C-suite or board-level sponsor who provides organizational authority and resources. May be required to authorize extraordinary measures (taking systems offline, engaging outside counsel, public disclosure).
- Business Unit Leaders: Represent the business units affected. Make business decisions about system availability trade-offs during containment.
First Responder Actions — Preserve Before You Remediate
The first technical responder to an incident faces competing pressures: restore service quickly, investigate thoroughly, preserve evidence, and prevent further harm. CISM tests whether candidates understand the priority ordering from a governance perspective.
The cardinal rule: preserve evidence before remediating. Actions that remediate an incident may destroy forensic evidence required for legal proceedings, insurance claims, regulatory inquiries, or understanding of the root cause. First responders must:
- Document the state of the system before any action is taken — screenshots, log captures, process lists
- Follow chain-of-custody procedures for any evidence collected
- Avoid actions that modify or destroy volatile data (do not reboot without capturing memory; do not delete files; do not run cleanup tools)
- Contain rather than eradicate until forensic collection is complete
- Immediately notify legal counsel when criminal activity or regulatory exposure is suspected
Communication Protocols
Communication during an incident is not ad hoc — it must follow pre-defined protocols that ensure accuracy, consistency, and appropriate authorization. CISM distinguishes between internal and external communications.
Internal communications during an active incident:
- Executive briefings at defined intervals (e.g., every 2 hours for P1) with structured status updates
- Board notification when incident meets defined thresholds (confirmed data breach, major operational disruption, regulatory exposure)
- Employee communications managed through HR and legal — employees should not be left to speculate or turn to external sources for information about an incident affecting their organization
- All communications through secure channels — do not discuss incident details on systems that may be compromised
External communications require legal review before release:
- Regulators: notification timelines are legally mandated (GDPR: 72 hours; SEC: 4 business days for material incidents; state breach laws vary)
- Customers and affected parties: notification drafted by legal and PR, authorized by executive sponsor
- Law enforcement: engagement decision made by legal counsel in consultation with CISO and CEO
- Cyber insurance carrier: must be notified per policy requirements, typically immediately upon confirmed incident
- Media: no uncoordinated statements — all media contact routed through designated spokesperson
Domain 4 exam questions consistently test whether you understand the distinction between management oversight of technical response and technical response itself. CISM will never ask you which log to analyze or which forensic tool to use. It will ask: who should the CISO notify, what should the escalation threshold be, how should the IR plan be structured, what is the information security manager's role during an active incident. If you find yourself answering with a technical action, reconsider — the correct answer is almost always a governance, communication, or oversight action.
Session 3 — Digital Forensics & Post-Incident Review (Day 26, Wed Apr 22)
Chain of Custody
Chain of custody is the documented, unbroken record of who had possession of evidence, when, and what was done with it. It is the mechanism that makes digital evidence admissible in legal proceedings and credible in regulatory inquiries. For the information security manager, chain of custody is a governance requirement — the IR plan must mandate it, the team must be trained on it, and it must be enforced from the moment evidence is collected.
Chain of custody documentation includes:
- Identification of the evidence (description, serial number, hash value)
- Date, time, and location of collection
- Identity of the person who collected it
- Transfer records — each hand-off is documented with signatures and timestamps
- Storage conditions and access controls applied to the evidence
- Any analysis performed on the evidence, by whom, and when
A broken chain of custody does not necessarily mean evidence is wrong — but it means evidence can be challenged as potentially tampered with. In legal proceedings, this can destroy the value of otherwise conclusive digital evidence. The information security manager must ensure that forensic procedures are built into the IR plan before an incident occurs, not improvised during one.
Forensic Evidence Collection — Order of Volatility
Digital evidence exists on storage media with varying lifespans. The order of volatility principle holds that evidence should be collected from most volatile (shortest lifespan) to least volatile. This ensures that the most transient data — which is often the most valuable for understanding an active attack — is captured before normal system operations or containment actions destroy it.
General order from most to least volatile:
- CPU registers and cache
- RAM (system memory) — active connections, running processes, decryption keys, malware in memory
- Network state — active connections, routing tables, ARP cache
- Running processes and open files
- Disk storage — files, logs, configuration
- Remote logging and monitoring systems
- Archival media — backups, offline storage
CISM does not test forensic tool usage. It tests whether you understand that the information security manager must ensure the IR plan requires volatile data capture before containment actions that might destroy it — and that legal counsel should be consulted before any forensic action when criminal prosecution or litigation is possible.
When to Involve Law Enforcement
The decision to involve law enforcement is a business decision made by legal counsel and senior leadership — not a unilateral technical decision. The information security manager's role is to inform that decision with facts about the incident and to ensure the organization is prepared to cooperate if engagement is decided upon. Consider law enforcement engagement when:
- Criminal activity is suspected or confirmed (unauthorized computer access, theft, fraud)
- National security or critical infrastructure implications exist
- The organization needs law enforcement's investigative capabilities or legal authority
- Regulatory or contractual requirements mandate reporting to law enforcement (some sectors have these)
Factors that may weigh against immediate law enforcement engagement:
- Ongoing law enforcement investigations require preserving the incident state — may delay remediation
- Investigation may become public, creating reputational or business risk
- Attribution is uncertain — premature engagement may not lead to prosecution
Legal counsel must be involved in this decision. The information security manager should never engage law enforcement without legal guidance.
Legal Holds and Evidence Preservation
A legal hold (also called a litigation hold or preservation notice) is a directive, issued by legal counsel, requiring the organization to preserve all information potentially relevant to anticipated or ongoing litigation or regulatory investigation. When an incident triggers potential legal action:
- Legal counsel issues the hold — scope includes email, logs, documents, system images, communications about the incident
- Normal data retention and deletion schedules are suspended for in-scope data
- IT and security teams must implement technical controls to prevent deletion of in-scope data
- Custodians of relevant data are notified and trained on their obligations
- The hold remains in effect until legal counsel lifts it — which may be months or years
The information security manager's role is to ensure that legal hold procedures exist in the IR plan and that the organization can execute them rapidly when required. An organization that deletes relevant data after receiving legal notice faces potential obstruction of justice exposure that is far worse than the original incident.
Post-Incident Review (PIR) / Lessons Learned
The post-incident review is one of the most important — and most frequently neglected — phases of incident management. It is where the organization converts a reactive response into a proactive improvement. CISM tests the purpose, content, and outcomes of PIRs.
Purpose: The PIR exists to improve the organization's future incident response capability — not to assign blame. A blame-focused PIR creates incentives to conceal information, deflect responsibility, and minimize findings. An improvement-focused PIR surfaces root causes honestly and creates actionable remediation. The information security manager must set and enforce this cultural norm.
Timing: PIRs should be conducted within a defined window after incident resolution — typically within 5–15 business days while details are still fresh, but after the immediate recovery pressure has passed. Major incidents may warrant multiple PIRs: an initial rapid review and a deeper retrospective.
What PIR documentation should contain:
- Incident timeline: Chronological reconstruction from initial event to resolution, with timestamps. Used to calculate detection time, response time, and time to containment.
- What happened: Factual description of the incident — initial vector, attacker actions, systems affected, data impacted.
- What worked: Controls, procedures, and team actions that performed as intended. Reinforces effective practices.
- What didn't work: Controls that failed, gaps in detection, process breakdowns, communication failures. This is the most valuable section and must be documented honestly.
- Gaps identified: Identified deficiencies in the security program, IR plan, staffing, technology, or training.
- Recommendations: Specific, actionable improvements with owners, timelines, and success criteria.
Root Cause Analysis
Root cause analysis (RCA) goes deeper than incident description — it asks why the incident was possible, not just what happened. The goal is to identify the fundamental control failures, architectural weaknesses, or process gaps that allowed the incident to occur and to propagate. Common RCA techniques include:
- 5 Whys: Repeatedly asking "why" to drill past symptoms to root causes. A patch wasn't applied (why?) because the patch management process didn't cover legacy systems (why?) because the asset inventory was incomplete (why?) because there was no formal asset management program...
- Fishbone / Ishikawa diagram: Categorizes potential root causes across people, process, technology, and environment dimensions.
- Timeline analysis: Maps decision points and actions against the incident timeline to identify where different choices would have changed the outcome.
Feeding PIR Findings Back Into the Program
A PIR that produces findings but no action is a compliance exercise, not a management process. CISM tests whether candidates understand how PIR findings drive program improvement. Findings must flow into:
- IR plan updates: Gaps in the plan are corrected. Roles clarified, escalation paths adjusted, communication protocols refined.
- Risk register updates: Newly identified risks are added. Risk ratings for existing entries may be updated based on evidence from the incident.
- Policy and control updates: Control failures identified in the PIR trigger remediation — patching processes, access controls, monitoring coverage.
- Training updates: Human factors identified in the PIR (poor first responder actions, communication failures, classification errors) are addressed through targeted training.
- Exercise scenarios: Major incidents provide realistic scenarios for future tabletop exercises and simulations.
Session 4 — Business Continuity & Disaster Recovery (Day 27, Thu Apr 23)
BCP vs. DRP — The Structural Relationship
The BCP/DRP distinction is one of the most consistently tested concepts in Domain 4. The exam tests both definitional precision and the hierarchical relationship between the two frameworks.
| Dimension | Business Continuity Plan (BCP) | Disaster Recovery Plan (DRP) |
|---|---|---|
| Focus | Maintaining business operations during a disruption — the people, processes, and alternative arrangements that keep the business running | Recovering IT systems, infrastructure, and data after a disruption |
| Scope | Enterprise-wide — encompasses all critical business functions, not just IT | IT and technology infrastructure — systems, data, connectivity |
| Relationship | The umbrella framework — BCP is the broader program | A component under BCP — DRP supports BCP objectives |
| Ownership | Business leadership — business continuity manager or equivalent | IT/technology function — often IT DR coordinator |
| Key Question | How does the business continue to function? | How do we restore the technology the business depends on? |
| Example | Manual order processing procedures during system outage; relocating staff to alternate site; vendor agreements for alternate supply | Failover to hot site; restoring from backup; network rerouting |
BCP is the umbrella; DRP is subordinate to it. A question that asks "what is the relationship between BCP and DRP?" — the answer is that BCP is the broader framework and DRP is a component of it focused specifically on technology recovery. An organization can have a DRP without a BCP (though that would be incomplete), but a complete BCP necessarily includes DRP as one of its components.
Recovery Metrics — Memorize the Definitions and Relationships
The quantitative metrics that govern recovery planning are among the most heavily tested items in Domain 4. Know the definitions precisely and understand the mathematical and logical relationships between them.
Recovery Time Objective (RTO): The maximum amount of time the business can tolerate being without a specific function or system before the disruption causes unacceptable business impact. Example: "Our order management system must be restored within 4 hours."
Recovery Point Objective (RPO): The maximum amount of data loss the business can tolerate, expressed as a time interval. It defines how far back in time the organization must be able to restore data. Example: "We cannot lose more than 2 hours of transaction data." If the last backup was 6 hours ago and the system fails, the RPO of 2 hours has been violated.
Maximum Tolerable Downtime (MTD) / Maximum Allowable Downtime (MAD): The absolute maximum time a business function can be disrupted before the impact becomes irreversible — permanent customer loss, regulatory sanction, organizational failure. MTD is the ceiling that RTO must stay under.
Work Recovery Time (WRT): The time needed to return to normal operations after the system is restored — data re-entry, validation, testing before returning to production. Often overlooked in planning.
The Key Relationship: RTO + WRT ≤ MTD
The combined time to restore technology and complete work recovery must not exceed the maximum tolerable downtime.
RTO is about time — how long can you be down? RPO is about data — how much data can you lose? An aggressive RPO (near-zero data loss) requires near-continuous replication. An aggressive RTO (very short outage) requires a hot standby environment. These are independent dimensions with independent cost implications. The exam will present scenarios where you must identify which metric is being violated or what investment is needed to meet a stated requirement.
BCP Development Process
The BCP development lifecycle is sequential and each phase builds on the previous. CISM tests the process and the role of the Business Impact Analysis as its foundation.
- Business Impact Analysis (BIA): The foundational step. Identifies critical business processes, their dependencies, and the impact of disruption over time. The BIA is what assigns RTO/RPO/MTD values — these are business decisions, not IT decisions. The BIA output prioritizes recovery efforts.
- Risk Assessment: Identifies threats that could cause disruption — natural disasters, infrastructure failure, cyberattack, supply chain failure. Evaluates likelihood and impact to prioritize the scenarios the BCP must address.
- Strategy Development: Selects recovery strategies for each critical function based on BIA requirements and risk assessment findings. Technology recovery strategies (hot/warm/cold site), staff recovery strategies (remote work, alternate site), vendor strategies.
- Plan Development: Documents the strategies in actionable plans. Defines roles, procedures, contact lists, resource requirements, activation criteria, and authority levels.
- Testing: Validates that plans work as designed before they are needed. Exercises identify gaps, validate assumptions, and train team members.
- Maintenance: Keeps plans current as the business changes — new systems, changed processes, staff turnover, new threats. Plans that aren't maintained are not plans — they are historical documents.
Business Impact Analysis — What It Produces
The BIA is the most important input to BCP/DR planning. Without it, recovery strategies are guesses and RTO/RPO values are arbitrary. The BIA must be conducted with business unit leaders — not by the IT or security team in isolation. Key BIA outputs:
- List of critical business processes ranked by priority
- Dependencies for each process (systems, staff, vendors, facilities)
- Impact of disruption over time (financial, operational, reputational, regulatory)
- RTO and RPO requirements for each critical process
- MTD thresholds
- Recovery priorities — which functions must be restored first
Recovery Site Strategies
| Site Type | Description | Activation Time | Cost | Best For |
|---|---|---|---|---|
| Hot Site | Fully operational duplicate of the production environment — hardware, software, data replication current. Ready for immediate cutover. | Minutes to hours | Highest — paying for full duplicate infrastructure continuously | Critical systems with very aggressive RTO (hours) |
| Warm Site | Pre-configured hardware and software in place, but data must be restored from backup. Not operational until data is loaded. | Hours to days | Moderate | Systems with moderate RTO (1–3 days) |
| Cold Site | Physical space and basic infrastructure (power, cooling, connectivity) available but no hardware, software, or data. Everything must be sourced and installed. | Days to weeks | Lowest | Non-critical systems where extended outage is acceptable |
| Reciprocal Agreement | Two organizations agree to host each other's operations in the event of a disaster. Rarely works as planned — both organizations may be affected by the same regional disaster; capacity rarely exists. | Variable | Low (but unreliable) | Small organizations — better than nothing, but not recommended as primary strategy |
| Cloud-Based Recovery | Recovery infrastructure provisioned on-demand in cloud platforms. Can behave like a hot site at warm-site cost when configured correctly. Dominant modern approach. | Minutes to hours | Pay-per-use — efficient when not in use | Organizations with cloud-compatible workloads and aggressive RTO |
BCP/DRP Testing Methods
A plan that has never been tested is a hypothesis. Testing converts a plan from a document into a verified capability. CISM tests the spectrum of testing approaches — their rigor, their risk, and their appropriate use.
| Testing Method | Description | Business Risk | Value |
|---|---|---|---|
| Document Review / Checklist | Reviewers read through the plan and verify that all required elements are present and current. | None | Low — verifies completeness, not effectiveness |
| Tabletop Exercise | Key stakeholders gather to walk through a simulated disaster scenario verbally. No systems are moved; decisions and procedures are discussed. | None | Moderate — surfaces process gaps and communication issues without operational risk |
| Walkthrough / Structured Walkthrough | Team members walk through the plan step-by-step in detail, identifying gaps and ambiguities. | None | Moderate |
| Simulation / Functional Exercise | A realistic scenario is simulated and teams execute their roles as they would in a real event — communications are made, decisions taken — but production systems are not affected. | Low | High — tests execution under realistic conditions |
| Parallel Test | Recovery systems are brought up at the alternate site while production continues running in parallel. Validates that the recovery environment works without affecting production. | Low to moderate | High — validates technical recovery without operational risk |
| Full Interruption Test | Production systems are taken offline and the organization operates entirely from the recovery environment. The most realistic test. | High — production systems are offline; a failure in recovery is an actual disaster | Highest — the only test that truly validates end-to-end recovery capability |
CISM does not prescribe specific testing frequencies — these are determined by the organization's risk appetite, regulatory requirements, and the criticality of the systems covered. What CISM does emphasize: testing must occur regularly (at minimum annually), plans must be updated based on test findings, and the most critical systems should be tested more rigorously and more frequently. Full interruption tests are rarely conducted for critical production systems given the risk — but they may be required by regulators in certain sectors. The information security manager is responsible for establishing the testing program and ensuring findings drive plan improvement.
Session 5 — Crisis Communications & Program Integration (Day 28, Fri Apr 24)
Crisis Communications — The CISO's Responsibility
Crisis communication is not an afterthought to incident response — it is a parallel workstream that begins the moment a significant incident is confirmed and often determines how the incident defines the organization in its stakeholders' eyes. A technically perfect incident response that is communicated poorly — delayed disclosures, inaccurate statements, conflicting messages — can create secondary crises more damaging than the original incident.
The information security manager is responsible for ensuring that crisis communication capability exists before it is needed. This means working with legal, PR/communications, and executive leadership during the Prepare phase to establish protocols, draft template communications, identify spokespersons, and rehearse the communication process.
Communication Principles
Crisis communications must adhere to four core principles. These are consistent with ISACA's framing and appear in exam scenarios:
- Timely: Communicate as soon as reliable information is available — do not wait until the full picture is clear. Early notification with acknowledged uncertainty is preferable to delayed notification with complete information. Regulatory deadlines make timeliness legally mandatory.
- Accurate: Never speculate, minimize, or overstate. Inaccurate information in a crisis notification — particularly to regulators — compounds the original problem. It is acceptable to say "we are still investigating and will provide an update by [time]."
- Consistent: All communications must be coordinated. Different messages to different audiences create confusion, undermine trust, and create legal exposure. All statements must be aligned and approved before release.
- Appropriate to audience: Technical details belong in internal briefings, not customer notifications. Regulatory notifications have specific required elements. Board briefings require business impact framing, not technical forensic detail.
External Notification Requirements
Regulatory notification obligations are mandatory, have defined deadlines, and apply regardless of the organization's preference. The information security manager must know these obligations in advance and ensure the IR plan's escalation timelines are compatible with meeting them.
| Regulatory Framework | Notification Requirement | Deadline | Who Is Notified |
|---|---|---|---|
| GDPR | Personal data breach affecting rights and freedoms of individuals | 72 hours from discovery (to supervisory authority); "without undue delay" to affected individuals when high risk | Data Protection Authority; affected individuals |
| SEC Rules (US public companies) | Material cybersecurity incidents | 4 business days from determining materiality | SEC (Form 8-K); investors via public disclosure |
| US State Breach Laws | Unauthorized access to personal information of state residents | Varies by state — commonly 30–72 days; some states have 30-day requirements | State Attorney General; affected residents |
| HIPAA (US healthcare) | Breach of unsecured protected health information | 60 days from discovery; breaches affecting 500+ individuals require media notice | HHS; affected individuals; media (if 500+) |
| PCI DSS | Suspected or confirmed breach of cardholder data | Immediately upon suspicion | Payment card brands (Visa, Mastercard); acquiring bank |
| NIS2 Directive (EU) | Significant incidents affecting essential/important entities | Early warning within 24 hours; incident notification within 72 hours; final report within 1 month | National CSIRT or competent authority |
Organizations that attempt to manage regulatory notification as a discretionary decision — waiting to see if the incident is "bad enough" to report — face significantly elevated regulatory exposure. GDPR supervisory authorities have levied substantial fines specifically for delayed notification, separate from any fines related to the underlying breach. The information security manager must ensure legal counsel is engaged immediately in any incident with potential regulatory notification implications, and that the IR plan explicitly captures notification timelines and responsibilities.
Media Relations — Structure and Discipline
Media engagement during a security incident must be structured, disciplined, and coordinated. Uncontrolled media contact — a technical responder speaking to a reporter, a social media post from an employee, an executive making an off-the-cuff comment — can undermine legal strategy, provide attackers with intelligence, and create false narratives that become permanent.
Pre-incident media relations preparation includes:
- Designated spokesperson: One individual (typically the CEO, CCO, or designated PR lead) is authorized to speak to media. All other employees are trained to route media inquiries to this individual.
- Template statements: Pre-drafted initial statements ("We are aware of a security incident and are investigating. The safety of our customers' information is our priority. We will provide updates as the investigation progresses.") that can be issued quickly without revealing sensitive details.
- Legal review protocol: All media statements require legal review before release. No exceptions.
- Social media monitoring: Active monitoring of social media for breach-related posts, misinformation, and public sentiment during an active incident.
Internal Communications During a Crisis
Internal communications are frequently underplanned relative to external communications, but they matter significantly for maintaining operational coherence and employee trust.
- Employee notification: Employees learn about security incidents through internal channels, news media, or customer complaints. Information shared with employees must be controlled — enough to enable appropriate action and prevent panic, not so much that it creates legal exposure or enables social media disclosure.
- Board briefing during a crisis: The board must be notified for material incidents. Board briefings are structured communications — executive summary of what happened, business impact, response status, regulatory exposure, and what decisions or resources are needed from the board. The CISO typically delivers these in coordination with legal and the CEO.
- Crisis communication cadence: Establish a defined schedule for status updates — internal stakeholders should not be left wondering about status during an active P1 incident. Silence creates speculation.
Integration with the Overall Security Program
Domain 4 does not end with incident closure. CISM emphasizes that incident management is a closed-loop process — every incident provides information that should improve the broader security program. This integration is a governance responsibility, not a technical one.
Incident findings feed risk register updates. A confirmed attack vector that was not previously identified as a risk must be added to the risk register with an appropriate rating. A risk that was rated "low likelihood" but materialized must have its likelihood rating revisited. The risk register should reflect the organization's actual threat environment, not a theoretical one. Post-incident is the highest-confidence moment to update it.
Lessons learned feed policy and procedure updates. If the incident was enabled by a policy gap (no requirement for MFA on privileged accounts, no restriction on USB devices in sensitive areas, no mandatory patch timeline for critical systems) then the policy must be updated. If response was hampered by a procedural gap (no defined escalation path, no forensic collection procedure, unclear authority to take systems offline), the IR plan must be updated.
Threat intelligence from incidents improves detection. Indicators of compromise (IOCs) identified during an incident — IP addresses, domain names, file hashes, attack techniques — are fed into monitoring tools to improve detection of similar attacks against other parts of the organization or future recurrence. The security operations function should have a formal process for operationalizing threat intelligence from incident investigations.
Program metrics include incident response effectiveness. Key performance indicators for the incident management program include: mean time to detect (MTTD), mean time to respond (MTTR), mean time to contain, incident recurrence rate, percentage of incidents meeting classification accuracy targets, percentage of notifications meeting regulatory deadlines. These metrics must be tracked, reported to leadership, and used to drive program improvement.
Continuous Improvement — How the IR Program Matures
An incident management program at initial capability looks very different from a mature one. CISM expects the information security manager to understand program maturity as an ongoing investment, not a one-time achievement. Program maturity progresses through recognizable stages:
- Initial / Ad hoc: Incidents are responded to without a documented plan. Response is improvised, inconsistent, and dependent on specific individuals. No metrics. Learning is informal.
- Documented / Repeatable: A basic IR plan exists and is followed. Roles are defined. Escalation procedures exist. Classification schema is in place. Basic metrics are tracked.
- Defined / Managed: The plan is regularly tested and updated. Metrics are tracked and reported to management. PIRs are consistently conducted and findings are actioned. Integration with BCP/DR is established.
- Quantitatively managed: Program metrics drive investment decisions. MTTD and MTTR are trending in the right direction. Detection coverage is measured and known. Threat intelligence is integrated into detection.
- Optimizing: Continuous improvement is embedded. Exercises include advanced scenarios (insider threats, supply chain compromises, multi-vector attacks). Threat hunting is proactive. The program is a recognized organizational asset.
The information security manager's role is to understand where the program sits on this maturity curve, communicate that honestly to leadership, and make the investment case for moving to the next level based on business risk and regulatory requirements — not on technical preferences.
Before moving to flashcards, verify you can answer these without notes:
- What is the difference between an event, an alert, and an incident?
- Name the seven phases of the incident management lifecycle in order.
- What are the four classification criteria CISM emphasizes?
- Who must approve risk acceptance for a P1 incident escalation?
- What is chain of custody and why does it matter?
- What is the order of volatility principle and why does it matter for IR planning?
- What is the difference between RTO and RPO?
- What is the relationship between MTD and RTO?
- How does BCP relate to DRP hierarchically?
- What are the six BCP testing methods from least to most rigorous?
- What is the GDPR notification deadline?
- What should a PIR document and what should it NOT do?
- Name three ways incident findings integrate back into the security program.
- What is the difference between the incident management program and the incident response capability?