Incident Response
When something goes wrong, the difference between a survivable incident and a catastrophic one is preparation. The runbook should not be written at 3 a.m. while the SIEM is on fire. This guide covers building response capability, handling incidents effectively, and the learning afterward.
The Incident Response Lifecycle
Incident response is not a single action. It is a cycle that moves from detecting something is wrong through containment, eradication, recovery, and finally learning.
Activities
- Monitoring alerts from SIEM, EDR, or IDS
- User reports of suspicious activity
- Automated anomaly detection
- Third-party notifications
- Threat intelligence feeds
Key Questions
- What triggered the alert?
- Who reported it and when?
- What systems are potentially affected?
Activities
- Isolating affected systems
- Blocking malicious IPs or accounts
- Changing compromised credentials
- Enabling additional logging
- Activating backup procedures
Key Questions
- What is the blast radius?
- What can we do right now to stop spread?
- What do we need to preserve for investigation?
Activities
- Identifying root cause
- Removing malware and backdoors
- Patching exploited vulnerabilities
- Resetting compromised accounts
- Verifying system integrity
Key Questions
- How did the attacker get in?
- What did they leave behind?
- What needs to be rebuilt versus cleaned?
Activities
- Restoring from clean backups
- Rebuilding compromised systems
- Verifying security controls
- Resuming normal operations
- Monitoring for recurrence
Key Questions
- Is the environment clean?
- Have we addressed the root cause?
- What monitoring do we need long-term?
Activities
- Documenting the timeline
- Identifying what worked and what did not
- Updating procedures
- Training the team
- Implementing improvements
Key Questions
- What controls failed?
- What would we do differently?
- How do we prevent recurrence?
Common Incident Types
Different incidents require different responses. Understanding the patterns helps your team respond faster and more effectively.
Indicators
- Suspicious emails to multiple users
- Users reporting credential prompts
- External emails bypassing filters
Response Steps
- Isolate affected workstations
- Reset compromised credentials
- Review email logs for scope
- Block sender and domain
- Notify potentially affected users
Prevention Measures
Indicators
- Encrypted files
- Ransom notes
- Widespread file access issues
- Suspicious executables
Response Steps
- Isolate affected systems immediately
- Do NOT pay ransom without leadership approval
- Identify ransomware variant
- Check for backups and determine restoration path
- Engage law enforcement and cyber insurance
Prevention Measures
Indicators
- Unauthorized data access
- Large data transfers
- Access to systems outside normal role
- Policy violations
Response Steps
- Verify the alert is not false positive
- Engage HR and legal
- Preserve evidence without alerting
- Document access patterns
- Plan response with legal counsel
Prevention Measures
Indicators
- Unauthorized access to sensitive data
- Data appearing elsewhere
- Credential misuse
- Anomalous database queries
Response Steps
- Contain the breach
- Identify what data was accessed
- Determine scope and affected individuals
- Notify legal and compliance
- Notify affected individuals per regulatory requirements
Prevention Measures
Indicators
- Services unavailable
- Traffic spikes
- Server resource exhaustion
- Network latency
Response Steps
- Verify it is not legitimate traffic spike
- Enable DDoS mitigation if available
- Block malicious IPs at firewall
- Scale infrastructure if possible
- Engage ISP or CDN for upstream filtering
Prevention Measures
Building Response Capability
Document your response procedures before incidents happen. When you are in crisis mode is not the time to discover gaps in your plan.
- Escalation paths and contact information
- Decision authority and approval processes
- Communication templates and notification sequences
- Legal and regulatory notification requirements
Define roles before incidents happen. Chaos multiplies when nobody knows who does what.
- Incident Commander: Leads response
- Technical Lead: Fixes things technically
- Communications Lead: Manages notifications
- Scribe: Documents timeline and decisions
Pre-position tools and access so response is not delayed by technical hurdles during incidents.
- Isolated analysis environments
- Out-of-band communication channels
- Forensics tooling and evidence preservation
- Emergency contact and escalation lists
Plans that are not practiced are assumptions. Test your response capability regularly.
- Tabletop exercises quarterly
- Full simulation exercises annually
- Lessons learned reviews after incidents
- Procedure updates based on learnings
Do you have an incident response plan?
If your answer is "we will figure it out when something happens," that is not a plan. That is optimism.