From "Fundamentals of Software Architecture"
🎧 Listen to Summary
Free 10-min PreviewThe Collaborative Risk Storming Process
Key Insight
Risk storming is a collaborative exercise vital for determining architectural risk, recognizing that no single architect possesses full system knowledge or can identify all potential risks. It involves multiple architects, senior developers, and tech leads to ensure a comprehensive, implementation-aware perspective. This process focuses on specific risk dimensions like unproven technology, performance, scalability, availability, transitive dependencies, data loss, single points of failure, and security. The exercise uses an architecture diagram, which must be current and accessible to all participants, with holistic diagrams for overall assessments and contextual ones for specific application areas.
The risk storming process unfolds in three primary activities: Identification, Consensus, and Mitigation. Identification is always an individual, non-collaborative activity where participants independently use the risk matrix to classify risks (low, medium, high) and note them on color-coded Post-it notes. This initial individual assessment prevents undue influence and ensures a broad range of identified risks. When analyzing multiple dimensions due to constraints, participants explicitly label the dimension on their notes, but it's recommended to focus on a single dimension per session to maximize participant focus and clarity.
Consensus and Mitigation are collaborative activities. During Consensus, participants place their Post-it notes on a large architecture diagram. The team then collectively discusses and agrees upon the risk qualification for each identified area. Discrepancies (e.g., some identifying medium risk, others high for the same component) or single-person identifications (e.g., one participant identifies high risk on 'Push Expansion Servers' while others see none) trigger discussions to understand rationales. Examples include uncovering unreliability of 'Push Expansion Servers' from prior experience, or a participant's unfamiliarity with 'Redis cache' leading to an incorrect high-risk rating, highlighting the value of diverse perspectives and developer involvement, especially for unknown technologies where maximum risk (9) is assigned. The final and most important activity, Mitigation, involves collaboratively devising architecture changes or enhancements to reduce or eliminate the agreed-upon risks. These solutions, like breaking a central database into two clustered databases or introducing asynchronous queues with an 'Ambulance Pattern' for priority traffic, incur costs. Stakeholders then negotiate trade-offs, deciding if the cost outweighs the risk, often leading to alternative, more cost-effective mitigation strategies. Risk storming is an ongoing process throughout a system's lifecycle, typically occurring after major features or at the end of iterations to continuously identify and mitigate risks before production.
📚 Continue Your Learning Journey — No Payment Required
Access the complete Fundamentals of Software Architecture summary with audio narration, key takeaways, and actionable insights from Mark Richards, Neal Ford.