From "The Pragmatic Programmer"
🎧 Listen to Summary
Free 10-min PreviewDRY—The Evils of Duplication
Key Insight
Duplication of knowledge is a significant issue in software development, frequently leading to maintenance problems well before an application is deployed. Knowledge, encompassing specifications, code, and tests, is inherently volatile and changes often due to evolving requirements, regulations, or algorithms. The Don't Repeat Yourself (DRY) principle mandates that 'every piece of knowledge must have a single, unambiguous, authoritative representation within a system.' Violating DRY means expressing the same information in multiple locations, inevitably leading to forgotten updates and system contradictions when changes occur. This principle extends beyond merely copying and pasting code to the duplication of 'intent' or conceptual information across various formats or system components.
Code duplication, while seemingly minor, is prevalent and takes many forms. For example, repetitive logic for handling negative numbers in financial formatting functions or consistent field widths in print statements are DRY violations that can be resolved by abstracting common patterns into reusable functions. A subtle violation might be the implicit link between the number of hyphens in a separator line and the width of an amount field. Crucially, not all identical code constitutes a DRY violation; if two identical code segments represent distinct pieces of knowledge that merely happen to share rules, such as separate age and quantity validations, it is a coincidence, not a duplication. In documentation, repeating code's intent in comments often creates inconsistencies, underscoring that clear naming and layout in the code itself should largely eliminate the need for most explanatory comments, thus adhering to DRY.
Data structures can also violate DRY if derived information, like a line's length based on start and end points, is stored instead of calculated on demand. While caching for performance might introduce controlled duplication, it should be localized within a class and exposed via accessor functions to maintain system cohesion and flexibility. Interfacing with external systems (APIs, remote services, data sources) often introduces 'representational duplication,' where a code holds knowledge also present externally. This can be mitigated by using neutral API specifications, OpenAPI for public APIs, or generating data containers from database schemas. For external data, a pragmatic approach is to use key/value data structures with a table-driven validation suite. The most challenging duplication occurs between developers, where entire functionalities might be unknowingly re-implemented. This is best addressed by fostering strong team communication through daily standups, forums, appointing a project librarian to facilitate knowledge exchange, and encouraging code reviews to make reuse easier than reinvention.
📚 Continue Your Learning Journey — No Payment Required
Access the complete The Pragmatic Programmer summary with audio narration, key takeaways, and actionable insights from Andrew Hunt, David Thomas.